[xsmith-dev] XSmith questions

Tue Jul 20 10:57:43 MDT 2021

We are primarily focused on different compiler backends for Dafny (Dafny
can be compiled to Java, C#, Rust, JS, Go, C++, and perhaps more).

On Tue, Jul 20, 2021 at 9:34 AM William G Hatch <william at hatch.uno> wrote:

> >> Do you have a roadmap for implementing ...
>
> Not at the moment.  Right now I'm focused on getting some of our
> mostly-finished fuzzers *really* finished and ready for fuzzing,
> running some fuzzing campaigns to find some bugs, and some features
> that are immediately helpful for running fuzzing (eg. some stuff to
> support feedback-directed fuzzing and choice weighting).
>
>
> >Also, I'm curious about the setup of various fuzzers that you are running.
> >How many programs are you able to generate and test per minute? Do you
> have
> >any particular parallelization setup that works well for you?
>
> There is a harness repository that we use that was mostly written by
> Guy Watson:
>
> https://gitlab.flux.utah.edu/xsmith/harness
>
> There is a `harness.py` script that takes a configuration file to run
> an xsmith fuzzer, running its output through various implementations
> of whatever language, comparing the output, and binning the result
> (eg. everything matching, diff, crash, everything crashing (IE bad
> program), xsmith itself crashing, ...).  That harness script is
> basically for running a single job, and there are 2 ways of doing it
> in parallel.
>
> The first one is the `go-button`, which can run fuzzing on multiple
> machines at once.  It is fairly dependent on Emulab/Cloudlab
> (https://www.emulab.net/) to have things set up just right, and
> requires at least 2 machines.  It also generates a bunch of live
> graphs about fuzzing.  It uses ansible to run setup scripts to prep
> each machine before starting the fuzz campaign.
>
> I mostly use `local-multi-harness` instead, which just runs on a
> single machine, but runs multiple harness processes.  It doesn't
> automatically run the setup scripts like the `go-button` does, so
> eg. if you want to use the `racket-kernel-fuzzer` configuration you
> need to run the `racket-install.rkt` script first manually.
>
> As far as speed, xsmith is not particularly fast.  There is a server
> mode that at least saves on startup time, but I think the harness
> might not run it particularly well (Guy correct me if I'm wrong -- at
> any rate I haven't been using it with server mode, but that might be
> because I never bothered changing my config after bugs were fixed).
> Aside from Racket's slow startup speed (particularly bad with Xsmith
> because it loads a lot of stuff), the runtime varies a lot based on
> the max-depth configuration, the size of the grammar, etc.
>
> I haven't focused a lot on performance, necessarily.  Basically
> whenever I decide Xsmith is too slow for my taste I spend some time
> trying to optimize it, and generally make a few gains until I think
> “that's enough optimization for now”.  If you can find any ways to
> make improvements, that would be great!  I might have taken all of the
> low-hanging fruit in my previous optimization passes, however.  In
> previous optimization passes I got the most out of optimizing in the
> type system, but I made some major improvements to type checking speed
> last time so it may not dominate anymore.
>
>
> As an aside, a question for you: Are there multiple Dafny
> implementations?  Or different optimization levels?  Or are you
> running different versions of the compiler against each other?
>
>
>
> On Mon, Jul 19, 2021 at 04:15:04PM -0700, Sorawee Porncharoenwase wrote:
> >Also, I'm curious about the setup of various fuzzers that you are running.
> >How many programs are you able to generate and test per minute? Do you
> have
> >any particular parallelization setup that works well for you?
> >
> >On Fri, Jul 16, 2021 at 8:32 PM Sorawee Porncharoenwase <
> >sorawee.pwase at gmail.com> wrote:
> >
> >> Do you have a roadmap for implementing "Provide mechanism for precluding
> >> types in `fresh-type-variable` invocations"? (
> >> https://gitlab.flux.utah.edu/xsmith/xsmith/-/issues/56). I also really
> >> want this feature.
> >>
> >> On Thu, Jul 15, 2021 at 9:22 PM William G Hatch <william at hatch.uno>
> wrote:
> >>
> >>> That would be great, thanks!
> >>>
> >>> On Thu, Jul 15, 2021 at 08:09:13PM -0700, Sorawee Porncharoenwase
> wrote:
> >>> >Yes, this is very helpful. I will let you know if I encounter any
> further
> >>> >issues after updating XSmith.
> >>> >
> >>> >By the way, do you want any help with Scribble? I notice a lot of
> typos
> >>> and
> >>> >link errors and can submit PRs to fix them.
> >>> >
> >>> >On Thu, Jul 15, 2021 at 8:00 PM William G Hatch <william at hatch.uno>
> >>> wrote:
> >>> >
> >>> >> Sorry for the delay.  I wanted to do the debugging before
> responding,
> >>> >> but then I didn't get to it for a while.
> >>> >>
> >>> >> Anyway, I now see what's going wrong, and I've fixed some bugs.
> >>> >>
> >>> >> First, that issue with the bud-node was a bug in the `edit` property
> >>> >> which the canned-components `ProcedureApplication` uses.  It's fixed
> >>> >> now.
> >>> >>
> >>> >> >> Hmm, it seems like first-order functions would be encoded as a
> >>> >> >> function application node that only allows a reference as its RHS
> >>> >> >> rather than allowing arbitrary expressions.  But that said, I'm
> not
> >>> >> >> sure I fully understand what you're trying to do.
> >>> >> >>
> >>> >> >I’m not sure what you mean by that. But:
> >>> >>
> >>> >> Erm, I misspoke (mistyped) a bit.  I meant that it would be function
> >>> >> application that only allows a reference in the procedure
> position.  I
> >>> >> don't know why I wrote RHS.
> >>> >>
> >>> >> Anyway, the strategy of having separate definition nodes for
> functions
> >>> >> and everything else is used in Cish.  I don't recommend following
> >>> >> anything Cish does.  However, it accomplishes this by having the
> >>> >> program node have non-function definitions first, and disallowing
> use
> >>> >> of functions on the RHS of those functions.
> >>> >>
> >>> >> The problem where it can't find a reference is a bug (that I haven't
> >>> >> fixed yet) caused by the lifting mechanics not really being
> completely
> >>> >> in sync with the reference mechanics with respect to nodes that have
> >>> >> multiple definition fields.  When there are multiple definition
> >>> >> fields, definitions in the lower field can reference the definitions
> >>> >> in the higher field, but not vice versa.  The lifting machinery
> >>> >> doesn't know about that ordering, it just stuffs the lifted
> definition
> >>> >> into the front of the definitions in whatever field it is
> targetting.
> >>> >> Thus it places a definition in a place that the reference machinery
> >>> >> thinks is out of scope.
> >>> >>
> >>> >> So... if you want to have different binders that get different
> fields
> >>> >> in the program node you can only make it work if you disallow stuff
> to
> >>> >> be sure you never need to reference something that will be lifted
> into
> >>> >> the lower field.
> >>> >>
> >>> >> That said, I've just fixed some issues to allow definition nodes to
> be
> >>> >> subtypable, so now you can have a definition node that is a subtype
> of
> >>> >> Definition, but has some different rules.  Importantly, though, it
> can
> >>> >> be lifted to the same field.
> >>> >>
> >>> >> Eg. if I change the MyDefinition declaration in your example to look
> >>> >> like this:
> >>> >>
> >>> >> ```
> >>> >>   [MyDefinition Definition ()
> >>> >>                 #:prop fresh (hash 'Expression (make-hole
> >>> >> 'ProcedureApplication))]
> >>> >> ```
> >>> >>
> >>> >> It can now be lifted among normal definitions, but always have
> >>> >> ProcedureApplication as its RHS.
> >>> >>
> >>> >> However, I'm still not convinced that you actually want
> >>> >> ProcedureApplication as the RHS.  Maybe you want to restrict lambda
> >>> >> nodes to only be generated on the RHS of a FunctionDefinition node
> >>> >> (that's a subtype of Definition), and make FunctionDefinition always
> >>> >> have a Lambda node as its RHS.
> >>> >>
> >>> >> There is an example in the documentation that is like this, in the
> >>> >> explanation of the `choice-filters-to-apply` property.  But I'll
> paste
> >>> >> it here:
> >>> >>
> >>> >> ```
> >>> >> (add-choice-method
> >>> >>   my-component
> >>> >>   no-lambda-except-global-def
> >>> >>   [#f (λ () #t)]
> >>> >>   [Lambda (λ ()
> >>> >>             (and (parent-node current-hole)
> >>> >>                  (equal? (ast-node-type (parent-node current-hole))
> >>> >>                          'Definition)
> >>> >>                  (parent-node (parent-node current-hole))
> >>> >>                  (equal? (ast-node-type (parent-node (parent-node
> >>> >> current-hole)))
> >>> >>                          'Program)))])
> >>> >> (add-property my-component
> >>> >>            choice-filters-to-apply
> >>> >>            [#f (no-lambda-except-global-def)])
> >>> >> ```
> >>> >>
> >>> >> You can use something like that to allow Lambda nodes to only be
> >>> >> generated on the RHS of FunctionDefinition nodes, and define
> >>> >> FunctionDefinition like MyDefinition above to always generate Lambda
> >>> >> as its RHS, and I think it should work for a first-order language.
> >>> >>
> >>> >>
> >>> >> >(Also note that I tried unify! these unrestricted type variables
> >>> >> >with some concrete non-function type variables, but down the road,
> I
> >>> get
> >>> >> an
> >>> >> >internal error about unification failure.). Perhaps I'm not
> supposed
> >>> to
> >>> >> >unify! inside add-choice-method?
> >>> >>
> >>> >> Well, probably not.  Generally the only place you want to `unify!`
> is
> >>> >> inside the type property in the function that determines child types
> >>> >> based on the parent type.  Inside that function you can `unify!`,
> but
> >>> >> the unification needs to be consistent (IE not unified based on the
> >>> >> result of `random`) OR the choice needs to be saved in a way that it
> >>> >> can be looked up when that function is run again.
> >>> >>
> >>> >> The problem is that the choice method gets its type information
> based
> >>> >> on a hole node that is later replaced, so the type is recomputed
> >>> >> without necessarily consulting the type that the hole said it was.
> >>> >> You should be able to constrain the type using unification if you
> save
> >>> >> that type to re-unify it consistently later.  But without seeing the
> >>> >> exact error and code that generated it, I'm not 100% certain.
> >>> >>
> >>> >>
> >>> >> > In the last email, I asked if lift-type->ast-binder-type could be
> >>> used
> >>> >> > nondeterministically in this manner, and my experience with it so
> far
> >>> >> seems
> >>> >> > to indicate no.
> >>> >>
> >>> >> I think this should work, barring the issues mentioned above about
> the
> >>> >> lifting vs referencing bug when you have the different definition
> >>> >> types in different fields.  Adjusting your example to have
> >>> >> MyDefinition as a subtype, and then only having a Definition field
> in
> >>> >> MyProgram, random choice seems to work fine.  But sometimes it
> >>> >> generates gargantuan programs (that I kill before it finishes)
> because
> >>> >> it keeps choosing MyDefinition, requiring a function application,
> >>> >> which likely lifts a reference to a function, which might choose
> >>> >> MyDefinition, etc, in a bad lift chain that generates bigger and
> >>> >> bigger types.
> >>> >>
> >>> >>
> >>> >> Anyway, does that all help?
> >>> >>
> >>> >>
> >>> >> On Tue, Jul 13, 2021 at 02:55:10PM -0700, Sorawee Porncharoenwase
> >>> wrote:
> >>> >> >>
> >>> >> >> Hmm, it seems like first-order functions would be encoded as a
> >>> >> >> function application node that only allows a reference as its RHS
> >>> >> >> rather than allowing arbitrary expressions.  But that said, I'm
> not
> >>> >> >> sure I fully understand what you're trying to do.
> >>> >> >>
> >>> >> >I’m not sure what you mean by that. But:
> >>> >> >
> >>> >> >   1. I have no problem with the definition site. That’s easy to
> deal
> >>> >> with.
> >>> >> >   2. I also have no problem with restricting function application.
> >>> That’s
> >>> >> >   easy to deal with as well.
> >>> >> >   3. The problem is primarily with a bare variable reference bound
> >>> to a
> >>> >> >   closure value, which must only occur in function application.
> For
> >>> >> example,
> >>> >> >   the following code is invalid.
> >>> >> >
> >>> >> >(define (f) (void))
> >>> >> >
> >>> >> >(define g f) ;; this is invalid
> >>> >> >(define h (g))
> >>> >> >(define a (list f)) ;; this is also invalid
> >>> >> >(define b (f)) ;; this is ok
> >>> >> >
> >>> >> >I have been using choice-filters-to-apply to restrict variable
> >>> references
> >>> >> >whose type can-unify? with function-type that doesn't directly
> appear
> >>> >> under
> >>> >> >function application. But this is obviously not ideal because the
> >>> filtered
> >>> >> >nodes are mostly those whose type is totally unrestricted. So
> >>> filtering
> >>> >> >them would prevent them from being concretized to other
> non-function
> >>> types
> >>> >> >as well. (Also note that I tried unify! these unrestricted type
> >>> variables
> >>> >> >with some concrete non-function type variables, but down the road,
> I
> >>> get
> >>> >> an
> >>> >> >internal error about unification failure.). Perhaps I'm not
> supposed
> >>> to
> >>> >> >unify! inside add-choice-method?
> >>> >> >
> >>> >> >That's probably a bug.  “Lifting” is what I've called the situation
> >>> >> >> where a reference is generated that causes a new definition to be
> >>> >> >> added such that the variable is in scope at the use site.  IE the
> >>> >> >> binding is lifted from the use site to some outer node that
> contains
> >>> >> >> definitions.  `lift_X` is the name Xsmith gives to definitions
> that
> >>> >> >> are created automatically in this way (which is ultimately most
> >>> >> >> definitions).  The exception is caused because somehow the lift
> >>> ended
> >>> >> >> up in a place where it wasn't actually in scope.  That shouldn't
> >>> >> >> happen.  Anyway, if you point me to a version of a fuzzer that
> does
> >>> >> >> this, I'll try to debug it.
> >>> >> >>
> >>> >> >#lang clotho
> >>> >> >
> >>> >> >(require
> >>> >> > xsmith
> >>> >> > racr
> >>> >> > xsmith/racr-convenience
> >>> >> > xsmith/canned-components
> >>> >> > racket/string
> >>> >> > racket/list
> >>> >> > racket/pretty)
> >>> >> >
> >>> >> >;; We first define a basic component and add a bunch of
> expressions.
> >>> >> >
> >>> >> >(define-basic-spec-component somelisp)
> >>> >> >
> >>> >> >(add-basic-expressions somelisp
> >>> >> >                       #:ProgramWithSequence #t
> >>> >> >                       #:VoidExpression #t
> >>> >> >                       #:AssignmentExpression #t
> >>> >> >                       #:VariableReference #t
> >>> >> >                       #:ProcedureApplication #t
> >>> >> >                       #:IfExpression #t
> >>> >> >                       #:ExpressionSequence #t
> >>> >> >                       #:LetSequential #t
> >>> >> >                       #:LambdaWithExpression #t
> >>> >> >                       #:Numbers #t
> >>> >> >                       #:Booleans #t
> >>> >> >                       #:Strings #t
> >>> >> >                       #:ImmutableList #t)
> >>> >> >
> >>> >> >(add-loop-over-container
> >>> >> > somelisp
> >>> >> > #:name Map
> >>> >> > #:collection-type-constructor (λ (inner) (immutable (list-type
> >>> inner)))
> >>> >> > #:loop-type-constructor (λ (inner) (immutable (list-type inner))))
> >>> >> >
> >>> >> >(add-to-grammar
> >>> >> > somelisp
> >>> >> > [MyProgram #f
> >>> >> >            ([decls : Definition *]
> >>> >> >             [declsTwo : MyDefinition *]
> >>> >> >             [body : Expression])
> >>> >> >            #:prop type-info
> >>> >> >            [(fresh-type-variable)
> >>> >> >             (λ (n t)
> >>> >> >               (hash 'decls (λ (c) (fresh-type-variable))
> >>> >> >                     'declsTwo (λ (c) (fresh-type-variable))
> >>> >> >                     'body (fresh-type-variable)))]]
> >>> >> > [MyDefinition #f ([type]
> >>> >> >                   [name = (fresh-var-name "b_")]
> >>> >> >                   [e : ProcedureApplication])
> >>> >> >               #:prop binder-info ()
> >>> >> >               #:prop type-info [(fresh-type-variable) (λ (n t)
> (hash
> >>> 'e
> >>> >> t))]])
> >>> >> >
> >>> >> >(add-property
> >>> >> > somelisp
> >>> >> > lift-type->ast-binder-type
> >>> >> > [#f (λ (type) (if (zero? (random 2)) 'MyDefinition 'Definition))])
> >>> >> >
> >>> >> >;; Sometimes we have a type variable that is not concrete but needs
> >>> to be.
> >>> >> >;; Here we provide a list of options we can pick for an
> unconstrained
> >>> >> >;; type variable.
> >>> >> >(define (type-thunks-for-concretization)
> >>> >> >  (list
> >>> >> >   (λ()int-type)
> >>> >> >   (λ()bool-type)
> >>> >> >   (λ()string-type)
> >>> >> >   (λ()(immutable (list-type (fresh-type-variable))))))
> >>> >> >
> >>> >> >(define (somelisp-format-render s-exps)
> >>> >> >  (define out (open-output-string))
> >>> >> >  (for ([symex s-exps])
> >>> >> >    (pretty-print symex out 1))
> >>> >> >  (get-output-string out))
> >>> >> >
> >>> >> >(define-xsmith-interface-functions
> >>> >> >  [somelisp]
> >>> >> >  #:program-node MyProgram
> >>> >> >  #:type-thunks type-thunks-for-concretization
> >>> >> >  #:comment-wrap (λ (lines)
> >>> >> >                   (string-join
> >>> >> >                    (map (λ (l) (format ";; ~a" l)) lines)
> >>> >> >                    "\n"))
> >>> >> >  #:format-render somelisp-format-render)
> >>> >> >
> >>> >> >(module+ main (somelisp-command-line))
> >>> >> >
> >>> >> >
> >>> >> >> Probably something is going wrong in the `fresh` property.  Have
> you
> >>> >> >> changed that?
> >>> >> >>
> >>> >> >I don’t think so. See the code at the end of the email.
> >>> >> >
> >>> >> >> I should change that error message.  Anyway, it seems to think
> there
> >>> >> >> is no outer node that has a `Definition *` child that can be the
> >>> >> >> target for lifting a definition.  Generally this happens when you
> >>> >> >> forget to have an outer Program node or something that can hold
> >>> >> >> definitions.
> >>> >> >>
> >>> >> >I don’t think that’s the case, but I could be wrong.
> >>> >> >
> >>> >> >#lang clotho
> >>> >> >
> >>> >> >(require
> >>> >> > xsmith
> >>> >> > racr
> >>> >> > xsmith/racr-convenience
> >>> >> > xsmith/canned-components
> >>> >> > racket/string
> >>> >> > racket/list
> >>> >> > racket/pretty)
> >>> >> >
> >>> >> >;; We first define a basic component and add a bunch of
> expressions.
> >>> >> >
> >>> >> >(define-basic-spec-component somelisp)
> >>> >> >
> >>> >> >(add-basic-expressions somelisp
> >>> >> >                       #:ProgramWithSequence #t
> >>> >> >                       #:VoidExpression #t
> >>> >> >                       #:AssignmentExpression #t
> >>> >> >                       #:VariableReference #t
> >>> >> >                       #:ProcedureApplication #t
> >>> >> >                       #:IfExpression #t
> >>> >> >                       #:ExpressionSequence #t
> >>> >> >                       #:LetSequential #t
> >>> >> >                       #:LambdaWithExpression #t
> >>> >> >                       #:Numbers #t
> >>> >> >                       #:Booleans #t
> >>> >> >                       #:Strings #t
> >>> >> >                       #:ImmutableList #t)
> >>> >> >
> >>> >> >(add-loop-over-container
> >>> >> > somelisp
> >>> >> > #:name Map
> >>> >> > #:collection-type-constructor (λ (inner) (immutable (list-type
> >>> inner)))
> >>> >> > #:loop-type-constructor (λ (inner) (immutable (list-type inner))))
> >>> >> >
> >>> >> >(add-to-grammar
> >>> >> > somelisp
> >>> >> > [MyProgram #f
> >>> >> >            ([decls : Definition *]
> >>> >> >             [body : SubProgram])
> >>> >> >            #:prop type-info
> >>> >> >            [(fresh-type-variable)
> >>> >> >             (λ (n t)
> >>> >> >               (hash 'decls (λ (c) (fresh-type-variable))
> >>> >> >                     'body (fresh-type-variable)))]]
> >>> >> > [SubProgram #f
> >>> >> >             ([decls : MyDefinition *]
> >>> >> >              [body : Expression])
> >>> >> >             #:prop type-info
> >>> >> >             [(fresh-type-variable)
> >>> >> >              (λ (n t)
> >>> >> >                (hash 'decls (λ (c) (fresh-type-variable))
> >>> >> >                      'body (fresh-type-variable)))]]
> >>> >> > [MyDefinition #f ([type]
> >>> >> >                   [name = (fresh-var-name "b_")]
> >>> >> >                   [e : ProcedureApplication])
> >>> >> >               #:prop binder-info ()
> >>> >> >               #:prop type-info [(fresh-type-variable) (λ (n t)
> (hash
> >>> 'e
> >>> >> t))]])
> >>> >> >
> >>> >> >(add-property
> >>> >> > somelisp
> >>> >> > lift-type->ast-binder-type
> >>> >> > [#f (λ (type) (if (zero? (random 2)) 'MyDefinition 'Definition))])
> >>> >> >
> >>> >> >;; Sometimes we have a type variable that is not concrete but needs
> >>> to be.
> >>> >> >;; Here we provide a list of options we can pick for an
> unconstrained
> >>> >> >;; type variable.
> >>> >> >(define (type-thunks-for-concretization)
> >>> >> >  (list
> >>> >> >   (λ()int-type)
> >>> >> >   (λ()bool-type)
> >>> >> >   (λ()string-type)
> >>> >> >   (λ()(immutable (list-type (fresh-type-variable))))))
> >>> >> >
> >>> >> >(define (somelisp-format-render s-exps)
> >>> >> >  (define out (open-output-string))
> >>> >> >  (for ([symex s-exps])
> >>> >> >    (pretty-print symex out 1))
> >>> >> >  (get-output-string out))
> >>> >> >
> >>> >> >(define-xsmith-interface-functions
> >>> >> >  [somelisp]
> >>> >> >  #:program-node MyProgram
> >>> >> >  #:type-thunks type-thunks-for-concretization
> >>> >> >  #:comment-wrap (λ (lines)
> >>> >> >                   (string-join
> >>> >> >                    (map (λ (l) (format ";; ~a" l)) lines)
> >>> >> >                    "\n"))
> >>> >> >  #:format-render somelisp-format-render)
> >>> >> >
> >>> >> >(module+ main (somelisp-command-line))
> >>> >>
> >>>
> >>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: </listarchives/xsmith-dev/attachments/20210720/0b4a4aee/attachment.html>