[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [xsmith-dev] XSmith questions



Also, I'm curious about the setup of various fuzzers that you are running. How many programs are you able to generate and test per minute? Do you have any particular parallelization setup that works well for you?

On Fri, Jul 16, 2021 at 8:32 PM Sorawee Porncharoenwase <sorawee.pwase@gmail.com> wrote:
Do you have a roadmap for implementing "Provide mechanism for precluding types in `fresh-type-variable` invocations"? (https://gitlab.flux.utah.edu/xsmith/xsmith/-/issues/56). I also really want this feature.

On Thu, Jul 15, 2021 at 9:22 PM William G Hatch <william@hatch.uno> wrote:
That would be great, thanks!

On Thu, Jul 15, 2021 at 08:09:13PM -0700, Sorawee Porncharoenwase wrote:
>Yes, this is very helpful. I will let you know if I encounter any further
>issues after updating XSmith.
>
>By the way, do you want any help with Scribble? I notice a lot of typos and
>link errors and can submit PRs to fix them.
>
>On Thu, Jul 15, 2021 at 8:00 PM William G Hatch <william@hatch.uno> wrote:
>
>> Sorry for the delay.  I wanted to do the debugging before responding,
>> but then I didn't get to it for a while.
>>
>> Anyway, I now see what's going wrong, and I've fixed some bugs.
>>
>> First, that issue with the bud-node was a bug in the `edit` property
>> which the canned-components `ProcedureApplication` uses.  It's fixed
>> now.
>>
>> >> Hmm, it seems like first-order functions would be encoded as a
>> >> function application node that only allows a reference as its RHS
>> >> rather than allowing arbitrary expressions.  But that said, I'm not
>> >> sure I fully understand what you're trying to do.
>> >>
>> >I’m not sure what you mean by that. But:
>>
>> Erm, I misspoke (mistyped) a bit.  I meant that it would be function
>> application that only allows a reference in the procedure position.  I
>> don't know why I wrote RHS.
>>
>> Anyway, the strategy of having separate definition nodes for functions
>> and everything else is used in Cish.  I don't recommend following
>> anything Cish does.  However, it accomplishes this by having the
>> program node have non-function definitions first, and disallowing use
>> of functions on the RHS of those functions.
>>
>> The problem where it can't find a reference is a bug (that I haven't
>> fixed yet) caused by the lifting mechanics not really being completely
>> in sync with the reference mechanics with respect to nodes that have
>> multiple definition fields.  When there are multiple definition
>> fields, definitions in the lower field can reference the definitions
>> in the higher field, but not vice versa.  The lifting machinery
>> doesn't know about that ordering, it just stuffs the lifted definition
>> into the front of the definitions in whatever field it is targetting.
>> Thus it places a definition in a place that the reference machinery
>> thinks is out of scope.
>>
>> So... if you want to have different binders that get different fields
>> in the program node you can only make it work if you disallow stuff to
>> be sure you never need to reference something that will be lifted into
>> the lower field.
>>
>> That said, I've just fixed some issues to allow definition nodes to be
>> subtypable, so now you can have a definition node that is a subtype of
>> Definition, but has some different rules.  Importantly, though, it can
>> be lifted to the same field.
>>
>> Eg. if I change the MyDefinition declaration in your example to look
>> like this:
>>
>> ```
>>   [MyDefinition Definition ()
>>                 #:prop fresh (hash '_expression_ (make-hole
>> 'ProcedureApplication))]
>> ```
>>
>> It can now be lifted among normal definitions, but always have
>> ProcedureApplication as its RHS.
>>
>> However, I'm still not convinced that you actually want
>> ProcedureApplication as the RHS.  Maybe you want to restrict lambda
>> nodes to only be generated on the RHS of a FunctionDefinition node
>> (that's a subtype of Definition), and make FunctionDefinition always
>> have a Lambda node as its RHS.
>>
>> There is an example in the documentation that is like this, in the
>> explanation of the `choice-filters-to-apply` property.  But I'll paste
>> it here:
>>
>> ```
>> (add-choice-method
>>   my-component
>>   no-lambda-except-global-def
>>   [#f (λ () #t)]
>>   [Lambda (λ ()
>>             (and (parent-node current-hole)
>>                  (equal? (ast-node-type (parent-node current-hole))
>>                          'Definition)
>>                  (parent-node (parent-node current-hole))
>>                  (equal? (ast-node-type (parent-node (parent-node
>> current-hole)))
>>                          'Program)))])
>> (add-property my-component
>>            choice-filters-to-apply
>>            [#f (no-lambda-except-global-def)])
>> ```
>>
>> You can use something like that to allow Lambda nodes to only be
>> generated on the RHS of FunctionDefinition nodes, and define
>> FunctionDefinition like MyDefinition above to always generate Lambda
>> as its RHS, and I think it should work for a first-order language.
>>
>>
>> >(Also note that I tried unify! these unrestricted type variables
>> >with some concrete non-function type variables, but down the road, I get
>> an
>> >internal error about unification failure.). Perhaps I'm not supposed to
>> >unify! inside add-choice-method?
>>
>> Well, probably not.  Generally the only place you want to `unify!` is
>> inside the type property in the function that determines child types
>> based on the parent type.  Inside that function you can `unify!`, but
>> the unification needs to be consistent (IE not unified based on the
>> result of `random`) OR the choice needs to be saved in a way that it
>> can be looked up when that function is run again.
>>
>> The problem is that the choice method gets its type information based
>> on a hole node that is later replaced, so the type is recomputed
>> without necessarily consulting the type that the hole said it was.
>> You should be able to constrain the type using unification if you save
>> that type to re-unify it consistently later.  But without seeing the
>> exact error and code that generated it, I'm not 100% certain.
>>
>>
>> > In the last email, I asked if lift-type->ast-binder-type could be used
>> > nondeterministically in this manner, and my experience with it so far
>> seems
>> > to indicate no.
>>
>> I think this should work, barring the issues mentioned above about the
>> lifting vs referencing bug when you have the different definition
>> types in different fields.  Adjusting your example to have
>> MyDefinition as a subtype, and then only having a Definition field in
>> MyProgram, random choice seems to work fine.  But sometimes it
>> generates gargantuan programs (that I kill before it finishes) because
>> it keeps choosing MyDefinition, requiring a function application,
>> which likely lifts a reference to a function, which might choose
>> MyDefinition, etc, in a bad lift chain that generates bigger and
>> bigger types.
>>
>>
>> Anyway, does that all help?
>>
>>
>> On Tue, Jul 13, 2021 at 02:55:10PM -0700, Sorawee Porncharoenwase wrote:
>> >>
>> >> Hmm, it seems like first-order functions would be encoded as a
>> >> function application node that only allows a reference as its RHS
>> >> rather than allowing arbitrary expressions.  But that said, I'm not
>> >> sure I fully understand what you're trying to do.
>> >>
>> >I’m not sure what you mean by that. But:
>> >
>> >   1. I have no problem with the definition site. That’s easy to deal
>> with.
>> >   2. I also have no problem with restricting function application. That’s
>> >   easy to deal with as well.
>> >   3. The problem is primarily with a bare variable reference bound to a
>> >   closure value, which must only occur in function application. For
>> example,
>> >   the following code is invalid.
>> >
>> >(define (f) (void))
>> >
>> >(define g f) ;; this is invalid
>> >(define h (g))
>> >(define a (list f)) ;; this is also invalid
>> >(define b (f)) ;; this is ok
>> >
>> >I have been using choice-filters-to-apply to restrict variable references
>> >whose type can-unify? with function-type that doesn't directly appear
>> under
>> >function application. But this is obviously not ideal because the filtered
>> >nodes are mostly those whose type is totally unrestricted. So filtering
>> >them would prevent them from being concretized to other non-function types
>> >as well. (Also note that I tried unify! these unrestricted type variables
>> >with some concrete non-function type variables, but down the road, I get
>> an
>> >internal error about unification failure.). Perhaps I'm not supposed to
>> >unify! inside add-choice-method?
>> >
>> >That's probably a bug.  “Lifting” is what I've called the situation
>> >> where a reference is generated that causes a new definition to be
>> >> added such that the variable is in scope at the use site.  IE the
>> >> binding is lifted from the use site to some outer node that contains
>> >> definitions.  `lift_X` is the name Xsmith gives to definitions that
>> >> are created automatically in this way (which is ultimately most
>> >> definitions).  The exception is caused because somehow the lift ended
>> >> up in a place where it wasn't actually in scope.  That shouldn't
>> >> happen.  Anyway, if you point me to a version of a fuzzer that does
>> >> this, I'll try to debug it.
>> >>
>> >#lang clotho
>> >
>> >(require
>> > xsmith
>> > racr
>> > xsmith/racr-convenience
>> > xsmith/canned-components
>> > racket/string
>> > racket/list
>> > racket/pretty)
>> >
>> >;; We first define a basic component and add a bunch of expressions.
>> >
>> >(define-basic-spec-component somelisp)
>> >
>> >(add-basic-expressions somelisp
>> >                       #:ProgramWithSequence #t
>> >                       #:VoidExpression #t
>> >                       #:AssignmentExpression #t
>> >                       #:VariableReference #t
>> >                       #:ProcedureApplication #t
>> >                       #:IfExpression #t
>> >                       #:ExpressionSequence #t
>> >                       #:LetSequential #t
>> >                       #:LambdaWithExpression #t
>> >                       #:Numbers #t
>> >                       #:Booleans #t
>> >                       #:Strings #t
>> >                       #:ImmutableList #t)
>> >
>> >(add-loop-over-container
>> > somelisp
>> > #:name Map
>> > #:collection-type-constructor (λ (inner) (immutable (list-type inner)))
>> > #:loop-type-constructor (λ (inner) (immutable (list-type inner))))
>> >
>> >(add-to-grammar
>> > somelisp
>> > [MyProgram #f
>> >            ([decls : Definition *]
>> >             [declsTwo : MyDefinition *]
>> >             [body : _expression_])
>> >            #:prop type-info
>> >            [(fresh-type-variable)
>> >             (λ (n t)
>> >               (hash 'decls (λ (c) (fresh-type-variable))
>> >                     'declsTwo (λ (c) (fresh-type-variable))
>> >                     'body (fresh-type-variable)))]]
>> > [MyDefinition #f ([type]
>> >                   [name = (fresh-var-name "b_")]
>> >                   [e : ProcedureApplication])
>> >               #:prop binder-info ()
>> >               #:prop type-info [(fresh-type-variable) (λ (n t) (hash 'e
>> t))]])
>> >
>> >(add-property
>> > somelisp
>> > lift-type->ast-binder-type
>> > [#f (λ (type) (if (zero? (random 2)) 'MyDefinition 'Definition))])
>> >
>> >;; Sometimes we have a type variable that is not concrete but needs to be.
>> >;; Here we provide a list of options we can pick for an unconstrained
>> >;; type variable.
>> >(define (type-thunks-for-concretization)
>> >  (list
>> >   (λ()int-type)
>> >   (λ()bool-type)
>> >   (λ()string-type)
>> >   (λ()(immutable (list-type (fresh-type-variable))))))
>> >
>> >(define (somelisp-format-render s-exps)
>> >  (define out (open-output-string))
>> >  (for ([symex s-exps])
>> >    (pretty-print symex out 1))
>> >  (get-output-string out))
>> >
>> >(define-xsmith-interface-functions
>> >  [somelisp]
>> >  #:program-node MyProgram
>> >  #:type-thunks type-thunks-for-concretization
>> >  #:comment-wrap (λ (lines)
>> >                   (string-join
>> >                    (map (λ (l) (format ";; ~a" l)) lines)
>> >                    "\n"))
>> >  #:format-render somelisp-format-render)
>> >
>> >(module+ main (somelisp-command-line))
>> >
>> >
>> >> Probably something is going wrong in the `fresh` property.  Have you
>> >> changed that?
>> >>
>> >I don’t think so. See the code at the end of the email.
>> >
>> >> I should change that error message.  Anyway, it seems to think there
>> >> is no outer node that has a `Definition *` child that can be the
>> >> target for lifting a definition.  Generally this happens when you
>> >> forget to have an outer Program node or something that can hold
>> >> definitions.
>> >>
>> >I don’t think that’s the case, but I could be wrong.
>> >
>> >#lang clotho
>> >
>> >(require
>> > xsmith
>> > racr
>> > xsmith/racr-convenience
>> > xsmith/canned-components
>> > racket/string
>> > racket/list
>> > racket/pretty)
>> >
>> >;; We first define a basic component and add a bunch of expressions.
>> >
>> >(define-basic-spec-component somelisp)
>> >
>> >(add-basic-expressions somelisp
>> >                       #:ProgramWithSequence #t
>> >                       #:VoidExpression #t
>> >                       #:AssignmentExpression #t
>> >                       #:VariableReference #t
>> >                       #:ProcedureApplication #t
>> >                       #:IfExpression #t
>> >                       #:ExpressionSequence #t
>> >                       #:LetSequential #t
>> >                       #:LambdaWithExpression #t
>> >                       #:Numbers #t
>> >                       #:Booleans #t
>> >                       #:Strings #t
>> >                       #:ImmutableList #t)
>> >
>> >(add-loop-over-container
>> > somelisp
>> > #:name Map
>> > #:collection-type-constructor (λ (inner) (immutable (list-type inner)))
>> > #:loop-type-constructor (λ (inner) (immutable (list-type inner))))
>> >
>> >(add-to-grammar
>> > somelisp
>> > [MyProgram #f
>> >            ([decls : Definition *]
>> >             [body : SubProgram])
>> >            #:prop type-info
>> >            [(fresh-type-variable)
>> >             (λ (n t)
>> >               (hash 'decls (λ (c) (fresh-type-variable))
>> >                     'body (fresh-type-variable)))]]
>> > [SubProgram #f
>> >             ([decls : MyDefinition *]
>> >              [body : _expression_])
>> >             #:prop type-info
>> >             [(fresh-type-variable)
>> >              (λ (n t)
>> >                (hash 'decls (λ (c) (fresh-type-variable))
>> >                      'body (fresh-type-variable)))]]
>> > [MyDefinition #f ([type]
>> >                   [name = (fresh-var-name "b_")]
>> >                   [e : ProcedureApplication])
>> >               #:prop binder-info ()
>> >               #:prop type-info [(fresh-type-variable) (λ (n t) (hash 'e
>> t))]])
>> >
>> >(add-property
>> > somelisp
>> > lift-type->ast-binder-type
>> > [#f (λ (type) (if (zero? (random 2)) 'MyDefinition 'Definition))])
>> >
>> >;; Sometimes we have a type variable that is not concrete but needs to be.
>> >;; Here we provide a list of options we can pick for an unconstrained
>> >;; type variable.
>> >(define (type-thunks-for-concretization)
>> >  (list
>> >   (λ()int-type)
>> >   (λ()bool-type)
>> >   (λ()string-type)
>> >   (λ()(immutable (list-type (fresh-type-variable))))))
>> >
>> >(define (somelisp-format-render s-exps)
>> >  (define out (open-output-string))
>> >  (for ([symex s-exps])
>> >    (pretty-print symex out 1))
>> >  (get-output-string out))
>> >
>> >(define-xsmith-interface-functions
>> >  [somelisp]
>> >  #:program-node MyProgram
>> >  #:type-thunks type-thunks-for-concretization
>> >  #:comment-wrap (λ (lines)
>> >                   (string-join
>> >                    (map (λ (l) (format ";; ~a" l)) lines)
>> >                    "\n"))
>> >  #:format-render somelisp-format-render)
>> >
>> >(module+ main (somelisp-command-line))
>>