Re: [xsmith-dev] XSmith questions

On Thu, Jul 15, 2021 at 9:22 PM William G Hatch <william@hatch.uno> wrote:

That would be great, thanks!

On Thu, Jul 15, 2021 at 08:09:13PM -0700, Sorawee Porncharoenwase wrote:
>Yes, this is very helpful. I will let you know if I encounter any further
>issues after updating XSmith.
>
>By the way, do you want any help with Scribble? I notice a lot of typos and
>link errors and can submit PRs to fix them.
>
>On Thu, Jul 15, 2021 at 8:00 PM William G Hatch <william@hatch.uno> wrote:
>
>> Sorry for the delay. I wanted to do the debugging before responding,
>> but then I didn't get to it for a while.
>>
>> Anyway, I now see what's going wrong, and I've fixed some bugs.
>>
>> First, that issue with the bud-node was a bug in the `edit` property
>> which the canned-components `ProcedureApplication` uses. It's fixed
>> now.
>>
>> >> Hmm, it seems like first-order functions would be encoded as a
>> >> function application node that only allows a reference as its RHS
>> >> rather than allowing arbitrary expressions. But that said, I'm not
>> >> sure I fully understand what you're trying to do.
>> >>
>> >I’m not sure what you mean by that. But:
>>
>> Erm, I misspoke (mistyped) a bit. I meant that it would be function
>> application that only allows a reference in the procedure position. I
>> don't know why I wrote RHS.
>>
>> Anyway, the strategy of having separate definition nodes for functions
>> and everything else is used in Cish. I don't recommend following
>> anything Cish does. However, it accomplishes this by having the
>> program node have non-function definitions first, and disallowing use
>> of functions on the RHS of those functions.
>>
>> The problem where it can't find a reference is a bug (that I haven't
>> fixed yet) caused by the lifting mechanics not really being completely
>> in sync with the reference mechanics with respect to nodes that have
>> multiple definition fields. When there are multiple definition
>> fields, definitions in the lower field can reference the definitions
>> in the higher field, but not vice versa. The lifting machinery
>> doesn't know about that ordering, it just stuffs the lifted definition
>> into the front of the definitions in whatever field it is targetting.
>> Thus it places a definition in a place that the reference machinery
>> thinks is out of scope.
>>
>> So... if you want to have different binders that get different fields
>> in the program node you can only make it work if you disallow stuff to
>> be sure you never need to reference something that will be lifted into
>> the lower field.
>>
>> That said, I've just fixed some issues to allow definition nodes to be
>> subtypable, so now you can have a definition node that is a subtype of
>> Definition, but has some different rules. Importantly, though, it can
>> be lifted to the same field.
>>
>> Eg. if I change the MyDefinition declaration in your example to look
>> like this:
>>
>> ```
>> [MyDefinition Definition ()
>> #:prop fresh (hash '_expression_ (make-hole
>> 'ProcedureApplication))]
>> ```
>>
>> It can now be lifted among normal definitions, but always have
>> ProcedureApplication as its RHS.
>>
>> However, I'm still not convinced that you actually want
>> ProcedureApplication as the RHS. Maybe you want to restrict lambda
>> nodes to only be generated on the RHS of a FunctionDefinition node
>> (that's a subtype of Definition), and make FunctionDefinition always
>> have a Lambda node as its RHS.
>>
>> There is an example in the documentation that is like this, in the
>> explanation of the `choice-filters-to-apply` property. But I'll paste
>> it here:
>>
>> ```
>> (add-choice-method
>> my-component
>> no-lambda-except-global-def
>> [#f (λ () #t)]
>> [Lambda (λ ()
>> (and (parent-node current-hole)
>> (equal? (ast-node-type (parent-node current-hole))
>> 'Definition)
>> (parent-node (parent-node current-hole))
>> (equal? (ast-node-type (parent-node (parent-node
>> current-hole)))
>> 'Program)))])
>> (add-property my-component
>> choice-filters-to-apply
>> [#f (no-lambda-except-global-def)])
>> ```
>>
>> You can use something like that to allow Lambda nodes to only be
>> generated on the RHS of FunctionDefinition nodes, and define
>> FunctionDefinition like MyDefinition above to always generate Lambda
>> as its RHS, and I think it should work for a first-order language.
>>
>>
>> >(Also note that I tried unify! these unrestricted type variables
>> >with some concrete non-function type variables, but down the road, I get
>> an
>> >internal error about unification failure.). Perhaps I'm not supposed to
>> >unify! inside add-choice-method?
>>
>> Well, probably not. Generally the only place you want to `unify!` is
>> inside the type property in the function that determines child types
>> based on the parent type. Inside that function you can `unify!`, but
>> the unification needs to be consistent (IE not unified based on the
>> result of `random`) OR the choice needs to be saved in a way that it
>> can be looked up when that function is run again.
>>
>> The problem is that the choice method gets its type information based
>> on a hole node that is later replaced, so the type is recomputed
>> without necessarily consulting the type that the hole said it was.
>> You should be able to constrain the type using unification if you save
>> that type to re-unify it consistently later. But without seeing the
>> exact error and code that generated it, I'm not 100% certain.
>>
>>
>> > In the last email, I asked if lift-type->ast-binder-type could be used
>> > nondeterministically in this manner, and my experience with it so far
>> seems
>> > to indicate no.
>>
>> I think this should work, barring the issues mentioned above about the
>> lifting vs referencing bug when you have the different definition
>> types in different fields. Adjusting your example to have
>> MyDefinition as a subtype, and then only having a Definition field in
>> MyProgram, random choice seems to work fine. But sometimes it
>> generates gargantuan programs (that I kill before it finishes) because
>> it keeps choosing MyDefinition, requiring a function application,
>> which likely lifts a reference to a function, which might choose
>> MyDefinition, etc, in a bad lift chain that generates bigger and
>> bigger types.
>>
>>
>> Anyway, does that all help?
>>
>>
>> On Tue, Jul 13, 2021 at 02:55:10PM -0700, Sorawee Porncharoenwase wrote:
>> >>
>> >> Hmm, it seems like first-order functions would be encoded as a
>> >> function application node that only allows a reference as its RHS
>> >> rather than allowing arbitrary expressions. But that said, I'm not
>> >> sure I fully understand what you're trying to do.
>> >>
>> >I’m not sure what you mean by that. But:
>> >
>> > 1. I have no problem with the definition site. That’s easy to deal
>> with.
>> > 2. I also have no problem with restricting function application. That’s
>> > easy to deal with as well.
>> > 3. The problem is primarily with a bare variable reference bound to a
>> > closure value, which must only occur in function application. For
>> example,
>> > the following code is invalid.
>> >
>> >(define (f) (void))
>> >
>> >(define g f) ;; this is invalid
>> >(define h (g))
>> >(define a (list f)) ;; this is also invalid
>> >(define b (f)) ;; this is ok
>> >
>> >I have been using choice-filters-to-apply to restrict variable references
>> >whose type can-unify? with function-type that doesn't directly appear
>> under
>> >function application. But this is obviously not ideal because the filtered
>> >nodes are mostly those whose type is totally unrestricted. So filtering
>> >them would prevent them from being concretized to other non-function types
>> >as well. (Also note that I tried unify! these unrestricted type variables
>> >with some concrete non-function type variables, but down the road, I get
>> an
>> >internal error about unification failure.). Perhaps I'm not supposed to
>> >unify! inside add-choice-method?
>> >
>> >That's probably a bug. “Lifting” is what I've called the situation
>> >> where a reference is generated that causes a new definition to be
>> >> added such that the variable is in scope at the use site. IE the
>> >> binding is lifted from the use site to some outer node that contains
>> >> definitions. `lift_X` is the name Xsmith gives to definitions that
>> >> are created automatically in this way (which is ultimately most
>> >> definitions). The exception is caused because somehow the lift ended
>> >> up in a place where it wasn't actually in scope. That shouldn't
>> >> happen. Anyway, if you point me to a version of a fuzzer that does
>> >> this, I'll try to debug it.
>> >>
>> >#lang clotho
>> >
>> >(require
>> > xsmith
>> > racr
>> > xsmith/racr-convenience
>> > xsmith/canned-components
>> > racket/string
>> > racket/list
>> > racket/pretty)
>> >
>> >;; We first define a basic component and add a bunch of expressions.
>> >
>> >(define-basic-spec-component somelisp)
>> >
>> >(add-basic-expressions somelisp
>> > #:ProgramWithSequence #t
>> > #:VoidExpression #t
>> > #:AssignmentExpression #t
>> > #:VariableReference #t
>> > #:ProcedureApplication #t
>> > #:IfExpression #t
>> > #:ExpressionSequence #t
>> > #:LetSequential #t
>> > #:LambdaWithExpression #t
>> > #:Numbers #t
>> > #:Booleans #t
>> > #:Strings #t
>> > #:ImmutableList #t)
>> >
>> >(add-loop-over-container
>> > somelisp
>> > #:name Map
>> > #:collection-type-constructor (λ (inner) (immutable (list-type inner)))
>> > #:loop-type-constructor (λ (inner) (immutable (list-type inner))))
>> >
>> >(add-to-grammar
>> > somelisp
>> > [MyProgram #f
>> > ([decls : Definition *]
>> > [declsTwo : MyDefinition *]
>> > [body : _expression_])
>> > #:prop type-info
>> > [(fresh-type-variable)
>> > (λ (n t)
>> > (hash 'decls (λ (c) (fresh-type-variable))
>> > 'declsTwo (λ (c) (fresh-type-variable))
>> > 'body (fresh-type-variable)))]]
>> > [MyDefinition #f ([type]
>> > [name = (fresh-var-name "b_")]
>> > [e : ProcedureApplication])
>> > #:prop binder-info ()
>> > #:prop type-info [(fresh-type-variable) (λ (n t) (hash 'e
>> t))]])
>> >
>> >(add-property
>> > somelisp
>> > lift-type->ast-binder-type
>> > [#f (λ (type) (if (zero? (random 2)) 'MyDefinition 'Definition))])
>> >
>> >;; Sometimes we have a type variable that is not concrete but needs to be.
>> >;; Here we provide a list of options we can pick for an unconstrained
>> >;; type variable.
>> >(define (type-thunks-for-concretization)
>> > (list
>> > (λ()int-type)
>> > (λ()bool-type)
>> > (λ()string-type)
>> > (λ()(immutable (list-type (fresh-type-variable))))))
>> >
>> >(define (somelisp-format-render s-exps)
>> > (define out (open-output-string))
>> > (for ([symex s-exps])
>> > (pretty-print symex out 1))
>> > (get-output-string out))
>> >
>> >(define-xsmith-interface-functions
>> > [somelisp]
>> > #:program-node MyProgram
>> > #:type-thunks type-thunks-for-concretization
>> > #:comment-wrap (λ (lines)
>> > (string-join
>> > (map (λ (l) (format ";; ~a" l)) lines)
>> > "\n"))
>> > #:format-render somelisp-format-render)
>> >
>> >(module+ main (somelisp-command-line))
>> >
>> >
>> >> Probably something is going wrong in the `fresh` property. Have you
>> >> changed that?
>> >>
>> >I don’t think so. See the code at the end of the email.
>> >
>> >> I should change that error message. Anyway, it seems to think there
>> >> is no outer node that has a `Definition *` child that can be the
>> >> target for lifting a definition. Generally this happens when you
>> >> forget to have an outer Program node or something that can hold
>> >> definitions.
>> >>
>> >I don’t think that’s the case, but I could be wrong.
>> >
>> >#lang clotho
>> >
>> >(require
>> > xsmith
>> > racr
>> > xsmith/racr-convenience
>> > xsmith/canned-components
>> > racket/string
>> > racket/list
>> > racket/pretty)
>> >
>> >;; We first define a basic component and add a bunch of expressions.
>> >
>> >(define-basic-spec-component somelisp)
>> >
>> >(add-basic-expressions somelisp
>> > #:ProgramWithSequence #t
>> > #:VoidExpression #t
>> > #:AssignmentExpression #t
>> > #:VariableReference #t
>> > #:ProcedureApplication #t
>> > #:IfExpression #t
>> > #:ExpressionSequence #t
>> > #:LetSequential #t
>> > #:LambdaWithExpression #t
>> > #:Numbers #t
>> > #:Booleans #t
>> > #:Strings #t
>> > #:ImmutableList #t)
>> >
>> >(add-loop-over-container
>> > somelisp
>> > #:name Map
>> > #:collection-type-constructor (λ (inner) (immutable (list-type inner)))
>> > #:loop-type-constructor (λ (inner) (immutable (list-type inner))))
>> >
>> >(add-to-grammar
>> > somelisp
>> > [MyProgram #f
>> > ([decls : Definition *]
>> > [body : SubProgram])
>> > #:prop type-info
>> > [(fresh-type-variable)
>> > (λ (n t)
>> > (hash 'decls (λ (c) (fresh-type-variable))
>> > 'body (fresh-type-variable)))]]
>> > [SubProgram #f
>> > ([decls : MyDefinition *]
>> > [body : _expression_])
>> > #:prop type-info
>> > [(fresh-type-variable)
>> > (λ (n t)
>> > (hash 'decls (λ (c) (fresh-type-variable))
>> > 'body (fresh-type-variable)))]]
>> > [MyDefinition #f ([type]
>> > [name = (fresh-var-name "b_")]
>> > [e : ProcedureApplication])
>> > #:prop binder-info ()
>> > #:prop type-info [(fresh-type-variable) (λ (n t) (hash 'e
>> t))]])
>> >
>> >(add-property
>> > somelisp
>> > lift-type->ast-binder-type
>> > [#f (λ (type) (if (zero? (random 2)) 'MyDefinition 'Definition))])
>> >
>> >;; Sometimes we have a type variable that is not concrete but needs to be.
>> >;; Here we provide a list of options we can pick for an unconstrained
>> >;; type variable.
>> >(define (type-thunks-for-concretization)
>> > (list
>> > (λ()int-type)
>> > (λ()bool-type)
>> > (λ()string-type)
>> > (λ()(immutable (list-type (fresh-type-variable))))))
>> >
>> >(define (somelisp-format-render s-exps)
>> > (define out (open-output-string))
>> > (for ([symex s-exps])
>> > (pretty-print symex out 1))
>> > (get-output-string out))
>> >
>> >(define-xsmith-interface-functions
>> > [somelisp]
>> > #:program-node MyProgram
>> > #:type-thunks type-thunks-for-concretization
>> > #:comment-wrap (λ (lines)
>> > (string-join
>> > (map (λ (l) (format ";; ~a" l)) lines)
>> > "\n"))
>> > #:format-render somelisp-format-render)
>> >
>> >(module+ main (somelisp-command-line))
>>