[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [xsmith-dev] XSmith questions



Sorry for the delay.  I wanted to do the debugging before responding,
but then I didn't get to it for a while.

Anyway, I now see what's going wrong, and I've fixed some bugs.

First, that issue with the bud-node was a bug in the `edit` property
which the canned-components `ProcedureApplication` uses.  It's fixed
now.

Hmm, it seems like first-order functions would be encoded as a
function application node that only allows a reference as its RHS
rather than allowing arbitrary expressions.  But that said, I'm not
sure I fully understand what you're trying to do.

I’m not sure what you mean by that. But:

Erm, I misspoke (mistyped) a bit.  I meant that it would be function
application that only allows a reference in the procedure position.  I
don't know why I wrote RHS.

Anyway, the strategy of having separate definition nodes for functions
and everything else is used in Cish.  I don't recommend following
anything Cish does.  However, it accomplishes this by having the
program node have non-function definitions first, and disallowing use
of functions on the RHS of those functions.

The problem where it can't find a reference is a bug (that I haven't
fixed yet) caused by the lifting mechanics not really being completely
in sync with the reference mechanics with respect to nodes that have
multiple definition fields.  When there are multiple definition
fields, definitions in the lower field can reference the definitions
in the higher field, but not vice versa.  The lifting machinery
doesn't know about that ordering, it just stuffs the lifted definition
into the front of the definitions in whatever field it is targetting.
Thus it places a definition in a place that the reference machinery
thinks is out of scope.

So... if you want to have different binders that get different fields
in the program node you can only make it work if you disallow stuff to
be sure you never need to reference something that will be lifted into
the lower field.

That said, I've just fixed some issues to allow definition nodes to be
subtypable, so now you can have a definition node that is a subtype of
Definition, but has some different rules.  Importantly, though, it can
be lifted to the same field.

Eg. if I change the MyDefinition declaration in your example to look
like this:

```
 [MyDefinition Definition ()
               #:prop fresh (hash 'Expression (make-hole 'ProcedureApplication))]
```

It can now be lifted among normal definitions, but always have
ProcedureApplication as its RHS.

However, I'm still not convinced that you actually want
ProcedureApplication as the RHS.  Maybe you want to restrict lambda
nodes to only be generated on the RHS of a FunctionDefinition node
(that's a subtype of Definition), and make FunctionDefinition always
have a Lambda node as its RHS.

There is an example in the documentation that is like this, in the
explanation of the `choice-filters-to-apply` property.  But I'll paste
it here:

```
(add-choice-method
 my-component
 no-lambda-except-global-def
 [#f (λ () #t)]
 [Lambda (λ ()
           (and (parent-node current-hole)
                (equal? (ast-node-type (parent-node current-hole))
                        'Definition)
                (parent-node (parent-node current-hole))
                (equal? (ast-node-type (parent-node (parent-node current-hole)))
                        'Program)))])
(add-property my-component
          choice-filters-to-apply
          [#f (no-lambda-except-global-def)])
```

You can use something like that to allow Lambda nodes to only be
generated on the RHS of FunctionDefinition nodes, and define
FunctionDefinition like MyDefinition above to always generate Lambda
as its RHS, and I think it should work for a first-order language.


(Also note that I tried unify! these unrestricted type variables
with some concrete non-function type variables, but down the road, I get an
internal error about unification failure.). Perhaps I'm not supposed to
unify! inside add-choice-method?

Well, probably not.  Generally the only place you want to `unify!` is
inside the type property in the function that determines child types
based on the parent type.  Inside that function you can `unify!`, but
the unification needs to be consistent (IE not unified based on the
result of `random`) OR the choice needs to be saved in a way that it
can be looked up when that function is run again.

The problem is that the choice method gets its type information based
on a hole node that is later replaced, so the type is recomputed
without necessarily consulting the type that the hole said it was.
You should be able to constrain the type using unification if you save
that type to re-unify it consistently later.  But without seeing the
exact error and code that generated it, I'm not 100% certain.


In the last email, I asked if lift-type->ast-binder-type could be used
nondeterministically in this manner, and my experience with it so far seems
to indicate no.

I think this should work, barring the issues mentioned above about the
lifting vs referencing bug when you have the different definition
types in different fields.  Adjusting your example to have
MyDefinition as a subtype, and then only having a Definition field in
MyProgram, random choice seems to work fine.  But sometimes it
generates gargantuan programs (that I kill before it finishes) because
it keeps choosing MyDefinition, requiring a function application,
which likely lifts a reference to a function, which might choose
MyDefinition, etc, in a bad lift chain that generates bigger and
bigger types.


Anyway, does that all help?


On Tue, Jul 13, 2021 at 02:55:10PM -0700, Sorawee Porncharoenwase wrote:

Hmm, it seems like first-order functions would be encoded as a
function application node that only allows a reference as its RHS
rather than allowing arbitrary expressions.  But that said, I'm not
sure I fully understand what you're trying to do.

I’m not sure what you mean by that. But:

  1. I have no problem with the definition site. That’s easy to deal with.
  2. I also have no problem with restricting function application. That’s
  easy to deal with as well.
  3. The problem is primarily with a bare variable reference bound to a
  closure value, which must only occur in function application. For example,
  the following code is invalid.

(define (f) (void))

(define g f) ;; this is invalid
(define h (g))
(define a (list f)) ;; this is also invalid
(define b (f)) ;; this is ok

I have been using choice-filters-to-apply to restrict variable references
whose type can-unify? with function-type that doesn't directly appear under
function application. But this is obviously not ideal because the filtered
nodes are mostly those whose type is totally unrestricted. So filtering
them would prevent them from being concretized to other non-function types
as well. (Also note that I tried unify! these unrestricted type variables
with some concrete non-function type variables, but down the road, I get an
internal error about unification failure.). Perhaps I'm not supposed to
unify! inside add-choice-method?

That's probably a bug.  “Lifting” is what I've called the situation
where a reference is generated that causes a new definition to be
added such that the variable is in scope at the use site.  IE the
binding is lifted from the use site to some outer node that contains
definitions.  `lift_X` is the name Xsmith gives to definitions that
are created automatically in this way (which is ultimately most
definitions).  The exception is caused because somehow the lift ended
up in a place where it wasn't actually in scope.  That shouldn't
happen.  Anyway, if you point me to a version of a fuzzer that does
this, I'll try to debug it.

#lang clotho

(require
xsmith
racr
xsmith/racr-convenience
xsmith/canned-components
racket/string
racket/list
racket/pretty)

;; We first define a basic component and add a bunch of expressions.

(define-basic-spec-component somelisp)

(add-basic-expressions somelisp
                      #:ProgramWithSequence #t
                      #:VoidExpression #t
                      #:AssignmentExpression #t
                      #:VariableReference #t
                      #:ProcedureApplication #t
                      #:IfExpression #t
                      #:ExpressionSequence #t
                      #:LetSequential #t
                      #:LambdaWithExpression #t
                      #:Numbers #t
                      #:Booleans #t
                      #:Strings #t
                      #:ImmutableList #t)

(add-loop-over-container
somelisp
#:name Map
#:collection-type-constructor (λ (inner) (immutable (list-type inner)))
#:loop-type-constructor (λ (inner) (immutable (list-type inner))))

(add-to-grammar
somelisp
[MyProgram #f
           ([decls : Definition *]
            [declsTwo : MyDefinition *]
            [body : Expression])
           #:prop type-info
           [(fresh-type-variable)
            (λ (n t)
              (hash 'decls (λ (c) (fresh-type-variable))
                    'declsTwo (λ (c) (fresh-type-variable))
                    'body (fresh-type-variable)))]]
[MyDefinition #f ([type]
                  [name = (fresh-var-name "b_")]
                  [e : ProcedureApplication])
              #:prop binder-info ()
              #:prop type-info [(fresh-type-variable) (λ (n t) (hash 'e t))]])

(add-property
somelisp
lift-type->ast-binder-type
[#f (λ (type) (if (zero? (random 2)) 'MyDefinition 'Definition))])

;; Sometimes we have a type variable that is not concrete but needs to be.
;; Here we provide a list of options we can pick for an unconstrained
;; type variable.
(define (type-thunks-for-concretization)
 (list
  (λ()int-type)
  (λ()bool-type)
  (λ()string-type)
  (λ()(immutable (list-type (fresh-type-variable))))))

(define (somelisp-format-render s-exps)
 (define out (open-output-string))
 (for ([symex s-exps])
   (pretty-print symex out 1))
 (get-output-string out))

(define-xsmith-interface-functions
 [somelisp]
 #:program-node MyProgram
 #:type-thunks type-thunks-for-concretization
 #:comment-wrap (λ (lines)
                  (string-join
                   (map (λ (l) (format ";; ~a" l)) lines)
                   "\n"))
 #:format-render somelisp-format-render)

(module+ main (somelisp-command-line))


Probably something is going wrong in the `fresh` property.  Have you
changed that?

I don’t think so. See the code at the end of the email.

I should change that error message.  Anyway, it seems to think there
is no outer node that has a `Definition *` child that can be the
target for lifting a definition.  Generally this happens when you
forget to have an outer Program node or something that can hold
definitions.

I don’t think that’s the case, but I could be wrong.

#lang clotho

(require
xsmith
racr
xsmith/racr-convenience
xsmith/canned-components
racket/string
racket/list
racket/pretty)

;; We first define a basic component and add a bunch of expressions.

(define-basic-spec-component somelisp)

(add-basic-expressions somelisp
                      #:ProgramWithSequence #t
                      #:VoidExpression #t
                      #:AssignmentExpression #t
                      #:VariableReference #t
                      #:ProcedureApplication #t
                      #:IfExpression #t
                      #:ExpressionSequence #t
                      #:LetSequential #t
                      #:LambdaWithExpression #t
                      #:Numbers #t
                      #:Booleans #t
                      #:Strings #t
                      #:ImmutableList #t)

(add-loop-over-container
somelisp
#:name Map
#:collection-type-constructor (λ (inner) (immutable (list-type inner)))
#:loop-type-constructor (λ (inner) (immutable (list-type inner))))

(add-to-grammar
somelisp
[MyProgram #f
           ([decls : Definition *]
            [body : SubProgram])
           #:prop type-info
           [(fresh-type-variable)
            (λ (n t)
              (hash 'decls (λ (c) (fresh-type-variable))
                    'body (fresh-type-variable)))]]
[SubProgram #f
            ([decls : MyDefinition *]
             [body : Expression])
            #:prop type-info
            [(fresh-type-variable)
             (λ (n t)
               (hash 'decls (λ (c) (fresh-type-variable))
                     'body (fresh-type-variable)))]]
[MyDefinition #f ([type]
                  [name = (fresh-var-name "b_")]
                  [e : ProcedureApplication])
              #:prop binder-info ()
              #:prop type-info [(fresh-type-variable) (λ (n t) (hash 'e t))]])

(add-property
somelisp
lift-type->ast-binder-type
[#f (λ (type) (if (zero? (random 2)) 'MyDefinition 'Definition))])

;; Sometimes we have a type variable that is not concrete but needs to be.
;; Here we provide a list of options we can pick for an unconstrained
;; type variable.
(define (type-thunks-for-concretization)
 (list
  (λ()int-type)
  (λ()bool-type)
  (λ()string-type)
  (λ()(immutable (list-type (fresh-type-variable))))))

(define (somelisp-format-render s-exps)
 (define out (open-output-string))
 (for ([symex s-exps])
   (pretty-print symex out 1))
 (get-output-string out))

(define-xsmith-interface-functions
 [somelisp]
 #:program-node MyProgram
 #:type-thunks type-thunks-for-concretization
 #:comment-wrap (λ (lines)
                  (string-join
                   (map (λ (l) (format ";; ~a" l)) lines)
                   "\n"))
 #:format-render somelisp-format-render)

(module+ main (somelisp-command-line))