[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [csmith-dev] Generate two slightly different files

John Regehr <regehr@cs.utah.edu> writes:

> Additionally, since Csmith and Frama-C cooperate so well, a
> guess-and-check approach can work: instead of being careful in Csmith
> you instead generate whatever you like and then use Frama-C to verify
> absence of UB.

thank you for your fast feedback, which was definitly faster than my
reply. I took a look into the references and hints you gave me. The
basic design I have in mind, when using mutation based testing would be
to generate a C file and derive one or more mutants from it.

- cmutate has the problem that mutants are generated on a line-by-line
  basis. cmutate has no abstracted understanding of the C semantic.
  Which results in uncompilable mutants. I think that this approach will
  have a larger problem with test cases generated by csmith, since they
  have some really compilated pointer reference semantic there (as far
  as i can tell)

- Proteum and Milu have that understanding of the semantic. Milu does
  compile and produces some mutants for my test case.

One problem I see with all mutation-based testing tools is that they do
not try to mutate every part of the C file. The 77 mutation operators
defined by Agrawal et al.[1] (Milu is based on them) only mutate
statements, some expressions, and constant values. Types are never
changed, fields are never added, no intermediate typedef. Which is fine
and appropriate for mutation-based testing.

Regarding Xuejun's question about the undefined behavior: I do not care
much about it for my use case. So using csmith plus some transformations
might be an good option.

Since John suggested reusing the creduce infrastructe, how do creduce
and csmith relate to one another? Are there shared sources or concepts?

Another idea that came to my mind was to manipulate the random stream
csmith uses to generate files. As I understand it correctly, csmith uses
lrand48() to generate a sequence of random numbers that are used to
construct the testcase. If filled with the same seed, the same testcase
would be build. What would happen, if I drop/replace one of these
numbers when generated the "mutant"? For example, one number that is
used relative late in the generation process:

 - $ csmith > test.c
   Results: test.c,
            L = length of random sequence
   $ csmith -seed $Seed -drop-seq-item $(randint (L * 0.9) L) > test-mutated.c

Could this approach generate similar C files?


[1] http://web.soccerlab.polymtl.ca/log6305/protected/papers/CMutation.pdf
Christian Dietrich, M.Sc. (Research Staff)
Computer Science 4 (Distributed Systems and Operating Systems)
Friedrich-Alexander-Universität Erlangen-Nürnberg
Martensstr. 1
91058 Erlangen

Tel:    (09131) 85-27280
Fax:    (09131) 85-28732
eMail:  christian.dietrich@fau.de
WWW:    http://www4.cs.fau.de/~dietrich