On 02/09/11 15:52, John Regehr wrote:
Perhaps the larger problem here is that Csmith produces many programs
that are small and (worse) effectively identical.

My early driver scripts kept a hash of every tested program to avoid
testing duplicates, but eventually I realized it was simpler and more
effective to just throw away every program smaller than some size. We've
variously used numbers like 5 KB and 20 KB.

That's definitely an interesting idea. Thanks for the tip.
Did you use file size to determine if a test is big enough of source lines?

Also, is there any study on the size of programs generated by CSmith in general?

At some level it would be nice if Csmith just failed to produce these
programs, but on the other hand I strongly believe in keeping complexity
outside of Csmith when this is feasible (I'm not claiming we've done a
good job with this).

I agree that if there are things that can be easily performed by a postprocessor and kept outside CSmith, then better! :)


