That's definitely an interesting idea. Thanks for the tip.
Did you use file size to determine if a test is big enough of source
lines?
I just use bytes, for example in Perl:
my $filesize = stat("$cfile")->size;
if ($filesize < $MIN_PROGRAM_SIZE) {
print "FILE TOO SMALL\n";
return;
}
Nobody has systematically looked for the best value for
$MIN_PROGRAM_SIZE but a study I did a while ago showed (this is in the
PLDI paper I think) showed that Csmith's bug-finding power is maximized
for programs in the 80 KB range. So it's surely safe to throw away
anything less than 10 or 20 KB.
Also, is there any study on the size of programs generated by CSmith in
general?
I looked at this a while ago but the size distribution changes every
time someone changes Csmith so the info is stale.
Qualitatively, many programs are quite short but there's a long tail of
huge (> 1 MB) programs.