[csmith-dev] Alignment of struct members cause target dependent checksum

Wed Jun 5 11:01:37 MDT 2013

On Wed, Jun 5, 2013 at 9:38 AM, Nicholas Mc Guire <der.herr at hofr.at> wrote:
> On Wed, 05 Jun 2013, John Regehr wrote:
>>
>> It is not generally expected that Csmith programs will have the same
>> results across compilers or platforms except when those
>> compilers/platforms make the same choices for implementation-defined
>> characteristics such as alignment, integer width and representation, etc.
>>
>> When creating Csmith we had a choice between generating more portable
>> code, which would permit differential testing across more platforms, and
>> less portable code, which (we think) finds more compiler bugs.

Incidentally, John, I was also surprised that Csmith would generate
code that writes to a union member and then reads back from a
different (and not even equivalent) member. On reflection I agree this
might be useful, but shouldn't there at least be a command-line switch
to prevent Csmith from generating undefined behavior? (If Kees
described the problem correctly, the test program was definitely UB.)

Kees: In order for anything useful to happen, you'll probably have to
post the exact git revision of Csmith you were using, the exact
command line, and the "seed" value which produced the bad test case.

>> Most compiler bugs can be found simply by comparing different
>> optimization levels of the same compiler.  We've seen a few bugs where a
>> compiler produces that same wrong result at all optimization levels, but
>> this is quite rare.
>
> would you have a quantifiation/estimate of "quite rare" ?
>
> Would it make sense to have a two step process to minimize false positives
> something like:
>  1) same compiler -O -O2 -Os (or similar)
>  2) those that did not trigger in 1) rerun agains compiler A/B/C ?

Nicholas: Isn't that protocol exactly equivalent to the "one-step" protocol
1) compiler A, -O -O2 -Os; compiler B; compiler C
except that the "one-step" protocol will "unnecessarily" run compilers
B and C even when compiler A fails?
But compiler A isn't expected to fail a significant (let's say >5%)
fraction of the time. We're expecting maybe one in 1000 of our test
cases to fail — or else there's something seriously wrong with
compiler A!  So the two-step protocol gives a 0.1% speedup at best.

–Arthur