I love your project and publications and findings.
Im pondering a little side project based on csmith. I would like to test bin codesize between diff compilers. In this version, avoiding undef behaviours would be less relevant.
Id run the testsuites and genned code from your tool hrough my harness measuring func sizes. Im thinking initially of two versions:
One would measure sizes of all functions, one only of available ones. Thus first version would interfere a bit with inliner and dce while other would respect them.
Im initially thinking of measuring clang vs gcc possibly on common set of targets available on both compilers.
Please let me know if that makes sense.
Looking forward to it.