[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [creduce-dev] parallel tuning

On 2015.11.17 at 11:00 +0100, John Regehr wrote:
> C-Reduce's strategy of querying the number of CPUs and running that many
> parallel reduction attempts is bad in some cases, such as on my Macbook
> where it runs with concurrency 8, where 3 would be a better choice.
> We did a bunch of benchmarking of this a few years ago but I'm afraid that
> the results are very specific to not only the platforms but also the
> interestingness tests.  Some of those have very light cache footprints
> whereas others (for example those that invoke static analyzers) tend to blow
> out the shared cache.
> My current idea is that first we need to detect real cores instead of
> hyperthreaded cores, which is sort of a pain but we can special-case Mac OS
> and Linux I guess.  Then maybe something like:
> - parallelism 2 on a dual core
> - 3 on a 4-core
> - 4 on a >4 core
> How does this match with your experience?

I've tested creduce on a real 6 core machine without hyperthreading with
a 2MB C++ testcase:

creduce -n 1 --backup ./check.sh bug244.cc  2576.49s user 300.02s system 100% cpu 47:47.16 total

creduce -n 4 --backup ./check.sh bug244.cc  3714.57s user 480.69s system 243% cpu 28:46.14 total

creduce -n 6 --backup ./check.sh bug244.cc  4759.06s user 578.17s system 270% cpu 32:54.00 total

So your idea looks good to me.