[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [creduce-dev] Making creduce more parallel

To: Nick Fitzgerald <fitzgen@gmail.com>
Subject: Re: [creduce-dev] Making creduce more parallel
From: John Regehr <regehr@cs.utah.edu>
Date: Fri, 27 Jan 2017 11:27:39 -0700
Cc: creduce-dev@flux.utah.edu
In-reply-to: <CAN1aaTXqFu+1DQK2M00KRhBCN5YaLRrRb8Oan+Txk79RUkUsTg@mail.gmail.com>
List-archive: </listarchives/creduce-dev>
List-help: <mailto:creduce-dev-request@flux.utah.edu?subject=help>
List-id: C-Reduce Development Mailing List <creduce-dev.flux.utah.edu>
List-post: <mailto:creduce-dev@flux.utah.edu>
List-subscribe: <http://www.flux.utah.edu/mailman/listinfo/creduce-dev>, <mailto:creduce-dev-request@flux.utah.edu?subject=subscribe>
List-unsubscribe: <http://www.flux.utah.edu/mailman/options/creduce-dev>, <mailto:creduce-dev-request@flux.utah.edu?subject=unsubscribe>
References: <CAN1aaTUxQH8n9G-JWEbETS_y8LB6ap4F5cRUBVcQrLZTSbs8tQ@mail.gmail.com> <3bf0290b-70fb-9813-8bb2-76af8d8b608a@cs.utah.edu> <CAN1aaTXqFu+1DQK2M00KRhBCN5YaLRrRb8Oan+Txk79RUkUsTg@mail.gmail.com>
User-agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.12; rv:45.0) Gecko/20100101 Thunderbird/45.6.0

How are the passes / reducers structured right now? If we can generate
all of a pass's potential reductions up front, then they can be
inserted into the queue in a random order to reduce the likelihood of
conflicts. If the passes don't separate generating a potential
reduction from testing it, then we may need to refactor more.

Let's talk about the line reduction pass at granularity 1 (each variantis created by removing one line). We're running it on a 1000-line file.

The search tree here has 2^1000 leaves, so we certainly don't want totry to generate all variants up-front.

What we can do is speculate: assume the variants are unsuccessful(statistically this is the right guess) and now we only have 1000variants, so that is feasible, but not particularly fast since we'remanipulating ~50 KB of text. Worse, this line of speculation becomesmore and more out of sync with the current best reduction, assuming thatsome line removals succeed -- so merge conflicts are going to starthappening.

The upshot is that while the coordinator can run ahead, it should runonly far enough ahead that the queue doesn't empty out.

The API for a C-Reduce pass is purely functional. The pass takes acurrent test case and a pass state object, and either produces a newvariant + pass state or else says that it has no more variants togenerate. This API is not designed to facilitate picking random variants.

However, a quick hack to get a random variant is to just repeatedlyinvoke the pass a random number of times. This is not fast but it'llget things off the ground with very little effort. Some experimentationwill be required to determine how this parameter interacts with thelikelihood of merge conflicts.

I was imagining that the "orchestrator" process would spawn worker
threads that spawn-and-wait on interestingness processes and use
CSP-style channels to comminucate with the main thread that owns the
queue. This would leverage the modularity of passes that
generate potential reductions and of the interesting test.

Something like this diagram:
https://gist.githubusercontent.com/fitzgen/bf1acdc6dad217f2ed5accbabce9cf73/raw/981bff47be0f818e69041908eb63035fabb4e25a/orchestrator-diagram.txt

I was planning on prototyping in Rust, which has channels in its
standard library. Python's `queue.Queue` should also be able to handle
the job.

If you have other suggestions, I am all ears.

This sounds fine, the only suggestion I have is that you might considerusing a network-friendly mechanism for the orchestration in case wewanted to see how reduction across multiple machines works. Or at leastdesign things so that this isn't difficult if anyone wants to try it outlater.

I would suggest that you start out dealing only with one or two passes,perhaps the line remover and the token remover. These do some heavylifting, hopefully never crash, and are always easy to understand.


John

Follow-Ups:
- Re: [creduce-dev] Making creduce more parallel
  - From: John Regehr <regehr@cs.utah.edu>

References:
- [creduce-dev] Making creduce more parallel
  - From: Nick Fitzgerald <fitzgen@gmail.com>
- Re: [creduce-dev] Making creduce more parallel
  - From: John Regehr <regehr@cs.utah.edu>
- Re: [creduce-dev] Making creduce more parallel
  - From: Nick Fitzgerald <fitzgen@gmail.com>

Prev by Date: Re: [creduce-dev] A reduction attempt that creduce handled poorly, where delta was able to make progress
Next by Date: Re: [creduce-dev] Making creduce more parallel
Previous by thread: Re: [creduce-dev] Making creduce more parallel
Next by thread: Re: [creduce-dev] Making creduce more parallel
Index(es):
- Date
- Thread