22 May 18:20
ForkJoinPool does not achieve expected parallelism
Romain Colle <rco <at> quartetfs.com>
2012-05-22 16:20:47 GMT
2012-05-22 16:20:47 GMT
Hi Doug and all,
Using the latest version of the ForkJoinPool available on the jsr166y website, we ran into an issue/regression with our existing code base.
In our code, we have a single task that scans a fairly large fact table while applying some filtering conditions. Any row that passes this condition must then be processed.
The pattern that we use is that the scanning task saves (in an array) the rows that passed the condition and forks a processing task as soon as we have enough rows to process (say 1024).
We expect that these processing tasks will be picked up (i.e. stolen) by the other threads in the pool and executed while we continue scanning and filtering.
Unrelated to the current issue, we have a completion phase at the end of the scanning that ensures all the forked tasks have been executed.
To sum it up, we have a single task that scans some data and forks lots of processing task.
Previously (with the version from a few months back without the worker queues), this was working perfectly and all the worker threads were kept busy and executed the processing tasks.
Now, we see that only a few threads are being kept busy with the processing tasks (in addition to the scanning task). Dozens of threads are idle while work is piling up.
If have put together a simple test that exhibits this issue:
Looking at the ForkJoinPool code, it looks like the issue could be in ForkJoinPool.WorkQueue.push().
More specifically, we have the following:
if ((n = (top = s + 1) - base) <= 2) {
if ((p = pool) != null)
p.signalWork();
}
If tasks are being forked quickly enough, we will only signal work twice while the local queue is getting fairly large, and no extra help is being made available.
I naively modified this code to the following and got back to the initial behavior (full threads usage):
if ((n = (top = s + 1) - base) <= pool.parallelism) {
if ((p = pool) != null)
p.signalWork();
}
I'm not quite sure why the initial "2" was put there. Was it to avoid flooding the threads with signals if everybody is already at work? In that case we could do the signaling only if the AC count is < 0.
I'm sure there is a more elegant solution available, so any advice on whether this is indeed a core issue or if there is an issue with our pool usage would be appreciated!
Thanks,
Romain Colle
R&D Project Manager
QuartetFS
2 rue Jean Lantier, 75001 Paris, France
http://www.quartetfs.com
_______________________________________________ Concurrency-interest mailing list Concurrency-interest <at> cs.oswego.edu http://cs.oswego.edu/mailman/listinfo/concurrency-interest
RSS Feed