10

Possible Duplicate:
Parallel.ForEach vs Task.Factory.StartNew

I need to run about 1,000 tasks in a ThreadPool on a nightly basis (the number may grow in the future). Each task is performing a long running operation (reading data from a web service) and is not CPU intensive. Async I/O is not an option for this particular use case.

Given an IList<string> of parameters, I need to DoSomething(string x). I am trying to pick between the following two options:

IList<Task> tasks = new List<Task>();
foreach (var p in parameters)
{
    tasks.Add(Task.Factory.StartNew(() => DoSomething(p), TaskCreationOptions.LongRunning));
}
Task.WaitAll(tasks.ToArray());

OR

Parallel.ForEach(parameters, new ParallelOptions {MaxDegreeOfParallelism = Environment.ProcessorCount*32}, DoSomething);

Which option is better and why?

Note :

The answer should include a comparison between the usage of TaskCreationOptions.LongRunning and MaxDegreeOfParallelism = Environment.ProcessorCount * SomeConstant.


  • @zooone9243: Perhaps I'm nitpicking a bit and it is just a language thing, but on a 4 core machine only 4 threads run concurrently. You can still create 1,000 threads (unless you run of of memory space) but a proper solution is to use threads from a pool exactly as you intend to do. It is the I need to run about 1,000 threads part that confuses me. - Martin Liversage
  • Please don't tell us that async I/O is not an option without explaining it. This sounds like an X/Y problem if I've ever heard one. Async I/O is the correct way to perform these types of tasks. If you're sure it doesn't apply in your case, then explain your problem so that we can actually try to provide the best possible solution. - Aaronaught
  • Then modify it so that it does. This is an architecture problem, not a performance concern. You will not get acceptable performance using the TPL (which includes both Task and Parallel). At best you're asking to choose between the lesser of two grave evils. - Aaronaught
  • Just an aside I realize this is an old question: Task/Parallel are 4.0 features. async is a 4.5 feature (yes I realize there was a CTP). So it could just be mandated from God that only true 4.0 features will be in the code. Or as the questioner mentions a web services library which must be used might be third party and there is no ability to modify it and decorate everything with async/awaits everywhere. - Mike
  • Something to note here - if you use Parallel.ForEach on long running (I/O bound tasks), the thread scheduler gets impatient. It assumes that the reason for the slow progress is that tasks are overly CPU intensive, so it starts adding threads to the thread pool at a rate of 2/minute. This basically "leaks" threads in this manner until the parallel foreach is complete. - Steven Padfield

3 답변


35

Perhaps you aren't aware of this, but the members in the Parallel class are simply (complicated) wrappers around Task objects. In case you're wondering, the Parallel class creates the Task objects with TaskCreationOptions.None. However, the MaxDegreeOfParallelism would affect those task objects no matter what creation options were passed to the task object's constructor.

TaskCreationOptions.LongRunning gives a "hint" to the underlying TaskScheduler that it might perform better with oversubscription of the threads. Oversubscription is good for threads with high-latency, for example I/O, because it will assign more than one thread (yes thread, not task) to a single core so that it will always have something to do, instead of waiting around for an operation to complete while the thread is in a waiting state. On the TaskScheduler that uses the ThreadPool, it will run LongRunning tasks on their own dedicated thread (the only case where you have a thread per task), otherwise it will run normally, with scheduling and work stealing (really, what you want here anyway)

MaxDegreeOfParallelism controls the number of concurrent operations run. It's similar to specifying the max number of paritions that the data will be split into and processed from. If TaskCreationOptions.LongRunning were able to be specified, all this would do would be to limit the number of tasks running at a single time, similar to a TaskScheduler whose maximum concurrency level is set to that value, similar to this example.

You might want the Parallel.ForEach. However, adding MaxDegreeOfParallelism equal to such a high number actually won't guarantee that there will be that many threads running at once, since the tasks will still be controlled by the ThreadPoolTaskScheduler. That scheduler will the number of threads running at once to the smallest amount possible, which I suppose is the biggest difference between the two methods. You could write (and specify) your own TaskScheduler that would mimic the max degree of parallelism behavior, and have the best of both worlds, but I'm doubting that something you're interested in doing.

My guess is that, depending on latency and the number of actual requests you need to do, using tasks will perform better in many(?) cases, though wind up using more memory, while parallel will be more consistent in resource usage. Of course, async I/O will perform monstrously better than any of these two options, but I understand you can't do that because you're using legacy libraries. So, unfortunately, you'll be stuck with mediocre performance no matter which one of those you chose.

A real solution would be to figure out a way to make async I/O happen; since I don't know the situation, I don't think I can be more helpful than that. Your program (read, thread) will continue execution, and the kernel will wait for the I/O operation to complete (this is also known as using I/O completion ports). Because the thread is not in a waiting state, the runtime can do more work on less threads, which usually ends up in an optimal relationship between the number of cores and number of threads. Adding more threads, as much as I wish it would, does not equate to better performance (actually, it can often hurt performance, because of things like context switching).

However, this entire answer is useless in a determining a final answer for your question, though I hope it will give you some needed direction. You won't know what performs better until you profile it. If you don't try them both (I should clarify that I mean the Task without the LongRunning option, letting the scheduler handle thread switching) and profile them to determine what is best for your particular use case, you're selling yourself short.


  • Thanks for a great answer. I wonder why, if Parallel class creates Task objects, how come it is able to create foreground threads vs Task library which creates background threads and doesn't seem to give you the option to create foreground threads? - Zaid Masud
  • @zooone9243 - Its not actually creating foreground threads. Instead, it just calls Wait() which blocks execution until its finished or canceled. - Christopher Currens
  • @zooone9243 - It's a little bit more complicated than I'm making it out to be. If you want to get a good understanding of the inner workings, I'd recommend you check out the .NET Reference Source - Christopher Currens

4

Both options are entirely inappropriate for your scenario.

TaskCreationOptions.LongRunning is certainly a better choice for tasks that are not CPU-bound, as the TPL (Parallel classes/extensions) are almost exclusively meant for maximizing the throughput of a CPU-bound operation by running it on multiple cores (not threads).

However, 1000 tasks is an unacceptable number for this. Whether or not they're all running at once isn't exactly the issue; even 100 threads waiting on synchronous I/O is an untenable situation. As one of the comments suggests, your application will be using an enormous amount of memory and end up spending almost all of its time in context-switching. The TPL is not designed for this scale.

If your operations are I/O bound - and if you are using web services, they are - then async I/O is not only the correct solution, it's the only solution. If you have to re-architect some of your code (such as, for example, adding asynchronous methods to major interfaces where there were none originally), do it, because I/O completion ports are the only mechanism in Windows or .NET that can properly support this particular type of concurrency.

I've never heard of a situation where async I/O was somehow "not an option". I cannot even conceive of any valid use case for this constraint. If you are unable to use async I/O then this would indicate a serious design problem that must be fixed, ASAP.


  • "I've never heard of a situation where async I/O was somehow 'not an option'" ... each web service call requires an expensive handshake. Establishing a connection is the real killer, almost more so than the actual calls themselves. My knowledge of IO completion ports is limited, can they be used for this scenario? If you have any good references on these please share. Thanks. - Zaid Masud
  • @zooone9243, I don't see why that would have to mean you can't use async IO. And it's hard to tell you how exactly to do that unless your tell us more. - svick
  • @svick I need to learn more about these I/O completion ports... are we talking about using unmanaged Windows I/O threads as discussed here? blogs.msdn.com/b/ericeil/archive/2008/06/20/… - Zaid Masud
  • @zooone9243, no, it means using BeginXxx()/EndXxx() methods instead of just Xxx() method. Which method(s) exactly are we talking about depends on what exactly you are doing (they can be methods on WebRequest or Socket or maybe something else). The Begin/End methods then use I/O completion ports internally. - svick
  • @zooone9243: Perhaps I've misinterpreted the meaning of "web service" here, but in my experience, the entire premise of a web service is that it uses a standard web protocol and format (i.e. SOAP, XML, JSON, all over HTTP or HTTPS) which you don't need a proprietary library to access. Is this some totally opaque binary-encoded RPC service for which you have no source or specifications? - Aaronaught

4

While this is not a direct comparison, I think it may help you. I do something similar to what you describe (in my case I know there is a load balanced server cluster on the other end serving REST calls). I get good results using Parrallel.ForEach to spin up an optimal number of worker threads provided that I also use the following code to tell my operating system it can connect to more than usual number of endpoints.

    var servicePointManager = System.Net.ServicePointManager.FindServicePoint(Uri);
    servicePointManager.ConnectionLimit = 250;

Note you have to call that once for each unique URL you connect to.

Linked


Related

Latest