Does Python have a concurrency problem?

by Jeremy Jones

There are a number of concurrency models available for Python, both in the standard library and some home-grown solutions. Standard threading, which uses the OS's threading library, is perhaps the most common. Select-based concurrency, such as is used by Twisted, is also quite popular in the Python community. Generator-based "threads", such as described by David Mertz, is another mechanism to support concurrent tasks in Python. A problem is that none of these methods currently scale across multiple CPUs and take full advantage of them. This isn't as much of a problem on IO bound processes (such as network applications) as it is on CPU bound applications, so a select-based concurrency model does have an advantage there. The only options currently available (that I know of) which can take full advantage of multiple CPUs involve multiple process, either forking or using shared memory...or both. I'm sure this approach works very well in a number of situations, but it just feels like a mess. I have a hard time attaching the words "elegant" and "Pythonic" to "forked processes" and "shared memory".

CPython comes equipped with the global interpreter lock (GIL) which allows only one op-code of Python bytecode to execute at a time, regardless of how many threads may be running in a given Python process. This is by design and, from what I gather, is a protection mechanism which keeps the internals of the Python interpreter (and I assume running code) from being mangled by threads accessing the same spots of memory. The end result is that a single CPython process of threaded code will not fully utilize more than a single processor in a single system. This means that, all things being equal, a single process, even threaded, will run no faster on a 128 CPU machine than it will a single CPU machine.

There has been some talk recently about a more scalable (and more Pythonic) concurrency model. Bruce Eckel started a discussion around this topic the other day on the python-dev mailing list. There doesn't appear to be a consensus just yet on the exact approach, but some good ideas floated around for a bit. Unfortunately, the discussion appears to be done - prematurely by my estimation. I'd love to see a PEP come out of this, though. There are so many sticking points, both technically and ideologically, that it will take some time to formulate a PEP that will gain general acceptance. This sounds like a case where some really bright person (of which there are plenty in the Python community and on the Python-dev list specifically) needs to just write a PEP that isn't too strongly hated by any one side and let it get BDFLed into existence.

I'm no language writing expert and the only experience I've had with concurrent programming has been with Python, so I'm sure there are nuances of lower level concurrency that I'm missing, but I am formulating in my mind what kind of concurrency model I'd like to see. I liked the idea of each task creating another Python interpreter instance in the Python process. Why not just spawn a new process? It seems like that just makes it a bit harder to share information between the starting task and the started task. Of course, you want the ability to share information, but you don't want too much shared. Another idea that I liked was a queue-like interface between the starting task and the started task. The starting task should have to explicitly pass in the specific pieces of information it wants the started task to work on or have available to it. The starting task should have the ability to query the started task and find out if it's working on the task or if it's done. Now, if the started task needs to return something to the starting task, how does it do it? I don't know. I really don't like the thought of the starting task polling a queue to see if there is anything in there. What if the started task isn't intended to return anything? I know, you can set flags when starting it....... You can't really make its "run" method return anything or you would block until it finished, which is self-defeating. I'm sure one or more of the pythonian intelligencia will come up with something brilliant. It will probably look nothing like what I've described and I'm sure I'll love it and think that it is better than I could have imagined. I would just like to see it happen.

You may think that I presupposed that Python needs a new concurrency scheme. Well, maybe that'll be a discussion for another day.


2005-10-05 22:10:43
I think forking is a good solution.
I think forking is a good solution.

The thing about forking is that it allows the whole interpreter to be not only shuffled off of one processor onto another but in HPC cluster environments from one node to another. This means that the controller need only spawn copies of itself and then handle communication through other means. Forking also has fairly little overhead on most systems where the interpreter isn't being moved off of physical system because only one copy of the code sits in memory and conducting a copy of the memory takes very little time. This means two python processes will take less time to spawn via forking than two commands as there is no 2nd initialization phase and there will be less physical memory used. Sounds good to me

How forking is handled may not be as graceful as it could be, although once you're used to it, it's not bad.

That's just how I understand it, and how I feel based on that understanding. I could be out to lunch, I'm no expert.

2005-10-05 23:01:03
What about multiple core processors?
Does the same limitations in the current threading methods with multiple processors apply to multiple core processors?

If it does, I think a solution will come up really soon because of popular demand. But I also think that it should come after watching how several projects deal with the coming wave of mutiple core processors, and choosing the most elegant solution.

So, I second what you said at the beginning of your article, it is too early.

2005-10-06 00:14:58
Look at Erlang
Check out Erlang ( for an example of a very nice and successful concurrency model.
2005-10-06 04:58:29
Look at Erlang
I thought about mentioning Erlang, but since I know next to nothing about it, I decided not to. There is a project, I think its name is "Candygram" or something like that, which tries to bring an Erlangish concurrency model into Python. There is an Erlang book which focuses on concurrency freely available online. Guess I'll put that on my reading list.
2005-10-06 05:01:38
What about multiple core processors?
I believe that it treats multicore systems the same as if they were multicpu systems. I think this is a huge reason people are starting to clammor for it. But I think the time is currently right to get this into the language (and the CPython implementation specifically).
2005-10-06 05:09:39
I think forking is a good solution.
I'm not opposed to forking, not totally anyway. It does solve a number of problems that other concurrency schemes have. It mostly gives the isolation of the started task by creating a copy of the starting task. My understanding is, though, that file descriptors aren't copied; if you have an open file in the starting process, you get the same file descriptor in the started process, which could cause problems. This also handles the scalability problem. But according to the Python lib ref, it's only available on Mac and *NIX. So, instead of creating some Pythonic wrapper for fork(), I think something more along the line of threads will be selected, at least under the hood.
2005-10-06 08:20:14
The Python-dev thread and related work
On the Python-dev thread a proposal was made for a process-based concurrency API:

One library that doesn't go as far as that proposal in terms of concurrency "limits", and which doesn't enforce some kind of function-oriented approach, instead providing a channel-oriented approach, is the parallel module:

This module was written before I became aware of the above discussion, but given the availability of transparent process migration for various operating systems and the likelihood that such techniques will become more widespread, I don't think fork-based concurrency should be written off. Apart from issues of duplicated file descriptors, similarities to thread-based concurrency in terms of preserving the program "image" (albeit in a copied state) make the approach more palatable than other distributed computing techniques.

2005-10-06 08:57:11
Look at Erlang
Erlang is interesting, in that it has extremely light processes, and I get this impression that processes are similar to what we think of as objects. Except encapsulated much more firmly, since you can really only send messages.
2005-10-06 08:59:26
Saying "I like forking" is like saying "I like vector graphics... therefore we shouldn't give access to raster graphics". Both forking and threads, believe it or not, have value. I think Python either needs to give me access to both or prepare to be retired in favor of a more serious language (Ruby?).

Nobody is asking Python to implement threads. The thread library already exists. Just please get out of the way.

2005-10-06 11:30:53
Look at Erlang
Yes, and one of the creators of Erlang even coined the expression: "Concurrency Object Programming".
2005-10-06 14:16:56
FYI, Ruby doesn't have OS-level threads. But anyway, there's no way threading support is going away. It's only a question of whether the alternatives are going to get more serious development done on them.
2005-10-06 15:30:58
Yes, python has a concurrency problem
I have seen this question debated in so many forums, and I must confess, at the risk of being flamed, that the defenses of Python's poor support for threading are lame. If you have an application for which multiple threads of execution are required, and there needs to be significant communication between the threads of execution, then forking processes simply will not do. Correct me if I'm wrong, but I believe DBMSs like Oracle and MySQL are multi-threaded. There are numerous apps that can get away with being multi-process because they use the DBMS as their means of dealing with the communication between threads. There is nothing wrong with that, but one should realize that those applications are using threading, it is just hidden in the DBMS.

Writing an app in which the threads of execution need to have a significant amount of interaction, and they are in multiple processes, adds lots of difficulties. The IPC needed to communicate btw the processes is generally a major point of failure. There needs to be code that figures out if a process has died and how to start it again. There needs to be code to kill all the processes when the app is supposed to terminate. There needs to be code to figure out if one of the processes has stayed alive when it shouldn't have. The headaches never end.

The usual arguments against threading are that it is too hard and too error-prone. I agree that writing multi-thread apps is not for the feint of heart, but if you need significant communication between the threads of execution, and you can't handle the complexity of thread, then you probably can't handle the complexity of multiple processes either.

Regarding python's support for threading, in addition to the point you make about its inability to utilize multiple processors effectively, the GIL also appears to preclude setting thread priorities.

Java seems to have done multi-threading right in its JVM. Why not python?

2005-10-06 23:39:16
Python and Concurrency
I think I hear this rubbish every two months. I hate to repeat myself but...

Python code runs at approximately one tenth the speed of C code (on a good day). If you are serious about performance, this should concern you.

Threads are not the answer. Most applications don't need threading (shared image). Binding workers together with a single shared memory image seems like a gain. Later on, you will have to scale across multiple machines. Maybe for failover, maybe to scale--however, you will need to do so.

When you do, you will wish you had written your application forked w/ IPC, because it's easier to port local IPC to network IPC than it is to convert a SSI-threaded model to an IPC model. So again, with the exception of a few very high performance, very limited applications, threads are only a middle ground between scalable and not.

Removing the GIL slows down Python. The gauntlet has been thrown down numerous times. No one has been able to yet realize a single performance increase by removing the GIL. The savings they thought they were getting were usually smashed by increased cache contention and locking overhead.

Finally, the GIL does not make Python single threaded. Python has threading support. The GIL makes individual Python opcodes single threaded. Python is made to be a hybrid solution. Write in Python, optimize with C extensions. C extension code has all of the facilities it needs to run as many threads as it wants to use. I have Python applications that demonstratably consume more than 1 CPU of power when carefuly profiled.

This is really not rocket science. The fact of the matter is that Python makes development quick and painless. When scalability and performance are needed, a little profiling, a little bit of high-level refactoring of your Python code, and careful implementation of critical logic in C extensions give you a level of flexibility and performance at an unmatched price point.

To sum up:

Python has no less threading performance than C does, you just may have to write some C before its over.

Threads are not the answer. Threads are the question. No is the answer. Why? Scalable != threading. Scalable == clustered / peer to peer / grid. Threading is no more the height of scalability than purgatory is heaven.

2005-10-07 04:43:54
Python and Concurrency
Python performs well enough. As you mention, profiling and C extensions make Python's performance "problem" a non-problem.

You seem to be focusing on scalability across a multitude of machines. The majority of programs written won't ever be run in that sort of situation. This blog entry was really intended to discuss scaling across multiple CPUs on a single system.

I'm not sure your understanding of the GIL is correct. The GIL does in a sense make Python single threaded. Yes, there is still thread support, yes, multiple threads can actually be running simultaneously, but only one op-code is executed at a time regardless of the number of threads running. At least that is my understanding from newsgroup discussions and what I've read in Python's ceval.c.

I'm not sure threads are the answer, but they are certainly an answer and I would like to see a concurrency model which utilized threads better. I like the idea of creating child processes either by fork() or some other mechanism, but I think (maybe I'm wrong here) that threads will have better cross-platform support.

2005-10-07 04:52:48
Yes, python has a concurrency problem
I'm going off of memory here, so I may be wrong, but I thought at least one of the major RDBMSs used multiple processes and shared memory (and each of those processes probably used threads). Googling for it right now turns up nothing, though. Regardless, your point stands that sometimes using threads is a preferable alternative.

Again, I'd love to see Python have a better threading model. I would also love to see a new concurrency model which is both Pythonic and scalable across multiple CPUs on the same machine.

2005-10-07 09:18:32
Yes, python has a concurrency problem
Excellent point about Java's ability to allow full access to real threads. And Jython at least gives a programmer a choice. You can use the usual GIL-constrained threads to keep in step with CPython or simply import native Java threads in order to bypass the bottleneck and take advantage of multiple CPUs. So much for the notion that the Python language and native threads are somehow necessarily mutually exclusive.

2005-10-07 09:43:27
No OS-level threads in Ruby? Perhaps you mean... not yet!

From another thread, dated April 2005: "Ruby v2, which will start trickling out later this year apparently contains support for native threading."

Well, at least it's a sign that Ruby developers are on the right track. Meanwhile... Python gets decorators??!?

2005-10-07 15:22:13
Ruby v2 is still very much a moving target. The current implementation of Ruby like Python has threading issues. Your only purpose was to say "Ruby" in this discussion. Saying v2 of Ruby may have native threads is bluster since the next version of Python "could" have native threads as well. The question is, what are we dealing with now.
2005-10-07 15:59:08
I believe Python currently supports native threads. Here is a quote from the top of thread.c:

/* Thread package.
This is intended to be usable independently from Python.
The implementation for system foobar is in a file thread_foobar.h
which is included by this file dependent on config settings.
Stuff shared by all thread_*.h files is collected here. */

Each system has its own threading package which appears to import the proper threading library for that system. If I'm wrong, someone please correct me.

2005-10-10 09:42:14
You're correct, I believe. It's just that the GIL gets in the way as far as taking advantage of multiple CPUs.