Rambles around computer science

Diverting trains of thought, wasting precious time

Fri, 02 Oct 2009

Python, threading and the madness of language implementors

I just stumbled across this video of a talk by David Beazley about the implementation of Python and, in particular, threading. In a nutshell, threading in Python is implemented as a complete minimal-effort hack on top of an interpreter that is essentially single-threaded by design. Static state is again the culprit---there's a big lock, called the Global Interpreter Lock, much like Linux used to have the Big Kernel Lock. But, rather than just protecting some carefully-selected critical regions, it protects all Python computation!

So, alarmingly, the performance of Python threading is absolutely disastrous. It's ironic that Google are heavy users of Python, given that they work with some of the largest datasets on the planet and generally have to know a thing or two about optimisation and concurrency. Of course they don't use Python for their core systems, and are sponsoring an improvement effort called Unladen Swallow.

I have some research ideas that predicated heavily on the position that language implementors often don't do a great job, particularly in the case of dynamic languages. So if we have to rewrite a bunch of dynamic language implementations, that's not really a terrible thing. I'm glad to have yet more evidence of this.

[/devel] permanent link

Shell Gotme number 1 -- the Heisenbergian process trees

On my Lab machine, the X login system doesn't clean up stray child processes that your X session may have left behind. (I've a feeling the Debian xinit scripts do this, but I'm not sure.) This was bothering me, because I start some background jobs in my .xsession which I want to die naturally when the session ends. So I put the following towards the end of my .xsession.

echo -n "End of .xsession: detected living child pids: " 1>&2
ps --no-header f -o pid --ppid $$ | \
	while read pid; do kill ; done 2>/dev/null

Unfortunately, I found that those pesky children were still alive. (Can you tell what's wrong? Yes, that's right.) Both the ps process and the while subshell are among the children which are being killed. So one way or another, the pipeline gets broken before the loop has managed to kill the children I actually wanted to kill. A version which doesn't suffer this problem is as follows.

child_pids=$( ps --no-header f -o pid --ppid $$ )
for pid in ; do kill  2>/dev/null; done

It's another instance of the familiar Heisenbergian nature of inspecting process trees from the shell: doing many basic things from the shell entails creating processes, so inspecting the shell's own process tree is likely to yield perturbed results. Usually it's just an unwanted entry in ps (as with the old ps | grep foo gotcha) but the above shows it sometimes causes subtler problems.

[/devel] permanent link


Powered by blosxom

validate this page