Log in

Cheedoong Drung has started a Chinese translation of “Is Parallel Programming Hard, And, If So, What Can You Do About It?”. He is making good progress, and would also be happy to have others join in. So if you have good Chinese and English skills (I guess I am one for two on that one), get in touch with Cheedoong. (His contact information is on his web site).

Parallel Programming: Temporary Home

Due to some unscheduled downtime, perfbook's usual home is unavailable. I have therefore posted a recent PDF of perfbook here.

Parallel Programming: August 2011 Update

This release of the parallel programming book features a new data-ownership chapter, the start of a new validation chapter, and lots of fixes, primarily from Elie De Brauwer.

I expect to put out the next release in December. I will continue on validation, and hopefully get a few other things accomplished.

As always, git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/perfbook.git will be updated in real time.

Parallel Programming: May 2011 Update

Once again, a big thank you for everyone who contributed to the parallel-programming book. This edition includes:

  1. Nicer fonts, courtesy of Tom Gunderson.
  2. Front and back cover art, courtesy of Melissa Broussard.
  3. Greatly expanded locking chapter (though this chapter never really will be complete).
  4. An appendix covering recent history of RCU in the Linux kernel.
  5. A section introducing RCU concepts as preparation for RCU fundamentals.
  6. A section covering seqlock.
  7. A conflicting-views section covering CPU evolution.

I expect to put out the next update in August. My guess is that I will focus on data ownership and perhaps a bit on data structures, but I do reserve the right to change plans at any point. ;–)

As always, git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/perfbook.git will be updated in real time.

Parallel Programming: Update

Thanks to everyone who provided comments, criticisms, and, most of all, patches to the parallel-programming book announced last month. Several of the contributions were quite substantial, particularly a couple that fixed my extensive abuse of the \url{} latex command. A number of the comments had to do with format, and so the new version has a number of improvements:

  1. Embedded fonts, which make the PDF compatible with various printing services.
  2. PDF hyperlinks, both internal and external.
  3. A set of PDF bookmarks corresponding to the table of contents.
  4. An experimental single-column version.
  5. A tarball of an experimental HTML version.

A number of people convinced me to move from latex to pdflatex, which I should have done years ago. Quarter-century-old habits die hard, I guess.

I expect to release updates every three to six months. However, for the impatient, git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/perfbook.git will be updated in real time, as always.

Parallel Programming: What Ted Said!!!

Ted gave a great presentation on improving ext4's performance and scalability, but the best part was his call for people to document which lock protects each variable and structure field. This would be a great step forward, even given that this information would often be outdated. You would at least know what the intended locking design was some time in the past. ;–)
I presented “Is Parallel Programming Hard, And, If So, Why?” at the Multicore and Parallel Computing Miniconf, and had the good fortune of having Linus in the audience. As you will quickly learn if you post to LKML, Linus often gives very insightful (and sometimes quite pointed) feedback. My presentation was no exception.

Linus noted that many parallel programming experts are unwilling to admit that there are algorithms that do not parallelize nicely.

As you might expect, I responded by saying that parallelism is an optimization, and like other optimizations has places where it works well and places where it does not.

What else could I have said?

Parallel Programming: Announcement

As some of you know, I have been working on a book on parallel programming. My thought had been to complete it, then announce it. But I finally realized that it really never will be complete, at least not as long as people keep coming up with new parallel-programming ideas, which I most definitely hope will continue for a very long time.

So, here it is!

Parallel Programming: The Summing Up

So what can we say about parallel programming?
The many answers to this question fills many a book, but the following
points seem especially worth emphasizing:

  1. Parallelism is universal and intuitive. It is programming, whether parallel or not, that is counter-intuitive.
  2. The primary cause of the Great Multicore Software Crisis is the new-found low price and large supply of shared-memory parallel systems. Many lessons can be learned from the Great Software Crisis of the 1970s and 1980s, which had similar economic underpinnings.
  3. “Embarrassing parallelism” is really not an embarrassment at all, except perhaps to those who thoughtlessly use this as a derogatory term. To the rest of is, it is truly an embarrassment of riches. Where they apply, trivial solutions are indeed a wonderful thing.
  4. Parallel programmers can be trained, and the conditions for training them has improved greatly, and continues to improve. However, in parallel programming, raw intelligence without experience or expert guidance is almost always a negative: The more intelligent you are, the deeper a hole you will dig for yourself before realizing that you are in trouble.
  5. During the decades when parallel systems were rare and expensive, Darwinian selection favored projects whose leaders were afraid of or revulsed by parallelism. This fear and revulsion must be overcome if such projects are to be successful on multicore systems.
  6. Sequential software is likely to have made design and coding choices that, while eminently reasonable for a sequential program, are completely inappropriate for parallel systems. These historical choices must be dealt with if the software is to be converted to run multithreaded.
  7. Many of the complications that parallelism introduces are global in nature. For example, the root cause of a deadlock might be distributed widely throughout the parallel program. People debugging parallel programs therefore have a greater than usual need to view the source code for the entire application.
  8. They say that you get what you measure. Therefore, be very careful what benchmarks you use when evaluating multicore systems. Of course, the same caution applies when evaluating single-CPU systems.
  9. Although Amdahl's law is useful, it measures only scalability. Raw performance is, if anything, more important than is scalability.
  10. They say that a developer is only as good as his or her tools. This is as true of parallel programming as it is of sequential programming.
  11. Multithreaded programs place added stress on testing and validation. Interestingly enough, running parallel programs on systems containing CPUs running at different frequencies can cause certain kinds of race conditions to occur more frequently. So, if you choose to develope multithreaded software, beef up your validation efforts accordingly.
  12. Beware the expert who claims something is impossible. Perhaps it is in fact impossible for him or her, but that does not mean that it is impossible for everyone.
  13. Creating a new artifact requires a different set of skills than does maintaining and operating an existing artifact. But parallelizing an existing artifact is more an act of creation than of maintenance or operation. However, there are some common special cases that can easily parallelized by operations and maintenance personnel.
  14. Just as the Great Software Crisis of the 1970s and 1980s spawned the good, the fad, and the ugly, so too will the current Great Multicore Software Crisis. But never forget that the definition of “good” is something that delivers either order-of-magnitude improvements in productivity or order-of-magnitude improvements in the size of population able to make good use of multicore systems. Preferably both.

But how can parallel software ever hope to keep up with the exponential growth in the number of cores?
Back in the Heeding History posting in this series, I argued that the solutions put forward to deal with the Great Software Crisis of the 1970s and 1980s could be grouped into the good, the fad, and the ugly. Is a similar grouping appropriate for the present-day Great Multicore Software Crisis?

Perhaps a review of the definition of “good” would be helpful: a good multicore programming language will (1) increase the productivity of existing multicore developers by orders of magnitude, (2) increase the fraction of the population who can make good use of multicore/parallel systems, again by orders of magnitude, or preferably (3) both. Yes, this is a high bar, but this is the bar that must be cleared.

The programming languages in the “ugly” category are quite apparent: MPI, OpenMP, C/C++ with POSIX pthreads or fork/wait, Bourne shell and relatives (don't forget the & operator and the wait command!), and numerous other sturdy indefensibles. Some say that these languages should have been strangled at birth, but on the other hand they are always there when you need them.

It is too early to tell exactly which of the much-hyped multicore programming languages and environments will fall into the “fad” category, but it seems clear that a great many of them will. If you don't like my attitude, feel free to protest all you like. However, I have heard it all long ago from the many and vociferous proponents of such wonderous gems as PASCAL, Modula, Eiffel, ADA, and many others besides. Rest assured that your protestations sound very much like theirs did. And if you really don't like my attitude, the only correct response is to prove me wrong. Just keep the definition of “good” clearly in mind, and also mind the gap between research prototypes and production-quality software.

What if you cannot see that there is a gap between good research and solid practice?

And just what are the good multicore parallel programming languages?