Related technologies

This page will host FastFlow related technologies and works as well as new ideas we are working on.

Sequential Consistency and Lock-Free synchronizations:Lamport's Bakery Algorithm

It is possible to implement mutual exclusion without special atomic machine instructions like CAS or LL/SC.

One of the most famous lock-free algorithm that uses only read and write operations to ensure mutual exclusion among N processes, is due to Leslie Lamport and is known as Bakery Algorithm (see the page on wikipedia).

The algorithm is elegant and apparently very simple. It is based on the policy that is sometimes used in a bakery. Upon entering the bakery a customer gets a number which is greater than the numbers of other customers that are waiting for service. The holder of the lowest number is the next to be served.

The Bakery algorithm satisfies FIFO ordering and is wait-free, but as the numbers can grow without bound, its implementation uses unbounded size registers (this might be a nonissue if we use a 64-bit counter). Gadi Taubenfeld proposed the Black-White Bakery Algorithm (link), which preserve the simplicity and elegance of the original algorithm and bounds the amount of space required by coloring the numbers (using only one additional shared bit).

The Bakery Algorithm works correctly under Sequential Consistency (SC) memory model. The SC memory model was formally defined by Lamport as follows:

A multiprocessor system is sequentially consistent if the result of any execution is the same as if the operations of all the processors were executed in some sequential order, and the operations of each individual processor appear in this sequence in the order specified by its program.

As reported in ”Shared Memory Consistency Models: A Tutorial” by Sarita V. Adve and Kourosh Gharachorloo link, “There are two aspects to sequential consistency: (1) maintaining program order among operations from individual processors, and (2) maintaining a single sequential order among operations from all processors. The latter aspect makes it appear as if a memory operation executes atomically or instantaneously with respect to other memory operations.”

Unfortunately, it is costly in term of performance to implement SC because prevents widely used compiler optimizations based on memory instruction reordering. Furthermore, modern pipeline microprocessors feature out-of-order instructions execution to maxymize CPU utilization. So, in most modern multiprocessor/multicore architectures, memory reads and writes are not sequentially consistent.

When the programmes need SC memory model, they must enforce it explicitly by using architecture's special instruction called memory barriers (or memory fences). Such special instructions tell the processor to immediatly propagate updates to and from memory hierarchy in order to guarantee that “every load and store instruction that precedes in program order the fence instruction is globally visible before any load or store instruction that follows the fence instruction is globally visible”.

In the paper ”Consistency Requirements of Distributed Shared Memory for Lamport's Bakery Algorithm for Mutual Exclusion” by Jerzy Brzezinski and Dariusz Wawrzyniak link, has been proved that the Bakery algorithm is correct (without any explicit synchronisation), provided that write operations to the array “choosing” are sequentially consistent, and write operations to the array “number” are at least PRAM consistent. So the paper states that sequential consistency is not necessary for all memory operations but only for some of them. It is worth to say that PRAM consistency is a quite relaxed memory model, in particular much weaker than TSO-like models often used in modern multiprocessors.

The Bakery Algorithm is not used in practice, not only because it does not work on modern architectures that exploit relaxed memory consistency model (like TSO memory models), but also because it requires to read and write N distict memory locations where N is the maximum number of concurrent processes.

Maurice Herlihy and Nir Shavit in "The Art Of Multiprocessor Programming" proved that “any deadlock-free Lock algorithm requires allocating and then reading or writing at least N distinct locations in the worst case. This result is crucially important, because it motivates us to add to our multiprocessor machines, synchronization operations stronger than read-write, and use them as the basis of our mutual exclusion algorithms.”

We have implemented the Bakery Algorithm under Linux using GNU g++; the code can be found here.

For any comments, errors or clarifications please drop a mail to Massimo Torquati torquati AT di (dot) unipi (dot) it.

  • Bookmark at
  • Bookmark "Related technologies" at del.icio.us
  • Bookmark "Related technologies" at Digg
  • Bookmark "Related technologies" at Furl
  • Bookmark "Related technologies" at Reddit
  • Bookmark "Related technologies" at Ask
  • Bookmark "Related technologies" at BlinkList
  • Bookmark "Related technologies" at blogmarks
  • Bookmark "Related technologies" at Google
  • Bookmark "Related technologies" at Technorati
  • Bookmark "Related technologies" at Twitter
  • Bookmark "Related technologies" at Slashdot
 
ffnamespace/related.txt · Last modified: 2010/09/23 20:02 by aldinuc · [Old revisions]
Recent changes RSS feed Powered by PHP Valid XHTML 1.0 Valid CSS Driven by DokuWiki