Until HTM implementations provide some sort of forward-progress guarantee, HTM will be limited by its fallback code sequences. For example, if the fallback code uses locking, then HTM will be just as vulnerable to deadlock as is the fallback code.
And during the past year, the IBM Mainframe announced an HTM implementation that includes constrained transactions in addition to the usual best-effort HTM implementation. A constrained transaction starts with the
tbegincinstruction instead of the
tbegininstruction that is used for best-effort transactions. Constrained transactions are guaranteed to always complete (eventually), so if a transaction aborts, rather than branching to a fallback path (as is done for best-effort transactions), the hardware instead restarts the transaction at the
The Mainframe architects needed to take extreme measures to deliver on this forward-progress guarantee. If a given constrained transaction repeatedly fails, the CPU might disable branch prediction, force in-order execution, and even completely disable pipelining. If the repeated failures are due to high contention, the CPU might disable speculative fetches, introduce random delays, and even serialize execution of the conflicting CPUs. “Interesting” forward-progress scenarios involve as few as two CPUs or as many as one hundred CPUs. Perhaps these extreme measures provide some insight as to why other CPUs have thus far refrained from offering constrained transactions.
As the name implies, constrained transactions are in fact severely constrained:
- The maximum data footprint is four blocks of memory, where each block can be no larger than 32 bytes.
- The maximum code footprint is 256 bytes.
- If a given 4K page contains a constrained transaction’s code, then that page may not contain that transaction’s data.
- The maximum number of assembly instructions that may be executed is 32.
- Backwards branches are forbidden.
Nevertheless, these constraints support a number of important data structures, including linked lists, stacks, queues, and arrays. Perhaps HTM with constrained transactions will eventually become an important tool in the parallel programmer's toolbox.