You are viewing paulmck

(no subject)

HTM is interesting, but what about hardware support for locks? As the main performance problem with locking seems to be cache misses during contention, it occurs to me to wonder what would happen if a multicore chip devoted some of its gates to a reasonable number (32K?) of shared 1 bit registers, along with instructions for read, write and TAS, so that locks could be implemented without the overhead of memory latency. The OS would need to allocate a register to each lock at initialisation - or maintain a traditional, memory based lock if it runs out of registers, but then many structures could be locked at coarser granularity, so there should actually be significantly fewer locks.

Has anyone done this? If not, why not? The circuitry would be essentially identical to that which maintains the cache lines.
Error running style: Died in S2::run_code running ReplyPage::print(): Can't call method "user" on unblessed reference at /home/lj/cgi-bin/LJ/S2.pm line 3816, line 23.