When we're byte-logging we store a third word, the mask, and use it in the isValid() operation. The value we store is stored in masked form, which is an extra operation of overhead for single-threaded execution, but saves us masking during validation.
bool stm::ByteLoggingValueListEntry::isValid |
( |
| ) |
const |
|
inline |
When we're dealing with byte-granularity we need to check values on a per-byte basis.
We believe that this implementation is safe because the logged address is always word aligned, thus promoting subword loads to aligned word loads followed by a masking operation will not cause any undesired HW behavior (page fault, etc.).
We're also assuming that the masking operation means that any potential "low-level" race that we introduce is immaterial—this may or may not be safe in C++1X. As an example, someone is nontransactionally writing the first byte of a word and we're transactionally reading the scond byte. There is no language-level race, however when we promote the transactional byte read to a word, we read the same location the nontransactional access is writing, and there is no intervening synchronization. We're safe from some bad behavior because of the atomicity of word-level accesses, and we mask out the first byte, which means the racing read was actually dead. There are no executions where the source program can observe the race and thus they conclude that it is race-free.
I don't know if this argument is valid, but it is certainly valid for now, since there is no memory model for C/C++.
If this becomes a problem we can switch to a loop-when-mask != ~0x0 approach.