[Day43] Read Rust Atomics and Locks - Read-Modify-Write in the x86 Processor
by Mara Bos
At Topic: Chapter 7 Read-Modify-Write Operations
Recall
Compare-and-Exchange Operations
- Compare-and-Exchange checks if the atomic value is equal to a given value, and only if that is the case does it replace it with a new value (as a single operation)
compare_exchange_weak
: Similar tocompare_exchange
, but the difference is that the weak version may still sometimes leave the value untouched and return an Err, even though the atomic value matched the expected value.- We could use Compare-and-Exchange to implement all the other atomic operations by putting it in a loop to retry if it did change.
// source: https://doc.rust-lang.org/std/sync/atomic/struct.AtomicUsize.html#examples-11 use std::sync::atomic::{AtomicUsize, Ordering}; let some_var = AtomicUsize::new(5); // some_var compares to the first argument (5) and if it is equal, it will be replaced by the second argument (10) assert_eq!(some_var.compare_exchange(5, 10, Ordering::Acquire, Ordering::Relaxed), Ok(5)); assert_eq!(some_var.load(Ordering::Relaxed), 10);
Notes
An example of read-modify-write operation in x86-64 and ARM64:
// Rust code pub fn a(x: &mut i32) { *x += 10; }
=>
// Compiled x86-64
a:
add dword ptr [rdi], 10
ret
// Compiled ARM64
a:
ldr w8, [x0]
add w8, w8, #10
str w8, [x0]
ret
- While loading and storing happens in different steps in
ARM64
, it is clear that the instructions are not atomic. x86-64
'sadd
instruction will be separated by the process into several micro-instructions behind the scenes, so it is not atomic as well.
If instructions run on a single core, the atomic issue is not a problem, as switching a processor core between threads generally only happens between instructions. But if multiple cores are involved, it is crucial to consider the multiple steps of a read-modify-write operation.
x86
systems provide an instruction prefix called lock
as a modifier to make instructions like add
atomic.
a:
lock add dword ptr [rdi], 10
ret
- In modern processors, The
lock
prefix blocks the using memory while allowing other cores to operate on unrelated memory. - The
lock
prefix can only be applied to a few instructions, includingadd
,sub
,and
,not
,or
, andxor
. - The
xchg
instruction (exchange), used for atomic swaps, implicitly behaves as if it has a lock prefix, ensuring it is always atomic.
A fetch_add
example in x86-64
:
pub fn a(x: &AtomicI32) -> i32 { x.fetch_add(10, Relaxed) }
=>
a:
mov eax, 10
lock xadd dword ptr [rdi], eax
ret
Instead of the add
instruction, the xadd
(exchange and add) instruction is used, as it places the original value of x
into a register (eax
) before performing the addition.
add
can provide a little of useful to next instructions though, such as whether the updated value was zero or negative.
xadd
(O),xchf
(O),xsub
(X),xand
(X),xor
(X)- For subtraction, this isn't problematic since
xadd
can be used with negative values. - For
and
,or
andxor
(Bitwise Operations) =>bts
(bit test and set),btr
(bit test and reset), andbtc
(bit test and complement) - When operations affect more than one bit, they cannot be represented by a single
x86-64
instruction. - The
x86-64
architecture supports a lock-prefixablecmpxchg
(compare and exchange) instruction, which can be used to implement atomic operations that are not directly supported by thex86-64
instruction such asfetch_or
:
pub fn a(x: &AtomicI32) -> i32 { x.fetch_or(10, Relaxed) }
=>
a:
mov eax, dword ptr [rdi] ; Load the value from the atomic variable (pointed to by rdi) into eax
.L1:
mov ecx, eax ; Copy the value in eax to ecx (prepare for OR operation)
or ecx, 10 ; Perform a bitwise OR on ecx with 10, storing the result in ecx
lock cmpxchg dword ptr [rdi], ecx ; Atomically compare and exchange:
; - Compare the value in eax with the value at the memory address rdi
; - If they are equal, store ecx (new value) at eax
; - If not, store the current value at rdi in eax
jne .L1 ; If the comparison failed (values were not equal), jump back to .L1 to retry
ret ; Return from the function, with the original value in eax
On
x86-64
, there is no difference betweencompare_exchange
andcompare_exchange_weak
. Both compile down to alock cmpxchg
instruction.