Skip to content

os(atomic): add portable pointer-width CAS __os_atomic_cas_ptr#33

Merged
gburd merged 1 commit into
masterfrom
perf/read-path
Jun 25, 2026
Merged

os(atomic): add portable pointer-width CAS __os_atomic_cas_ptr#33
gburd merged 1 commit into
masterfrom
perf/read-path

Conversation

@gburd

@gburd gburd commented Jun 25, 2026

Copy link
Copy Markdown
Collaborator

What

Add a pointer-width compare-and-swap, __os_atomic_cas_ptr, to the atomic
layer. Until now the layer exposed only 32-bit and 64-bit integer CAS.

How

  • Native implementations in the GCC __atomic and legacy __sync builtin
    tiers, expressed over intptr_t. Integer atomics have more uniform codegen
    than pointer-typed C11 atomics and are value-preserving for object pointers.
  • A generic fallback synthesizes the operation from the tier's sized CAS:
    the 64-bit CAS when pointers are 8 bytes, otherwise the 32-bit CAS; the
    mutex-emulated atomic-pool mutex when no native atomics are configured; or a
    plain store in the single-threaded build.
  • Prototypes added to os_ext.h under the matching tier guards.

Status

No current caller. This lands the primitive as the substrate that lock-free
and RCU data structures (Treiber stacks, epoch reclamation) in the multi-core
scaling work will build on. It depends on real hardware atomics, which #32
made available on modern toolchains.

Verification

Builds clean on top of current master (meson, macOS clang, GCC __atomic
tier); __os_atomic_cas_ptr is present as a native text symbol. On x86_64 it
compiles to a single lock cmpxchg.

Add a pointer-width compare-and-swap to the atomic layer, which until now
exposed only 32-bit and 64-bit integer CAS.  Native implementations in the
GCC __atomic and legacy __sync builtin tiers (over intptr_t, whose codegen
is more uniform than pointer-typed C11 atomics and is value-preserving for
object pointers).  A generic fallback synthesizes it from the tier's sized
CAS (64-bit when pointers are 8 bytes, else 32-bit), the mutex-emulated
atomic-pool mutex when there are no native atomics, or a plain store in the
single-threaded build.

This is foundational for lock-free/RCU data structures (Treiber stacks,
epoch reclamation) used in the multi-core scaling work.  No current caller;
added as the substrate the RCU-for-shared-structures track builds on.
@gburd gburd merged commit 8a39a7d into master Jun 25, 2026
36 of 39 checks passed
@gburd gburd deleted the perf/read-path branch June 25, 2026 11:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant