fix(tsan): suppress false-positive race from VA reuse in LinearAllocator#593
Conversation
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 8eda1b6039
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
CI Test ResultsRun: #27410334436 | Commit:
Status Overview
Legend: ✅ passed | ❌ failed | ⚪ skipped | 🚫 cancelled Summary: Total: 32 | Passed: 32 | Failed: 0 Updated: 2026-06-12 10:49:17 UTC |
4edbf2b to
402ea86
Compare
This comment has been minimized.
This comment has been minimized.
402ea86 to
9062753
Compare
9062753 to
a31cc4a
Compare
…AD__ The clang CI sanitizer build does not define __SANITIZE_THREAD__, so every TSan annotation guarded on it was dead code. Add a TSAN_ENABLED macro in common.h (clang __has_feature + gcc __SANITIZE_THREAD__) and migrate all sites. Also assert the mmap MAP_FIXED re-map, fix stale comments, and drop the stale write_range spec. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
This PR aims to eliminate ThreadSanitizer (TSan) false-positive data race reports caused by virtual address reuse after munmap/mmap in LinearAllocator, and to standardize sanitizer detection across toolchains via common.h.
Changes:
- Introduces toolchain-agnostic sanitizer detection macros (
ASAN_ENABLED/TSAN_ENABLED) incommon.hand updates call sites to use them. - Updates
LinearAllocator’s TSan-specific handling for freshlysafeAlloc’d chunks to prevent stale-shadow false positives on reused virtual addresses. - Adjusts test and gtest crash-handler conditional compilation to use
TSAN_ENABLED.
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| ddprof-lib/src/main/cpp/common.h | Adds uniform sanitizer detection macros for clang/gcc. |
| ddprof-lib/src/main/cpp/linearAllocator.cpp | Switches TSan gating to TSAN_ENABLED and changes the TSan VA-reuse mitigation in allocateChunk(). |
| ddprof-lib/src/main/cpp/gtest_crash_handler.h | Uses TSAN_ENABLED to disable custom handlers under TSan. |
| ddprof-lib/src/test/cpp/ddprof_ut.cpp | Uses TSAN_ENABLED to skip the fork-based test under TSan. |
| ddprof-lib/src/test/cpp/stress_callTraceStorage.cpp | Includes common.h and uses TSAN_ENABLED to gate a TSan-only regression test. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Replace assert() with an unconditional abort on mmap(MAP_FIXED) failure so the failure is loud even under NDEBUG, and reword the comment so it no longer implies TSAN_ENABLED is gtest-scoped (it tracks compiler sanitizer detection; the test-only property comes from the build). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
What does this PR do?:
Re-map each freshly-allocated
LinearAllocatorchunk's virtual address through the libcmmap()wrapper withMAP_FIXED(TSan builds only) inLinearAllocator::allocateChunk, so TSan's interceptor clears stale shadow state left over from virtual-address reuse aftermmap/munmap.Motivation:
The stress test
ConcurrentExpansionAndCollectStressTestsurfaced a TSan data race report atlinearAllocator.cpp:237(chunk->prev = current). The race is a false positive caused by OS VA reuse: whenOS::safeFree(munmap) frees a chunk andOS::safeAlloc(mmap) later returns the same VA to a different allocator, TSan's shadow memory for that VA still holds the old access history.OS::safeAllocuses a rawmmapsyscall that bypasses TSan's interceptors by design. TSan's interceptors normally callMemoryRangeImitateWriteon mapped ranges, setting all 4 shadow slots for the range to the current thread's write and erasing stale entries; without interception, stale shadow persists across VA reuse → false-positive races.An earlier attempt used
__tsan_acquire(chunk), but that only establishes happens-before at the header address (VA+0), not user-data addresses (VA+n), and__tsan_reset_rangedoes not exist in the TSan API. The working fix re-maps the same VA through the libcmmap()wrapper withMAP_FIXEDafter the rawsafeAlloc; TSan intercepts that call and clears the stale shadow.OS::safeAllocitself is unchanged, so the production allocation path does not diverge.The
__tsan_release(chunk)after writing the header is retained so thatfreeChunks/detachChunks'__tsan_acquirecalls can establish happens-before when readingchunk->prev.Additional Notes:
The
mmap(MAP_FIXED)re-map and TSan interceptor are not async-signal-safe, so the code is gated onTSAN_ENABLEDand relies on the build applying-fsanitize=threadonly to the isolated gtest binaries (never the JVM-loaded shared library). A failedMAP_FIXEDaborts unconditionally (not viaassert) so it cannot silently leave a hole under NDEBUG. Observed in the reliability/chaos CI run on PR #589; not related to that PR's changes.How to test the change?:
Run the TSan build stress test:
./gradlew :ddprof-lib:testTsan --tests "*ConcurrentExpansionAndCollectStressTest*"— the reported race atlinearAllocator.cpp:237should no longer appear.For Datadog employees: