Skip to content

Potentially infinite loop and store trace in unowned slot in CallTraceHashTable::putWithExistingId()#578

Open
zhengyu123 wants to merge 5 commits into
mainfrom
zgu/calltrace_storage
Open

Potentially infinite loop and store trace in unowned slot in CallTraceHashTable::putWithExistingId()#578
zhengyu123 wants to merge 5 commits into
mainfrom
zgu/calltrace_storage

Conversation

@zhengyu123

@zhengyu123 zhengyu123 commented Jun 5, 2026

Copy link
Copy Markdown
Contributor

What does this PR do?:
This PR fixes a corner case, that may result in infinite loop and/or store trace in unowned slot.

Motivation:
Improve stability.

Additional Notes:

How to test the change?:

  • Regular CI tests
  • New test cases for this corner cases.

For Datadog employees:

  • If this PR touches code that signs or publishes builds or packages, or handles
    credentials of any kind, I've requested a review from @DataDog/security-design-and-guidance.
  • This PR doesn't touch any of that.
  • JIRA: PROF-14915

Unsure? Have a question? Request a review!

@zhengyu123 zhengyu123 requested a review from a team as a code owner June 5, 2026 00:39
@zhengyu123 zhengyu123 marked this pull request as draft June 5, 2026 00:39

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 699ab3f1e3

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread ddprof-lib/src/main/cpp/callTraceHashTable.cpp
@dd-octo-sts

dd-octo-sts Bot commented Jun 5, 2026

Copy link
Copy Markdown
Contributor

CI Test Results

Run: #27231185436 | Commit: aaf8ef2 | Duration: 12m 29s (longest job)

All 32 test jobs passed

Status Overview

JDK glibc-aarch64/debug glibc-amd64/debug musl-aarch64/debug musl-amd64/debug
8 - - -
8-ibm - - -
8-j9 - -
8-librca - -
8-orcl - - -
11 - - -
11-j9 - -
11-librca - -
17 - -
17-graal - -
17-j9 - -
17-librca - -
21 - -
21-graal - -
21-librca - -
25 - -
25-graal - -
25-librca - -

Legend: ✅ passed | ❌ failed | ⚪ skipped | 🚫 cancelled

Summary: Total: 32 | Passed: 32 | Failed: 0


Updated: 2026-06-09 19:57:25 UTC

@zhengyu123 zhengyu123 marked this pull request as ready for review June 5, 2026 14:02

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 53f193d0df

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread ddprof-lib/src/test/cpp/test_callTraceStorage.cpp
@zhengyu123 zhengyu123 requested a review from jbachorik June 5, 2026 16:58

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 9533310501

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

std::this_thread::sleep_for(std::chrono::milliseconds(50));
}

bool ok = completed.load();

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Avoid detaching the timed-out worker

If this timeout fires, the detached worker can keep running after the test returns while completed and other stack state captured by reference have been destroyed; if the regression is only very slow rather than permanently hung, the worker will later write through a dangling reference and can corrupt/crash subsequent tests instead of reporting a clean failure.

Useful? React with 👍 / 👎.

@datadog-prod-us1-6

datadog-prod-us1-6 Bot commented Jun 8, 2026

Copy link
Copy Markdown

Pipelines

Fix all issues with BitsAI

⚠️ Warnings

🚦 3 Pipeline jobs failed

CodeQL | Analyze (java-kotlin)   View in Datadog   GitHub Actions

DataDog/java-profiler | gtest-tsan-amd64   View in Datadog   GitLab

DataDog/java-profiler | gtest-tsan-arm64   View in Datadog   GitLab

Useful? React with 👍 / 👎

This comment will be updated automatically if new data arrives.
🔗 Commit SHA: 239a008 | Docs | Datadog PR Page | Give us feedback!

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 0ba2856759

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +264 to +266
if (size >= capacity * 3 / 4) {
// Allocate new table with double capacity using LinearAllocator
LongHashTable* new_table = LongHashTable::allocate(table, capacity * 2, &_allocator);

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Avoid allocating replacement tables after expansion

When one thread has already swapped _table to an expanded table, other put() callers that loaded the old table can still reach this helper after incSize(). Since the new >= condition stays true for every later insert and the replacement table is allocated before the compare-exchange, each failed CAS leaves a full LongHashTable allocated in the LinearAllocator from the hot signal-handler path. Under concurrent high-cardinality traces once the old table is past 75% full, stale writers can burn megabytes per sample until the storage is cleared; recheck that _table is still this table before allocating, or only allocate on the threshold transition.

Useful? React with 👍 / 👎.

@jbachorik

Copy link
Copy Markdown
Collaborator

@copilot resolve the merge conflicts in this pull request

Copilot AI commented Jun 10, 2026

Copy link
Copy Markdown
Contributor

@copilot resolve the merge conflicts in this pull request

Resolved the merge conflicts locally in commit 239a0084 (merged origin/main and reconciled ddprof-lib/src/test/cpp/test_callTraceStorage.cpp). The branch push was blocked by repository protection rules (GH013: Cannot update this protected ref), so please apply this commit through your normal protected-branch flow.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants