Skip to content

Add process-tree-based LLM detection telemetry property#54223

Open
marcpopMSFT wants to merge 1 commit into
mainfrom
marcpopMSFT-addprocesstreecheck
Open

Add process-tree-based LLM detection telemetry property#54223
marcpopMSFT wants to merge 1 commit into
mainfrom
marcpopMSFT-addprocesstreecheck

Conversation

@marcpopMSFT
Copy link
Copy Markdown
Member

Add a new 'llm_process' telemetry common property that walks the process ancestor tree to detect known LLM process names. This complements the existing env-var-based 'llm' property, which has been unreliable because LLMs don't consistently set environment variables.

The new property is additive - both 'llm' and 'llm_process' are sent in every telemetry event so their detection rates can be compared.

Implementation details:

  • Cross-platform parent PID resolution: Windows uses existing ProcessExtensions.GetParentProcessId (CsWin32/NtQueryInformationProcess), Linux reads /proc/{pid}/stat, macOS shells out to ps
  • All approaches are AOT-compatible (no WMI/System.Management)
  • Max 20 ancestor traversal depth with visited set to prevent loops
  • All exceptions caught - telemetry never crashes the CLI
  • Known LLM processes: claude, cursor, code (vscode), windsurf, zed, gemini, codex, aider, goose, amp

@marcpopMSFT
Copy link
Copy Markdown
Member Author

marcpopMSFT commented May 7, 2026

This solution for walking the process tree seems a bit hacky to me. Not usre if there is a better way. Also, I tried testing locally but I could only ever get telemetry to send for my manual execution of dotnet. For running in copilot CLI and VSCode, I was able to get the vscode process listed but the copilot CLI just had null as it was a standard terminal window. I don't know if there's a better way to detect the copilot CLI as that appears to be a gap in this logic.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a new telemetry common property (llm_process) that detects likely LLM/assistant usage by walking the current process’s ancestor tree, complementing the existing env-var-based llm detection.

Changes:

  • Add ILLMProcessTreeDetector abstraction and LLMProcessTreeDetector implementation with OS-specific parent PID resolution.
  • Emit the new llm_process common telemetry property alongside existing common properties.
  • Add unit tests for the new property and document it in the telemetry reference.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
test/dotnet.Tests/TelemetryTests/TelemetryCommonPropertiesTests.cs Adds tests asserting llm_process exists and flows detector output through.
src/Cli/dotnet/Telemetry/TelemetryCommonProperties.cs Adds llm_process to common properties and wires in the detector dependency.
src/Cli/dotnet/Telemetry/LLMProcessTreeDetector.cs Implements process-tree traversal and OS-specific parent PID lookup.
src/Cli/dotnet/Telemetry/ILLMProcessTreeDetector.cs Introduces a detector interface for process-tree-based LLM detection.
documentation/project-docs/telemetry.md Documents the new LLM Process common property.

Comment on lines +167 to +170
ps.Start();

string output = ps.StandardOutput.ReadToEnd().Trim();
ps.WaitForExit(1000);
Comment on lines +162 to +166
ps.StartInfo.FileName = "ps";
ps.StartInfo.Arguments = $"-o ppid= -p {pid}";
ps.StartInfo.RedirectStandardOutput = true;
ps.StartInfo.UseShellExecute = false;
ps.StartInfo.CreateNoWindow = true;
Comment thread documentation/project-docs/telemetry.md Outdated
| **Docker Container** | Whether running in Docker container | `True` or `False` |
| **CI** | Whether running in CI environment | `True` or `False` |
| **LLM** | Detected LLM/assistant environment identifiers (comma-separated) | `claude`, `cursor`, `gemini`, `copilot`, `codex`, `aider`, `plandex`, `amp`, `qwen`, `droid`, `opencode`, `zed`, `kimi`, `openhands`, `goose`, `cline`, `roo`, `windsurf`, `generic_agent` |
| **LLM Process** | Detected LLM process names found in the process ancestor tree (comma-separated). Complements the env-var-based LLM property by walking the process tree to find known LLM processes. Known limitations: may produce false positives from shell-escape commands (`!dotnet build`) and false negatives when LLMs launch via intermediate terminal windows. | `claude`, `cursor`, `vscode`, `windsurf`, `zed`, `gemini`, `codex`, `aider`, `goose`, `amp` |
Comment on lines +104 to +183
private static int GetParentProcessIdWindows(int pid)
{
#if NET
if (!OperatingSystem.IsWindows())
{
return -1;
}
#endif
try
{
using var process = Process.GetProcessById(pid);
return Microsoft.DotNet.Cli.Utils.Extensions.ProcessExtensions.GetParentProcessId(process);
}
catch
{
return -1;
}
}

private static int GetParentProcessIdLinux(int pid)
{
// Read /proc/{pid}/stat — field index 3 (0-based) is the ppid.
try
{
string statPath = $"/proc/{pid}/stat";
if (!File.Exists(statPath))
{
return -1;
}

string stat = File.ReadAllText(statPath);
// The process name in field 2 may contain spaces/parens, so find the last ')' to skip past it.
int lastParen = stat.LastIndexOf(')');
if (lastParen < 0 || lastParen + 2 >= stat.Length)
{
return -1;
}

string[] fields = stat[(lastParen + 2)..].Split(' ', StringSplitOptions.RemoveEmptyEntries);
// fields[0] = state, fields[1] = ppid
if (fields.Length >= 2 && int.TryParse(fields[1], out int ppid))
{
return ppid;
}

return -1;
}
catch
{
return -1;
}
}

private static int GetParentProcessIdMacOS(int pid)
{
try
{
using var ps = new Process();
ps.StartInfo.FileName = "ps";
ps.StartInfo.Arguments = $"-o ppid= -p {pid}";
ps.StartInfo.RedirectStandardOutput = true;
ps.StartInfo.UseShellExecute = false;
ps.StartInfo.CreateNoWindow = true;
ps.Start();

string output = ps.StandardOutput.ReadToEnd().Trim();
ps.WaitForExit(1000);

if (int.TryParse(output, out int ppid))
{
return ppid;
}

return -1;
}
catch
{
return -1;
}
}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For coverage purposes, each of these methods should have the SupportedOsAttribute set, so that trimming tooling can yell at us if we ever mess up using them on other platforms.

@baronfel
Copy link
Copy Markdown
Member

baronfel commented May 7, 2026

@marcpopMSFT when I tested with copilot just now using a powershell script executed in pwsh as a proxy for 'some process attempting to walk its own process tree', here's the results I got:

 |  32448    pwsh.exe
│ 16112    copilot.exe
│ 21468    copilot.exe
│ 30712    node.exe
│ 17484    pwsh.exe
│ 2468     WindowsTerminal.exe
│ 11048    explorer.exe

Here's the command run inside a copilot session to get this:

! pwsh -c { 
  $currentPid = $PID

  function Show-ProcessTree {
    param (
      [int]$ProcessId,
      [int]$Level = 0
     )
    try {
      $proc = Get-CimInstance Win32_Process -Filter "ProcessId = $ProcessId"
      if (-not $proc) { return }
      $indent = ' ' * ($Level * 4)
      Write-Host "$indent$($proc.Name) (PID: $($proc.ProcessId))"
      $children = Get-CimInstance Win32_Process -Filter "ParentProcessId = $ProcessId"
      foreach ($child in $children) {
        Show-ProcessTree -ProcessId $child.ProcessId -Level ($Level + 1)
      }
    }
    catch {
      Write-Warning "Error retrieving process info for PID $ProcessId : $_"
    }
  }
  
Show-ProcessTree -ProcessId $currentPi
}

Add a new 'llm_process' telemetry common property that walks the process
ancestor tree to detect known LLM process names. This complements the
existing env-var-based 'llm' property, which has been unreliable because
LLMs don't consistently set environment variables.

The new property is additive - both 'llm' and 'llm_process' are sent
in every telemetry event so their detection rates can be compared.

Implementation details:
- Cross-platform parent PID resolution: Windows uses existing
  ProcessExtensions.GetParentProcessId (CsWin32/NtQueryInformationProcess),
  Linux reads /proc/{pid}/stat, macOS shells out to ps
- All approaches are AOT-compatible (no WMI/System.Management)
- Max 20 ancestor traversal depth with visited set to prevent loops
- All exceptions caught - telemetry never crashes the CLI
- Known LLM processes: claude, cursor, code (vscode), windsurf, zed,
  gemini, codex, aider, goose, amp

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@marcpopMSFT marcpopMSFT force-pushed the marcpopMSFT-addprocesstreecheck branch from 6f42259 to fb584fb Compare May 7, 2026 16:32
@marcpopMSFT
Copy link
Copy Markdown
Member Author

@baronfel I asked copilot why it's own data didn't show up and it figured out that it wasn't tracking the copilot process itself. Pushed the update. Doesn't make my confident that this will work for all the other copilots though we could potentially take it and follow up based on missing data but that's a slow iteration cycle. I did confirm now with this fix that a manual ! in a copilot CLI will be treated as copilot data.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants