English · 한국어
A fast CLI/TUI/GUI code scanner and indexer that analyzes source code at the class:method level with git blame integration, stores results in a local SQLite database with full-text and graph search, and provides command-line, terminal, and local web interfaces.
Built as a single native AOT binary with .NET 10.0.
- Multi-language analysis — Extracts classes, methods, comments, and dependency hints across common source languages
- Git blame integration — Associates each method with its last author, date, and commit
- Full-text search — FTS5 with trigram tokenizer for substring and CJK language support
- Hybrid search — Combines indexed DB search with live
git log --grepresults - Graph search — Neo4j-style source knowledge graph stored in embedded SQLite
- Cypher-like graph query — Safe
MATCH ... WHERE ... LIMIT ...subset for structured graph retrieval - Hybrid dependency graphing — Regex-first dependency edges, with language/project metadata probes for future semantic analyzers
- Interactive TUI — Terminal.Gui v2 interface for browsing, scanning, keyword search, graph search, and graph query
- Local web GUI — Keyword search, graph search/query, interactive 2D graph exploration, and controllable 3D view on port 8085 by default
- Project management — Register, describe, update, and delete indexed projects
- Single binary — Native AOT compiled, no runtime dependency required
The local GUI provides keyword search, graph search, node/edge detail inspection, 2D graph controls, and a camera-controlled 3D graph view.
The TUI supports project browsing, scanning, project management, keyword search, and graph search from the terminal.
Scanning can be launched from the terminal interface with method/comment extraction, git blame enrichment, and DB graph indexing.
| Language | Extensions | Class / Type Detection | Method Detection | Dependency Hints |
|---|---|---|---|---|
| C# | .cs |
class / struct / record / interface | access + return + name | using, inheritance/interface, new, type usage |
| Java | .java |
class / interface / enum | access + return + name | import, extends/implements, new, type usage |
| Kotlin | .kt, .kts |
class / object / data class / sealed class | fun / suspend fun | import, base type, constructor/type usage |
| JavaScript | .js, .jsx |
class | function / arrow / const / export | import, extends/implements-style hints, new, type-like usage |
| TypeScript | .ts, .tsx |
class | function / arrow / const / export | import, extends/implements, new, type annotations |
| PHP | .php |
class / interface / trait | function | use, extends/implements, new, type hints |
| Python | .py |
class (indent-based) | def / async def (indent-based) | import, base class, constructor-like calls |
| Go | .go |
type struct/interface | graph dependency scan only | import, constructor/type usage |
| Rust | .rs |
struct / enum / trait | graph dependency scan only | use, associated constructor/type usage |
| C/C++ | .c, .cc, .cpp, .cxx, .h, .hpp, .hh, .hxx |
class / struct | graph dependency scan only | include, inheritance, new, type usage |
One release pipeline → four native binaries → three package channels. GitHub Actions release.yml
| OS | Architecture | Primary command | Alternative |
|---|---|---|---|
| Windows | x64 | winget install psmon.CodeScan |
npm install -g codescan-cli (if Node is already installed) |
| macOS | arm64 (Apple Silicon) | brew install psmon/codescan/codescan |
— |
| Linux | x64 / arm64 | npm install -g codescan-cli |
— |
After install, verify:
codescan --version # should print: codescan v0.5.0 (or newer)
codescan --helpChannel status (v1) The GitHub Release pipeline is live and produces the binaries every channel pulls from. · Homebrew tap — live at
psmon/homebrew-codescan.brew tap psmon/codescan && brew install codescanworks today on Apple Silicon Macs. · winget — manifest atpackaging/winget/manifests/p/psmon/CodeScan/, pending PR tomicrosoft/winget-pkgs. Until merged, test locally — see Testing winget locally on Windows below. · npm (codescan-cli) — package atpackaging/npm/codescan-cli/, pending publish to npm registry. Until each channel goes live, the direct installers below work today.
The winget manifest is generated for every release. To install from a local manifest file (before the PR to microsoft/winget-pkgs is merged), winget requires a one-time opt-in. Run once in an elevated PowerShell:
winget settings --enable LocalManifestFilesThen install from the in-repo manifest (no admin needed after the opt-in):
# from a fresh clone of this repo
winget install --manifest packaging\winget\manifests\p\psmon\CodeScan\0.5.0
codescan --versionThis is winget's built-in safety guard against arbitrary-yaml installs — once enabled, you can install / validate any local manifest. To disable later: winget settings --disable LocalManifestFiles (elevated).
- winget (Windows) — Microsoft's native Windows package manager. Portable install, no admin needed, PATH handled automatically.
- Homebrew (macOS) — De-facto package manager for macOS developers. v1 ships arm64 only (Apple Silicon); Intel Mac users should build from source or use Rosetta with the arm64 build. Intel Mac shipping is a v2 candidate.
- npm (Linux + Windows alternative) — Picked over apt/dnf/snap because npm is universally available across Linux distros and the CodeScan release pipeline can serve all four binaries (
linux-x64,linux-arm64,osx-arm64,win-x64) from a single wrapper package. The npm package is a thin postinstall wrapper that downloads the right native binary from GitHub Releases. On Windows,wingetstays the recommended path (no Node.js required), butnpm install -g codescan-clialso works if you already have Node and want toolchain consistency. Linux arm64 is a deliberate first-class target — see Why AOT? — Edge AI trend and the value of a single binary at the bottom of this README for why arm64 SBC (Raspberry Pi / Jetson / Latte Panda) deployment is a key forward-looking scenario for this tool.
The npm wrapper auto-detects your CPU architecture and downloads the matching tarball:
process.arch |
Asset downloaded |
|---|---|
x64 |
codescan-linux-x64.tar.gz |
arm64 |
codescan-linux-arm64.tar.gz |
If postinstall cannot reach GitHub (corporate proxy, air-gapped), set CODESCAN_SKIP_DOWNLOAD=1 during install and grab the binary manually from the latest release.
v1 ships glibc-based Linux only. musl/Alpine support is a v2 candidate.
For environments without a package manager — or when you want to pin to a specific release.
Windows (PowerShell):
iwr https://raw.githubusercontent.com/psmon/CodeScan/main/Script/install-win.ps1 -OutFile install-win.ps1
.\install-win.ps1 # latest
.\install-win.ps1 -Version 0.5.0 # pinnedLinux / macOS (bash):
curl -fsSL https://raw.githubusercontent.com/psmon/CodeScan/main/Script/install.sh -o install.sh
sh install.sh # latest
sh install.sh --version 0.5.0 # pinnedBoth installers download the matching release asset from GitHub, verify SHA256 against checksums.txt, install to a user-local path (Win: ~/.codescan/bin, Unix: ~/.local/bin), and never touch user data under ~/.codescan/{db,logs,config}.
| OS | Binary install path | User data |
|---|---|---|
| Windows | %USERPROFILE%\.codescan\bin (or winget-managed) |
%USERPROFILE%\.codescan\{db,logs,config} |
| Linux | ~/.local/bin (or npm-managed) |
~/.codescan/{db,logs,config} |
| macOS | $(brew --prefix)/bin |
~/.codescan/{db,logs,config} |
User data is preserved across install / upgrade / uninstall.
git clone https://github.com/psmon/CodeScan.git
cd CodeScan
dotnet build # debug build
dotnet publish -c Release # release publish (single-file)Prerequisites:
- .NET 10.0 SDK (for building)
- Git (for blame integration)
Output: bin/Release/net10.0/<rid>/codescan (or codescan.exe on Windows).
For repo developers who want to bypass GitHub Releases and install directly from a local checkout:
- Windows:
Script/deploy-win.ps1 - Linux:
Script/deploy-linux.sh
These do dotnet publish + install to ~/.codescan/bin + register PATH — handy during local development but not the recommended path for users.
See Docs/install-distribution-strategy.md for the v1 confirmed plan (asset naming, signing posture, SBOM, CI flow, channel submission procedures).
# Scan current directory (register + analyze + display)
codescan scan
# Scan a specific path
codescan scan /path/to/project
# Search across all indexed projects
codescan search "HttpClient"
# Graph search
codescan graph "HttpClient"
codescan search "HttpClient" --graph --depth 2
# Cypher-like graph query
codescan query "MATCH (c:class)-[r:uses_type]->(t:type) WHERE t.label = 'HttpClient'"
# Launch interactive TUI
codescan tui
# Start local GUI viewer
codescan gui start --port 8085| Command | Description |
|---|---|
scan [path] |
Register and analyze a directory (shortcut for list with defaults) |
list <path> |
Scan with custom filtering and output options |
search <query> |
Hybrid full-text + git log search |
graph [query] |
Search and inspect source knowledge graph |
query <graph-query> |
Run the CodeScan Cypher-like graph query subset |
cypher <graph-query> |
Alias for query |
| `gui start | stop` |
projects |
List all registered projects with stats |
project <id> |
Show project summary or --detail for full view |
project-addinfo <id> <text> |
Add an AI-friendly description to a project |
project-update <id> |
Update project path or description |
project-delete <id> |
Remove a project from the database |
tui |
Launch interactive terminal UI |
help [command] |
Show help for a specific command |
# Search methods
codescan search "async" --type method
# Search comments
codescan search "TODO" --type comment
# Search within a specific project
codescan search "config" --project 1
# Search the graph
codescan search "HttpClient" --graph --depth 2
codescan graph "SearchCommand" --project 1
# Treat a search argument as a graph query
codescan search "MATCH (f:file)-[r:imports]->(m:module) LIMIT 20" --queryCodeScan supports a Cypher-like query subset for the graph data it actually stores. It is designed for CLI users, AI agents, and automation scripts that need structured graph retrieval without direct SQL access.
This is not full Cypher. It maps to CodeScan's SQLite-backed source graph and returns a GraphData result that CLI, TUI, and GUI can render.
Supported patterns:
MATCH (n:kind)
MATCH (a:kind)-[r:edge_kind]->(b:kind)Supported WHERE fields:
| Alias Type | Fields |
|---|---|
| Node aliases | kind, label, path, detail |
| Edge aliases | kind, label |
Supported operators:
| Operator | Example |
|---|---|
= |
t.label = 'HttpClient' |
CONTAINS |
c.label CONTAINS 'Command' |
STARTS WITH |
m.label STARTS WITH 'System' |
ENDS WITH |
f.path ENDS WITH '.cs' |
Supported clauses:
| Clause | Behavior |
|---|---|
WHERE ... AND ... |
Filters matched nodes/edges |
RETURN ... |
Accepted for readability, ignored by the renderer |
LIMIT <n> |
Limits matched seed nodes/edges |
Examples:
# Find class nodes
codescan query "MATCH (c:class) WHERE c.label CONTAINS 'Service' LIMIT 20"
# Find classes that use a type
codescan query "MATCH (c:class)-[r:uses_type]->(t:type) WHERE t.label = 'HttpClient'"
# Find file imports
codescan query "MATCH (f:file)-[r:imports]->(m:module) WHERE m.label CONTAINS 'System.Net'"
# Find author-to-method relationships and expand one neighbor hop
codescan query "MATCH (a:author)-[r:authored]->(m:method) WHERE a.label CONTAINS 'kim'" --depth 1
# `graph` auto-detects MATCH queries
codescan graph "MATCH (c:class)-[r:creates]->(t:type) LIMIT 30"Common node kinds:
project, directory, file, class, method, comment, doc, author, type, module
Common edge kinds:
contains, defines, authored, has_comment, documents, imports, inherits_or_implements, creates, uses_type
# Start on the default port
codescan gui start
# Start on a custom port
codescan gui start --port 8090
# Stop the GUI server
codescan gui stopOpen http://127.0.0.1:8085/ after starting the GUI. The viewer provides keyword search, graph search, Cypher-like graph query, a Neo4jClient-like 2D graph canvas, and a controllable 3D graph view.
GUI graph controls:
| Control | Behavior |
|---|---|
Keyword |
Run full-text keyword search |
Graph Search |
Search graph nodes by keyword and expand neighbors |
Query |
Run MATCH ... graph query and render the result |
| 2D drag background | Pan the graph |
| 2D mouse wheel | Zoom around the cursor |
| 2D drag node | Reposition a node |
| Node click | Show node detail and visible relationships |
| Edge click | Show relationship detail |
| Legend chips | Toggle node kinds on/off |
Fit |
Fit visible nodes into the canvas |
Reset Camera |
Reset 2D viewport or 3D camera |
| 3D drag | Orbit camera |
| 3D Shift-drag / right-drag | Pan camera |
| 3D mouse wheel | Zoom camera |
# Tree view with method details
codescan list /path/to/project --detail --tree
# Filter by extension
codescan list /path --include .ts,.tsx
# Limit depth and include git blame
codescan list /path --depth 3 --blameAll data is stored under ~/.codescan/:
~/.codescan/
├── db/
│ └── codescan.db # SQLite database with FTS5 index
└── logs/
└── *.log # Scan logs (--devmode only)
| Table | Contents |
|---|---|
projects |
Indexed projects with path, scan date, stats |
scans |
Scan history per project |
files |
File metadata (path, size, extension, depth) |
methods |
Class:method definitions with git blame data |
comments |
Comment blocks with surrounding code context |
project_docs |
Auto-discovered README / AGENT / CLAUDE.md content |
search_index |
FTS5 virtual table (trigram tokenizer) |
graph_nodes |
Source graph nodes: projects, directories, files, classes, methods, comments, docs, authors |
graph_edges |
Source graph relationships: contains, defines, authored, documents, comments, imports, creates, uses_type, inherits_or_implements |
Structural edges:
| Edge | Meaning |
|---|---|
project -[contains]-> directory/file |
Project file tree |
directory -[contains]-> directory/file |
Directory file tree |
file -[contains]-> class |
Class/type found in a source file |
class/file -[defines]-> method |
Method/function definition |
file -[has_comment]-> comment |
Comment block found in a source file |
author -[authored]-> method |
Git blame last-author relationship |
project -[documents]-> doc |
Auto-discovered project document |
Dependency hint edges:
| Edge | Source |
|---|---|
file/class -[imports]-> module |
using, import, use, #include |
class -[inherits_or_implements]-> type |
Base class / interface / trait-style declarations |
class -[creates]-> type |
Constructor or constructor-like calls such as new Type() |
class -[uses_type]-> type |
Type annotations, fields, parameters, returns, or local declarations detected by regex strategy |
The dependency graph is intentionally hybrid. CodeScan first uses language-neutral regex strategies so graph edges exist even when the project cannot be built. It also probes for semantic analysis capability using project metadata:
| Language | Semantic Probe |
|---|---|
| C# | .sln, .csproj for future Roslyn analyzers |
| Java | pom.xml, build.gradle, build.gradle.kts for future JDT/Spoon analyzers |
| TypeScript/JavaScript | tsconfig.json, jsconfig.json for future TypeScript Compiler API analyzers |
| Go | go.mod, go.work for future go/packages analyzers |
| Rust | Cargo.toml for future rust-analyzer/Cargo metadata analyzers |
| C/C++ | compile_commands.json for future Clang LibTooling analyzers |
Current semantic probes detect whether the required project model exists; regex remains the active fallback until a language-specific semantic strategy is added.
CodeScan/
├── Program.cs # Entry point and CLI routing
├── Commands/ # Command implementations
├── Models/ # Data structures (FileEntry, MethodEntry, CommentBlock, SourceDependency)
├── Services/ # Core logic
│ ├── DirectoryScanner.cs # Recursive traversal with filtering
│ ├── SourceAnalyzer.cs # Multi-language class/method extraction
│ ├── SourceGraphAnalyzer.cs # Hybrid dependency edge extraction
│ ├── CommentExtractor.cs # Comment extraction with context
│ ├── GitBlameService.cs # Git blame per method
│ ├── GitLogSearchService.cs # Hybrid git log search
│ ├── GraphQuery.cs # Cypher-like MATCH query parser
│ ├── GraphModels.cs # Source graph DTOs
│ ├── SqliteStore.cs # SQLite DB with FTS5 full-text search
│ └── TreeFormatter.cs # Tree/flat output formatting
├── Tui/
│ └── TuiApp.cs # Terminal.Gui v2 interactive UI
└── Script/ # Deployment scripts (Windows/Linux)
| Package | Purpose |
|---|---|
| Microsoft.Data.Sqlite | Embedded SQLite with FTS5 support |
| Terminal.Gui v2 | Cross-platform terminal UI framework |
- Centralized storage — All data under
~/.codescan/regardless of where the tool is run - Recent-first sorting — Files and directories sorted by modification time (newest first)
- Smart defaults —
.git,node_modules,bin,obj,dist,build,__pycache__excluded automatically - Markdown always included —
.mdfiles are always indexed even when--includefilters are active - Git root detection — Walks directory tree to find
.git/without spawning subprocesses - Trigram FTS — Enables effective substring search for CJK languages (Korean, Chinese, Japanese)
- Regex-first graphing — Produces dependency graph hints without requiring a successful build
- Semantic-ready strategy layer — Language-specific compiler analyzers can be added behind
ISourceDependencyStrategy
TL;DR — Through 2026 and into 2027, AI infrastructure is shifting from cloud-hosted frontier models toward on-device SLMs (Small Language Models) and edge agents. CodeScan is a .NET 10 Native AOT build — a single binary with no runtime dependency — designed to ride that wave: the same artifact runs on a developer laptop, a Raspberry Pi, or a drone-grade SBC.
Up to 2025, "edge LLM" was mostly demoware and benchmark posts. 2026 is when that changed:
- Google Gemma 3 / 3n / 3 270M — Gemma 3 has been measured at 14.5 tok/s on a Raspberry Pi and survived a 12-hour Jetson run with no memory leak or slowdown. The 270M variant uses just 0.75% of a Pixel 9 Pro battery for 25 conversations thanks to INT4 quantization and Per-Layer Embedding (PLE) caching — small enough to land naturally on everyday devices. (Gemma 3 270M announcement, Gemma 3n overview)
- NVIDIA Nemotron 3 Nano (4B and 30B-A3B) — A hybrid Mixture-of-Experts design: 30B total parameters but only 3B active per forward pass. The 4B variant, quantized to 4-bit, fits under 3 GB of VRAM and runs on consumer RTX cards and Jetson-class edge boards. NVIDIA claims 9× throughput over comparable open models. (Nemotron 3 Nano Omni announcement, Nemotron 3 Nano 4B hybrid architecture)
- 3B parameters as the 2026 sweet spot — With production-grade 3–8 bit quantization and a fresh wave of small NPUs landing on single-board computers, the community has converged on ~3B parameters as the practical sweet spot for SBC inference. (The Small Model Revolution 2026)
Extrapolating the curve, the following is likely to be the 2027 baseline:
- Drones — Whisper-class speech recognition + a 3B-class SLM handles autonomous mission parsing and replanning without GPS — moving from academic demos to production payloads.
- Raspberry Pi 5 + AI HAT+ 2 — The Hailo-10H accelerator (40 TOPS INT4) and 8 GB LPDDR4X turn an SBC into a real LLM host. (Raspberry Pi AI HAT+ 2 release — The Register, Jan 2026)
- x86/ARM SBCs (Latte Panda, Khadas Edge, Orange Pi, …) — Sitting next to PLCs on the factory floor, local SLMs handle log triage, anomaly detection, and natural-language operator UIs.
- Laptops and tablets — NPU-equipped SoCs (Apple Silicon, Snapdragon X, AMD Strix Halo) make 4B-class on-device inference an OS-level default. Samsung, Google, and Motorola's 2026 flagships already ship support for 4B models at Q4 quantization. (2026 SLM comparison: Phi-4 vs Gemma 3 vs Qwen)
All of these targets share the same structural constraints:
| Constraint | Implication |
|---|---|
| No runtime present — drone firmware and SBC minimal images rarely carry a .NET / Java / Python runtime, and adding one is expensive | A single self-contained binary is effectively a requirement |
| Memory and storage pressure — the model already owns most of the RAM; surrounding tools must be small | AOT trimming and single-file compression matter |
| Cold-start cost — battery-powered and event-triggered workloads must respond immediately | "No JIT warmup" is a decisive advantage |
| Supply-chain trust — edge updates are infrequent, so the integrity of the artifact you ship matters more | Single file + SHA256 + SBOM is a natural fit |
CodeScan's build shape lines up with each of those constraints:
- Instant startup (no JIT) — Decisive when an edge agent must respond within ~50 ms of a voice trigger. (Native AOT deployment overview — Microsoft Learn)
- Runtime-free single file — Copy a single
~/.codescan/bin/codescanand it runs on a Raspberry Pi with no .NET installed. - Smaller memory footprint — AOT drops the JIT, its metadata, and unreachable runtime services, leaving more RAM for the model.
- Reduced attack surface — Dynamic code generation and most reflection paths are stripped; pairing a single file with an SBOM is friendly to supply-chain audit.
- First-class multi-arch — From v1 the same pipeline publishes
linux-x64,linux-arm64,osx-arm64, andwin-x64as peer artifacts. SBC deployment needs no separate build procedure.
CodeScan does not host an LLM itself. It is the indexing and retrieval layer an agent needs whenever it has to interact with a codebase:
- An FTS5 + graph backend that lets a code-aware agent (e.g., Gemma 3 4B with tool-use) on an SBC sweep a local repository quickly.
- A Cypher-like graph-query surface that an autonomous build/deploy bot can use to reason about change impact.
- A RAG-lite component for drone or robot SDK repos — analyze offline, then feed code context into the SLM.
The "agent" half of the picture is being built in parallel as a sibling research project — psmon/AgentZeroLite. AgentZeroLite focuses on running and evaluating on-device SLMs (the Gemma 3 / Nemotron Nano-class models discussed above) on real consumer-grade hardware, while CodeScan acts as the code-aware retrieval layer those agents call into. The two are designed to compose:
- AgentZeroLite — hosts the on-device model, manages prompting, tool-use, and evaluation loops for edge inference scenarios.
- CodeScan — answers "what's in this codebase?" with FTS5 keyword hits, the source graph, and Cypher-like queries — the kind of structured context an SLM needs to do useful code work without a frontier model.
If you want to see how this plays out end-to-end (on-device SLM ↔ structured code retrieval), AgentZeroLite is the natural next stop.
In short — as small models start doing real work on the edge, the tooling around those models has to be small, instant, and runtime-free too. A Native AOT single binary is the most direct answer to that requirement, and CodeScan is built along that line.
For the full build/distribution spec, see
Docs/install-distribution-strategy.md.
See repository for license information.




