Graphize¶

LLM-powered CLI for transforming polyglot codebases into queryable knowledge graphs.

Overview¶

Graphize extracts structure from polyglot codebases and builds queryable knowledge graphs stored in GraphFS format. It combines deterministic AST extraction with optional LLM semantic analysis to create rich, navigable representations of code architecture.

Supported Languages¶

Language	Parser	Framework Detection
Go	Native `go/ast`	-
Java	Tree-sitter	Spring (Controller, Service, Repository)
TypeScript/JavaScript	Tree-sitter	-
Swift	Tree-sitter	-

External extractors can be added via the provider interface.

Features¶

🌍 Multi-Language - Go, Java, TypeScript, Swift with extensible provider interface
📊 AST Extraction - Fast, deterministic extraction of functions, types, and relationships
🤖 LLM Enhancement - Optional semantic analysis to discover implicit dependencies
🔍 Graph Queries - BFS/DFS traversal, path finding, community detection
📈 Analysis Reports - God nodes, surprising connections, corpus health, suggested questions
💡 Node Explanation - Get context with community membership and centrality metrics
🌐 MCP Server - Integrate with Claude Desktop and Claude Code
🔌 Platform Installers - One-command setup for Claude, Cursor, Copilot, Codex, Gemini, Aider
📤 Multiple Exports - HTML, TOON, JSON, GraphML, Neo4j Cypher, Obsidian vault
👁️ Watch Mode - Auto-rebuild graph on file changes
🔗 Git Hooks - Automatic analysis on commit/checkout
📝 Doc Extraction - Link markdown/text documentation to code entities

Quick Start¶

# Initialize a new graph database
graphize init

# Add your repository (Go, Java, TypeScript, Swift)
graphize add .

# Extract the graph (AST-based)
graphize analyze

# Generate an analysis report
graphize report

# Export interactive visualization
graphize export html

Two-Step Extraction Pipeline¶

Graphize provides a two-step extraction pipeline:

Deterministic AST extraction - Fast, reproducible, always available
LLM semantic extraction - Optional, adds inferred relationships and rationale

┌────────────────────────────────────────────────────────────┐
│                      GRAPHIZE PIPELINE                     │
├────────────────────────────────────────────────────────────┤
│                                                            │
│  Step 1: Scan     Step 2: Extract        Step 3: Build     │
│  ┌──────────┐     ┌─────────────────┐    ┌──────────────┐  │
│  │ Detect   │     │ Part A: AST     │    │ Merge AST +  │  │
│  │ sources  │────>│ (deterministic) │─┬─>│ Semantic     │  │
│  │          │     ├─────────────────┤ │  │ results      │  │
│  └──────────┘     │ Part B: LLM     │ │  └──────────────┘  │
│                   │ (optional)      │─┘         │          │
│                   └─────────────────┘           ▼          │
│                                          ┌──────────────┐  │
│  Step 4: Analyze      Step 5: Export     │ GraphFS      │  │
│  ┌──────────┐         ┌─────────────┐    │ Store        │  │
│  │ Cluster  │<────────│ God nodes   │<───└──────────────┘  │
│  │ Detect   │         │ Surprises   │                      │
│  └──────────┘         │ Questions   │                      │
│                       └─────────────┘                      │
└────────────────────────────────────────────────────────────┘

Target Users¶

Developers exploring unfamiliar codebases
AI Agents (Claude, Codex) needing codebase context
Architects documenting system design
Teams onboarding new members

Output Formats¶

Format	Use Case
TOON	Agent-friendly, token-efficient (default)
JSON	Machine-readable, full fidelity
HTML	Interactive Cytoscape.js visualization
GraphML	Import into Gephi, yEd, Cytoscape desktop
Cypher	Neo4j CREATE statements
Obsidian	Wiki-style vault with wikilinks

Storage¶

Graphize stores graphs in GraphFS format:

One file per node/edge (git-friendly)
Deterministic JSON serialization
Schema validation
Referential integrity

.graphize/
├── manifest.json      # Tracked sources
├── nodes/             # One file per node
├── edges/             # One file per edge
└── cache/             # Per-file extraction cache

Next Steps¶

Getting Started - Installation and first graph
CLI Workflow - The analyze → enhance → merge flow
MCP Server - Claude Desktop/Code integration
Architecture - Technical design details