Doh MCP Server — Modern Developer Toolkit

A fast, organized Model Context Protocol (MCP) server for Doh.js projects. It exposes powerful local code intelligence, headful browser automation, full-stack debugging (client and server), and cloud instance management via a simple SSE transport.

This README focuses on what the system offers at a glance, highlighting capabilities designed to be used by LLMs while remaining intuitive for human operators.

Why MCP for Doh.js

Unified tools that operate on the same Doh.js project state and Pods
Smart defaults: automatic detection and enablement (debug domains, connections, caching)
Built for AI+human collaboration inside modern IDEs (Cursor, Claude Code)

Quick Start

# Start the MCP server (SSE on port 4000 by default)
doh mcp

# PM2-managed (optional)
doh mcp-pm2 setup   # initial setup (creates process)
doh mcp-pm2 restart # restart the MCP server process

Endpoint: http://localhost:4000/sse (SSE transport)
Auto-installs editor configuration and rule files in .cursor/ and manages .gitignore entries

Capabilities at a Glance

Codebase tools: intelligent search across code and docs; structural outline; targeted line edits
Browser automation: navigate, execute JS, collect logs, prompt users — via a single persistent, headful instance
Unified inspector: one interface for browser and server debugging with automatic connection management
Cloud management: list/restart instances, run commands, retrieve logs, and sync files (local↔cloud and cross-instance)

Tool Categories and APIs

Each tool is exposed as an MCP method, validated with zod, returning structured, human-readable results.

Core Development Tools

search — Advanced codebase search (code + docs)
- Strategy: ripgrep-based code search + BM25/Vector documentation search, fused via dynamic RRF
- AST-aware snippets for higher signal results
- Order-insensitive multi-term fallback: requires all terms on a single line when specific/context hits are sparse (PCRE2 lookaheads)
- OR-mode fallback: matches any term, then de-noises by requiring ≥2 distinct terms within a small cross-line window
- Cross-line window: default ±2 lines; configurable via mcp.search.ripgrep.crossline_window_lines (alias: window_lines)
- Automatic PCRE2: advanced patterns automatically enable ripgrep --pcre2 when needed
grep_search — Fast exact regex search across the workspace
file_search — Fuzzy file/path finder
outline_code — AST-based file outline (functions/classes/exports/calls)
get_local_server_logs — Get recent logs from the local Doh server process
restart_local_server — Restart the local Doh server process via PM2

Browser Automation (Playwright)

Requires mcp.browser_tools_enabled: true.

browser_goto — Navigate to a URL; auto-launches a single persistent, headful browser instance
browser_execute_js — Run JS in page context; structured results and robust error capture
browser_get_console_logs — Retrieve recent console output with type filtering
browser_reload — Hard reload of current page
browser_prompt_user — Compact, draggable dialog for human collaboration
- Persistent state across navigations via ai_prompt_restoration
- Timeout gently defaults to “wait” (auto-extend) to preserve user progress

Notes

Single default browser instance; cookies/localStorage persist across navigations
System dependencies handled automatically; use bunx playwright install-deps if needed

Unified Inspector (Browser & Bun)

Requires mcp.browser_debugging_enabled: true and/or mcp.bun_inspector_tools_enabled: true.

A single, elegant interface for client and server debugging; set target: browser | bun per call
Smart auto-management: on-demand connections, automatic domain enablement (runtime/debugger/network/console), event buffering, heartbeats, reconnection
Browser target uses the default headful instance (no instance IDs)
Server target auto-detects Bun and connects to ws://localhost:6499/inspect

Core tools (prefix: inspector_)

evaluate, call_function, get_properties
set_breakpoint, remove_breakpoint
step (over/into/out), pause, resume
get_events (console/network/debugger), get_response_body
clear_event_buffer

Cloud Management

Requires mcp.cloud_tools_enabled: true, anchoring the local instance, and remote instances must explicitly opt-in with cloud.mcp_controllable: true.

Core tools

cloud_list_instances — Discover controllable instances you own
cloud_get_status — Detailed status for a single instance
cloud_restart_instances — Restart one or more instances with optional delay
cloud_get_logs — Tail recent logs from a remote instance
cloud_run_command — Execute a shell command with timeout/workingDir
cloud_sync_files_from_local_to_instances — Upload/download files or folders
cloud_sync_files_cross_instance — Server-side compare and sync between instances (fast, MD5-aware, with dry-run)

Security

Cloud anchor token must exist on the local machine
Only mcp_controllable: true instances are visible/controllable
All commands validated server-side and logged

Deep Tech Highlights (What makes this powerful)

Documentation Search Engine (hybrid BM25 + Vector)

Hierarchical chunking: Markdown is segmented by headings → paragraphs → long-paragraph splits, preserving semantic context and line ranges for precise citations.
BM25SearchEngine: Tuned boosts for headings vs. paragraphs; priority docs receive extra weight. Index offers per-chunk metadata for better ranking and formatting.
Local embeddings: Uses Transformers.js (Xenova/all-MiniLM-L6-v2) entirely on-device. Vectors are cached in SQLite with mtime freshness; loaded into an in-memory vector store at startup for low-latency queries.
Hybrid fusion: Combines BM25 ranks and cosine similarity via Reciprocal Rank Fusion (RRF) with dynamic K and weighted scoring; gracefully degrades if embeddings are disabled (embedder: noop).
Query intelligence: Symbol extraction, stopword handling, synonym expansion, filename variants, and intent classification (definition/usage/concept) guide search phases and limits.
Result quality: AST-aware snippets for code, deduplication by similarity, diversity filtering, and smart truncation to keep outputs high-signal.
Live updates: A chokidar-backed markdown watcher incrementally re-indexes changed files; stale vectors are removed and re-embedded in batches automatically.

Code Intelligence & Navigation

AST snippet generation: Acorn + acorn-walk identify functions/classes/methods/imports/exports/object shapes/calls to return stable, readable blocks (not arbitrary line ranges).
Outline generation: outline_code parses JS structure to quickly map large files (functions, classes, methods, exports, calls) with minimal cognitive load.
Fast text search: Ripgrep integration (@lvce-editor/ripgrep) honors .gitignore, supports glob/pattern filters, and is guarded by filesize limits to avoid runaway scans.
Inclusive multi-term matching: unordered all-terms (same line) and OR-mode with a small cross-line window; auto-enables PCRE2 for advanced patterns.

Reliability, Safety, and Performance

Protective wrappers: Timeouts (withTimeout, withBatchTimeout) and CircuitBreaker guard long or flaky operations; failures return actionable, human-friendly messages.
Error UX: Central MCPErrorHandler adds context-aware guidance, suggestions, and compact technical detail toggles to keep results helpful yet calm.
Connection resilience: SSE heartbeats keep sessions healthy; inspector connections auto-enable required domains and reconnect with exponential backoff.
Sensible limits: Dynamic ceilings for candidates, symbols, and results prevent noise and preserve responsiveness on huge repos.

Human-in-the-Loop Browser UX

Single persistent browser: One headful Chromium instance with a dedicated user data dir preserves cookies/localStorage for seamless workflows.
Log capture: Structured, timestamped console buffers with type filters (log/warn/error/info/debug) and safe truncation.
Collaborative prompts: Compact, draggable UI with countdown, notes, and isolated CSS; default timeout action is “wait” to avoid losing human progress.
Prompt restoration: The ai_prompt_restoration module revives interrupted prompts via localStorage and cleans up state post-completion.

Cloud Sync Intelligence

Always-fresh integrity: MD5 checks computed on demand (single or batch) ensure transfers only happen when needed.
Efficient transfers: Cross-instance sync executes close to the data; multi-file operations are batched and filtered with smart patterns.
Secure-by-default: Only instances with cloud.mcp_controllable: true are visible; every operation is validated and logged.

Configuration

Configure via Pod files (pod.yaml, boot.pod.yaml) and module defaults.

`mcp` pod section (common)

mcp:
  port: 4000
  search_engine: bm25           # docs: BM25 ranking
  embedder: local               # local | noop (disables embeddings)
  vector_store: inmemory
  read_whole_files: false

  # Browser
  browser_tools_enabled: true   # enable Playwright-powered tools
  browser_debugging_enabled: true  # enable browser inspector tools
  browser_devtools_port: 9222   # DevTools port for browser debugging
  browser_user_data_dir: "/.doh/browser_data"  # persistent profile

  # Inspector
  bun_inspector_tools_enabled: true  # enable Bun server inspector tools

  # Cloud
  cloud_tools_enabled: true     # enable cloud management tools

  # Error handling
  error_handling:
    show_technical_details: false
    include_error_codes: true
    max_error_length: 2000
    debug_mode: false
    auto_suggest_alternatives: true
    context_aware_guidance: true

  # Search tuning
  search:
    definition_results: 5
    instance_results: 10
    explanation_results: 5
    example_results: 5
    cache:
      default_duration: 30000
      pattern_duration: 60000
      complex_duration: 45000
      low_confidence_factor: 0.7
      cleanup_age_multiplier: 2
    thresholds:
      min_specific_matches_for_context_skip: 5
      min_results_before_enhanced_search: 10
      max_enhanced_results: 5000
      max_candidates_multiplier: 5
      min_candidates: 15
    boost:
      doc_boost_factor: 1.0
      heading_boost_factor: 1.5
      paragraph_boost_factor: 1.0
      priority_doc_boost: 2.0
      priority_explanation_multiplier: 2
      definition_heading_boost: 2.5
      usage_paragraph_boost: 1.5
      concept_heading_boost: 2.0
      rrfSymbolBoostFactor: 0.1
    rrf:
      default_k: 60
      score_weights:
        normalized_rrf: 0.4
        similarity: 0.6
    docs:
      fetch_multiplier: 5
      base_fetch_add: 10
    ripgrep:
      symbol_limit: 50
      pattern_limit: 30
      related_terms_limit: 40
      context_search_limit: 100
      enhanced_search_limit: 50
      filename_search_limit: 25
       crossline_window_lines: 2     # Lines above/below when de-noising OR-mode matches
       # Alias supported if the above is absent:
       window_lines: 2

  # Optional self-improvement helpers
  self_improve: false  # if true, enables get/restart MCP server helpers

  # Optional rule/hook management for editors
  manage_claude_md: true
  claude_managed_files:
    - .cursor/rules/doh-reference.mdc
    - .cursor/rules/codebase.mdc
  hooks:
    manage_hooks: true
    hook_files:
      - .cursor/hooks/doh-path-suggestions.json

`cloud` pod section (remote instances)

cloud:
  endpoint: http://localhost:3000   # or your cloud manager
  mcp_controllable: true             # required for remote control

How It Works (High Level)

Transport: HTTP SSE (/sse) for events; /messages?sessionId=... for requests
Code search: ripgrep symbol/context search + inclusive multi-term fallbacks (unordered all-terms, OR-mode with cross-line window) + AST-based snippet extraction
Doc search: BM25 ranking with vector enhancements (optional) and RRF fusion
Browser tooling: Playwright persistent contexts (cookies/localStorage preserved)
Unified inspector: automatic connections, domain enablement, and buffered events for both Bun and browser

Troubleshooting

Port in use: set mcp.port in your pod before doh mcp
Docs not found or stale: wait for initial indexing; it updates live on markdown changes
Slow or huge result sets: refine queries, use directory scoping, or rely on search intent-aware flow
Browser launch fails: run bunx playwright install-deps (or npx), or install listed system libraries
Cloud tools unavailable: ensure anchor token exists and restart MCP server; remote instances must set cloud.mcp_controllable: true
Multi-term queries feel strict: the search includes order-insensitive and OR-mode fallbacks. Tune mcp.search.thresholds.min_specific_matches_for_context_skip and mcp.search.ripgrep.crossline_window_lines to shift recall vs precision.
PCRE2 errors: lookahead-based patterns require ripgrep with PCRE2. The server auto-adds --pcre2 when needed; ensure your ripgrep build supports PCRE2 if errors appear.

Notes

The server manages .cursor/mcp.json and rule files automatically for editor integration
For PM2-managed workflows, use doh mcp-pm2 (supports start/stop/restart/log/inspect)
ai_prompt_restoration transparently restores in-page prompts after navigation and cleans its state to avoid residue

Doh MCP Server — Modern Developer Toolkit

Why MCP for Doh.js

Quick Start

Capabilities at a Glance

Tool Categories and APIs

Core Development Tools

Browser Automation (Playwright)

Unified Inspector (Browser & Bun)

Cloud Management

Deep Tech Highlights (What makes this powerful)

Documentation Search Engine (hybrid BM25 + Vector)

Code Intelligence & Navigation

Reliability, Safety, and Performance

Human-in-the-Loop Browser UX

Cloud Sync Intelligence

Configuration

mcp pod section (common)

cloud pod section (remote instances)

How It Works (High Level)

Troubleshooting

Notes

`mcp` pod section (common)

`cloud` pod section (remote instances)