Misar IO Docs

RAG API

Index workspace files and retrieve relevant code chunks using hybrid BM25 + vector search powered by PostgreSQL and pgvector.

Overview

The RAG (Retrieval-Augmented Generation) API indexes your workspace files and retrieves the most relevant code chunks before each agent response. This gives the agent accurate, up-to-date knowledge of your codebase without consuming the full context window.

Architecture

Workspace files
      │
      ▼
tree-sitter chunker
  ├── L1: file-level summary
  ├── L2: symbol-level (functions, classes, types)
  └── L3: sub-block (branches, loops, expressions)
      │
      ▼
BGE-small embeddings  +  BM25 full-text index
      │                        │
      └──────── hybrid search ─┘
                     │
                     ▼
           ranked chunk results

Storage: PostgreSQL with pgvector extension. Hybrid retrieval combines dense vector similarity with BM25 lexical scoring for best-of-both results.

Index Files

POST /rag/index
Authorization: Bearer YOUR_API_KEY
Content-Type: application/json

Request

{
  "workspace_id": "my-project",
  "files": [
    {
      "path": "src/auth/session.ts",
      "content": "import { cookies } from 'next/headers';\n..."
    },
    {
      "path": "src/lib/db.ts",
      "content": "import { createClient } from '@supabase/supabase-js';\n..."
    }
  ]
}

| Field | Type | Description | |-------|------|-------------| | workspace_id | string | Unique identifier for the workspace | | files | array | Files to index. Each item has path (string) and content (string) |

Limits

| Limit | Value | |-------|-------| | Max files per request | 200 | | Max file size | 512 KB | | Max chunks per workspace | 10,000 |

Response

{
  "indexed": 2,
  "chunks_created": 47,
  "duration_ms": 312
}

Re-indexing a file that was previously indexed replaces the old chunks automatically. You do not need to delete before re-indexing.

Query

POST /rag/query
Authorization: Bearer YOUR_API_KEY
Content-Type: application/json

Request

{
  "workspace_id": "my-project",
  "query": "how are user sessions validated",
  "mode": "hybrid",
  "limit": 10
}

| Field | Type | Default | Description | |-------|------|---------|-------------| | workspace_id | string | — | Workspace to search | | query | string | — | Natural language or code query | | mode | string | "hybrid" | Retrieval mode: hybrid, vector, or lexical | | limit | number | 10 | Maximum number of chunks to return |

Retrieval Modes

| Mode | Description | Best For | |------|-------------|----------| | hybrid | BM25 + vector, scores merged with RRF | General use — best accuracy | | vector | Dense embedding similarity only | Semantic / conceptual queries | | lexical | BM25 full-text only | Exact identifier or keyword search |

Response

{
  "chunks": [
    {
      "path": "src/auth/session.ts",
      "content": "export async function validateSession(token: string) {",
      "level": "L2",
      "score": 0.94,
      "start_line": 12,
      "end_line": 34
    }
  ],
  "query_ms": 18
}

Status

POST /rag/status
Authorization: Bearer YOUR_API_KEY
Content-Type: application/json

{ "workspace_id": "my-project" }

Response

{
  "workspace_id": "my-project",
  "chunk_count": 1842,
  "file_count": 63,
  "last_indexed": "2026-03-21T14:32:00Z",
  "index_size_kb": 4096
}

The VS Code extension handles indexing automatically when misar.autoContext is enabled. Use the API directly when building custom tooling or CI pipelines that need to pre-index large workspaces.