Misar.io Documentation

Monitor provider health, rate limits, token usage, cost, latency, and full request traces in real time.

Overview

The Observability API gives you live visibility into how the backend is performing — circuit breaker states, rate limit headroom, per-model metrics, and full execution traces for debugging.

All endpoints require authentication:

Authorization: Bearer YOUR_API_KEY

Provider Health

Circuit Breaker State

GET /circuit-state

Returns the circuit breaker state for each configured AI provider.

{
  "providers": {
    "gemini": { "state": "closed", "failure_count": 0, "last_failure": null },
    "groq":   { "state": "open",   "failure_count": 5, "last_failure": "2026-03-21T14:55:00Z", "retry_after": "2026-03-21T15:00:00Z" },
    "mistral":{ "state": "half_open", "failure_count": 2, "last_failure": "2026-03-21T14:58:00Z" }
  }
}

| State | Meaning | |-------|---------| | closed | Provider healthy — requests flow normally | | open | Provider failing — requests are blocked until retry_after | | half_open | Testing recovery — one probe request allowed through |

Rate Limit Status

GET /rate-status

{
  "providers": {
    "gemini": { "requests_remaining": 580, "tokens_remaining": 950000, "reset_at": "2026-03-21T15:00:00Z" },
    "groq":   { "requests_remaining": 28,  "tokens_remaining": 120000,  "reset_at": "2026-03-21T14:59:00Z" }
  }
}

Combined Provider Status

GET /providers/status

Single endpoint combining circuit state and rate limits for all providers.

{
  "providers": {
    "gemini": {
      "circuit": "closed",
      "requests_remaining": 580,
      "tokens_remaining": 950000,
      "healthy": true
    }
  },
  "healthy_count": 5,
  "total_count": 7
}

Metrics

Aggregated Metrics

GET /metrics

{
  "period": "1h",
  "total_requests": 1240,
  "error_rate": 0.012,
  "latency": {
    "p50_ms": 420,
    "p95_ms": 1840
  },
  "by_model": {
    "gemini-2.0-flash": { "requests": 830, "errors": 8, "p50_ms": 380 },
    "deepseek-chat":    { "requests": 410, "errors": 7, "p50_ms": 510 }
  }
}

Usage Dashboard

GET /usage

Real-time per-model usage breakdown including token counts, cost, and latency.

{
  "window": "24h",
  "models": [
    {
      "model": "gemini-2.0-flash",
      "requests": 2840,
      "input_tokens": 4200000,
      "output_tokens": 980000,
      "cost_usd": 1.24,
      "avg_latency_ms": 395
    }
  ],
  "totals": {
    "requests": 3100,
    "input_tokens": 4600000,
    "output_tokens": 1050000,
    "cost_usd": 1.87
  }
}

Request History

GET /history?limit=20

Returns a summary of the last N requests — model used, token counts, latency, and outcome.

{
  "requests": [
    {
      "request_id": "req_abc123",
      "model": "gemini-2.0-flash",
      "input_tokens": 1240,
      "output_tokens": 387,
      "latency_ms": 412,
      "status": "success",
      "timestamp": "2026-03-21T14:59:01Z"
    }
  ]
}

Traces

Request Trace

GET /trace?request_id=req_abc123

Full execution DAG for a single request — every model call, tool call, retrieval, and ensemble vote recorded as a node.

{
  "request_id": "req_abc123",
  "duration_ms": 1240,
  "nodes": [
    { "id": "n1", "type": "retrieval",  "label": "RAG query",         "duration_ms": 18  },
    { "id": "n2", "type": "model_call", "label": "gemini-2.0-flash",  "duration_ms": 412, "parent": "n1" },
    { "id": "n3", "type": "tool_call",  "label": "read_file",         "duration_ms": 3,   "parent": "n2" },
    { "id": "n4", "type": "model_call", "label": "gemini-2.0-flash",  "duration_ms": 807, "parent": "n3" }
  ]
}

Recent Traces

GET /traces?limit=10

Summaries of the most recent request traces — useful for spotting slow or failed requests at a glance.

Model Catalog

Available Models

GET /models

Lists all configured models with current health status.

{
  "models": [
    { "id": "gemini-2.0-flash", "provider": "gemini", "healthy": true,  "latency_p50_ms": 380 },
    { "id": "deepseek-chat",    "provider": "deepseek","healthy": true,  "latency_p50_ms": 510 },
    { "id": "groq-llama",       "provider": "groq",   "healthy": false, "circuit": "open" }
  ]
}

Full Catalog

GET /models/catalog

Complete model catalog with capability metadata.

{
  "models": [
    {
      "id": "gemini-2.0-flash",
      "provider": "gemini",
      "context_window": 1000000,
      "speed": "fast",
      "quality": "high",
      "cost_per_1m_input_tokens_usd": 0.10,
      "cost_per_1m_output_tokens_usd": 0.40,
      "supports_tools": true,
      "supports_vision": true
    }
  ]
}

Poll /providers/status in your dashboard to surface provider degradation to users before they hit errors. Circuit breaker state changes are a leading indicator of upstream issues.