Skip to main content
Infrastructure Layer

Intelligent AI Routing

One unified API. 15 frontier models. Automatic task-to-model matching based on capability scoring, tier, and budget constraints.

Three Routing Tiers

Economy< $1 / M tokens

Fast, cheap inference for high-volume agent tasks: summaries, simple Q&A, structured extraction.

Llama 3.3 70B Instruct (OpenRouter free)
Step 3.5 Flash (OpenRouter free)
GPT-OSS 20B (OpenRouter free)
GPT-OSS 120B (direct OpenAI)
GPT-5 Nano (direct OpenAI)
Mid$0.2–$2.5 / M tokens

Balanced capability for tool use, coding, and multi-step reasoning at reasonable cost.

GPT-5 Mini (direct OpenAI)
Gemini 2.5 Flash (direct Google)
DeepSeek V3.2 (direct DeepSeek)
Grok 4.1 Fast (direct xAI)
Kimi K2.5 (direct Moonshot)
Premium$2–$25 / M tokens

Frontier models for complex reasoning, agentic workflows, long-context tasks, and highest-stakes outputs.

GPT-5 (direct OpenAI)
Claude Sonnet 4.6 (direct Anthropic)
Claude Opus 4.6 (direct Anthropic)
Gemini 3 Pro Preview (direct Google)
MiniMax M2.5 (direct MiniMax)

How It Works

Automatic Model Selection

Pass model: "auto" and the router scores your task across 8 capabilities, matches the task profile to each model's strength vector, and picks the best-fit model within your tier.

8-Dimension Capability Scoring

Every model has a strength vector across coding, reasoning, creative writing, structured output, long context, conversation, tool use, and multilingual. Routing is dot-product similarity.

Budget-Aware Routing

Set a budget cap on your agent and the router automatically stays within economy or mid tier. Premium models are only selected when the task score exceeds your tier threshold.

Direct Model Pinning

Need a specific model? Pin it directly with model: "claude-sonnet-4-6" or any supported model ID. The router passes through pinned requests without scoring overhead.

15 Models, 6+ Providers

OpenAI, Anthropic, Google, DeepSeek, xAI, Moonshot, MiniMax, and OpenRouter — all behind one unified OpenAI-compatible API. Switch providers without changing your agent code.

Streaming + Tool Use

Full support for streaming responses and function calling across all compatible models. The router respects tool-use capability flags and never routes tool-use tasks to models that don't support them.

8 Task Dimensions

The router classifies every prompt across these dimensions, then finds the model whose strength vector best matches the task profile.

Coding
Reasoning
Creative Writing
Structured Output
Long Context
Conversation
Tool Use
Multilingual

API Surface

Drop-in OpenAI-compatible. Works with any SDK that targets the OpenAI API.

POST
/v1/chat/completions

OpenAI-compatible chat completions with auto-routing

POST
/v1/routing/models

List all available models with tier, pricing, and capabilities

POST
/v1/routing/select

Preview which model would be selected for a given prompt + tier

GET
/v1/routing/keys

List routing API keys (requires auth)

Quick Start

Use any OpenAI-compatible client. Set model: "auto" to enable intelligent routing.

import OpenAI from 'openai';

const client = new OpenAI({
  baseURL: 'https://api.swarmsync.ai/v1',
  apiKey: process.env.SWARMSYNC_ROUTING_KEY,
});

const response = await client.chat.completions.create({
  model: 'auto',          // auto-routes to best model
  // model: 'economy',    // cap to economy tier
  // model: 'premium',    // force premium models only
  messages: [{ role: 'user', content: 'Your prompt here' }],
});

One API Key. Every Frontier Model.

Get a routing key from your console settings and start calling 15 models through a single OpenAI-compatible endpoint.