API Reference

API Reference

Complete guide to integrating Decisor API into your application

Quick Start

Decisor provides a unified API for accessing multiple AI providers. Use the same OpenAI-compatible format with intelligent routing, caching, and optimization.

Base URL: https://decisor.io/v1

Authentication

All API requests require authentication using a Bearer token in the Authorization header.

Authorization: Bearer sk_live_xxx

Get your API key from the API Keys dashboard

Chat Completions

POST/v1/chat/completions

Request Body

{
  "model": "auto",
  "messages": [
    {
      "role": "user",
      "content": "Hello, how are you?"
    }
  ],
  "temperature": 0.7,
  "max_tokens": 1000,
  "routing": {
    "prefer": "speed"
  }
}
model(required)

Recommended: Use "auto" for intelligent routing. We automatically select the best model and provider based on your request.

Advanced (Starter+ only): Specify a specific model (e.g., "gpt-4", "claude-3-opus") for explicit model selection. Free plan users must use "auto".

messages(required)

Array of message objects with "role" (user/assistant/system) and "content".

temperature(optional)

Sampling temperature between 0 and 2. Default: 1

max_tokens(optional)

Maximum number of tokens to generate. Default: varies by model

routing(optional)

Optional object to control routing behavior.

prefer(optional)

Routing preference: "speed", "quality", or "cost". Default: cost-optimized routing.

fallback(optional)

Enable automatic fallback to alternative providers if the primary provider fails. Default: true.

Response

{
  "id": "chatcmpl-xxx",
  "object": "chat.completion",
  "created": 1234567890,
  "model": "llama-3.1-8b-instant",
  "choices": [{
    "index": 0,
    "message": {
      "role": "assistant",
      "content": "Hello! I'm doing well, thank you..."
    },
    "finish_reason": "stop"
  }],
  "usage": {
    "prompt_tokens": 10,
    "completion_tokens": 20,
    "total_tokens": 30
  },
  "_optimization": {
    "cached": false,
    "original_tokens": 15,
    "optimized_tokens": 10,
    "tokens_saved": 5
  },
  "_routing": {
    "provider": "groq",
    "original_model": "auto",
    "mapped_model": "llama-3.1-8b-instant",
    "reason": "Best speed for this request"
  }
}

Routing Preferences

Control how requests are routed by specifying your preference for speed, quality, or cost.

By default, the system optimizes for cost. You can override this behavior by setting the routing preference.

Speed Priority

Prioritize fastest response times. Routes to Groq (fastest provider) when speed is preferred.

"prefer": "speed"

Quality Priority

Prioritize highest quality responses. Routes to providers with better models for complex tasks.

"prefer": "quality"

Cost Priority

Prioritize lowest cost. Routes to Together AI (cheapest provider) when cost is preferred. This is the default behavior.

"prefer": "cost"

Example Request:

{
  "model": "auto",
  "messages": [
    {"role": "user", "content": "Explain quantum computing"}
  ],
  "routing": {
    "prefer": "quality"
  }
}

Code Examples

const response = await fetch('https://decisor.io/v1/chat/completions', {
  method: 'POST',
  headers: {
    'Content-Type': 'application/json',
    'Authorization': 'Bearer sk_live_xxx'
  },
  body: JSON.stringify({
    model: 'auto',
    messages: [
      { role: 'user', content: 'Hello, how are you?' }
    ],
    routing: {
      prefer: 'speed'  // 'speed', 'quality', or 'cost'
    }
  })
});

const data = await response.json();
console.log(data.choices[0].message.content);

Available Models

Free Plan: Only "auto" is available. Intelligent routing automatically selects the best model for you.

Starter+ Plans: You can use "auto" or specify any model below for explicit control.

auto
Recommended

Intelligent routing - automatically selects the best provider and model based on your request. Recommended for all users.

llama-3.3-70b-versatile
Starter+

Groq Llama 3.3 70B - Fast inference with high quality (12 cents/1M tokens)

llama-3.1-8b-instant
Starter+

Groq Llama 3.1 8B - Fastest inference (12 cents/1M tokens)

meta-llama/Llama-3.2-3B-Instruct-Turbo
Starter+

Together AI Llama 3.2 3B - Most cost-effective (5.5 cents/1M tokens)

meta-llama/Llama-3.1-8B-Instruct
Starter+

Together AI Llama 3.1 8B - Good balance (10 cents/1M tokens)

accounts/fireworks/models/llama-v3p1-8b-instruct
Starter+

Fireworks AI Llama 3.1 8B - Fast and reliable (9 cents/1M tokens)

accounts/fireworks/models/llama-v3p3-70b-instruct
Starter+

Fireworks AI Llama 3.3 70B - High quality for complex tasks (9 cents/1M tokens)

Rate Limits

Rate limits are applied per API key to ensure fair usage and system stability.

Default: 60 requests per minute
Free tier: 200,000 tokens per month
Limits can be adjusted in your dashboard settings

Error Handling

CodeStatusDescription
invalid_api_key401The API key provided is invalid or missing
missing_model400The model parameter is required
invalid_messages400Messages must be a non-empty array with valid role and content
rate_limit_exceeded429Too many requests. Check your rate limit settings
token_limit_exceeded429Monthly token limit exceeded. Upgrade your plan
routing_error400Unable to route request to a provider
provider_error500Error communicating with the AI provider

Features

Intelligent Routing

Automatically routes requests to the best provider based on your preferences (speed, quality, cost).

Smart Caching

Automatic response caching reduces costs by serving cached responses for identical requests.

Prompt Optimization

Automatically optimizes your prompts to reduce token usage while maintaining quality.

OpenAI Compatible

Drop-in replacement for OpenAI API. No code changes needed - just update the base URL.

Ready to get started?

Create your free account and get 200,000 tokens to start building with Decisor API.