Introduction
Infiner provides a unified API to access 50+ AI models with industry-leading latency and 99.99% uptime. Our intelligent routing automatically selects the best model for your use case.
12ms Latency
Average response time across all models
50+ Models
Access GPT-4, Claude, Llama, and more
One API Key
Single key for all model providers
Quickstart
Get started with Infiner in under 5 minutes. Install our SDK and make your first API call.
1. Install the SDK
npm install @infiner/sdk2. Initialize the client
import { Infiner } from '@infiner/sdk' const infiner = new Infiner({ apiKey: 'YOUR_API_KEY' })
3. Make your first request
const response = await infiner.chat.completions.create({ model: 'gpt-4-turbo', messages: [ { role: 'user', content: 'Hello, how are you?' } ] }) console.log(response.choices[0].message.content)
Authentication
All API requests require an API key. You can create and manage API keys in your dashboard.
Using your API key
Include your API key in the Authorization header:
curl https://api.infiner.ai/v1/chat/completions \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-4-turbo",
"messages": [{"role": "user", "content": "Hello!"}]
}'Chat Completions
Create chat completions using any supported model. The API is compatible with the OpenAI format.
Request
const response = await infiner.chat.completions.create({ model: 'claude-3-opus', messages: [ { role: 'system', content: 'You are a helpful assistant.' }, { role: 'user', content: 'Explain quantum computing.' } ], temperature: 0.7, max_tokens: 1000 })
Response
{
"id": "chatcmpl-abc123",
"object": "chat.completion",
"created": 1703654321,
"model": "claude-3-opus",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "Quantum computing is..."
},
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 25,
"completion_tokens": 150,
"total_tokens": 175
}
}Embeddings
Generate vector embeddings for text using models like text-embedding-3-large.
const embedding = await infiner.embeddings.create({ model: 'text-embedding-3-large', input: 'The quick brown fox jumps over the lazy dog' }) console.log(embedding.data[0].embedding) // [0.0023, -0.0045, ...]
Models
List available models and retrieve model information.
// List all available models const models = await infiner.models.list() // Get specific model info const model = await infiner.models.retrieve('gpt-4-turbo') console.log(model.context_length) // 128000
Model Routing
Let Infiner automatically select the best model for your request based on cost, latency, and capability.
const response = await infiner.chat.completions.create({ model: 'auto', // Infiner selects the best model messages: [ { role: 'user', content: 'Write a haiku about coding' } ], routing: { optimize: 'cost', // 'cost' | 'latency' | 'quality' fallback: ['gpt-4-turbo', 'claude-3-opus'] } })
Streaming
Stream responses token by token for real-time user experiences.
const stream = await infiner.chat.completions.create({ model: 'gpt-4-turbo', messages: [{ role: 'user', content: 'Tell me a story' }], stream: true }) for await (const chunk of stream) { process.stdout.write(chunk.choices[0]?.delta?.content || '') }
Function Calling
Enable models to call functions and interact with external tools.
const response = await infiner.chat.completions.create({ model: 'gpt-4-turbo', messages: [{ role: 'user', content: 'What is the weather in Tokyo?' }], tools: [ { type: 'function', function: { name: 'get_weather', description: 'Get current weather for a location', parameters: { type: 'object', properties: { location: { type: 'string', description: 'City name' } }, required: ['location'] } } } ] })
API Keys
Manage your API keys securely. Never expose keys in client-side code.
Security Warning: Keep your API keys secure. Do not share them or expose them in browser code.
# Store your API key in environment variables
export INFINER_API_KEY="inf_sk_..."Rate Limits
Rate limits vary by plan. Check response headers for your current usage.
| Plan | Requests/min | Tokens/min |
|---|---|---|
| Developer | 60 | 40,000 |
| Pro | 500 | 200,000 |
| Enterprise | Unlimited | Custom |
Rate limit headers
X-RateLimit-Limit: 60
X-RateLimit-Remaining: 45
X-RateLimit-Reset: 1703654400Questions?
Contact our team