Introduction
Choose the right Ando path for an agent that needs inference.
Agents use Ando when model calls need a clear boundary: who can call the model, which account or payment rail carries the request, and how usage is kept visible after the response returns.
The agent can either belong to an Ando account or pay at request time with Tempo MPP. Keep those paths separate. A Virtual Key request uses bearer auth. A Tempo MPP request uses a payment challenge and a request-bound payment credential.

Agents should stay legible: one route, one credential shape, one usage trail, and no hidden spend.
Choose the agent path
Best when the agent belongs to a user, team, app, or managed workflow with an Ando account and spend controls.
Read Virtual KeysBest when the agent has a Tempo-compatible wallet and should pay for inference at request time.
Read Tempo MPPUse the account-owned path when an operator should see usage inside Ando, apply connection limits, and manage monthly or connection-level spend caps.
Use Tempo MPP when the agent should not hold an Ando API key, does not belong to an Ando account, or needs a portable payment rail for accountless model access.
Account-owned agent API
Account-owned agents call the standard OpenAI-compatible routes with a Virtual Key. Create the connection in Ando, reveal the Virtual Key only when the agent runtime is ready, then store it as a server-side secret.
Authorization: Bearer <your-virtual-key>
POST /v1/chat/completions
GET /v1/models returns the models allowed for that key.
curl https://inference.andoai.xyz/v1/chat/completions \
-H "Authorization: Bearer $ANDO_VIRTUAL_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "Qwen/Qwen3-8B-AWQ",
"messages": [
{"role": "user", "content": "Give me one calm setup step for this agent."}
],
"max_tokens": 96,
"temperature": 0.2
}'OpenAI-compatible SDKs can use the same base URL:
import OpenAI from "openai";
const ando = new OpenAI({
apiKey: process.env.ANDO_VIRTUAL_KEY,
baseURL: "https://inference.andoai.xyz/v1",
});
const response = await ando.chat.completions.create({
model: "Qwen/Qwen3-8B-AWQ",
messages: [
{ role: "user", content: "Write a short handoff note for the next agent." },
],
max_tokens: 128,
});
console.log(response.choices[0]?.message?.content);Use /v1/chat/completions for ordinary OpenAI-compatible chat clients. Use
/v1/responses for the buffered text Responses subset, /v1/think for the
Ando reasoning route, and /v1/models before hard-coding a model in an agent.
Accountless paid agents
Tempo MPP is the accountless path. The agent sends the same inference request
to the MPP endpoint, receives a 402 Payment Required challenge, pays with a
Tempo session credential, then retries the exact same request body.
Set up Tempo CLI and call Ando with request-time payment.
02Agent SkillsGive agents a compact operating contract for both auth modes.
03MPP API ReferenceRead request fields, payment challenge behavior, and error codes.
Operating rules
Never send a Virtual Key to the MPP endpoint, and never send Authorization: Payment to bearer-token routes.
Agents should set max_tokens or max_completion_tokens so spend and latency stay predictable.
Do not expose Virtual Keys, payment credentials, signatures, wallet addresses, receipts, or prompt content in logs.
For Tempo MPP, paid retries must use the same body that received the payment challenge.