Discover
Open app

Introduction

Choose the right Ando path for an agent that needs inference.

Agents use Ando when model calls need a clear boundary: who can call the model, which account or payment rail carries the request, and how usage is kept visible after the response returns.

The agent can either belong to an Ando account or pay at request time with Tempo MPP. Keep those paths separate. A Virtual Key request uses bearer auth. A Tempo MPP request uses a payment challenge and a request-bound payment credential.

Present, together.

Agents should stay legible: one route, one credential shape, one usage trail, and no hidden spend.

Choose the agent path

Account-ownedUse an Ando account directly

Best when the agent belongs to a user, team, app, or managed workflow with an Ando account and spend controls.

Read Virtual Keys
AccountlessUse Tempo MPP

Best when the agent has a Tempo-compatible wallet and should pay for inference at request time.

Read Tempo MPP

Use the account-owned path when an operator should see usage inside Ando, apply connection limits, and manage monthly or connection-level spend caps.

Use Tempo MPP when the agent should not hold an Ando API key, does not belong to an Ando account, or needs a portable payment rail for accountless model access.

Account-owned agent API

Account-owned agents call the standard OpenAI-compatible routes with a Virtual Key. Create the connection in Ando, reveal the Virtual Key only when the agent runtime is ready, then store it as a server-side secret.

Header

Authorization: Bearer <your-virtual-key>

Chat route

POST /v1/chat/completions

Model discovery

GET /v1/models returns the models allowed for that key.

curl https://inference.andoai.xyz/v1/chat/completions \
  -H "Authorization: Bearer $ANDO_VIRTUAL_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "Qwen/Qwen3-8B-AWQ",
    "messages": [
      {"role": "user", "content": "Give me one calm setup step for this agent."}
    ],
    "max_tokens": 96,
    "temperature": 0.2
  }'

OpenAI-compatible SDKs can use the same base URL:

import OpenAI from "openai";

const ando = new OpenAI({
  apiKey: process.env.ANDO_VIRTUAL_KEY,
  baseURL: "https://inference.andoai.xyz/v1",
});

const response = await ando.chat.completions.create({
  model: "Qwen/Qwen3-8B-AWQ",
  messages: [
    { role: "user", content: "Write a short handoff note for the next agent." },
  ],
  max_tokens: 128,
});

console.log(response.choices[0]?.message?.content);

Use /v1/chat/completions for ordinary OpenAI-compatible chat clients. Use /v1/responses for the buffered text Responses subset, /v1/think for the Ando reasoning route, and /v1/models before hard-coding a model in an agent.

Accountless paid agents

Tempo MPP is the accountless path. The agent sends the same inference request to the MPP endpoint, receives a 402 Payment Required challenge, pays with a Tempo session credential, then retries the exact same request body.

Operating rules

01Do not mix auth modes

Never send a Virtual Key to the MPP endpoint, and never send Authorization: Payment to bearer-token routes.

02Set output limits

Agents should set max_tokens or max_completion_tokens so spend and latency stay predictable.

03Keep secrets server-side

Do not expose Virtual Keys, payment credentials, signatures, wallet addresses, receipts, or prompt content in logs.

04Preserve request bodies

For Tempo MPP, paid retries must use the same body that received the payment challenge.

On this page