Tempo MPP Inference
Let an accountless agent pay for Ando inference with Tempo session payments.
Tempo MPP gives accountless agents a payment rail for Ando inference. The
agent does not need an Ando API key. It sends a request, receives an HTTP 402
payment challenge, pays with a Tempo session credential, and retries the same
request body.
This follows the Machine Payments Protocol pattern for agent payments: the agent can pay for an API request in the same HTTP flow, with receipts, request binding, and a reusable session for repeated low-value calls.
Set up an agent
Tempo recommends giving agents a wallet with spend controls, then using
tempo request to handle service discovery and payment negotiation. For wider
Tempo context, read Tempo's agentic payments page,
Using Tempo with AI, and
the MPP overview.
Install the Tempo docs skill or point the agent at Tempo's Markdown docs and llms.txt files.
Use Tempo Wallet CLI so the agent can manage balances, access keys, and payment sessions.
Use tempo request against Ando's MPP endpoint. The CLI handles the 402, credential, retry, and receipt.
Use Tempo's agent setup prompt when you want the agent to configure the wallet itself:
codex exec "Read https://tempo.xyz/SKILL.md and set up Tempo Wallet"For direct CLI setup, sign in and verify the wallet:
tempo wallet login
tempo wallet whoamiTempo also exposes documentation context for agents:
npx skills add tempoxyz/docs
codex mcp add tempo --url https://docs.tempo.xyz/api/mcpThose links give the agent Tempo documentation. Ando MPP inference itself is an HTTP endpoint, not an Ando MCP tool.
Endpoint
POST /v1/mpp/chat/completions
Tempo Mainnet session payments with USDC.e.
Buffered chat completions. Streaming MPP is not enabled yet.
HEAD on the same URL is a different HTTP operation from POST. The
HEAD /v1/mpp/chat/completions operation is reserved for later MPP state
metadata and does not run inference. Use POST /v1/mpp/chat/completions for
model calls.
Recommended request
Use optimization-on as the default. Ando chooses a plan-eligible model and the agent keeps the request simple.
tempo request -X POST \
--json '{
"plan": "Starter",
"model_optimization": true,
"messages": [
{"role": "user", "content": "Give me one calm setup step for this agent."}
],
"max_tokens": 96,
"temperature": 0.2
}' \
https://inference.andoai.xyz/v1/mpp/chat/completionstempo request handles the unpaid request, the WWW-Authenticate: Payment
challenge, the Tempo session credential, the paid retry, and the final response.
Use a named model only when you need deterministic testing or a fixed route:
tempo request -X POST \
--json '{
"plan": "Starter",
"model_optimization": false,
"model": "Qwen/Qwen3-8B-AWQ",
"messages": [
{"role": "user", "content": "Reply with one short sentence."}
],
"max_tokens": 32,
"temperature": 0
}' \
https://inference.andoai.xyz/v1/mpp/chat/completionsPayment flow
The first valid unpaid request returns 402 Payment Required with WWW-Authenticate: Payment.
The agent opens or reuses a Tempo session and signs a request-bound payment credential.
The agent retries the exact same body with Authorization: Payment <credential>.
Ando returns Payment-Receipt only after paid usage is committed.
For the first production rollout, Ando advertises a 1 USDC.e cumulative
voucher bucket and a 5 USDC.e suggested deposit. Treat the deposit as a
reusable-session hint. The session can be reused across requests until the
current voucher bucket is exhausted or stale.
Close the Tempo session when the agent is done with the workflow. Closing settles accepted voucher spend to Ando and returns unused deposit to the agent. Closing is not required between individual requests when the session will be reused.
Request controls
Every request must include:
plan:Starter,Plus, orPremiummodel_optimization:trueorfalsemessages
Set max_tokens or max_completion_tokens explicitly so spend and latency
stay predictable. When model_optimization is true, omit model. When it is
false, include a model that is available to the selected plan.
Troubleshooting
If the paid retry returns 402 with
mpp_session_deposit_below_request_cost, top up or reopen the Tempo session so
the channel deposit covers the current voucher bucket, then retry the same
request body.
If the paid retry returns 402 with mpp_voucher_bucket_refresh_required,
sign a fresh or higher cumulative Tempo session voucher, then retry the same
request body.
If the Tempo client fails locally with a message such as Deposit too low for request, the paid retry did not reach Ando. Increase the active session deposit
and run the same request again.
Safety rules
- Do not send an Ando Virtual Key to the MPP endpoint.
- Do not send
Authorization: Paymentto ordinary Virtual Key endpoints. - Do not change the body between the challenge and paid retry.
- Do not log raw payment credentials, signatures, wallet addresses, receipts, idempotency keys, or prompt content.
- Treat a
200withoutPayment-Receiptas a failed payment path.
For the exact request fields and error codes, read the MPP endpoint reference.