Discover
Open app

Tempo MPP Inference

Let an accountless agent pay for Ando inference with Tempo session payments.

Tempo MPP gives accountless agents a payment rail for Ando inference. The agent does not need an Ando API key. It sends a request, receives an HTTP 402 payment challenge, pays with a Tempo session credential, and retries the same request body.

This follows the Machine Payments Protocol pattern for agent payments: the agent can pay for an API request in the same HTTP flow, with receipts, request binding, and a reusable session for repeated low-value calls.

Set up an agent

Tempo recommends giving agents a wallet with spend controls, then using tempo request to handle service discovery and payment negotiation. For wider Tempo context, read Tempo's agentic payments page, Using Tempo with AI, and the MPP overview.

01Give the agent Tempo context

Install the Tempo docs skill or point the agent at Tempo's Markdown docs and llms.txt files.

02Set up the wallet

Use Tempo Wallet CLI so the agent can manage balances, access keys, and payment sessions.

03Call Ando with MPP

Use tempo request against Ando's MPP endpoint. The CLI handles the 402, credential, retry, and receipt.

Use Tempo's agent setup prompt when you want the agent to configure the wallet itself:

codex exec "Read https://tempo.xyz/SKILL.md and set up Tempo Wallet"

For direct CLI setup, sign in and verify the wallet:

tempo wallet login
tempo wallet whoami

Tempo also exposes documentation context for agents:

npx skills add tempoxyz/docs
codex mcp add tempo --url https://docs.tempo.xyz/api/mcp

Those links give the agent Tempo documentation. Ando MPP inference itself is an HTTP endpoint, not an Ando MCP tool.

Endpoint

Path

POST /v1/mpp/chat/completions

Payment

Tempo Mainnet session payments with USDC.e.

Response

Buffered chat completions. Streaming MPP is not enabled yet.

HEAD on the same URL is a different HTTP operation from POST. The HEAD /v1/mpp/chat/completions operation is reserved for later MPP state metadata and does not run inference. Use POST /v1/mpp/chat/completions for model calls.

Use optimization-on as the default. Ando chooses a plan-eligible model and the agent keeps the request simple.

tempo request -X POST \
  --json '{
    "plan": "Starter",
    "model_optimization": true,
    "messages": [
      {"role": "user", "content": "Give me one calm setup step for this agent."}
    ],
    "max_tokens": 96,
    "temperature": 0.2
  }' \
  https://inference.andoai.xyz/v1/mpp/chat/completions

tempo request handles the unpaid request, the WWW-Authenticate: Payment challenge, the Tempo session credential, the paid retry, and the final response.

Use a named model only when you need deterministic testing or a fixed route:

tempo request -X POST \
  --json '{
    "plan": "Starter",
    "model_optimization": false,
    "model": "Qwen/Qwen3-8B-AWQ",
    "messages": [
      {"role": "user", "content": "Reply with one short sentence."}
    ],
    "max_tokens": 32,
    "temperature": 0
  }' \
  https://inference.andoai.xyz/v1/mpp/chat/completions

Payment flow

01Challenge

The first valid unpaid request returns 402 Payment Required with WWW-Authenticate: Payment.

02Session

The agent opens or reuses a Tempo session and signs a request-bound payment credential.

03Retry

The agent retries the exact same body with Authorization: Payment <credential>.

04Receipt

Ando returns Payment-Receipt only after paid usage is committed.

For the first production rollout, Ando advertises a 1 USDC.e cumulative voucher bucket and a 5 USDC.e suggested deposit. Treat the deposit as a reusable-session hint. The session can be reused across requests until the current voucher bucket is exhausted or stale.

Close the Tempo session when the agent is done with the workflow. Closing settles accepted voucher spend to Ando and returns unused deposit to the agent. Closing is not required between individual requests when the session will be reused.

Request controls

Every request must include:

  • plan: Starter, Plus, or Premium
  • model_optimization: true or false
  • messages

Set max_tokens or max_completion_tokens explicitly so spend and latency stay predictable. When model_optimization is true, omit model. When it is false, include a model that is available to the selected plan.

Troubleshooting

If the paid retry returns 402 with mpp_session_deposit_below_request_cost, top up or reopen the Tempo session so the channel deposit covers the current voucher bucket, then retry the same request body.

If the paid retry returns 402 with mpp_voucher_bucket_refresh_required, sign a fresh or higher cumulative Tempo session voucher, then retry the same request body.

If the Tempo client fails locally with a message such as Deposit too low for request, the paid retry did not reach Ando. Increase the active session deposit and run the same request again.

Safety rules

  • Do not send an Ando Virtual Key to the MPP endpoint.
  • Do not send Authorization: Payment to ordinary Virtual Key endpoints.
  • Do not change the body between the challenge and paid retry.
  • Do not log raw payment credentials, signatures, wallet addresses, receipts, idempotency keys, or prompt content.
  • Treat a 200 without Payment-Receipt as a failed payment path.

For the exact request fields and error codes, read the MPP endpoint reference.

On this page