Receipt Specification
ProofMeter provides cryptographic spend attestation for AI agent actions. This document defines the receipt format, budget capability model, signing scheme, and verification rules. Patent Pending.
Core loop
Before an agent, benchmark run, or workflow can incur cost, it receives a signed budget capability. Each billable action produces a signed, hash-chained spend receipt. When the task completes, receipts are aggregated into a Merkle-rooted settlement. Any party can independently verify any receipt or settlement without trusting the runner or the platform.
Budget Capability
A budget capability is a signed permission granting an agent a maximum spend within defined scope and time bounds.
{
"schema": "proofmeter.capability.v1",
"capability_id": "cap_01HX...",
"namespace_id": "ns_benchd",
"authorized_agent_id": "benchd-runner",
"max_budget_cents": 500,
"currency": "USD",
"scope": {
"allowed_providers": ["openai", "anthropic"],
"allowed_endpoint_classes": ["chat", "embedding"]
},
"expires_at": "2026-05-18T00:00:00Z",
"metadata": {
"task_id": "run_abc123",
"source": "benchd-harness"
},
"signature": {
"algorithm": "Ed25519",
"key_id": "key_01HX...",
"value": "base64..."
}
}max_budget_cents — Hard spending limit in cents. Enforced per-receipt.
scope — Restricts which providers and endpoint classes the agent may use.
expires_at — Capability becomes invalid after this timestamp.
Design principle: Usage is fact. Cost is derived.
Receipts attest to provable facts — tokens consumed, provider called, model used, timestamps. Cost is a derived computationthat depends on who's computing it (list price, enterprise discount, internal chargeback). The receipt never claims to know what the customer actually paid. Different parties can compute different cost views from the same provable token counts.
Usage Receipt
Every API/LLM call produces a receipt with two distinct sections: proven usage (signed, verifiable forever) and an optional cost estimate (derived from a declared pricing table, re-computable by anyone).
{
"schema": "proofmeter.receipt.v1.1",
"receipt_id": "rcpt_01HX...",
"namespace_id": "ns_benchd",
"actor_id": "benchd-runner",
"capability_id": "cap_01HX...",
"task_id": "run_abc123",
"proven_usage": {
"provider": "openai",
"model": "gpt-4o-mini",
"endpoint_class": "chat",
"input_tokens": 1200,
"output_tokens": 300,
"total_tokens": 1500,
"latency_ms": 842,
"occurred_at": "2026-05-17T14:30:00Z"
},
"cost_estimate": {
"estimated_cost_usd": 0.00033,
"cost_confidence": "estimated",
"pricing_basis": "list_price",
"pricing_version": "2026-05",
"note": "Derived from public list prices. Actual billed amount may differ."
},
"metadata": {
"question_id": "q_014",
"benchmark": "reliability",
"adapter": "verifiedstate"
},
"chain": {
"previous_hash": "sha256:abc123...",
"event_hash": "sha256:def456..."
},
"signature": {
"algorithm": "Ed25519",
"key_id": "key_01HX...",
"value": "base64..."
}
}proven_usage — Signed, verifiable facts from the API response. This never changes.
cost_estimate — Derived from a declared pricing table. Can be recomputed by anyone with a different pricing table. Explicitly labeled as an estimate.
cost_confidence — One of: estimated (list prices), customer_supplied (their rates), invoice_reconciled (matched to billing), usage_only (no cost calculated).
Hash Chaining
Receipts are hash-chained: each receipt's event_hashis computed over the canonical JSON of the receipt payload plus the previous receipt's hash. This creates a tamper-evident chain — modifying or removing any receipt breaks the chain for all subsequent receipts.
Receipt 1: event_hash = SHA-256(canonical(payload) + "null")
Receipt 2: event_hash = SHA-256(canonical(payload) + receipt_1.event_hash)
Receipt 3: event_hash = SHA-256(canonical(payload) + receipt_2.event_hash)
...
Settlement: merkle_root = Merkle(all event_hashes)Settlement
When a task completes, all receipts are settled into a Merkle-rooted batch. The settlement is the final audit record for the task.
{
"schema": "proofmeter.settlement.v1.1",
"settlement_id": "stl_01HX...",
"namespace_id": "ns_benchd",
"task_id": "run_abc123",
"capability_id": "cap_01HX...",
"receipt_count": 294,
"proven_totals": {
"total_input_tokens": 352800,
"total_output_tokens": 88200,
"total_tokens": 441000
},
"cost_estimate": {
"estimated_total_usd": "0.2646",
"cost_confidence": "estimated",
"pricing_basis": "list_price",
"pricing_version": "2026-05"
},
"merkle_root": "sha256:...",
"signature": {
"algorithm": "Ed25519",
"key_id": "key_01HX...",
"value": "base64..."
}
}Signing scheme
Algorithm: Ed25519
Canonicalization: JCS (RFC 8785) — deterministic JSON serialization
Hash: SHA-256 over canonical JSON bytes
Signature: Ed25519 sign over the SHA-256 hash
Key format: Hex-encoded 32-byte public keys
The same signing scheme used by the Bench'd harness for benchmark manifests. A single manifest may carry both a harness signature (proving the score) and ProofMeter receipts (proving the cost).
Verification
Any party can verify a receipt or settlement without trusting the runner:
- Re-canonicalize the receipt payload using JCS
- Recompute SHA-256 hash of canonical bytes + previous_hash
- Verify Ed25519 signature against the computed hash
- Check that event_hash matches the recomputed hash (chain integrity)
- For settlements: verify Merkle root against all receipt hashes
- Check budget: total spend across receipts ≤ capability max_budget_cents
MCP integration
ProofMeter is accessible as MCP tools. Any MCP-compatible agent can authorize budgets, record spend, and verify receipts:
| Tool | Action |
|---|---|
| meter_authorize | Create budget capability |
| meter_spend | Record spend event, get signed receipt |
| meter_budget | Check remaining budget |
| meter_settle | Settle receipts into Merkle batch |
| meter_verify | Verify receipt signature and chain |
| meter_receipts | List and filter receipts |
Receipt examples by pricing mode
The same usage event looks different depending on the pricing mode. The proven_usage section is identical in all cases — only the cost section changes.
1. Usage-only (no cost)
{
"proven_usage": { "provider": "openai", "model": "gpt-4o", "input_tokens": 1200, "output_tokens": 300 },
"cost_estimate": { "estimated_cost_usd": null, "cost_confidence": "usage_only" }
}2. Public estimated (default)
{
"proven_usage": { "provider": "openai", "model": "gpt-4o", "input_tokens": 1200, "output_tokens": 300 },
"cost_estimate": {
"estimated_cost_usd": 0.006,
"cost_confidence": "estimated",
"pricing_basis": "public_estimate",
"price_book_id": "proofmeter_public_2026_05",
"price_book_hash": "sha256:8cfb89..."
}
}3. Customer price book (private rates)
{
"proven_usage": { "provider": "openai", "model": "gpt-4o", "input_tokens": 1200, "output_tokens": 300 },
"cost_estimate": {
"estimated_cost_usd": 0.003,
"cost_confidence": "customer_supplied",
"pricing_basis": "customer_price_book",
"price_book_id": "acme_openai_q2_2026",
"price_book_hash": "sha256:a6f9bf..."
}
}Note: Raw rates are excluded from shared receipts when rates_private: true.
4. Invoice-reconciled (future)
{
"proven_usage": { "provider": "openai", "model": "gpt-4o", "input_tokens": 1200, "output_tokens": 300 },
"cost_estimate": {
"estimated_cost_usd": 0.003,
"cost_confidence": "invoice_reconciled",
"pricing_basis": "customer_price_book",
"reconciliation_status": "reconciled",
"invoice_reference": "inv_2026_05_openai"
}
}Usage in benchd-harness
# Run a benchmark with $5 budget and spend tracking
benchd run -a verifiedstate -b reliability --budget 5.00
# The signed manifest will include a proofmeter section:
# - total_spend_usd
# - receipt_count
# - per-model cost breakdown
# - settlement_merkle_root
# - settlement_signatureReference implementation
The reference implementation ships with benchd-harness on PyPI as the benchd_harness.proofmeter submodule. It can be imported independently:
from benchd_harness.proofmeter import ProofMeterClient
client = ProofMeterClient(api_key="vs_live_...", namespace_id="ns_...")
client.connect()
budget = client.authorize_budget(
agent_id="my-agent",
max_budget_cents=500,
)
receipt = client.record_spend(
actor_id="my-agent",
provider_id="openai",
usage_unit="tokens",
usage_quantity=1500,
cost_cents=2,
capability_id=budget.capability_id,
)
settlement = client.settle(capability_id=budget.capability_id)Non-goals
- ProofMeter does not prevent fraud or validate request reasonableness.
- ProofMeter does not enforce budgets at the provider layer — only at the meter.
- ProofMeter does not validate vendor honesty about token counts (providers do not sign responses).
- ProofMeter does not match invoices unless the customer provides reconciliation data.
- ProofMeter does not claim to know actual billed cost — only estimated cost under a declared pricing basis.
- ProofMeter does not provide regulatory compliance (SOC 2, GDPR, HIPAA) on its own.
- ProofMeter does not attest to timestamp accuracy beyond signer assertion.
See Trust Boundaries for the full analysis of what ProofMeter proves, estimates, and does not prove.
Stable URL: benchd.ai/methodology/receipt-spec
Version: 1.1 | Protocol by VerifiedState. Patent Pending.
See also: Trust Boundaries (what ProofMeter proves and does not prove)