Practical formula
- Cost per task = input tokens + output tokens + embeddings + vector search + orchestration + logs + retries + human review where applicable.
- Measure by product workflow: answer, user/month, document, ticket, recommendation, successful automation, or transaction.
- Compare cost together with quality, latency, and error rate; cost alone leads to cheap models that fail more often.