February 7, 20267 min read

The 5 Dashboards Problem: Why AI Cost Tracking Is Broken

Five dashboards, a spreadsheet, and a guess. That's the state of AI cost tracking in 2026.

Deborah, CTO at botanu
Deborah

CTO at botanu

Here's how most AI teams track cost today. Open the OpenAI dashboard. Note the number. Open the AWS billing console. Note the number. Open Pinecone. Note the number. Maybe check Deepgram. Open a spreadsheet. Add the numbers together. Divide by... something. Monthly active users? API calls? Vibes?

This is the state of AI cost tracking in 2026. Five dashboards, a spreadsheet, and a guess.

Where It Breaks

Vendor-level visibility, not feature-level. The OpenAI bill says $6,000/month. Great. But how much of that is the Voice Agent vs the Chatbot vs the Escalation flow? OpenAI doesn't know what those features are. They see API calls. The breakdown we found: $2,400 goes to Voice Agent, $2,700 to Chatbot, $900 to Escalations. That changes everything about how to think about pricing and roadmap.

No concept of "outcome." LLM observability tools (Helicone, Langfuse, Portkey) show cost per trace, per span, per model call. Great for debugging. Not useful for business decisions. A single customer resolution might trigger 4 agent calls, 3.2 RAG queries, and touch 6 infrastructure services. The cost per model call is $0.063 for the intent router. The cost per outcome is $1.98. One of those numbers is useful for a pricing conversation.

Infrastructure is invisible. Nobody looks at their EC2 bill and thinks "that's AI cost." But a voice gateway on EC2 ($1,400/month) exists because of the Voice Agent feature. Same with Lambda ($880), API Gateway ($420), S3 for audio storage ($145). Infrastructure is 26% of total AI cost in the workloads we've instrumented. It doesn't get tracked as "AI cost" because it lives in a different billing console.

No cost variance tracking. Average cost per outcome is $0.82. But that's the P50. P90 is $1.74. P99 is $4.31. And 6.2% of runs cost over $3. Anyone pricing at $0.99 per resolution is losing money on every P90+ outcome. Week over week, P95 cost went from $2.94 to $3.18 (up 8.2%). P99 went from $3.84 to $4.31 (up 12.4%). Something changed. Without variance tracking, that shows up next month as a billing surprise.

No connection between cost and quality. Someone optimizes the RAG pipeline and cuts costs 15%. Did resolution quality drop? Is CSAT still 4.3/5? Is first-contact resolution still 78%? Is average resolution time still 2.4 minutes? Cost optimization without quality guardrails is just degradation with a nicer name.

What Tracking Should Actually Look Like

The fix isn't better dashboards. It's a different data model. Start with the business outcome (resolution, conversation, task). Trace every cost that contributed to it (models, agents, RAG, infra, data). Surface it at the feature level, the customer level, and the channel level. Add variance tracking to catch regressions before they become monthly surprises. Add SLO correlation so cost cuts don't silently break quality.

That's what we've been building. One place, full stack, outcome-level.

AI FinOpsObservability

Track cost per outcome across your AI stack

See how botanu gives engineering teams full visibility into workflow-level unit economics.