Poppi — AI Engineering Costs Ballooning?
Poppi shows you where your team’s AI work pays off, where it’s leaking spend, and the specific weak spots to fix next — across Claude, Codex, Cursor, and Gemini.
Insights · Transparency · Weak spots, named
The problem
Your team’s AI bill doubled. Nobody can tell you why.
The receipt arrives in dollars. The decisions that produced it are scattered across thousands of sessions in Claude, Codex, Cursor, and Gemini.
Your provider’s dashboard stops at the org total. It has every reason to show you the number — and none to show you the waste underneath it.
Your bill doubled last quarter. The dashboard your provider gives you stops at the org level — no repo, no engineer, no model.
Half the spend runs on the most expensive model. Nobody knows which half should have been on the cheap one.
Every session reloads the same memory, skills, and MCP tools. Most are never touched — and you pay for all of them.
The dashboard
One workspace. Every tool. The waste, ranked.
Connect your providers and GitHub. Poppi reconciles spend, context budget, and outcomes — then hands you a ranked list of what to change next, not another chart to interpret.
Where Acme Inc's AI spend went
Illustrative figures. Poppi is pre-launch — the dashboard reads live once your sessions are connected.
What Poppi shows you
Six kinds of waste, called out by name.
Each one ships with the thing you’d actually change next — not just a chart that says something’s wrong.
-
Root-context bloat
Every CLAUDE.md, AGENTS.md, and always-on skill loads into every session — including the ones nothing ever references.
Trim 40% of your project memory.
-
Skill & tool sprawl
Duplicate MCP tools and overlapping skills that load into context every run but never get invoked.
Cut the ten tools nobody calls.
-
Noisy failures
Tool errors, retries, and compaction events that burn context without producing a single usable token.
Quiet the loops that go nowhere.
-
Session looping
Agents that re-read the same files and chase the same dead end across run after run.
Break the loop — supply the missing constraint.
-
Wrong-sized models
Opus doing work Haiku already closes elsewhere in your repo. Routine output bought at premium prices.
Move 30% of routine work down a tier.
-
Spend outliers
Sessions costing 10× the median for the same outcome — and the upstream ambiguity that produced them.
Triage the long tail before it doubles.
How it works
From raw sessions to ranked recommendations.
- Step 01
Load your AI coding sessions
Connect Claude, Codex, Cursor, Gemini, and GitHub. Poppi ingests session telemetry and PR metadata from your whole team into one hosted workspace.
- Step 02
Analyze across providers
Spend, context budget, and outcomes — reconciled across every tool. Poppi looks for bloated memory, looping sessions, model mis-fits, and dead-end exploration.
- Step 03
Reports & recommendations
A ranked action list — what to trim, which models to right-size, which loops to break — delivered between blocks of work, not during them.
For engineering leadership
A defensible answer when finance asks about AI spend.
Poppi attributes spend to the unit that matters — per repo, per engineer, per model. You see who spent what, on what kind of work, and whether it landed.
Bring the bill back into the conversation as a number you can actually argue with.
How we handle your data
Your prompts and your code stay yours.
We’re pre-launch. These are the commitments we’re building around — and the questions we expect you to ask before you connect anything.
-
Metadata-first by default
Tool calls, token counts, model choices, latency, error patterns, and GitHub PR metadata — the shape of a session and its outcomes, not their contents. Deeper inspection only on the repos you opt into.
-
You control the scope
Exclude repos or engineers. Pause ingestion at any time. Delete a workspace and the data goes with it.
-
Encrypted, isolated, never used to train
Encrypted in transit and at rest. Your data lives in your workspace only — we don't train models on it and we don't resell it.
-
On the roadmap: self-host, SOC 2, EU residency
If you need any of these before GA, say so when you join — we're prioritizing by what design partners actually need.
FAQ
The questions we hear first.
-
What data does Poppi actually need?
Session metadata from your AI tools (tool calls, token counts, model choices, latency, error patterns) plus GitHub PR metadata — which sessions produced PRs, what landed, what got reverted. Your prompts and source files aren't required for the core analysis; opt in per repo if you want deeper inspection.
-
How does the integration work?
A small collector runs alongside your existing tools and ships session telemetry to your Poppi workspace, and a read-only GitHub app pulls PR metadata so we can attribute outcomes. Setup is minutes per machine; your team's day-to-day workflow doesn't change.
-
Which tools do you support?
At launch: Claude, Codex, Cursor, and Gemini. If you use something else, tell us when you join — we're shaping the supported list with design partners.
-
What will it cost?
Per-seat pricing. We're working out beta terms with the first design partners and will publish GA numbers before anyone gets charged.
-
Can we self-host?
Not at launch — it's on the roadmap. If self-hosting is a hard requirement for you, mention it when you join and we'll factor it into priorities.
Design partners
Help shape what Poppi becomes.
The first wave of teams gets early access, white-glove setup, and direct input on what we build next.
No spam. We’ll email when Poppi opens up.