LiteLLM connector

Use your LiteLLM data for reporting, automation and AI.

Data Panda brings your LiteLLM proxy data together with the data from the rest of your business. From one place, we turn it into dashboards, automations, AI workflows and custom apps your team uses every day.

Data Panda Reporting Automation AI Apps
LiteLLM logo
About LiteLLM

One gateway in front of every model you call.

LiteLLM is an open-source LLM gateway built by BerriAI, a Y Combinator company started in 2023 by Krrish Dholakia and Ishaan Jaff. The repo is on GitHub at BerriAI/litellm with 44k+ stars and a thousand-plus contributors. Stripe, Netflix, Google ADK, Greptile and OpenHands run it in production, alongside the long tail of teams that wanted one endpoint instead of ten provider SDKs.

The product has two shapes. The Python SDK lets code call litellm.completion() with the same call signature regardless of provider. The Proxy is a self-hosted server (Docker, Kubernetes or the LiteLLM CLI) that exposes one OpenAI-compatible REST endpoint and routes the call to OpenAI, Anthropic, Azure OpenAI, Amazon Bedrock, Google Vertex AI, Cohere, Mistral, Hugging Face, Groq or any of the 100+ providers it speaks. Around that proxy sit virtual API keys per team or user, per-key and per-team budgets, rate limits, automatic fallback and retry across deployments, response caching and structured spend logs. The Postgres tables behind the proxy (LiteLLM_VerificationToken, LiteLLM_TeamTable, LiteLLM_UserTable, LiteLLM_SpendLogs, LiteLLM_BudgetTable) hold every request with the api_key, user, team_id, end_user, model, model_group, prompt_tokens, completion_tokens, total_tokens, spend, request_tags and metadata attached, which is what turns a stack of provider invoices into something a finance and ML team can query.

What your LiteLLM data is for

What you get once LiteLLM is connected.

LLM spend attributed to teams, keys and end-users

Spend, tokens and model mix per virtual key, team and end-user across every provider on one timeline.

  • Spend per virtual key joined to the team or feature that owns the key
  • OpenAI versus Anthropic versus Bedrock versus Vertex split per team and per week
  • Cost per end-user from the spend log so you know which customer is producing which line

Cost-control automation

Push usage signals back into the tools where decisions about LLM spend really get made.

  • Slack alert when a team crosses 80% of its monthly LiteLLM budget
  • Virtual key paused when a single end-user burns more than the contracted token allowance in a day
  • CRM contact tagged when a customer's AI usage crosses what their plan assumed

AI workflows on LLM usage

Use LiteLLM history to feed the next round of routing and prompt decisions.

  • Routing scoring that picks the cheapest provider per request based on past quality and cost on the same prompt
  • Fallback analysis showing which primary models are getting kicked out by retries and why
  • Cache-hit ratio per template, so prompts that lost the cache get caught the same week

Custom apps on your data

Internal tools on LiteLLM data for teams that do not log into the proxy admin UI.

  • LLM cost dashboard per team, per product feature, per week
  • Per-customer AI-usage view next to MRR for finance and customer success
  • Provider mix and fallback dashboard so platform engineering sees which providers are carrying load
Use cases

Use cases we deliver with LiteLLM data.

A list of concrete reports, automations and AI features we have built on LiteLLM data. Pick the one that matches your situation.

Spend per virtual keyTotal spend, prompt tokens and completion tokens per LiteLLM virtual key over any time window.
Spend per teamRoll-up of every virtual key under a team, against the team's monthly budget.
Provider mix per teamShare of calls and spend across OpenAI, Anthropic, Bedrock, Vertex and the rest, per team and per week.
Cost per end-userSpend per end_user tag on the spend log, joined to the CRM customer behind the tag.
Fallback and retry rateHow often a primary model falls back to a secondary deployment, and what that costs in extra tokens.
Cache-hit ratio per templateCached responses divided by total calls per request_tag, to spot prompts that lost the cache after a release.
Budget burn per teamSpend against budget per team, per key and per end-user, with projected month-end position.
Model_group routing auditWhich underlying deployment served each call, useful for verifying the load-balancing config behaves as written.
Per-feature unit economicsSpend per request_tag joined to product events, for cost per AI action instead of cost per call.
Multi-deployment consolidationUsage across several LiteLLM proxy instances rolled up into one warehouse view.
Real business questions

Answers you will finally get.

Which team is driving our LLM bill, and on which provider?

Spend per LiteLLM team over the last thirty days, split by underlying provider (OpenAI, Anthropic, Bedrock, Vertex, the rest), with the model_group and request_tag breakdown on top. Surfaces the one team running an agent loop on GPT-4 class models that produces most of the bill while the support copilot on a cheaper tier barely registers, before the next provider invoice arrives as one number per provider.

Are we cheaper on OpenAI or on Anthropic for this workflow?

Spend and total_tokens per request_tag joined to model_group, with cost-per-call and quality signals from your eval pipeline alongside. Lets the ML team answer the same prompt cheaper on Sonnet or on GPT-4o mini with a number, instead of a hunch, and lets routing rules be set on something other than the loudest opinion in the room.

Which customers are running AI features beyond what their plan assumed?

Spend per end_user from the LiteLLM spend log, joined to CRM customer, plan tier and MRR. Shows the customer on a small plan whose AI assistant is producing tens of thousands of euros of LLM spend a month, so account management gets a real number to take into the renewal conversation, and customer success can flag accounts whose usage is creeping toward what the contract assumed.

Value for everyone in the organisation

Where each function gets value.

For finance leaders

LLM spend per team, per virtual key and per end-user instead of one line on each provider's invoice. AI cost moves from a fixed unknown spread over OpenAI, Anthropic and Bedrock into one metric tied to the teams and customers that produce it, in time to act on it before the next quarterly commitment renewal.

For sales leaders

AI usage per customer in the same record reps already open, so a customer running heavy GPT-4 traffic on a small plan becomes a renewal conversation instead of a surprise on the year-end review.

For operations

Provider mix, fallback rate, model_group routing and cache-hit ratio per template followed as a curve over ninety days. The behaviour of the AI features stops being rediscovered the morning a deploy quietly tripled the bill on one provider.

Your existing tools

Your data lands in a warehouse. Your BI tools read from it.

You keep the reporting tool you already have. We connect it to the warehouse where your LiteLLM data lives.

Power BI logo
Power BI Microsoft
Microsoft Fabric logo
Fabric Microsoft
Snowflake logo
Snowflake Data warehouse
Google BigQuery logo
BigQuery Google
Tableau logo
Tableau Visualisation
Microsoft Excel logo
Excel Sheets & pivots
Three steps

From LiteLLM to answers in three steps.

01

Connect securely

OAuth authentication. Read-only by default. We sign a DPA and your admin keeps the keys.

02

Land in your warehouse

Data flows into your warehouse on your schedule. Near real time or nightly, your call. You own the data.

03

Reporting, automation, AI

We build the first dashboard, workflow or AI feature with you, then hand over the keys. Or we stay on for ongoing delivery.

Two ways to work with us

Pick the track that fits how you work.

Track 01

Self-serve

We set up the foundation. Your team builds on top.

  • LiteLLM connector configured and running
  • Warehouse set up in your cloud account
  • Clean access for your Power BI, Fabric or Tableau team
  • Documentation on what's in the data model
  • Sync monitoring so you're warned before reports break

Best fit Teams that already have a BI analyst or data engineer and want to own the build.

Track 02

Done for you

We build the whole thing, end to end.

  • Everything in Self-serve
  • Dashboards built to the questions your team actually asks
  • Automations between your systems
  • AI workflows scoped to real tasks your team runs
  • Custom apps where a dashboard does not cut it
  • Ongoing delivery at a pace that fits your team

Best fit Teams without in-house BI or dev capacity. You tell us what you need and we deliver it.

Before you book

Frequently asked questions.

Who owns the data?

You do. It lands in your warehouse, on your cloud account. We don't resell or aggregate it. If you stop working with us, the warehouse stays yours and keeps running.

How fresh is the data?

Near real time for most operational systems. For heavier sources we schedule hourly or nightly. You pick based on what the reports need.

Do I need a warehouse already?

No. If you don't have one, we help you pick one and set it up as part of the first delivery. Common starting points are Snowflake, Microsoft Fabric, or a small Postgres start.

Which LiteLLM data does the connector really pull?

The proxy's Postgres tables are the primary source: LiteLLM_SpendLogs (one row per call, with api_key, user, team_id, end_user, model, model_group, prompt_tokens, completion_tokens, total_tokens, spend, request_tags and metadata), LiteLLM_VerificationToken (the virtual keys), LiteLLM_TeamTable, LiteLLM_UserTable and LiteLLM_BudgetTable. Customer prompts and completions are not pulled by default unless your proxy is configured to log them; the connector defaults to metering data only.

We already pull data from OpenAI and Anthropic directly. Why also pull LiteLLM?

The provider connectors give you what each provider sees: tokens and spend per their own API key. LiteLLM gives you what your organisation sees: the same call attributed to a virtual key, a team, an end-user and a request_tag, across every provider on one schema. The two are complementary. The provider connectors keep the invoice honest, the LiteLLM connector keeps the internal allocation honest.

Does it matter whether we self-host the proxy or run the LiteLLM Cloud version?

The schema is the same in both cases, so the warehouse model does not change. Self-hosted deployments expose Postgres directly to the connector, cloud deployments expose the same data through the LiteLLM API. Multiple proxy instances (for example, one per environment or one per region) can be unioned in the warehouse so spend is reported once across the lot.

GDPR-compliant
Data stays in the EU
You own the warehouse

A first deliverable live in four to six weeks.

We review your LiteLLM setup and the systems around it. Together we pick the first thing worth building.