BerriAI/litellm / LLM gateway

LiteLLM: read this before you install it

LiteLLM looks like a simple proxy until cost, keys, model names, routing, and error handling matter. I would start with one model alias and one test request, then add logging and budgets before letting a team point applications at it.

Project source: BerriAI/litellm
Author / organization: BerriAI
This page is a private experience note, not official documentation.
Future ad placement. Separated from navigation and action links.

Do not proxy every model on day one

I would not start LiteLLM by adding every provider I have. I start with one model alias and one real request. If the first alias is not boringly reliable, a giant routing config will only multiply the confusion.

The prep question is why I need a gateway. If I only have one app and one provider, direct SDK calls may be enough. LiteLLM becomes interesting when I need a consistent OpenAI-compatible surface, centralized keys, budgets, logging, fallback, or provider switching.

I also decide who owns the master key. A gateway that hides provider keys still becomes a sensitive service. Treat it like infrastructure, not just a dev tool.

When an LLM gateway saves real pain

LiteLLM fits when multiple apps or people need model access and I do not want every codebase to know every provider. It is useful for spend tracking, auth, rate limits, virtual keys, and fallback rules.

I would not put it into a small prototype unless the prototype is specifically testing multi-provider behavior. A gateway adds another service to monitor.

My fit check is whether central policy matters. If I need budgets, logs, or model routing, LiteLLM earns its place. If not, it may be premature.

Aliases, providers, keys, budgets, and logs

I read LiteLLM as a policy layer in front of providers. Model aliases map to provider models, keys are stored or referenced through environment variables, and clients call the proxy like an OpenAI-compatible endpoint.

The config file is the real source of truth. I check model names, provider prefixes, environment variable references, and whether each alias is something I would be willing to expose to an app.

I also pay attention to logs. A gateway without request/error visibility becomes a black box between users and providers. I want to know which model was called, what failed, and whether retries happened.

One alias, one request, one failure

My setup path is local CLI first: install proxy extras, run one model alias, call `/chat/completions` with curl, then move to a config file. I do not start with database-backed management unless I need it.

After one call works, I intentionally break the provider key. The proxy should fail clearly. If the error is vague, I enable detailed debug before adding more providers.

Only after the basic path is visible do I add virtual keys or team usage. Policy should be introduced when I can already see request flow.

My LiteLLM command path

Use the prep panel to decide gateway purpose, install LiteLLM, and confirm the CLI starts. If you cannot explain the first alias, do not add a config full of models.

Use the verify panel after creating one alias. Send a curl request through the proxy and confirm the response, logs, and model mapping.

Use the debug panel when the proxy runs but provider calls fail. Check model name, provider prefix, env variable, key scope, base URL, and request format before touching application code.

When the proxy runs but calls fail

If the proxy starts but requests fail, I check whether the alias points to a real provider model and whether the environment variable is actually loaded in the proxy process.

If the app works directly with the provider but fails through LiteLLM, I compare request payloads. Gateways make things cleaner only when the mapping is correct.

If costs are surprising, I check logs and virtual key usage. Cost control is one of the reasons to use the gateway; if you cannot inspect it, you have not finished deployment.

The first gateway rule I would keep

The first rule I would keep is one internal alias like `fast-cheap-chat` pointing to a non-critical model. One app calls it, logs show it, and a broken key produces a clear error.

Then I add a second alias with a stronger model. I do not add automatic fallback until I understand both single-model paths.

After that I add virtual keys and budgets. LiteLLM becomes valuable when policy is visible, not when the config is impressive.

How I would use the command panel

Use the LiteLLM commands by routing rule

model map — Before proxy mode, write the model aliases, providers, keys, budgets, fallback rules, and which clients will call the proxy.

one provider path — Test one provider through the proxy, then one fallback or alias. Keep cost and headers visible before exposing it to apps.

proxy logs first — When calls fail, inspect proxy logs, model alias, provider key, endpoint, rate limits, and whether the client is sending the expected model name.

Field commands I would keep beside this note

# LiteLLM prep

python --version
python -m venv .venv
source .venv/bin/activate
python -m pip install -U pip
pip install "litellm[proxy]"
litellm --help
# LiteLLM verify

# simple local proxy with one model alias
litellm --model huggingface/bigcode/starcoder

# new shell
curl http://localhost:4000/v1/models

# then test one chat completion using your configured provider/model
# LiteLLM debug

# detailed logs
litellm --model <provider/model> --detailed_debug

# check env loaded in same shell
printenv | grep -E "OPENAI|ANTHROPIC|AZURE|GEMINI|LITELLM"

# compare direct provider call vs proxy call