LLM proxy that lets Claude Code talk to any model
[Original Reddit post](https://www.reddit.com/r/ClaudeCode/comments/1t2jlsk/llm_proxy_that_lets_claude_code_talk_to_any_model/)
I built
rosetta-llm
— an open-source multi-format LLM proxy that acts as a drop-in Claude Code gateway.
- **Works as a Claude Code LLM gateway** — set `ANTHROPIC_BASE_URL` and all configured models appear in `/model` picker
- **Translates between formats** — Anthropic Messages ↔ OpenAI Chat ↔ OpenAI Responses at the wire level
- **Thinking blocks round-trip correctly** — this is the hard part and why I built this
- **Provider routing** — `openai/gpt-5.4`, `anthropic/claude-opus-4-7`, `groq/llama-4` all through one endpoint
- **Streaming on everything** — passthrough fast path + cross-format translation with proper SSE handling
## The thinking-block problem
Most proxies lose reasoning continuity. LiteLLM has had open PRs for thinking block handling for a long time — some dating back months — and they're still not merged. Without proper round-tripping, prompt caching breaks across turns and Claude Code loses context.
Rosetta encodes encrypted reasoning into Anthropic's `signature` field and decodes it back — so multi-turn agentic workflows keep their prompt-cache hits.
## Zero-setup Hugging Face Space
Literally a two-line Dockerfile:
```dockerfile
FROM
ghcr.io/lokesh-chimakurthi/rosetta-llm:latest
COPY --chown=app:app config.json /app/config.json
```
Drop config.json file and above Dockerfile into a HF Space (Docker SDK) and it's running. No clone, no build, no venv. The GHCR image has everything baked in.
## Also works with
```bash
# No install — ephemeral
uvx rosetta-llm
# Persistent install
uv tool install rosetta-llm
rosetta-llm --config ~/.rosetta-llm/config.json
# Docker
docker run -p 7860:7860 \
-v ~/.rosetta-llm/config.json:/app/config.json \
ghcr.io/lokesh-chimakurthi/rosetta-llm:latest
```
## Why another proxy?
I looked at existing solutions:
- **LiteLLM** — thinking block round-trip PRs going nowhere, too many abstractions
- **OpenRouter** — great but closed-source, no self-hosting
- **Direct passthrough proxies** — don't translate between formats
Nothing gave me lossless cross-format translation with proper reasoning fidelity.
## Links
- **GitHub:**
https://github.com/Lokesh-Chimakurthi/rosetta-llm
- **PyPI:**
https://pypi.org/project/rosetta-llm/
## Contributions welcome
I built this for myself and it works for my use cases. But there's a lot more it could do — better multimodal handling, embeddings support, rate limiting, an admin UI. If any of this sounds interesting, PRs are absolutely welcome. Happy to answer questions in the comments.
submitted by
/u/DataNebula
Originally posted by u/DataNebula on r/ClaudeCode
I built
rosetta-llm
— an open-source multi-format LLM proxy that acts as a drop-in Claude Code gateway.
- **Works as a Claude Code LLM gateway** — set `ANTHROPIC_BASE_URL` and all configured models appear in `/model` picker
- **Translates between formats** — Anthropic Messages ↔ OpenAI Chat ↔ OpenAI Responses at the wire level
- **Thinking blocks round-trip correctly** — this is the hard part and why I built this
- **Provider routing** — `openai/gpt-5.4`, `anthropic/claude-opus-4-7`, `groq/llama-4` all through one endpoint
- **Streaming on everything** — passthrough fast path + cross-format translation with proper SSE handling
## The thinking-block problem
Most proxies lose reasoning continuity. LiteLLM has had open PRs for thinking block handling for a long time — some dating back months — and they're still not merged. Without proper round-tripping, prompt caching breaks across turns and Claude Code loses context.
Rosetta encodes encrypted reasoning into Anthropic's `signature` field and decodes it back — so multi-turn agentic workflows keep their prompt-cache hits.
## Zero-setup Hugging Face Space
Literally a two-line Dockerfile:
```dockerfile
FROM
ghcr.io/lokesh-chimakurthi/rosetta-llm:latest
COPY --chown=app:app config.json /app/config.json
```
Drop config.json file and above Dockerfile into a HF Space (Docker SDK) and it's running. No clone, no build, no venv. The GHCR image has everything baked in.
## Also works with
```bash
# No install — ephemeral
uvx rosetta-llm
# Persistent install
uv tool install rosetta-llm
rosetta-llm --config ~/.rosetta-llm/config.json
# Docker
docker run -p 7860:7860 \
-v ~/.rosetta-llm/config.json:/app/config.json \
ghcr.io/lokesh-chimakurthi/rosetta-llm:latest
```
## Why another proxy?
I looked at existing solutions:
- **LiteLLM** — thinking block round-trip PRs going nowhere, too many abstractions
- **OpenRouter** — great but closed-source, no self-hosting
- **Direct passthrough proxies** — don't translate between formats
Nothing gave me lossless cross-format translation with proper reasoning fidelity.
## Links
- **GitHub:**
https://github.com/Lokesh-Chimakurthi/rosetta-llm
- **PyPI:**
https://pypi.org/project/rosetta-llm/
## Contributions welcome
I built this for myself and it works for my use cases. But there's a lot more it could do — better multimodal handling, embeddings support, rate limiting, an admin UI. If any of this sounds interesting, PRs are absolutely welcome. Happy to answer questions in the comments.
submitted by
/u/DataNebula
Originally posted by u/DataNebula on r/ClaudeCode