Why I Built My Own ElevenLabs MCP Server

After mapping the gaps in ElevenLabs’ official MCP connector, the decision was obvious. The Conversational AI API — agents, conversations, transcripts, knowledge bases — had zero MCP coverage. I could wait for ElevenLabs to ship it, or I could build it myself.

I built it myself.

Architecture

The server is straightforward. Single Axios client with shared auth (API key from environment), stdio transport, and tool files organized by domain: agents, conversations, knowledge-base, voices. Every tool input gets Zod validation. No surprises.

16 tools across those four domains. The agents tools handle listing, reading full configs, updating prompts and settings. The conversations tools pull session lists with metadata, retrieve full transcripts, and fetch scoring analysis. Knowledge-base tools manage document CRUD — listing, reading, adding text or URLs, deleting. Voice tools list available voices with their metadata.

None of this is complex individually. The value is in having all of it accessible from a single Claude conversation.

The Real Win

The key differentiator over the ElevenLabs dashboard isn’t any single tool — it’s composition. Claude can now read an agent’s full config, compare it side-by-side with another agent, pull the last 20 conversation transcripts, identify patterns in where users get confused, and suggest specific prompt changes. All in one conversation.

That feedback loop used to take 10 minutes of clicking through dashboard tabs, copying text into documents, and manually cross-referencing. Now it takes one sentence: “Compare the tutoring agent’s prompt against the last week of transcripts and tell me where users are getting stuck.”

This is exactly the kind of problem I hit during production hardening — iterating on agent prompts while monitoring conversation quality. Back then I was doing it by hand. Now I have tooling.

Built in a Session

The entire server went from zero to working in a single session with Claude Code. The MCP SDK handles the transport layer and tool registration. Zod handles validation. Axios handles HTTP. The actual work is just mapping API endpoints to tool definitions — mechanical, but the kind of mechanical work that an AI pair programmer eats for breakfast.

One note on the name: the project is codenamed “TwelveLabs.” This has nothing to do with the TwelveLabs video understanding platform. It’s just a codename. I mention this because someone will Google it and get confused.

What’s Next

Sixteen tools, four domains, one session. The server worked. But “works on my machine” is a familiar trap. The tool definitions needed tighter validation, error handling was optimistic at best, and there was no test coverage.

Before open-sourcing, I needed to harden it.