Hardening an MCP Server for Open Source

The audit before the publish

The MCP server worked. It had been running against live ElevenLabs agents for weeks. But “works for me on my machine” is a different standard than “strangers will install this from npm.” Before publishing, I did a full audit across four areas: security, error handling, AI discoverability, and state management.

It found more than I expected.

Path traversal

Every tool interpolated user-supplied IDs directly into URL paths:

/v1/convai/agents/${agent_id}

A malformed ID like ../../v1/user/subscription could walk up the path and hit an entirely different endpoint. ElevenLabs might reject it. Or it might not. I don’t control their routing, and I shouldn’t rely on it.

The fix was a validatePathSegment() guard that rejects anything containing slashes, dots, or other characters that have no business in an ID. Every tool that interpolates a user value into a URL path calls it before constructing the request.

Error body leaks

The formatError() helper was dumping raw API response bodies into MCP error messages. Seems reasonable until you consider that if the upstream API echoes request headers in its error response, you’ve just handed the user’s API key to the LLM context window.

I added field-level redaction and a 500-character cap on error bodies. Enough detail to debug, not enough to leak secrets.

Missing input limits

Zod schemas had .min(1) on string inputs but no maximum. An LLM could theoretically send megabytes in a prompt field. I added reasonable caps — 10,000 characters for prompt text, 2,048 for URLs, shorter limits for IDs and names.

HTTP webhooks

Webhook URL fields accepted plain HTTP. For a tool that configures where an AI agent sends conversation data, that’s not acceptable. Added HTTPS enforcement via a Zod refinement — the schema itself rejects http:// URLs before the tool logic ever runs.

process.exit considered harmful

A utility function called process.exit(1) on fatal errors. Fine for a CLI tool, terrible for a library. Anyone importing this module for testing would watch their test runner die mid-suite. Changed it to throw, letting the caller decide what “fatal” means.

AI discoverability

This one isn’t security, but it matters for an MCP server. Tools that modify live agents — updating system prompts, changing LLM settings, configuring webhooks — now carry explicit warnings in their descriptions. The AI reads these before deciding to call the tool, and the warnings tell it to confirm with the user first.

I also added cross-references between related tools. If you’re looking at get_agent, the description mentions update_agent_settings exists. Small thing, but it helps the AI navigate a 15-tool surface without the user having to spell out every step.

Published

The server is live on npm as twelvelabs-mcp-server (the name predates a rename — it targets ElevenLabs Conversational AI, not TwelveLabs video) and registered on the official MCP Registry.