LAB LOGS

Research, findings, and thoughts from TONARI LABS. Currently exploring audio-to-synth-parameter estimation, ML for sound design, and whatever else catches my ear.

Apr 14, 2026

Steganography and the Internal Monologue

Can a model lie to its own scratchpad? What the Mythos Preview system card reveals about extended thinking, grader awareness, and the gap between what AI says and what it's actually doing.

ai-safetyalignmentextended-thinkingfrontier-models
Apr 10, 2026

The Ghost in the Agentic Machine

What happens when a frontier model gets shouty with its subagents, runs 160 experiments on its own, and starts acting like that senior engineer who won't let anything go.

ai-safetyagenticfrontier-modelspersonality
Apr 7, 2026

The Mountaineer's Paradox

Why the best-aligned model might be the most dangerous one — and what that means for anyone building on top of frontier AI.

ai-safetyalignmentfrontier-modelsthinking-out-loud
Apr 6, 2026

Hardening an MCP Server for Open Source

Path traversal, error leaks, missing input limits, and HTTP webhooks — what a security audit found before publishing to npm.

mcpsecurityopen-sourcenpm
Apr 5, 2026

Why I Built My Own ElevenLabs MCP Server

16 tools across 4 domains, built in a single session. How the MCP SDK makes it fast to stand up a dedicated server for an API that isn't covered by official connectors.

mcpelevenlabsdeveloper-toolsclaudetypescript
Apr 4, 2026

Where Gemma 4 Fits in Patch Pilot

A technical look at how Google's new open model maps onto the Patch Pilot roadmap — what it replaces, what it can't, and why the v4 agentic layer just became a lot more real.

patch-pilotgemmaarchitecturemlagentic
Apr 4, 2026

Where Tonari Tutor Is Now

Current state of the voice interview practice tool: four languages, structured feedback, SEO groundwork, and what's next.

tonari-tutorstatus-updateseoanalytics
Apr 2, 2026

Gemma 4 and the Case for Open Audio AI

Google's most capable open model family just shipped under Apache 2.0 — with native audio input, function calling, and local inference. Here's what it means for indie audio tool builders.

mlopen-sourcegemmaaudio-mlindustry
Mar 28, 2026

The Official ElevenLabs MCP — and What It Doesn't Cover

ElevenLabs has an MCP connector. It handles TTS and voice cloning. It doesn't touch the Conversational AI API.

mcpelevenlabsdeveloper-toolsclaude
Mar 15, 2026

Managing ElevenLabs Agents Before MCP

The manual workflow of managing four voice agents through a web dashboard, and why automation became necessary.

mcpelevenlabsdeveloper-toolstonari-tutor
Mar 10, 2026

Production-Hardening a Voice AI App No One Asked For

Email gates, webhook pain, XSS fixes, and the long road from 'it works on my machine' to something I'd let other people use.

tonari-tutorcloudflareproductionsecuritywebhooks
Feb 20, 2026

Patch Pilot v2: MVP Architecture & Build Plan

The concrete step-by-step plan for building the retrieval-first MVP — target synth, renderer, dataset generation, embedding pipeline, and the four experiments that determine if it works.

patch-pilotresearcharchitecturemvpsurge-xt
Feb 20, 2026

Teaching an AI to Interview in Four Languages

Why one multilingual agent didn't work and how separate per-language agents with tuned TTS configs solved the problem.

tonari-tutorelevenlabsmultilingualagent-designjapanese
Feb 18, 2026

The Dataset Problem: What Exists and What We're Building

Existing synth parameter datasets, why we chose Surge XT for data legality, and the synthetic generation strategy.

patch-pilotresearchdatasetssurge-xt
Feb 16, 2026

Audio Embeddings for Timbre Similarity: What Actually Works

Most popular audio embedding models are the wrong tool for synth patch retrieval. Here's the model-by-model breakdown and our two-stage architecture.

patch-pilotresearchembeddingsaudio-ml
Feb 14, 2026

DDSP: Why We're Not Using It as the Core

An honest assessment of Google Magenta's DDSP for subtractive synth parameter inference — what it can do, what breaks, and how we'll use it instead.

patch-pilotresearchddspaudio-ml
Feb 12, 2026

ML Approaches for Synth Parameter Estimation — Compared

Supervised regression, DDSP, retrieval-based matching, reinforcement learning, and generative models — what works, what doesn't, and what we're using.

patch-pilotresearchaudio-mlarchitecture
Feb 10, 2026

The Problem Space: What Reverse Sound Design Actually Is

Why turning audio into synth parameters is mathematically hard, what tools exist today, and what open-source research is available.

patch-pilotresearchaudio-ml
Jan 20, 2026

Replacing the ElevenLabs Widget with the WebSocket SDK

Why the official embed was a dead end for a custom interview UI, and what it took to migrate to the WebSocket SDK.

tonari-tutorelevenlabswebsocketsdkaudio
Dec 15, 2025

Tonari Tutor: The Idea

Why I built a voice-based interview practice tool on ElevenLabs Conversational AI, and what the first commit looked like.

tonari-tutorelevenlabsvoice-aiinterview-prep
Dec 13, 2025

Building Patch Pilot with AI: Lessons from a Self-Taught Developer

What it was like to build an ML audio tool using Claude Code, the mistakes that come from moving fast without understanding deeply, and why we're doing research before code this time.

patch-pilotprocessclaude-codelessons
Jul 20, 2025

Patch Pilot v1: The Architecture and What We Built

A walkthrough of the four-layer architecture behind the original Patch Pilot — from input handling to synth parameter output — and the wild December sprint that brought it to life.

patch-pilotarchitecturepythonaudio-ml
Mar 7, 2025

Patch Pilot: Reverse Sound Design — Starting Over

Why we shelved the v1, what we learned, and how we're approaching the research reboot for an audio-to-synth-parameter tool.

patch-pilotaudio-mlresearch