The Problem Space: What Reverse Sound Design Actually Is
Why turning audio into synth parameters is mathematically hard, what tools exist today, and what open-source research is available.
What we’re trying to solve
Reverse sound design — also called automatic synthesizer programming — is the task of deriving synthesizer control parameters from an audio sample. The forward problem is trivial: given parameters, a synth deterministically produces audio. The reverse problem is hard for three fundamental reasons that took us a while to properly internalize.
Why the inverse problem is hard
It’s mathematically ill-posed. Multiple different parameter configurations can produce perceptually identical audio. This isn’t a data problem — it’s a property of the synthesis process itself. A 2025 ISMIR paper formally characterizes this as arising from synthesizer symmetries: swapping two identical oscillators produces the same output. Standard regression models struggle because they’re forced to pick one “correct” answer from many valid ones, often averaging them into a garbled result.
Synthesis involves discontinuities and discrete choices. Waveform selection (saw vs. square vs. triangle), routing choices, and modulation topology are categorical, not continuous. Standard automatic differentiation gives incorrect gradients at these discontinuities, which is why early approaches used genetic algorithms rather than gradient descent.
Domain gap. A model trained on clean synth renders encounters real-world audio that’s mixed, processed, compressed, and recorded through imperfect equipment. There are no ground-truth parameters for “a violin” in a subtractive synth.
This reframed our entire approach: Patch Pilot will not attempt to output “the correct patch.” Instead, it returns a set of plausible patches that recreate a similar timbre. This is the mathematically honest position, and it aligns with how every successful production tool actually works.
What exists today
The market does not have a broadly reliable “upload any song snippet → get accurate subtractive synth settings” solution. What exists works under tight constraints.
Synplant 2 (Genopatch) by Sonic Charge analyzes imported audio and generates FM-based patches iteratively, showing multiple “branch” candidates. Accuracy varies widely with source complexity — works best on isolated, simple sources.
Replicate by MicroMusic is an AI preset generator that outputs actual VST preset files with stem splitting for mixed audio. Technical implementation isn’t publicly documented.
MYTH by Dawesome/Tracktion does drag-and-drop audio → internal resynthesis, but it’s not synth parameter inference — it transforms audio into its own proprietary oscillator state.
None of these are general-purpose “infer knobs for any external synth” tools. Patch Pilot would not be entering a solved space.
Open-source research codebases
Several projects provide building blocks worth studying:
SpiegeLib — Full pipeline: VST automation via RenderMan, dataset generation, MLP/CNN/LSTM estimators, evaluation metrics. The closest existing open-source analog to Patch Pilot. Linux/macOS only.
InverSynth / InverSynth II (ISMIR 2023) — Differentiable synthesizer-proxy approach with inference-time finetuning. Directly applicable to our architecture.
DiffMoog (2024) — Differentiable modular synth with explicit module routing and signal-chain loss. The clearest example of a differentiable synth that mirrors real synth UIs.
Sound2Synth (IJCAI 2022) — FM parameter estimation for Dexed (DX7 emulation). 155-parameter space. Good FM reference.
Syntheon — Open-source parameter inference for music synthesizers. Direct analog, useful architecture reference.
This is part of a research series for Patch Pilot v2. Next: ML approaches compared.