torchsynth is based upon traditional modular synthesis written in pytorch. It is GPU-optional and differentiable.
Most synthesizers are fast in terms of latency. torchsynth is fast in terms of throughput. It synthesizes audio 16200x faster than realtime (714MHz) on a single GPU. This is of particular interest to audio ML researchers seeking large training corpora.
Additionally, all synthesized audio is returned with the underlying latent parameters used for generating the corresponding audio. This is useful for multi-modal training regimes.
Here is another set of sounds created with the
Drum Nebula. In torchsynth a nebula is a set of hyperparameters that defines how
synthesizer parameters are sampled. The hyperparameters of the Drum Nebula were hand-tuned
to increase the likelihood of the
Voice producing percussive