Neural Acoustic Fields is a research implementation that models acoustic propagation in physical scenes as a continuous implicit function. By treating sound propagation as a linear time-invariant system, NAF learns to map any emitter-listener location pair to a neural impulse response that can be applied to arbitrary audio sources.
The system enables continuous spatial audio rendering for listeners at any position in a scene, including novel locations not seen during training. NAF learns magnitude-only representations (using random phase similar to Image2Reverb) and demonstrates how acoustic structure emerges as a byproduct of learning spatial sound propagation. The learned representations can also improve visual learning tasks with sparse views.
This is research code from a NeurIPS 2022 paper, providing training and evaluation pipelines for learning acoustic fields from 3D scene data. It includes baseline comparisons against codec-based interpolation methods (AAC-LC, Opus) and tools for analyzing spectral accuracy, T60 error, and learned feature representations.
N
NoiseBandNet
Standalone NoiseBandNet is a neural network architecture for synthesizing controllable sound effects using filterbanks. It provides multiple control schemes: automatic extraction using loudness and spectral centroid, loudness-only control for loudness transfer between sounds, and user-defined control parameters drawn directly on spectrograms. The system uses a DDSP-inspired approach with learned filter banks, allowing real-time parameter manipulation and amplitude randomization for variations.
The tool includes training workflows for custom sound effect datasets and inference notebooks demonstrating loudness transfer, amplitude randomization for stereo generation, and custom control curve synthesis. Users can train models on their own sound libraries and define control parameters through an interactive labeling interface that displays waveforms and spectrograms.
Implemented in PyTorch, NoiseBandNet outputs controllable synthesis parameters that can be manipulated post-training without retraining, making it suitable for adaptive sound design and procedural audio generation in interactive contexts.
Raveler
Wwise Raveler is a Wwise plugin that runs RAVE (Realtime Audio Variational autoEncoder) models for real-time timbre transfer via neural audio synthesis in game audio contexts. The plugin provides direct integration of trained RAVE models into Wwise effect chains, enabling neural processing of game audio with adjustable latent space manipulation.
The plugin exposes controls for model performance parameters including latent noise injection, prior sampling, and dry/wet mixing. It offers direct manipulation of up to 8 latent dimensions with bias and scaling controls, all of which can be bound to RTPCs for dynamic runtime control. Buffer settings allow balancing between audio quality and latency based on project requirements.
Based on the RAVE VST project, Raveler brings research-grade neural audio synthesis techniques into production game audio workflows through Wwise's standard plugin architecture. Note: the core is released under CC BY-NC 4.0 (non-commercial), which restricts use in commercial products.