Direction-of-arrival (DOA) estimation with a microphone array: classical MUSIC vs a learned estimator.
A focused comparison between a classical array-processing method and a small neural network for estimating where sound sources are, using a uniform linear array. The study shows where a data-driven estimator helps most: in the low-SNR regime, where the classical subspace method becomes unreliable.
- Simulates multi-microphone snapshots from a uniform linear array (ULA) for one or more sources at given angles, with controllable SNR and number of snapshots.
- Estimates the source directions two ways:
- MUSIC - the classical subspace method (eigendecompose the spatial covariance, scan angle for peaks).
- Learned model - a small neural network mapping the covariance features to a spatial spectrum whose peaks give the DOAs.
- Compares their accuracy (RMSE) across SNR over many random scenarios.
src/ula.py # ULA signal model: steering vectors, snapshots, covariance
src/music.py # classical MUSIC DOA estimator
src/learned.py # dataset generation, neural model, training
src/evaluate.py # RMSE-vs-SNR comparison and example spectra
results/ # figures
pip install -r requirements.txt
python src/evaluate.pyThis trains the learned model and produces both figures in results/.
The learned estimator stays accurate at low SNR, while MUSIC's error rises sharply at -5 dB because its subspace estimate degrades. At higher SNR, classical MUSIC is excellent and slightly more accurate, limited only by the learned model's angle-grid resolution. The takeaway: a data-driven estimator is most valuable precisely in the hard, noisy regime.
- Uniform linear array, narrowband model. The standard setting for DOA, with directly controllable SNR and snapshot count, which makes the low-SNR comparison clean.
- Covariance features. Both methods consume the spatial covariance; the network sees the real and imaginary parts of its upper triangle, so the comparison is fair.
- Soft-label spectrum. The network regresses a Gaussian-bump spatial spectrum rather than raw angles, which handles a variable number of sources naturally and mirrors the MUSIC pseudo-spectrum.
- Coherent / correlated sources, where MUSIC struggles and learning can help more.
- Wideband signals and real recorded array data.
- Joint detection of the number of sources, not just their angles.


