microphone-array-doa

Direction-of-arrival (DOA) estimation with a microphone array: classical MUSIC vs a learned estimator.

A focused comparison between a classical array-processing method and a small neural network for estimating where sound sources are, using a uniform linear array. The study shows where a data-driven estimator helps most: in the low-SNR regime, where the classical subspace method becomes unreliable.

Pipeline

What it does

Simulates multi-microphone snapshots from a uniform linear array (ULA) for one or more sources at given angles, with controllable SNR and number of snapshots.
Estimates the source directions two ways:
- MUSIC - the classical subspace method (eigendecompose the spatial covariance, scan angle for peaks).
- Learned model - a small neural network mapping the covariance features to a spatial spectrum whose peaks give the DOAs.
Compares their accuracy (RMSE) across SNR over many random scenarios.

Repository layout

src/ula.py        # ULA signal model: steering vectors, snapshots, covariance
src/music.py      # classical MUSIC DOA estimator
src/learned.py    # dataset generation, neural model, training
src/evaluate.py   # RMSE-vs-SNR comparison and example spectra
results/          # figures

How to run

pip install -r requirements.txt
python src/evaluate.py

This trains the learned model and produces both figures in results/.

Result

The learned estimator stays accurate at low SNR, while MUSIC's error rises sharply at -5 dB because its subspace estimate degrades. At higher SNR, classical MUSIC is excellent and slightly more accurate, limited only by the learned model's angle-grid resolution. The takeaway: a data-driven estimator is most valuable precisely in the hard, noisy regime.

Design choices

Uniform linear array, narrowband model. The standard setting for DOA, with directly controllable SNR and snapshot count, which makes the low-SNR comparison clean.
Covariance features. Both methods consume the spatial covariance; the network sees the real and imaginary parts of its upper triangle, so the comparison is fair.
Soft-label spectrum. The network regresses a Gaussian-bump spatial spectrum rather than raw angles, which handles a variable number of sources naturally and mirrors the MUSIC pseudo-spectrum.

Possible extensions

Coherent / correlated sources, where MUSIC struggles and learning can help more.
Wideband signals and real recorded array data.
Joint detection of the number of sources, not just their angles.

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
results		results
src		src
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

microphone-array-doa

Pipeline

What it does

Repository layout

How to run

Result

Design choices

Possible extensions

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

microphone-array-doa

Pipeline

What it does

Repository layout

How to run

Result

Design choices

Possible extensions

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages