Discrete Audio and Speech Benchmark

Benchmark for a diverse set of discrete audio encoders from all three categories: semantic, compression, and hybrid.

Get Started

Discrete Audio and Speech Benchmark

DASB is a benchmark for assessing discrete audio tokens across various tasks. It includes different evaluation metrics, downstream architectures, and bitrates for thorough comparisons. The system also features an automated pipeline for dataset downloading, dataloading, evaluation, and leaderboard submission. DASB evolves based on community feedback. To contribute your audio tokenizer or report issues, please email us or visit our GitHub page.
Jekyll logo

Diverse Tasks

We consider a wide range of discriminative tasks, including speech, speaker, emotion recognition, keyword spotting, and intent classification. We also tackle generative tasks, such as speech enhancement, separation, and text-to-speech.

Multiple Tokenizer

DASB is a modular code repository built on the SpeechBraintoolkit. It supports a range of discrete audio encoders across three categories: semantic, compression, and hybrid.

Unified Evaluation

DASB offers dataset splits, various bitrates, and evaluators to ensure reproducible and standardized evaluations. For more reliable results, we use two different downstream architectures for each task.

SpeechBrain logo