Automatic speech recognition – Whisper OpenAI

Whisper is a recently released transformer-based automatic speech recognition (ASR) model from OpenAI.

It can be used for:

🗣Language identification

🗣Voice activity detection

🗣Multi-lingual speech recognition

🗣Multi-lingual speech translation

When evaluated on the ESB datasets (including LibriSpeech, Common Voice), Whisper outperformed Conformer RNN-T from NVidia and Wav2Vec2 from Meta.

Link to blog: https://openai.com/blog/whisper/
Link to repo: https://github.com/openai/whisper
Link to benchmarking study: https://arxiv.org/abs/2210.13352

Leave a comment

Your email address will not be published. Required fields are marked *