Automatic speech recognition – Whisper OpenAI

Whisper is a recently released transformer-based automatic speech recognition (ASR) model from OpenAI.

It can be used for:

🗣Language identification

🗣Voice activity detection

🗣Multi-lingual speech recognition

🗣Multi-lingual speech translation

When evaluated on the ESB datasets (including LibriSpeech, Common Voice), Whisper outperformed Conformer RNN-T from NVidia and Wav2Vec2 from Meta.

Link to blog:
Link to repo:
Link to benchmarking study:

Leave a comment

Your email address will not be published. Required fields are marked *