How can I find a specific word in an audio file?

Question

I'm trying to train and use a neural network to detect a specific word in an audio file. The input of the neural network is an audio of 2-3 seconds duration, and the neural network must determine whether the input audio (the voice of a person) contains the word "hello" or not.

I do not know what kind of network to use. I used the SOM network, but I did not get the desired result. My training data contains a large number of voices that contain the word "hello".

Is there any python code for dis problem?

You can use a CNN or LSTM model. For the CNN one you can try pre-trained models for starting and for the LSTM one you can try bidirectional LSTMs. — Let's try, Aug 03 '20 at 09:42
What datasheet should I use to train the class related to "not hello"? — Ali.kavari76, Aug 03 '20 at 13:08
Have you tried looking at some models at tensorflow hub? I’m positive there should be some useful stuff — pedrum, Aug 03 '20 at 20:02

score 2 · Answer 1 · answered Jan 27 '21 at 09:02

2

After some research on the internet, I realized that using VOSK toolkit in python, it can be found (detect) any particular word in audio file or real time audio streaming.

https://alphacephei.com/vosk/

answered Jan 27 '21 at 09:02

Ali.kavari76

111
6

How can I find a specific word in an audio file?

1 Answers1