5

I'm trying to train and use a neural network to detect a specific word in an audio file. The input of the neural network is an audio of 2-3 seconds duration, and the neural network must determine whether the input audio (the voice of a person) contains the word "hello" or not.

I do not know what kind of network to use. I used the SOM network, but I did not get the desired result. My training data contains a large number of voices that contain the word "hello".

Is there any python code for dis problem?

nbro
  • 39,006
  • 12
  • 98
  • 176
Ali.kavari76
  • 111
  • 6
  • You can use a CNN or LSTM model. For the CNN one you can try pre-trained models for starting and for the LSTM one you can try bidirectional LSTMs. – Let's try Aug 03 '20 at 09:42
  • What datasheet should I use to train the class related to "not hello"? – Ali.kavari76 Aug 03 '20 at 13:08
  • Have you tried looking at some models at tensorflow hub? I’m positive there should be some useful stuff – pedrum Aug 03 '20 at 20:02

1 Answers1

2

After some research on the internet, I realized that using VOSK toolkit in python, it can be found (detect) any particular word in audio file or real time audio streaming.

https://alphacephei.com/vosk/

Ali.kavari76
  • 111
  • 6