Dialects differ a lot between cities in my country, Syria. People sometimes express themselves using different local phrases and idioms which refer to the same topic. So, I came up with the idea of creating an Android application shows a limited set of sentences or expressions while asking you to express them in the local dialect of your region orally, after that this application tries to figure out what your dialect is. For a short period of time, I'm going to launch an Android application in order to collect the needed dataset which will be a new contribution. First of all, I need some helpful answers to my questions:
- In general, is a period of 6 months enough for such a project to be done by only one student who is a beginner in this field or it is harder than it seems?
- Are the libraries and tools needed to do this project available for free?
- I know that more training data leads to more accurate results. In order to obtain good results, what is the estimated minimum number of training data needed for this model?
- How do you advise me to begin?
- How much is my suggested project relevant to the project attached in this link?
kindly write down your suggested edits and recommendations if any.
Edit for the 5th question: also see this paper.