Use mobile device camera for moving pattern recognition

Question

For instance, one task would be to detect if with an Android phone in hand, I'm panning the camera toward a circle shape in a 2D space.

What is the best technology set and embedded application approach can be used for these types of motion independent pattern recognition tasks?

i can't find the article, but i recall there was a result where researchers were able to guess the smartphone pin by doing ML on the accelerometer / gyro. sounds like something you'd be interested in — k.c. sayz 'k.c sayz', Dec 30 '17 at 18:36
@k.c.sayz'k.csayz' yes that would be something very related if you have it. — Tina J, Dec 31 '17 at 00:50
@TinaJ it took me 3 words in google to search it up. you should do more research before asking a question like that (including this very question) — k.c. sayz 'k.c sayz', Dec 31 '17 at 18:27
I'm not interested in papers per se. More like SDK or APIs or something. — Tina J, Dec 31 '17 at 22:05
@TinaJ you should first analyse the architecture of the hardware(android device),very well.I see there are some questions to ask more in android community,as well. — quintumnia, Jan 01 '18 at 18:40

score 3 · Answer 1 · edited Jun 17 '20 at 09:57

Firstly, before we commence I will recommend that you refer to similar questions on the network i.e. https://stackoverflow.com/questions/6499880/ios-gesture-recognition-utilizing-accelerometer-and-gyroscope and https://stackoverflow.com/questions/6368618/store-orientation-to-an-array-and-compare

Your problem can be divided into three parts.

How to gather sensor data.

How to use the gathered data to train a model

How to use the trained model to make a prediction.

A modern smartphone contains around six sensors packed into one device. To implement your application I recommend that you use raw sensor data from either the gyroscope or the accelerometer.

On the android platform, you can access these sensors and acquire the raw sensor data by using the Android sensor framework. The Android sensor framework is part of the android.hardware package.

To capture the raw sensor data, take regular samples (> 20 Hz) and save the maximum values of x, y and z in an array each (to recognize in all 3 planes). If you want the gesture to be 5 seconds long, keep 100 samples (at 20 Hz). Then analyze if any of the three arrays has values which change sinusoidally. If it does, you've got a gesture.

You could store these values into an array, if the user is in 'record mode'. And when the user tries to replicate that movement, the model could predict the replicated movement array from the recorded one's. The thing is, how can you compare the arrays in a smart way? (Randy M 2011)

This leads us to the next step which is applying ML.

To train your model, you can choose to either train the model on the cloud or train it locally on the device. In most cases the problem of training a model on a mobile device is about computational limitations. Machine Learning algorithms running on a mobile device need to be carefully designed and implemented since most mobile devices have weak processors and small RAM's.

Its quite a challenge to squeeze a large neural network (NN) into the small RAM's that smartphones's have since NN's require that the model is fully loaded into memory. In many cases it is advisable to slim the model down and set some weights near zero to zero. (Chomba B 2017)

Incase you decide to utilize the cloud. Your mobile app is required to simply send an HTTPS request to a cloud web service along with the required raw sensor data, and within seconds the service replies with the prediction results. Though there are several cloud services i.e. Microsoft Azure Cognitive Services, Clarifai and Google Cloud Cognition that you can leverage to host the server side of your application, personally I recommend that you consider using reality.ai which is specifically an AI tool for engineers working with signals and sensors.

The next step will be to select an appropriate algorithm to be used in classifying the gesture. Here you can employ either Logistic Regression, Support Vector Machines, Random Forest or Extremely Randomized Trees depending on your app's use case. In order to train your model we then provide the learning algorithm with 'labeled examples'. The ML algorithm then extracts the features and constructs a mathematical model that can accurately describe gestures i.e. roll, pan from raw sensor data.

Thanks for the good description. I know about the Android sensors myself, just not the ML part. I wish there were some libraries or SDK that I could use for recognition experiments. — Tina J, Dec 31 '17 at 00:55

Use mobile device camera for moving pattern recognition

1 Answers1