Questions tagged [object-recognition]

For questions related to object recognition, which is the problem of determining the type/class/category of an object in the image, so object recognition could also be called object classification. This is different from object detection, which is either used to refer to object localization (i.e. find the coordinates of the object in the image) + object classification, or just object localization.

Object recognition – technology in the field of computer vision (can be any other domain also) for finding and identifying objects in an image or video sequence. Humans recognize a multitude of objects in images with little effort, despite the fact that the image of the objects may vary somewhat in different view points, in many different sizes and scales or even when they are translated or rotated. Objects can even be recognized when they are partially obstructed from view. This task is still a challenge for computer vision systems. Many approaches to the task have been implemented over multiple decades.

Object Recognition - Wikipedia

109 questions
20
votes
1 answer

Would Google's self-driving-car stop when it sees somebody with a T-shirt with a stop sign printed on it?

In Hidden Obstacles for Google’s Self-Driving Cars article we can read that: Google’s cars can detect and respond to stop signs that aren’t on its map, a feature that was introduced to deal with temporary signs used at construction sites. Google…
kenorb
  • 10,423
  • 3
  • 43
  • 91
11
votes
0 answers

Extending FaceNet’s triplet loss to object recognition

FaceNet uses a novel loss metric (triplet loss) to train a model to output embeddings (128-D from the paper), such that any two faces of the same identity will have a small Euclidean distance, and such that any two faces of different identities will…
7
votes
1 answer

What will happen when you place a fake speedsign on a highway?

I was wondering what will happen when somebody places a fake speedsign, of 10 miles per hour on a high way. Will a autonomous car slow down? Is this a current issue of autonomous cars?
7
votes
4 answers

What could an oscillating training loss curve represent?

I tried to create a simple model that receives an $80 \times 130$ pixel image. I only had 35 images and 10 test images. I trained this model for a binary classification task. The architecture of the model is described below. conv2d_1 (Conv2D) …
6
votes
2 answers

Are there any pretrained models for human recognition from all angles?

I need to be able to detect and track humans from all angles, especially above. There are, obviously, quite a few well-studied models for human detection and tracking, usually as part of general-purpose object detection, but I haven't been able to…
T3db0t
  • 161
  • 1
  • 4
6
votes
2 answers

Can one use an Artificial Neural Network to determine the size of an object in a photograph?

My question relates to but doesn't duplicate a question that has been asked here. I've Googled a lot for an answer to the question: Can you find the dimensions of an object in a photo if you don't know the distance between the lens and the object,…
6
votes
1 answer

How to detect LEGO bricks by using a deep learning approach?

In my thesis I dealt with the question how a computer can recognize LEGO bricks. With multiple object detection, I chose a deep learning approach. I also looked at an existing training set of LEGO brick images and tried to optimize it. My…
5
votes
2 answers

Can translational invariance of CNNs be unwanted if object is likely in certain positions?

Various texts on using CNNs for object detection in images talk about how their translation invariance is a good thing. Which makes sense for tasks where the object could be anywhere in the image. Let's say detecting a kitten in household…
5
votes
1 answer

In YOLO, when is $\mathbb{1}_{i j}^{\mathrm{obj}} = 1$, and what are the ground-truth labels for $x_i$ and $y_i$?

I'm trying to implement a custom version of the YOLO neural network. Originally, it was described in the paper You Only Look Once: Unified, Real-Time Object Detection (2016). I have some problems understanding the loss function they used. Basic…
5
votes
1 answer

Precise localization and characterization of rudimentary shapes with neural networks

I understand that there are flavors of (convolutional) neural networks that are useful for object localization and detection tasks of reasonable difficulty. In all of the examples I have seen so far, localization is formulated as finding the corners…
5
votes
1 answer

Why object detection algorithms are poor in optical character recognition?

OCR is still a very hard problem. We don't have universal powerful solutions. We use the CTC loss function An Intuitive Explanation of Connectionist Temporal Classification | Towards Data Science Sequence Modeling With CTC | Distill which is very…
4
votes
1 answer

How data augmentation like rotation affects the quality of detection?

I'm using an object detection neural network and I employ data augmentation to increase a little my small dataset. More specifically I do rotation, translation, mirroring and rescaling. I notice that rotating an image (and thus it's bounding box)…
4
votes
0 answers

What are the ways to calculate the error rate of a deep Convolutional Neural Network, when the network produces different results using the same data?

I am new to the object recognition community. Here I am asking about the broadly accepted ways to calculate the error rate of a deep CNN when the network produces different results using the same data. 1. Problem introduction Recently I was trying…
4
votes
2 answers

Alternative to sliding window neural network (was: Object detect (or) image classification at specific locations in the frame)

Recent advances in Deeplearning and dedicated hardware has made it possible to detect images with a much better accuracy than ever. Neural networks are the gold standard for computer vision application and are used widely in the industry, for…
4
votes
1 answer

What is the difference between pixel-based object recognition and feature-based object recognition?

From my understanding and text I found in research papers online : Pixel-based object recognition: neural networks are trained to locate individual objects based directly on pixel data. Feature-based object recognition: contents of a window are…
1
2 3 4 5 6 7 8