Questions tagged [object-detection]

For questions related to object detection (where objects can be e.g. humans, dogs, houses, etc.), whose meaning or definition can vary depending on the context. OD can refer to the task of locating (i.e. finding the coordinates) an object in an image (so, in this case, it would be a synonym for object localization) or the task of locating the object and classifying it (i.e. object localization + object classification).

224 questions
11
votes
3 answers

Is it difficult to learn the rotated bounding box for a (rotated) object?

I have checked out many methods and papers, like YOLO, SSD, etc., with good results in detecting a rectangular box around an object, However, I could not find any paper that shows a method that learns a rotated bounding box. Is it difficult to learn…
11
votes
0 answers

Extending FaceNet’s triplet loss to object recognition

FaceNet uses a novel loss metric (triplet loss) to train a model to output embeddings (128-D from the paper), such that any two faces of the same identity will have a small Euclidean distance, and such that any two faces of different identities will…
7
votes
2 answers

What's the role of bounding boxes in object detection?

I'm quite new to the field of computer vision and was wondering what are the purposes of having the boundary boxes in object detection. Obviously, it shows where the detected object is, and using a classifier can only classify one object per image,…
6
votes
1 answer

Formal definition of the Object Detection problem

For many problems in computer science, there is a formal, mathematical problem defition. Something like: Given ..., the problem is to ... How can the Object Detection problem (i.e. detecting objects on an image) be formally defined? Given a set of…
6
votes
1 answer

How does the region proposal method work in Fast R-CNN?

I read so many articles and the Fast R-CNN paper, but I'm still confused about how the region proposal method works in Fast R-CNN. As you can see in the image below, they say they used a proposal method, but it is not specified how it works. What…
6
votes
0 answers

Are there any easy ways to create annotated training images for object detection?

For the purposes of object detection, are there any easy ways to create annotated training images? For example, if we have $10,000$ images and want to draw bounding boxes on 2 objects for each image, do we have to physically draw those boxes? Is…
6
votes
0 answers

What are the differences between Yolo v1 and CenterNet?

I recently read a new paper (late 2019) about a one-shot object detector called CenterNet. Apart from this, I'm using Yolo (V3) one-shot detector, and what surprised me is the close similarity between Yolo V1 and CenterNet. First, both frameworks…
5
votes
1 answer

Why object detection algorithms are poor in optical character recognition?

OCR is still a very hard problem. We don't have universal powerful solutions. We use the CTC loss function An Intuitive Explanation of Connectionist Temporal Classification | Towards Data Science Sequence Modeling With CTC | Distill which is very…
5
votes
1 answer

Should I train different models for detecting subsets of objects?

Suppose we have $1000$ products that we want to detect. For each of these products, we have $500$ training images/annotations. Thus we have $500,000$ training images/associated annotations. If we want to train a good object detection algorithm to…
5
votes
1 answer

Do models train better if the labelling information is more specific (or dense)?

I'm working on a project where there is a limited dataset of videos (about 200). We want to train a model that can detect a single class in the videos. That class can be of multiple different types of shapes (thin wire, a huge area of the screen,…
5
votes
1 answer

Which neural network can count the number of objects in an image?

I'm looking for a neural network architecture that excels in counting objects. For example, CNN that can output the number of balls (or any other object) in a given image. I already found articles about crowd counting. I'm looking for articles about…
4
votes
1 answer

What loss function should one use for object detection, knowing that the input image contains exactly one target object?

What loss function should one use, knowing that the input image contains exactly one target object? I am currently using MSE to predict the center of ROI coordinates and its width and height. All values are relative to image size. I think that such…
4
votes
1 answer

How to add negative samples for object detection?

My question is: how to add certain negative samples to the training dataset to suppress those samples that are recognized as the object. For example, if I want to train a car detector. All my training images are outdoor images with at least one car.…
fnhdx
  • 143
  • 1
  • 4
4
votes
1 answer

How does Mask R-CNN automatically output a different number of objects on the image?

Recently, I was reading Pytorch's official tutorial about Mask R-CNN. When I run the code on colab, it turned out that it automatically outputs a different number of channels during prediction. If the image has 2 people on it, it would output a mask…
4
votes
2 answers

What is the meaning of "easy negatives" in the context of machine learning?

What does the term "easy negatives" exactly mean in the context of machine learning for a classification problem or any problem in general? From a quick google search, I think it means just negative examples in the training set. Can someone please…
1
2 3
14 15