Highest Voted 'computer-vision' Questions - Artificial Intelligence Stack Exchange

85

votes

9 answers

How is it possible that deep neural networks are so easily fooled?

The following page/study demonstrates that the deep neural networks are easily fooled by giving high confidence predictions for unrecognisable images, e.g. How this is possible? Can you please explain ideally in plain English?

asked Aug 02 '16 at 17:05

kenorb

10,423
3
43
91

17

votes

1 answer

Are information processing rules from Gestalt psychology still used in computer vision today?

Decades ago there were and are books in machine vision, which by implementing various information processing rules from gestalt psychology, got impressive results with little code or special hardware in image identification and visual…

machine-learning algorithm computer-vision

asked Oct 29 '16 at 06:49

Gottfried William

343
1
11

17

votes

1 answer

What is a fully convolution network?

I was surveying some literature related to Fully Convolutional Networks and came across the following phrase, A fully convolutional network is achieved by replacing the parameter-rich fully connected layers in standard CNN architectures by…

machine-learning convolutional-neural-networks computer-vision image-segmentation fully-convolutional-networks

asked Jun 12 '20 at 01:35

r4bb1t

305
1
2
8

13

votes

3 answers

Is it possible to train a neural network to estimate a vehicle's length?

I have a large dataset (over 100k samples) of vehicles with the ground truth of their lengths. Is it possible to train a deep network to measure/estimate vehicle length? I haven't seen any papers related to estimating object size using a deep neural…

machine-learning deep-learning computer-vision training reference-request

asked Oct 16 '17 at 18:10

Naji

139
1
1
3

11

votes

3 answers

Is it difficult to learn the rotated bounding box for a (rotated) object?

I have checked out many methods and papers, like YOLO, SSD, etc., with good results in detecting a rectangular box around an object, However, I could not find any paper that shows a method that learns a rotated bounding box. Is it difficult to learn…

convolutional-neural-networks computer-vision object-detection yolo

asked Jan 11 '19 at 15:00

Ankish Bansal

253
1
2
8

11

votes

1 answer

In Computer Vision, what is the difference between a transformer and attention?

Having been studying computer vision for a while, I still cannot understand what the difference between a transformer and attention is?

computer-vision comparison transformer attention

asked Jul 25 '21 at 04:01

novice

113
1
4

11

votes

2 answers

Do deep learning algorithms represent ensemble-based methods?

According to the Wikipedia article on deep learning: Deep learning is a branch of machine learning based on a set of algorithms that attempt to model high-level abstractions in data by using a deep graph with multiple processing layers, composed of…

neural-networks deep-learning convolutional-neural-networks computer-vision ensemble-learning

asked Sep 15 '16 at 15:34

Erba Aitbayev

357
1
10

11

votes

0 answers

Extending FaceNet’s triplet loss to object recognition

FaceNet uses a novel loss metric (triplet loss) to train a model to output embeddings (128-D from the paper), such that any two faces of the same identity will have a small Euclidean distance, and such that any two faces of different identities will…

deep-learning computer-vision object-recognition object-detection facial-recognition

asked Aug 22 '19 at 22:10

Benedict Aaron Tjandra

111
4

9

votes

1 answer

Why does nobody use decision trees for visual question answering?

I'm starting a project that will involve computer vision, visual question answering, and explainability. I am currently choosing what type of algorithm to use for my classifier - a neural network or a decision tree. It would seem to me that, because…

neural-networks computer-vision decision-trees explainable-ai question-answering

asked Apr 26 '18 at 11:41

The Impossible Squish

231
1
5

9

votes

1 answer

In YOLO, what exactly do the values associated with each anchor box represent?

I'm going through Andrew NG's course, which talks about YOLO, but he doesn't go into the implementation details of anchor boxes. After having looked through the code, each anchor box is represented by two values, but what exactly are these values…

neural-networks convolutional-neural-networks computer-vision yolo

asked Jan 24 '18 at 01:46

moondra

209
2
4

9

votes

1 answer

What are sim2sim, sim2real and real2real?

Recently, I always hear about the terms sim2sim, sim2real and real2real. Will anyone explain the meaning/motivation of these terms (in DL/RL research community)? What are the challenges in this research area? Anything intuitive would be appreciated!

deep-learning reinforcement-learning computer-vision terminology robotics

asked Oct 14 '19 at 09:27

wcc

91
1
4

8

votes

3 answers

What are the state-of-the-art approaches for detecting the most important "visual attention" area of an image?

I'm trying to detect the visual attention area in a given image and crop the image into that area. For instance, given an image of any size and a rectangle of say $L \times W$ dimension as an input, I would like to crop the image to the most…

machine-learning deep-learning computer-vision reference-request state-of-the-art

asked Jun 15 '18 at 14:32

Tina J

973
6
13

8

votes

3 answers

Is it okay to use publicly available Instagram videos to train an AI?

Since I haven't found any good training data for my university project, I want to use pictures and videos from public Instagram profiles. Am I allowed to do that?

computer-vision training datasets research image-processing

asked Sep 22 '21 at 12:04

Bert Gayus

545
3
12

8

votes

2 answers

What are the main algorithms used in computer vision?

Nowadays, CV has really achieved great performance in many different areas. However, it is not clear what a CV algorithm is. What are some examples of CV algorithms that are commonly used nowadays and have achieved state-of-the-art performance?

computer-vision terminology algorithm definitions image-processing

asked Jun 17 '20 at 15:12

Pluviophile

1,223
5
17
37

7

votes

2 answers

Term for algorithms that are not trained

Before the advent of neural architectures, many AI domains (e.g. speech recognition and computer vision) used algorithms that consisted of a series of hand-crafted transformations for feature extraction. In speech recognition everything to do with…

computer-vision terminology algorithm speech-recognition

asked Mar 22 '23 at 11:09

Mew

181
2

Questions tagged [computer-vision]