Highest Voted 'architecture' Questions - Artificial Intelligence Stack Exchange

19

votes

2 answers

Are Modular Neural Networks more effective than large, monolithic networks at any tasks?

Modular/Multiple Neural networks (MNNs) revolve around training smaller, independent networks that can feed into each other or another higher network. In principle, the hierarchical organization could allow us to make sense of more complex problem…

asked Dec 02 '18 at 21:09

Harsh Sikka

191
2

16

votes

2 answers

How can I automate the choice of the architecture of a neural network for an arbitrary problem?

Assume that I want to solve an issue with a neural network that either I can't fit to existing architectures (perceptron, Konohen, etc) or I'm simply not aware of the existence of those or I'm unable to understand their mechanics and I rely on my…

neural-networks reference-request hyperparameter-optimization architecture neuroevolution

asked Aug 05 '16 at 21:29

Zoltán Schmidt

623
7
14

11

votes

1 answer

Why is the merged neural network of AlphaGo Zero more efficient than two separate neural networks?

AlphaGo Zero contains several improvements compared to its predecessors. Architectural details of Alpha Go Zero can be seen in this cheat sheet. One of those improvements is using a single neural network that calculates move probabilities and the…

neural-networks comparison architecture alphago-zero efficiency

asked Oct 30 '17 at 23:01

Demento

1,684
1
7
26

9

votes

4 answers

Should neural nets be deeper the more complex the learning problem is?

I know it's not an exact science. But would you say that generally for more complicated tasks, deeper nets are required?

neural-networks deep-learning architecture

asked Apr 27 '20 at 13:21

Gilad Deutsch

629
5
12

7

votes

2 answers

Why do very deep non resnet architectures perform worse compared to shallower ones for the same iteration? Shouldn't they just train slower?

My understanding of the vanishing gradient problem in deep networks is that as backprop progresses through the layers the gradients become small, and thus training progresses slower. I'm having a hard time reconciling this understanding with images…

deep-learning training backpropagation architecture

asked Sep 28 '19 at 18:25

Intent Filters

71
1

7

votes

1 answer

How do neural network topologies affect GPU/TPU acceleration?

I was thinking about different neural network topologies for some applications. However, I am not sure how this would affect the efficiency of hardware acceleration using GPU/TPU/some other chip. If, instead of layers that would be fully connected,…

neural-networks convolutional-neural-networks training architecture hardware

asked Sep 23 '19 at 14:05

user2316602

173
4

6

votes

1 answer

Are there well-established ways of mixing different inputs (e.g. image and numbers)?

I am interested in the possibility of having extra input along with the main data. For instance, a medical application that would rely mostly on an image: how could one also account for sex, age, etc.? It is certainly possible to put the output of…

convolutional-neural-networks architecture

asked Oct 30 '19 at 09:22

Mathieu Bouville

241
1
7

5

votes

2 answers

Why do Transformers have a sequence limit at inference time?

As far as I understand, Transformer's time complexity increases quadratically with respect to the sequence length. As a result, during training to make training feasible, a maximum sequence limit is set, and to allow batching, all sequences smaller…

machine-learning natural-language-processing transformer architecture sequence-modeling

asked Nov 26 '21 at 15:32

chessprogrammer

2,215
2
12
23

5

votes

2 answers

What's the difference between architectures and backbones?

In the paper "ForestNet: Classifying Drivers of Deforestation in Indonesia using Deep Learning on Satellite Imagery", the authors talk about using: Feature Pyramid Networks (as the architecture) EfficientNet-B2 (as the backbone) Performance…

deep-learning convolutional-neural-networks comparison terminology architecture

asked Dec 18 '20 at 13:57

codinggirl123

51
1

5

votes

1 answer

How to create an AI to solve a word search?

This at first sounds ridiculous. Of course there is an easy way to write a program to solve a wordsearch. But what I would like to do is write a program that solves a word-search like a human. That is, use or invent different strategies. e.g.…

deep-learning search architecture reasoning

asked Nov 10 '19 at 20:00

zooby

2,196
1
11
21

4

votes

2 answers

Are Neural Net architectures accidental discoveries?

Recently, I have been learning about new neural networks, which are used for specialized purposes, like speech recognition, image recognition, etc. The more I discover the more I get amazed by the cleverness behind models such as RNN's and CNN's.…

neural-networks ai-design research architecture

asked Jun 21 '18 at 12:00

user9947

4

votes

2 answers

Which neural network can I use to solve this constrained optimisation problem?

Let $\mathcal{S}$ be the training data set, where each input $u^i \in \mathcal{S}$ has $d$ features. I want to design an ANN so that the cost function below is minimized (the sum of the square of pairwise differences between model outputs) and the…

neural-networks machine-learning architecture model-request constrained-optimization

asked May 01 '21 at 20:57

user3489173

179
6

4

votes

1 answer

What is a unified neural network model?

In many articles (for example, in the YOLO paper, this paper or this one), I see the term "unified" being used. I was wondering what the meaning of "unified" in this case is.

neural-networks deep-learning terminology architecture yolo

asked Dec 27 '20 at 20:13

Reactionic

63
3

4

votes

2 answers

Is a basic neural network architecture better with small datasets?

I'm currently trying to predict 1 output value with 52 input values. The problem is that I only have around 100 rows of data that I can use. Will I get more accurate results when I use a small architecture than when I use multiple layers with a…

neural-networks ai-design datasets regression architecture

asked Mar 05 '20 at 13:04

Yari Nowicki

73
3

4

votes

1 answer

Get the position of an object, out of an image

I have some images with a fixed background and a single object on them which is placed, in each image, at a different position on that background. I want to find a way to extract, in an unsupervised way, the positions of that object. For example,…

neural-networks convolutional-neural-networks image-recognition architecture

asked Sep 17 '19 at 04:13

Silviu-Marian Udrescu

41
2

Questions tagged [architecture]