6

I have implemented an MLP. Now, I want to train it to solve simple tasks.

Are there any data sets to train the MLP on simple tasks, that is, tasks with a small number of inputs and outputs?

I would like to train it to solve problems which are slightly more complex than the XOR problem.

nbro
  • 39,006
  • 12
  • 98
  • 176
  • Questions that ask for datasets are generally off-topic here. The most appropriate site to ask this type of question is probably Open Data SE. – nbro Jan 13 '21 at 14:21

4 Answers4

4

There are a ton of sample datasets our there you can play with. A bunch of good ones install with R in the datasets package. Luckily you can download them independently if you're not an R user. Try https://vincentarelbundock.github.io/Rdatasets/datasets.html

You might also be interested in the MNIST database which is one of the canonical databases used in handwriting recognition research.

Beyond that, you can look at / ask on http://datasets.reddit.com and/or http://opendata.reddit.com and you'll find all sorts of useful datasets.

And finally, don't overlook the UCI Machine Learning Repository.

mindcrime
  • 3,737
  • 14
  • 29
2

A popular dataset is the fisher iris dataset. It consists of 150 samples each with a dimensionality of 4. You can find it at http://archive.ics.uci.edu/ml/datasets/Iris

hh32
  • 175
  • 7
1

If you want to solve a multi-class classification problem, you could use the famous iris flower dataset, which was introduced by Fisher in 1936. In this dataset, each flower has (only) $4$ features (the inputs), namely

  • petal length,
  • petal width,
  • sepal length, and
  • sepal width

There are $3$ classes (the outputs)

  • iris setosa,
  • iris virginica, and
  • iris versicolor

And there are a total of $150$ observations (or records).

The iris flower dataset is available in sklearn. See, for example, Iris plants dataset.

To search for other datasets, you can also use https://toolbox.google.com/datasetsearch.

nbro
  • 39,006
  • 12
  • 98
  • 176
0

After almost three years the question is still relevant.

Let me add some too:

Deep Learning Datasets

The datasets from the above link can be used for benchmarking deep learning algorithms.

STL-10 dataset

An image dataset which is inspired by CIFAR-10 dataset

naive
  • 699
  • 6
  • 13