0

I would like to build a neural network that takes a natural number and generates a one-hot encoding vector corresponding to that number.

Example: $2 \rightarrow (0,0,1,0,\dots)$

More formally, I want it to take an input $i \in [0, \dots, K]$ and produce an output vector $(o_0, \dots, o_j, \dots, o_k)$, where all $o_j$ are 0, except $o_i$, which is 1 (i.e. one-hot encoding).

However, I am not sure on what the specific architecture should be. I believe it should have more than one layer, since there is no linear relationship between the input and the output.

I have tried several architectures, but the best performance I have been able to get is 33%. I am using the following code:

import pandas as pd
import torch
import torch.nn as nn
import torch.optim as optim
from sklearn.model_selection import train_test_split
import random
random.seed(0)
torch.manual_seed(0)

class Net(nn.Module):
    def __init__(self):
        super(Net, self).__init__()
        self.fc1 = nn.Linear(1, 55)
        self.fc2 = nn.Linear(55, 30)
        self.fc3 = nn.Linear(30, 25)
        self.relu = nn.ReLU()
        self.dropout = nn.Dropout(p=0.2)
        self.softmax = nn.Softmax(dim=0)

    def forward(self, x):
        x = self.fc1(x)
        x = self.relu(x)
        x = self.dropout(x)
        x = self.fc2(x)
        x = self.relu(x)
        x = self.dropout(x)
        x = self.fc3(x)
        x = self.softmax(x)
        return x
    
net = Net()

# A list of 10000 random numbers between 0 and 24
dataset = pd.DataFrame([random.randint(0, 24) for _ in range(10000)])
dataset['label'] = dataset[0] # The label is the same as the input

train, test = train_test_split(dataset, test_size=0.2)

criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(net.parameters(), lr=0.001)

for epoch in range(5):
    # Train
    running_loss = 0.0
    for i in range(len(train)):
        inputs = train.iloc[i, 1:].tolist()
        inputs = torch.tensor(inputs, dtype=torch.float)
        labels = train.iloc[i, 0]
        labels = torch.tensor(labels, dtype=torch.long)

        optimizer.zero_grad()
        outputs = net(inputs)
        loss = criterion(outputs, labels)
        loss.backward()
        optimizer.step()

    # Evaluate
    correct = 0
    total = 0
    with torch.no_grad():
        for i in range(len(test)):
            inputs = test.iloc[i, 1:].tolist()
            inputs = torch.tensor(inputs, dtype=torch.float)
            labels = test.iloc[i, 0]
            labels = torch.tensor(labels, dtype=torch.long)
            outputs = net(inputs)
            predicted = outputs.argmax()
            total += 1
            correct += (predicted == labels)
    
    print('Accuracy of the network on the test set: {:.2f}%'.format(100 * correct / total))

I believe my task is pretty simple, but I just can't think of the right architecture to do it. Do you have any ideas? :)

Aldan Creo
  • 101
  • 3

0 Answers0