Training a neural network to produce a one-hot encoding vector out of a single feature

Question

I would like to build a neural network that takes a natural number and generates a one-hot encoding vector corresponding to that number.

Example: $2 \rightarrow (0,0,1,0,\dots)$

More formally, I want it to take an input $i \in [0, \dots, K]$ and produce an output vector $(o_0, \dots, o_j, \dots, o_k)$, where all $o_j$ are 0, except $o_i$, which is 1 (i.e. one-hot encoding).

However, I am not sure on what the specific architecture should be. I believe it should have more than one layer, since there is no linear relationship between the input and the output.

I have tried several architectures, but the best performance I have been able to get is 33%. I am using the following code:

import pandas as pd
import torch
import torch.nn as nn
import torch.optim as optim
from sklearn.model_selection import train_test_split
import random
random.seed(0)
torch.manual_seed(0)

class Net(nn.Module):
    def __init__(self):
        super(Net, self).__init__()
        self.fc1 = nn.Linear(1, 55)
        self.fc2 = nn.Linear(55, 30)
        self.fc3 = nn.Linear(30, 25)
        self.relu = nn.ReLU()
        self.dropout = nn.Dropout(p=0.2)
        self.softmax = nn.Softmax(dim=0)

    def forward(self, x):
        x = self.fc1(x)
        x = self.relu(x)
        x = self.dropout(x)
        x = self.fc2(x)
        x = self.relu(x)
        x = self.dropout(x)
        x = self.fc3(x)
        x = self.softmax(x)
        return x
    
net = Net()

# A list of 10000 random numbers between 0 and 24
dataset = pd.DataFrame([random.randint(0, 24) for _ in range(10000)])
dataset['label'] = dataset[0] # The label is the same as the input

train, test = train_test_split(dataset, test_size=0.2)

criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(net.parameters(), lr=0.001)

for epoch in range(5):
    # Train
    running_loss = 0.0
    for i in range(len(train)):
        inputs = train.iloc[i, 1:].tolist()
        inputs = torch.tensor(inputs, dtype=torch.float)
        labels = train.iloc[i, 0]
        labels = torch.tensor(labels, dtype=torch.long)

        optimizer.zero_grad()
        outputs = net(inputs)
        loss = criterion(outputs, labels)
        loss.backward()
        optimizer.step()

    # Evaluate
    correct = 0
    total = 0
    with torch.no_grad():
        for i in range(len(test)):
            inputs = test.iloc[i, 1:].tolist()
            inputs = torch.tensor(inputs, dtype=torch.float)
            labels = test.iloc[i, 0]
            labels = torch.tensor(labels, dtype=torch.long)
            outputs = net(inputs)
            predicted = outputs.argmax()
            total += 1
            correct += (predicted == labels)
    
    print('Accuracy of the network on the test set: {:.2f}%'.format(100 * correct / total))

I believe my task is pretty simple, but I just can't think of the right architecture to do it. Do you have any ideas? :)

I'm not familiar with pandas but how does this dataframe layout work? Is this training with only 1 number per batch? — user253751, Mar 13 '23 at 20:23
I think this generally leads to suboptimal training unless the learning rate is set very low. In each training step the network converges towards whichever number was presented in that step, with nothing making sure the other numbers also keep working. — user253751, Mar 15 '23 at 17:10

Training a neural network to produce a one-hot encoding vector out of a single feature

0 Answers0