1

Below are the two tensors

[ 73.,  67.,  43.]    
[ 91.,  88.,  64.],
[ 87., 134.,  58.],
[102.,  43.,  37.],
[ 69.,  96.,  70.]

[ 56.,  70.],
[ 81., 101.],
[119., 133.],
[ 22.,  37.],
[103., 119.]

These are the weight that are added

Weights and biases

 w = torch.randn(2, 3, requires_grad=True)
 b = torch.randn(2, requires_grad=True)

I am not able to understand how the size of tensors are decided for weight and biases. Is there common rule that we should follow while adding weight and biases for our model

ZKS
  • 121
  • 3

1 Answers1

3

The size of the parameters tensor is depended on what type of layer that you want to build. Convolutional, fully connected, attention or even custom layer, each layer has a difference in the way it treats input, reading the documents is the good way to start (CS231n of Stanford University describes in detail each layer's properties).

In your case, the layer name is the fully-connected layer (or dense, linear layer in other documents) which converts an m-dimension input vector to an n-dimension output vector. All m nodes are mapped to n nodes by a $m\times n$ matrix (that's why its name is fully-connected) and the bias vector, in short, is to help the learned function be more flexible.

Therefore, the rule to decide the size of weight and bias is the size of input and target vector. Below is a simple example that builds a Linear layer from scratch:

import torch
class Example_Linear(torch.nn.Module):
    '''
    An example of Linear layer:
    Args:
        :param in_d: number of input dimensions
        :param out_d: number of output dimensions
        :param bias: whether to use bias with the linear layer
    '''
    def __init__(self, in_d, out_d, bias=True):
        super().__init__()
        self.weight = torch.nn.Parameter(torch.randn(out_d, in_d))
        self.bias = None
        if bias:
            self.bias = torch.nn.Parameter(torch.FloatTensor(out_d).fill_(0))
        
    def forward(self, x):
        out = torch.matmul(x,self.weight.T)
        out += self.bias
        return out

in_dimension = 10
out_dimension = 1
model = Example_Linear(in_d=in_dimension, out_d=out_dimension, bias=True)

Or you can simply implement fully-connected layer by one line:

model = torch.nn.Linear(in_dimension, out_dimension, bias=True)
CuCaRot
  • 892
  • 3
  • 15