torch.nn.init¶
- 
torch.nn.init.calculate_gain(nonlinearity, param=None)[source]¶
- Return the recommended gain value for the given nonlinearity function. The values are as follows: - nonlinearity - gain - Linear / Identity - Conv{1,2,3}D - Sigmoid - Tanh - ReLU - Leaky Relu - SELU - Warning - In order to implement Self-Normalizing Neural Networks , you should use - nonlinearity='linear'instead of- nonlinearity='selu'. This gives the initial weights a variance of- 1 / N, which is necessary to induce a stable fixed point in the forward pass. In contrast, the default gain for- SELUsacrifices the normalisation effect for more stable gradient flow in rectangular layers.- Parameters
- nonlinearity – the non-linear function (nn.functional name) 
- param – optional parameter for the non-linear function 
 
 - Examples - >>> gain = nn.init.calculate_gain('leaky_relu', 0.2) # leaky_relu with negative_slope=0.2 
- 
torch.nn.init.uniform_(tensor, a=0.0, b=1.0)[source]¶
- Fills the input Tensor with values drawn from the uniform distribution . - Parameters
- tensor – an n-dimensional torch.Tensor 
- a – the lower bound of the uniform distribution 
- b – the upper bound of the uniform distribution 
 
 - Examples - >>> w = torch.empty(3, 5) >>> nn.init.uniform_(w) 
- 
torch.nn.init.normal_(tensor, mean=0.0, std=1.0)[source]¶
- Fills the input Tensor with values drawn from the normal distribution . - Parameters
- tensor – an n-dimensional torch.Tensor 
- mean – the mean of the normal distribution 
- std – the standard deviation of the normal distribution 
 
 - Examples - >>> w = torch.empty(3, 5) >>> nn.init.normal_(w) 
- 
torch.nn.init.constant_(tensor, val)[source]¶
- Fills the input Tensor with the value . - Parameters
- tensor – an n-dimensional torch.Tensor 
- val – the value to fill the tensor with 
 
 - Examples - >>> w = torch.empty(3, 5) >>> nn.init.constant_(w, 0.3) 
- 
torch.nn.init.ones_(tensor)[source]¶
- Fills the input Tensor with the scalar value 1. - Parameters
- tensor – an n-dimensional torch.Tensor 
 - Examples - >>> w = torch.empty(3, 5) >>> nn.init.ones_(w) 
- 
torch.nn.init.zeros_(tensor)[source]¶
- Fills the input Tensor with the scalar value 0. - Parameters
- tensor – an n-dimensional torch.Tensor 
 - Examples - >>> w = torch.empty(3, 5) >>> nn.init.zeros_(w) 
- 
torch.nn.init.eye_(tensor)[source]¶
- Fills the 2-dimensional input Tensor with the identity matrix. Preserves the identity of the inputs in Linear layers, where as many inputs are preserved as possible. - Parameters
- tensor – a 2-dimensional torch.Tensor 
 - Examples - >>> w = torch.empty(3, 5) >>> nn.init.eye_(w) 
- 
torch.nn.init.dirac_(tensor, groups=1)[source]¶
- Fills the {3, 4, 5}-dimensional input Tensor with the Dirac delta function. Preserves the identity of the inputs in Convolutional layers, where as many input channels are preserved as possible. In case of groups>1, each group of channels preserves identity - Parameters
- tensor – a {3, 4, 5}-dimensional torch.Tensor 
- groups (optional) – number of groups in the conv layer (default: 1) 
 
 - Examples - >>> w = torch.empty(3, 16, 5, 5) >>> nn.init.dirac_(w) >>> w = torch.empty(3, 24, 5, 5) >>> nn.init.dirac_(w, 3) 
- 
torch.nn.init.xavier_uniform_(tensor, gain=1.0)[source]¶
- Fills the input Tensor with values according to the method described in Understanding the difficulty of training deep feedforward neural networks - Glorot, X. & Bengio, Y. (2010), using a uniform distribution. The resulting tensor will have values sampled from where - Also known as Glorot initialization. - Parameters
- tensor – an n-dimensional torch.Tensor 
- gain – an optional scaling factor 
 
 - Examples - >>> w = torch.empty(3, 5) >>> nn.init.xavier_uniform_(w, gain=nn.init.calculate_gain('relu')) 
- 
torch.nn.init.xavier_normal_(tensor, gain=1.0)[source]¶
- Fills the input Tensor with values according to the method described in Understanding the difficulty of training deep feedforward neural networks - Glorot, X. & Bengio, Y. (2010), using a normal distribution. The resulting tensor will have values sampled from where - Also known as Glorot initialization. - Parameters
- tensor – an n-dimensional torch.Tensor 
- gain – an optional scaling factor 
 
 - Examples - >>> w = torch.empty(3, 5) >>> nn.init.xavier_normal_(w) 
- 
torch.nn.init.kaiming_uniform_(tensor, a=0, mode='fan_in', nonlinearity='leaky_relu')[source]¶
- Fills the input Tensor with values according to the method described in Delving deep into rectifiers: Surpassing human-level performance on ImageNet classification - He, K. et al. (2015), using a uniform distribution. The resulting tensor will have values sampled from where - Also known as He initialization. - Parameters
- tensor – an n-dimensional torch.Tensor 
- a – the negative slope of the rectifier used after this layer (only used with - 'leaky_relu')
- mode – either - 'fan_in'(default) or- 'fan_out'. Choosing- 'fan_in'preserves the magnitude of the variance of the weights in the forward pass. Choosing- 'fan_out'preserves the magnitudes in the backwards pass.
- nonlinearity – the non-linear function (nn.functional name), recommended to use only with - 'relu'or- 'leaky_relu'(default).
 
 - Examples - >>> w = torch.empty(3, 5) >>> nn.init.kaiming_uniform_(w, mode='fan_in', nonlinearity='relu') 
- 
torch.nn.init.kaiming_normal_(tensor, a=0, mode='fan_in', nonlinearity='leaky_relu')[source]¶
- Fills the input Tensor with values according to the method described in Delving deep into rectifiers: Surpassing human-level performance on ImageNet classification - He, K. et al. (2015), using a normal distribution. The resulting tensor will have values sampled from where - Also known as He initialization. - Parameters
- tensor – an n-dimensional torch.Tensor 
- a – the negative slope of the rectifier used after this layer (only used with - 'leaky_relu')
- mode – either - 'fan_in'(default) or- 'fan_out'. Choosing- 'fan_in'preserves the magnitude of the variance of the weights in the forward pass. Choosing- 'fan_out'preserves the magnitudes in the backwards pass.
- nonlinearity – the non-linear function (nn.functional name), recommended to use only with - 'relu'or- 'leaky_relu'(default).
 
 - Examples - >>> w = torch.empty(3, 5) >>> nn.init.kaiming_normal_(w, mode='fan_out', nonlinearity='relu') 
- 
torch.nn.init.orthogonal_(tensor, gain=1)[source]¶
- Fills the input Tensor with a (semi) orthogonal matrix, as described in Exact solutions to the nonlinear dynamics of learning in deep linear neural networks - Saxe, A. et al. (2013). The input tensor must have at least 2 dimensions, and for tensors with more than 2 dimensions the trailing dimensions are flattened. - Parameters
- tensor – an n-dimensional torch.Tensor, where 
- gain – optional scaling factor 
 
 - Examples - >>> w = torch.empty(3, 5) >>> nn.init.orthogonal_(w) 
- 
torch.nn.init.sparse_(tensor, sparsity, std=0.01)[source]¶
- Fills the 2D input Tensor as a sparse matrix, where the non-zero elements will be drawn from the normal distribution , as described in Deep learning via Hessian-free optimization - Martens, J. (2010). - Parameters
- tensor – an n-dimensional torch.Tensor 
- sparsity – The fraction of elements in each column to be set to zero 
- std – the standard deviation of the normal distribution used to generate the non-zero values 
 
 - Examples - >>> w = torch.empty(3, 5) >>> nn.init.sparse_(w, sparsity=0.1)