nnspike.models.nvidia

Classes

NvidiaModelMultiTask(num_modes)

A neural network model based on the NVIDIA architecture for end-to-end learning of self-driving cars.

NvidiaModelRegression()

A neural network model based on the NVIDIA architecture for end-to-end learning of self-driving cars.

class nnspike.models.nvidia.NvidiaModelRegression[source]

A neural network model based on the NVIDIA architecture for end-to-end learning of self-driving cars.

This model consists of five convolutional layers followed by four fully connected layers. The ELU activation function is used after each layer except the final output layer. Additionally, an interval input is concatenated with the flattened output from the convolutional layers before being passed through the fully connected layers.

conv1

First convolutional layer with 3 input channels and 24 output channels.

Type:

nn.Conv2d

conv2

Second convolutional layer with 24 input channels and 36 output channels.

Type:

nn.Conv2d

conv3

Third convolutional layer with 36 input channels and 48 output channels.

Type:

nn.Conv2d

conv4

Fourth convolutional layer with 48 input channels and 64 output channels.

Type:

nn.Conv2d

conv5

Fifth convolutional layer with 64 input channels and 64 output channels.

Type:

nn.Conv2d

flatten

Layer to flatten the output from the convolutional layers.

Type:

nn.Flatten

fc1

First fully connected layer with input size adjusted to include sensor inputs.

Type:

nn.Linear

fc2

Second fully connected layer.

Type:

nn.Linear

fc3

Third fully connected layer.

Type:

nn.Linear

mode_classifier

Output layer for behavior mode classification (4 modes).

Type:

nn.Linear

self_driving_head

Output layer for self-driving control.

Type:

nn.Linear

elu

Exponential Linear Unit activation function applied after each layer except the final output layer.

Type:

nn.ELU

softmax

Softmax activation for mode classification.

Type:

nn.Softmax

forward(x, left_x, right_x, relative_position)[source]

Defines the forward pass of the model. Takes an image tensor x and additional sensor inputs, processes them through the network, and returns two output tensors: mode classification and control.

Parameters:
  • x (torch.Tensor) – Input image tensor of shape (batch_size, 3, height, width).

  • left_x (torch.Tensor) – Left sensor input tensor of shape (batch_size, 1).

  • right_x (torch.Tensor) – Right sensor input tensor of shape (batch_size, 1).

  • relative_position (torch.Tensor) – Relative position tensor of shape (batch_size, 1).

Returns:

  • mode_output: Softmax probabilities for robot behavior modes (batch_size, 4) [left_x following, right_x following, obstacle avoidance, self driving]

  • control_output: Control tensor for self-driving mode (batch_size, 1)

Return type:

tuple[torch.Tensor, torch.Tensor]

__init__()[source]
forward(x, relative_position)[source]
Return type:

Tensor

class nnspike.models.nvidia.NvidiaModelMultiTask(num_modes)[source]

A neural network model based on the NVIDIA architecture for end-to-end learning of self-driving cars.

This model consists of five convolutional layers followed by four fully connected layers. The ELU activation function is used after each layer except the final output layer. Additionally, an interval input is concatenated with the flattened output from the convolutional layers before being passed through the fully connected layers.

conv1

First convolutional layer with 3 input channels and 24 output channels.

Type:

nn.Conv2d

conv2

Second convolutional layer with 24 input channels and 36 output channels.

Type:

nn.Conv2d

conv3

Third convolutional layer with 36 input channels and 48 output channels.

Type:

nn.Conv2d

conv4

Fourth convolutional layer with 48 input channels and 64 output channels.

Type:

nn.Conv2d

conv5

Fifth convolutional layer with 64 input channels and 64 output channels.

Type:

nn.Conv2d

flatten

Layer to flatten the output from the convolutional layers.

Type:

nn.Flatten

fc1

First fully connected layer with input size adjusted to include sensor inputs.

Type:

nn.Linear

fc2

Second fully connected layer.

Type:

nn.Linear

fc3

Third fully connected layer.

Type:

nn.Linear

mode_classifier

Output layer for behavior mode classification (4 modes).

Type:

nn.Linear

self_driving_head

Output layer for self-driving control.

Type:

nn.Linear

elu

Exponential Linear Unit activation function applied after each layer except the final output layer.

Type:

nn.ELU

softmax

Softmax activation for mode classification.

Type:

nn.Softmax

forward(x, left_x, right_x, relative_position)[source]

Defines the forward pass of the model. Takes an image tensor x and additional sensor inputs, processes them through the network, and returns two output tensors: mode classification and control.

Parameters:
  • x (torch.Tensor) – Input image tensor of shape (batch_size, 3, height, width).

  • left_x (torch.Tensor) – Left sensor input tensor of shape (batch_size, 1).

  • right_x (torch.Tensor) – Right sensor input tensor of shape (batch_size, 1).

  • relative_position (torch.Tensor) – Relative position tensor of shape (batch_size, 1).

Returns:

  • mode_output: Softmax probabilities for robot behavior modes (batch_size, 4) [left_x following, right_x following, obstacle avoidance, self driving]

  • control_output: Control tensor for self-driving mode (batch_size, 1)

Return type:

tuple[torch.Tensor, torch.Tensor]

__init__(num_modes)[source]
forward(x, relative_position)[source]
Return type:

tuple[Tensor, Tensor]