nnspike.models.customized

Classes

SimpleNetClassification25(num_classes)

A simple convolutional neural network for processing image data with additional sensor inputs.

class nnspike.models.customized.SimpleNetClassification25(num_classes)[source]

A simple convolutional neural network for processing image data with additional sensor inputs.

This network consists of two convolutional layers followed by max pooling, and two fully connected layers. It accepts both image data and relative position information as inputs. The architecture is designed to handle input images and concatenate them with additional sensor data before final classification.

Architecture:

Conv2d (3->8 channels, 5x5 kernel, stride=2) + ReLU + MaxPool2d (2x2)
Conv2d (8->16 channels, 5x5 kernel, stride=1, padding=2) + ReLU + MaxPool2d (2x2)
Flatten + Concatenate with relative position
Linear (2689->64) + ReLU
Linear (64->1) output

Expected input image size: (61, 197) which gets processed to (16, 7, 24) after convolutions.

__init__(num_classes)[source]

Initialize the SimpleNetClassification25 model.

Sets up all layers including convolutional layers, pooling, and fully connected layers. The input size calculation assumes input images of size (61, 197).

forward(x, relative_position)[source]

Forward pass through the network.

Parameters:

x (torch.Tensor) – Input image tensor of shape (batch_size, 3, height, width). Expected input size is (batch_size, 3, 61, 197).
relative_position (torch.Tensor) – Relative position sensor data of shape (batch_size,) or (batch_size, 1). This additional sensor input is concatenated with the flattened convolutional features.

Returns:

Output logits of shape (batch_size, 1). These are raw: output values that can be used for regression or passed through a sigmoid for binary classification.

Return type:

torch.Tensor

Note

The network expects input images of size (61, 197). After the first conv+pool operation, the spatial dimensions become approximately (32, 32), and after the second conv+pool operation, they become (16, 16). The comment dimensions may not be accurate for all input sizes.