Deep learning visualizer

Dropout Layer Visualizer

See how randomly disabling neurons reduces overfitting.

Layer8 neuronsDropout rate0.40Active6ModeTraining
Input activations (4 values)
Training Pass

Random Mask Applied

Pass 4 / 12
Dropout layer network diagramInput Layer(4)Hidden Layer(8)Output Layer(3)x10.7x2-0.3x31.2x40.511.8421.2231.6240.9451.0661.627080y10.78y20.75y30.88
Original activations1.10.730.970.560.640.970.571.03
Dropout mask11111100
Scaled output1.841.221.620.941.061.6200
y = (x x mask) / (1 - p)neuron 1: 1.1 x 1 / 0.6 = 1.84
TrainingDropout layer network diagramInput Layer(4)Hidden Layer(8)Output Layer(3)x10.7x2-0.3x31.2x40.511.8421.2231.6240.9451.0661.627080y10.78y20.75y30.88
Different mask each passSame expected activation
InferenceDropout layer network diagramInput Layer(4)Hidden Layer(8)Output Layer(3)x10.7x2-0.3x31.2x40.511.120.7330.9740.5650.6460.9770.5781.03y10.74y20.71y30.76

Why dropout works

  1. Randomly dropping neurons prevents co-adaptations.
  2. The network learns redundant, more robust features.
  3. At inference, all neurons contribute to the prediction.
  4. This reduces overfitting and improves generalization.

Regularization effect: dropout approximates training an ensemble of many thinned networks and averaging their predictions.

Implementation

  1. import torch.nn as nn
  2. dropout = nn.Dropout(p=0.4)
  3. model.train() # dropout is active
  4. y_train = dropout(hidden)
  5. model.eval() # dropout is disabled