What is the mathematics behind the architecture of AlexNet?

328 Views Asked by Bumbble Comm At 25 Mar 2026 - 10:04

I've been trying to learn more about convolutional neural networks (coming from an SVM background) and I've been struggling with understanding how decisions were made when designing some of the leading architectures like VGG's, ResNet, and the like. Decisions like dropout rates, the depth of the network, the sizes of the kernels, strides, using overlapping pooling, etc. I know this is a loaded question, so maybe we can restrict this to the fire-starter: AlexNet. Section 3.4 of the linked paper gives the following reason for using overlapping pooling:

"We generally observe during training that models with overlapping pooling find it slightly more difficult to overfit."

Is there some mathematical justification for this, or is it just throw-stuff-at-the-wall-and-see-what-sticks? I feel like I'm missing something obvious.

Original Q&A

What is the mathematics behind the architecture of AlexNet?

Related Questions in MACHINE-LEARNING

Related Questions in NEURAL-NETWORKS

Trending Questions

Popular # Hahtags

Popular Questions