ImageNet Classification with Weight Standardization

This project is our implementation of Weight Standardization for ImageNet classification with ResNet and ResNeXt. The project is forked from pytorch-classification. Their original README.md is appended at the end.

Weight Standardization is a simple reparameterization method for convolutional layers. It enables micro-batch training with Group Normalization (GN) to match the performances of Batch Normalization (BN) trained with large-batch sizes. Please see our arXiv report for the details. If you find this project helpful, please consider citing our paper.

@article{weightstandardization,
  author    = {Siyuan Qiao and Huiyu Wang and Chenxi Liu and Wei Shen and Alan Yuille},
  title     = {Weight Standardization},
  journal   = {arXiv preprint arXiv:1903.10520},
  year      = {2019},
}

Performances of Weight Standardization (WS) on ImageNet

Architecture	Method	Top-1	Top-5	Pretrained
ResNet-50	GN + WS	23.72	6.99	Link
ResNet-101	GN + WS	22.10	6.07	Link
ResNeXt-50	GN + WS	22.71	6.38	Link
ResNeXt-101	GN + WS	21.80	6.03	Link

Training

NOTE: In reality we do not use batch size 1 per GPU for training since it is so slow. Because GN+WS does not use any batch knowledge, setting batch size to 256 and iteration size to 1 (for example) is equivalent to setting batch size to 1 and iteration size to 256. Therefore, to speed up training, we set batch size to large values and use the idea of iteration size to simulate micro-batch training. We provide the following training scripts to get the reported results. 4 GPUs with 12GB each are assumed.

ResNet-50:

python -W ignore imagenet.py -a l_resnet50 --data ~/dataset/ILSVRC2012/ --epochs 90 --schedule 30 60 --gamma 0.1 -c checkpoints/imagenet/resnet50 --gpu-id 0,1,2,3

ResNet-101:

python -W ignore imagenet.py -a l_resnet101 --data ~/dataset/ILSVRC2012/ --epochs 100 --schedule 30 60 90 --gamma 0.1 -c checkpoints/imagenet/resnet101 --gpu-id 0,1,2,3 --train-batch 128 --test-batch 128

ResNeXt-50 32x4d

python -W ignore imagenet.py -a l_resnext50 --base-width 4 --cardinality 32 --data ~/dataset/ILSVRC2012/ --epochs 100 --schedule 30 60 90 --gamma 0.1 -c checkpoints/imagenet/resnext50-32x4d --gpu-id 0,1,2,3 --train-batch 128 --test-batch 128

ResNeXt-101 32x4d

python -W ignore imagenet.py -a l_resnext101 --base-width 4 --cardinality 32 --data ~/dataset/ILSVRC2012/ --epochs 100 --schedule 30 60 90 --gamma 0.1 -c checkpoints/imagenet/resnext101-32x4d --gpu-id 0,1,2,3 --train-batch 128 --test-batch 128

Original README of pytorch-classification

Classification on CIFAR-10/100 and ImageNet with PyTorch.

Features

Unified interface for different network architectures
Multi-GPU support
Training progress bar with rich info
Training log and training curve visualization code (see ./utils/logger.py)

Install

Install PyTorch

Clone recursively

git clone --recursive https://github.com/bearpaw/pytorch-classification.git

Training

Please see the Training recipes for how to train the models.

Results

CIFAR

Top1 error rate on the CIFAR-10/100 benchmarks are reported. You may get different results when training your models with different random seed. Note that the number of parameters are computed on the CIFAR-10 dataset.

Model	Params (M)	CIFAR-10 (%)	CIFAR-100 (%)
alexnet	2.47	22.78	56.13
vgg19_bn	20.04	6.66	28.05
ResNet-110	1.70	6.11	28.86
PreResNet-110	1.70	4.94	23.65
WRN-28-10 (drop 0.3)	36.48	3.79	18.14
ResNeXt-29, 8x64	34.43	3.69	17.38
ResNeXt-29, 16x64	68.16	3.53	17.30
DenseNet-BC (L=100, k=12)	0.77	4.54	22.88
DenseNet-BC (L=190, k=40)	25.62	3.32	17.17

ImageNet

Single-crop (224x224) validation error rate is reported.

Model	Params (M)	Top-1 Error (%)	Top-5 Error (%)
ResNet-18	11.69	30.09	10.78
ResNeXt-50 (32x4d)	25.03	22.6	6.29

Pretrained models

Our trained models and training logs are downloadable at OneDrive.

Supported Architectures

CIFAR-10 / CIFAR-100

Since the size of images in CIFAR dataset is 32x32, popular network structures for ImageNet need some modifications to adapt this input size. The modified models is in the package models.cifar:

ImageNet

All models in torchvision.models (alexnet, vgg, resnet, densenet, inception_v3, squeezenet)
ResNeXt
Wide Residual Networks

Contribute

Feel free to create a pull request if you find any bugs or you want to contribute (e.g., more datasets and more network structures).

Name		Name	Last commit message	Last commit date
Latest commit History 52 Commits
models		models
utils		utils
.gitignore		.gitignore
.gitmodules		.gitmodules
LICENSE		LICENSE
README.md		README.md
TRAINING.md		TRAINING.md
cifar.py		cifar.py
imagenet.py		imagenet.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

ImageNet Classification with Weight Standardization

Performances of Weight Standardization (WS) on ImageNet

Training

Original README of pytorch-classification

Features

Install

Training

Results

CIFAR

ImageNet

Pretrained models

Supported Architectures

CIFAR-10 / CIFAR-100

ImageNet

Contribute

About

Uh oh!

Releases

Packages

Languages

License

joe-siyuan-qiao/pytorch-classification

Folders and files

Latest commit

History

Repository files navigation

ImageNet Classification with Weight Standardization

Performances of Weight Standardization (WS) on ImageNet

Training

Original README of pytorch-classification

Features

Install

Training

Results

CIFAR

ImageNet

Pretrained models

Supported Architectures

CIFAR-10 / CIFAR-100

ImageNet

Contribute

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages