Enlighten Publications

In this section

Normalization Layers Are All That Sharpness-Aware Minimization Needs

Müller, M., Vlaar, T. , Rolnick, D. and Hein, M. (2024) Normalization Layers Are All That Sharpness-Aware Minimization Needs. In: 37th Conference on Neural Information Processing Systems (NeurIPS 2023), New Orleans, Louisiana, USA, 10-16 December 2023,

Text
310174.pdf - Published Version
739kB

Publisher's URL: https://proceedings.neurips.cc/paper_files/paper/2023/hash/da909fc3893d272f26fd9db82e09d954-Abstract-Conference.html

Abstract

Sharpness-aware minimization (SAM) was proposed to reduce sharpness of minima and has been shown to enhance generalization performance in various settings. In this work we show that perturbing only the affine normalization parameters (typically comprising 0.1% of the total parameters) in the adversarial step of SAM can outperform perturbing all of the parameters.This finding generalizes to different SAM variants and both ResNet (Batch Normalization) and Vision Transformer (Layer Normalization) architectures. We consider alternative sparse perturbation approaches and find that these do not achieve similar performance enhancement at such extreme sparsity levels, showing that this behaviour is unique to the normalization layers. Although our findings reaffirm the effectiveness of SAM in improving generalization performance, they cast doubt on whether this is solely caused by reduced sharpness.

Item Type:	Conference Proceedings
Additional Information:	We acknowledge support from the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) under Germany’s Excellence Strategy (EXC number 2064/1, Project number 390727645), as well as from the Carl Zeiss Foundation in the project "Certification and Foundations of Safe Machine Learning Systems in Healthcare". We also thank the European Laboratory for Learning and Intelligent Systems (ELLIS) for supporting Maximilian Müller. We are grateful for support from the Canada CIFAR AI Chairs Program and US National Science Foundation award tel:1910864.
Status:	Published
Refereed:	Yes
Glasgow Author(s) Enlighten ID:	Vlaar, Dr Tiffany
Authors:	Müller, M., Vlaar, T., Rolnick, D., and Hein, M.
College/School:	College of Science and Engineering > School of Mathematics and Statistics > Mathematics
Copyright Holders:	Copyright © 2023 The Author(s)
First Published:	First published in Advances in Neural Information Processing Systems 36 (NeurIPS 2023)
Publisher Policy:	Reproduced in accordance with the publisher copyright policy
Related URLs:	Pre-print Organisation Publisher

University Staff: Request a correction | Enlighten Editors: Update this record

Deposit and Record Details

ID Code:	310174
Depositing User:	Dr Tiffany Vlaar
Datestamp:	13 Dec 2023 12:10
Last Modified:	24 Apr 2024 10:09
Date of acceptance:	22 September 2023
Date of first online publication:	2024
Date Deposited:	13 December 2023
Data Availability Statement:	No