Copyright Notice:

The documents distributed by this server have been provided by the contributing authors as a means to ensure timely dissemination of scholarly and technical work on a noncommercial basis. Copyright and all rights therein are maintained by the authors or by other copyright holders, notwithstanding that they have offered their works here electronically. It is understood that all persons copying this information will adhere to the terms and constraints invoked by each author's copyright. These works may not be reposted without the explicit permission of the copyright holder.

Publications of SPCL

B. A Plummer, N. Dryden, J. Frost, T. Hoefler, K. Saenko:

 Neural Parameter Allocation Search

(Jun. 2021)


Fitting a model into GPU memory during training is an increasing concern as models continue to grow. Parameter sharing can reduce memory requirements, but existing methods only share parameters between identical layers, limiting their impact. This paper removes these restrictions with a novel task called Neural Parameter Allocation Search (NPAS), where the goal is to generate weights for a network using a given parameter budget. NPAS requires new techniques to morph available parameters to fit any architecture. To address this new task we introduce Shapeshifter Networks (SSNs), which automatically learns where and how to share parameters between all layers in a network, even between layers of varying sizes and operations. SSNs do not require any loss function or architecture modifications, making them easy to use. We evaluate SSNs in key NPAS settings using seven network architectures across diverse tasks including image classification, bidirectional image-sentence retrieval, and phrase grounding, creating high performing models even when using as little as 1% of the parameters.


access preprint on arxiv:


  author={Bryan A Plummer and Nikoli Dryden and Julius Frost and Torsten Hoefler and Kate Saenko},
  title={{Neural Parameter Allocation Search}},