Parameter Prediction for Unseen Deep Architectures (w/ First Author Boris Knyazev)

First published at 04:54 UTC on November 25th, 2021.

Deep Neural Networks are usually trained from a given parameter initialization using SGD until convergence at a local optimum. This paper goes a different route: Given a novel network architectu…

