NeurIPS 2021 Lead Author Spotlight
Guan-Horng Liu, PhD Machine Learning student
We propose a novel second-order optimization framework for training the emerging deep continuous-time models, specifically the Neural Ordinary Differential Equations (Neural ODEs). Since their training already involves expensive gradient computation by solving a backward ODE, deriving efficient second-order methods becomes highly nontrivial. Nevertheless, inspired by the recent Optimal Control (OC) interpretation of training deep networks, we show that a specific continuous-time OC methodology, called Differential Programming, can be adopted to derive backward ODEs for higher-order derivatives at the same O(1) memory cost. Our code is available at https://github.com/ghliu/snopt.
Q&A with Guan-Horng Liu
(click question to show answer)
What motivated your work on this paper?
Despite that the adjoint-based optimization process of Neural ODEs is originated from Optimal Control Principle, few attempts have been made along this principle. We are interested in seeing the practical improvements it can bring to the table.
If readers remember one takeaway from the paper, what should it be and why?
That the deep learning optimization is fundamentally intertwined with optimal control principle paves an elegant path towards principled training improvement and new applications that will otherwise remain unexplored in the current framework.
Were there any “aha” moments or lessons that you’ll use to inform your future work?
Math won’t lie 🙂
What are you most excited for at NeurIPS and what do you hope to take away from the experience?
I am definitely excited to learn more at NeurIPS.