In this talk, we will offer an entirely white-box interpretation of deep (convolutional) networks from the perspective of data compression and group invariance. We’ll show how modern deep-layered architectures, linear (convolutional) operators and nonlinear activations, and even all parameters can be derived from the principle of maximizing rate reduction with group invariance. We’ll cover how all layers, operators, and parameters of the network are explicitly constructed through forward propagation rather than learned through back propagation. We’ll also explain how all components of the so-obtained network, called ReduNet, have precise optimization, geometric, and statistical interpretation. You’ll learn how this principled approach reveals a fundamental tradeoff between invariance and sparsity for class separability; how it reveals a fundamental connection between deep networks and Fourier transform for group invariance—the computational advantage in the spectral domain; and how it clarifies the mathematical role of forward and backward propagation. Finally, you’ll discover how the so-obtained ReduNet is amenable to fine-tuning through both forward and backward propagation to optimize the same objective.
Deep (Convolution) Networks from First Principles
- English: Harvard University, Center of Mathematical Sciences and Applications, Math-Science Literature Lecture Series | YouTube video >
- Chinese: Tsinghua University, Institute for AI Industrial Research | Tencent video >
Learn more about the 2021 Microsoft Research Summit: https://Aka.ms/researchsummit