About this Event
5200 N Lake Rd, Merced, CA 95343
Electrical Engineering and Computer Science (EECS)
Ph.D. Dissertation Defense
"Low Rank Compression of Neural Networks: LC Algorithms and Open-source Implementation"
Electrical Engineering and Computer Science
University of California, Merced
Neural networks have gained widespread use in many machine learning tasks due to their state-of-the-art performance. However, the cost of this progress lies in the ever-increasing sizes and computational demands of the resulting models. As such, the neural network compression has become an important practical step when deploying the trained models to perform inference tasks.
In this dissertation, we explore a particular compression mechanism — the low-rank decomposition —and its extensions for the purposes of neural network compression. We study important aspects of the low-rank compression: how to select the decomposition ranks across the layers, how to choose best decomposition shapes for non-matrix weights among a number of options, and how to adapt the low-rank scheme to target the inference speed. Computationally, these are hard problems involving integer variables (ranks, decomposition shapes) and continuous variables (weights), as well as nonlinear loss.
As we show over the course of this dissertation, all these problems admit suitable formulations that can be efficiently solved using the recently proposed learning-compression algorithm. The algorithm relies on the alternation of two optimization steps: the step over the neural network parameters, the L step, and the step over the compression parameters, the C step. Once we formulate the compression problems, we show how the L and C steps are derived. We demonstrate the effectiveness of the proposed compression schemes and the corresponding algorithms on multiple networks and datasets.
Yerlan is a Ph.D. candidate at the EECS department of UC Merced. He has a Master of Science degree from the University of California San Diego and a Bachelor of Science degree from International Information Technology University, Almaty, Kazakhstan. Yerlan's primary research interests are in problems of neural network compression including pruning, quantization, low-rank decompositions, and other forms of compressions.
0 people are interested in this event