Wednesday, August 11, 2021 10am to 12pm
About this Event
Electrical Engineering and Computer Science (EECS)
Ph.D. Dissertation Defense
"Gradient Descent Optimization on Heterogeneous Architectures"
Yujing Ma
Electrical Engineering and Computer Science
University of California, Merced
Abstract:
There is an increased interest in building machine learning frameworks with advanced algebraic capabilities both in industry and academia. The widely-adopted practice is to train deep learning models with specialized hardware accelerators, e.g., GPUs or TPUs, due to their superior performance on linear algebra operations. This procedure does not employ effectively the extensive CPU and memory resources. Moreover, for multi-GPU systems, the clock/memory speed may vary a lot even for the GPUs with the same model from the same vendor. The heterogeneity of GPUs must be carefully considered. In addition, the optimization algorithm for the deep learning framework also plays a pivotal role for the training performance. Stochastic gradient descent (SGD) is the most popular optimization method for model training on modern machine learning platforms. However, its convergence and its adaptation to heterogeneous systems is still an open research direction.
In this dissertation, we perform a comprehensive experimental study of parallel SGD for training machine learning models. We introduce a generic heterogeneous CPU+GPU framework that exploits the difference in computational power and memory hierarchy between CPU and GPU through synchronous message passing in order to maximize performance and resource utilization. Based on insights gained through experimentation with the framework, we design two heterogeneous asynchronous SGD algorithms and build a heterogeneity-aware multi-GPU framework to reduce the synchronization cost and data transfer cost of the training. We present the novel synchronous SGD algorithm to tackle the heterogeneity challenge in multi-GPU systems. We successfully show that the implementations of these algorithms in the proposed frameworks greatly accelerate the convergence and significantly achieve higher resource utilization than state-of-the-art machine learning systems on real datasets.
Biography:
Yujing Ma is currently a Ph.D. candidate working under the supervision of Professor Florin Rusu in Electrical Engineering and Computer Science, University of California, Merced. She received her B.Sc. degree in Software Engineering from East China Normal University in 2014. Her research focuses on designing high-performance machine learning algorithms and frameworks.
1 person is interested in this event
User Activity
No recent activity