5200 Lake Rd, Merced, CA 95343

Add to calendar

   Memory-Centric System Optimization For Large Scale Neural Network Workloads

   Large-scale neural networks, including Large Language Models (LLMs) and Graph Neural Networks (GNNs), have transformed AI applications but come with high computational and memory demands. This thesis introduces a suite of memory-centric optimization frameworks to overcome these challenges in resource-constrained, heterogeneous, and distributed systems. The framework Betty addresses memory bottlenecks in GNN training using Redundancy-Embedded Graph (REG) Partitioning and Memory-Aware Partitioning, enabling faster training and deeper aggregation on large graphs. Building on Betty, Buffalo leverages a memory-efficient bucketing strategy to reduce overhead and boost throughput for graphs with billions of edges, aiding applications like knowledge graph completion. For LLMs, BloomBee provides decentralized inference and fine-tuning through advanced parallelism and dynamic offloading, supporting billion-parameter models on consumer hardware. Together, these innovations pave the way for efficient large-scale neural network execution, broadening the applicability of advanced AI models.

   Shuangyan Yang is a PhD candidate in the Department of Electrical Engineering and Computer Science (EECS), specializing in systems, high-performance computing, and deep learning. Her current research centers on decentralized AI model inference, particularly for large language models, and the training of large-scale graph neural networks. She employs methodologies that enhance parallelism in LLMs and improve memory efficiency within GNN frameworks.​
 

Event Details

See Who Is Interested

0 people are interested in this event

User Activity

No recent activity

University of California Merced Events Calendar Powered by the Localist Community Event Platform © All rights reserved