About this Event
5200 N Lake Rd, Merced, CA 95343
Electrical Engineering and Computer Science (EECS)
Ph.D. Dissertation Defense
“Some Approaches to Interpret Deep Neural Networks”
Suryabhan Singh Hada
Electrical Engineering and Computer Science
University of California, Merced
Practical deployment of deep neural networks has become widespread in the last decade due to their ability to provide simple, intelligent, and automated processing of the tasks that up to now were hard for other machine learning models. At the same time, some concerns regarding these models' safety and ethical use have also arisen. One of the main concerns is interpretability, i.e., explaining how the model makes a decision for an input. Interpretability is one of the most important problems to address for building trust and accountability in deep neural networks.
This dissertation proposes two novel approaches to interpret deep neural networks. The first approach focuses on understanding what information is retained by the neurons of a deep net. We propose an approach to characterize the region of input space that excites a given neuron to a certain level.
Inspection of these regions by a human can reveal regularities that help to understand the neuron.
In the second approach, we provide a systematic way to understand what group of neurons in a deep net are responsible for a particular class. This allows us to study the relation between deep net features (neuron's activation) and output classes; and how different classes are distributed in the feature space. We also show that out of thousands of neurons in the deep net, only a small subset of neurons is associated with a specific class. Finally, we demonstrate that the second approach can also be used to interpret large datasets.
Suryabhan Singh Hada is a Ph.D. candidate at the EECS department of UC Merced. He received his Integrated Masters of Technology (Undergraduate and Masters) in Mathematics and Computing from the Indian Institute of Technology (Banaras Hindu University), Varanasi, India. His research focuses on interpreting complex machine learning models and their visualization.
0 people are interested in this event