Atish Agarwala

I am an AI Resident at Google. Previously, I was a physics PhD student at Stanford University advised by Daniel S. Fisher. Before that, I studied math and physics as an undergraduate at Swarthmore College.

My PhD research area was, broadly speaking, theoretical biology. During my PhD, I primarily studied evolution - trying to understand how different dynamical processes combine to give evolutionary dynamics, and to characterize things like the speeed and predictability of evolution. I worked on understanding theoretical models of evolution on random fitness landscapes to understand the complexities of evolution in various biologically plausible scenarios. Towards the end of my PhD, I worked on the intersection of ecology and evolution - using modeling and simulation to show that diversity can be maintained dynamically through interactions between host and pathogens.

In addition to my theoretical work, I’ve worked on analyzing data from experimental evolution, and developed robust methods of inferring fitness from abundance data (code here). I used my code to understand the nature of fitness gains in glucose limited yeast (in collaboration with experimentalists from the Petrov and Sherlock labs at Stanford).

For more on my PhD work, check out this interview.

Currently I’m working as an AI Resident at Google, where I’ve focused on two research areas: understanding machine learning using tools from statistical physics, and applying machine learning methods to problems in biology. I’m particularly interested in understanding how dynamics affects learning, in both the classical supervised as well as active learning settings. I’ve done theoretical work on proving the learnability of certain function classes with wide neural networks trained via gradient descent. I’ve also studied the learning dynamics of networks trained with cross-entropy loss with a combination of theory and experiment, showing that the softmax temperature can be tuned to improve network performance.

More recently, I’ve been applying my knowledge of evolution to understand and improve protein design. High-throughput sequencing-based assays can be combined with modern machine learning methods to explore the space of amino acid sequences to efficiently improve the function of proteins of interest. However, not much is understood about what sorts of data should be taken, and how sequences should be selected from models. I’m currently using in-silico fitness landscapes to experiment on the design process itself, and to understand how landscape properties translate to effective design strategies.


Google scholar