Currently at Netflix, I’m exploring principled approaches to both neural network architecture and learning
mechanisms through reinforcement learning that can maximize users’ long-term satisfaction.
Developed large-scale ranking systems for Netflix, integrating games and videos into a unified
recommendation model, serving over 250 million users worldwide.
Implemented deep learning models to enhance autonomous vehicle predictions, improving safety-critical event
handling and production accuracy.
Led machine learning teams to deploy scalable machine learning solutions, improving real-time bidding,
advertising, and user engagement systems.
Academic Highlights
I received my PhD in Computer Science in Reinforcement Learning under the supervision of Richard Sutton. Here are a few highlights of my academic
accomplishments:
Developed the GTD family of gradient-based temporal-difference learning algorithms under Richard S. Sutton’s
supervision, solving critical convergence challenges.
Introduced GQ(λ) and TDC algorithms, providing new convergence guarantees for temporal-difference learning.
Published extensively in ICML and NIPS, focusing on reinforcement learning and off-policy methods, with
significant impact in the field.