Hamid Maei

Hamid Maei

Building experiential AI systems that are able to plan and learn about consequence of actions

Research and Development Interests

  • Generative AI
  • Reinforcement Learning
  • AI Agent Systems with Planning
  • Recommendation Systems
  • Neural Architecture Design
  • Large-Scale AI Systems

My AI Journey: From Learning to Building

2022 – Present
AI Researcher & Engineer
Netflix, Los Gatos, CA
• Built incremental training of recommender system with generative AI paradigm and architecture
• Built the first unified personalized recommender system integrating games and video content for Netflix, serving 300M+ users worldwide
2020 – 2021
Staff Applied Research Scientist
Cruise Automation, San Francisco, CA
• Deep learning models for safety-critical vehicle, pedestrian, and bike prediction
• Multi-modal path prediction and interaction modeling for autonomous vehicles
2017 – 2020
Staff Research Scientist
Criteo, Palo Alto, CA
• Deep learning for advertising technology: text vs image for category predictions
• Scaling up neural networks for predictions such as CTR/conversions
• Research on reinforcement learning for bidding optimization
2015 – 2017
Senior Staff Engineer
Samsung Research America, Mountain View, CA
Led a team of research engineers developing CNN compression techniques for mobile device deployment
2013 – 2015
Lead Machine Learning Scientist/Engineer
Various startups, Toronto, Canada
Machine learning infrastructure, real-time bidding systems, and customer propensity prediction models
2011 – 2013
Postdoctoral Fellow
Stanford University, CA
Postdoctoral research in Reinforcement Learning with Benjamin Van Roy
2007 – 2011
PhD in Computer Science
CS Department, University of Alberta, Canada
• PhD Advisor: Richard Sutton (Turing Award Recipient 2024)
• Developed Gradient Temporal-Difference (GTD) algorithms, solving the fundamental "deadly triad" problem in reinforcement learning
2005 – 2007
Research Grad Student
University of Toronto, Canada
→ Admitted to Computer Science PhD program and transitioned to University of Alberta to work with Rich Sutton

Research in machine learning and memory mechanisms in neuronal systems
• Completed Deep Learning course (neural networks) with Geoff Hinton
• Advanced course project in deep learning
2003 – 2005
M.Phil. Machine Learning
Gatsby Unit, UCL, London, UK (founded by Geoff Hinton)
Advanced coursework and research in machine learning
• Temporal memory research in randomly connected recurrent neural networks
• Unsupervised learning

Industry Experience

📺

Recommender Systems

Netflix: Advanced recommender system for Netflix homepage serving 300M+ global users

🚗

Autonomous Driving

Cruise Automation: Deep learning models for safety-critical vehicle, pedestrian, and bike prediction

🎯

Advertising Technology

Criteo: Deep learning for category predictions and reinforcement learning frameworks for real-time bidding optimization

📱

Mobile AI

Samsung: CNN compression for object detection and classification, Samsung Pay project

Academic Breakthrough in RL

Gradient-TD (GTD) Algorithms

TDC • GTD(λ) • GQ(λ) • Greedy-GQ

Solved the "deadly triad" — a 16-year-old fundamental problem in reinforcement learning. Provided the first theoretical convergence guarantees for combining off-policy learning, function approximation, and bootstrapping.

Developed in close collaboration with Richard Sutton (PhD advisor), and in collaboration with Csaba Szepesvári on the theoretical aspects of stochastic convergence proofs for TDC and Greedy-GQ.

My work has been acknowledged in the preface and featured in Chapter 11 of "Reinforcement Learning: An Introduction" by Sutton and Barto.

Reinforcement Learning: An Introduction book cover

Get In Touch

Interested in collaborating on AI missions or discussing ideas? I'd love to connect.

Get In Touch