B.S. Computer Science · University of Washington

Eric Bae

I research offline reinforcement learning and the benchmarks we judge it by — headed toward research, grad school, and robotics.

  • UW — Seattle, WA
  • Offline RL & benchmarks
  • Grad Dec 2026
fig. 1 — reward_curve.log

y: reward·x: effort

// about

About

I've been taking things apart since I was a kid — toys first, to see the inner machinations, and eventually the algorithms behind them. That same itch is what pulled me into research. These days I'm a CS undergrad at the University of Washington, and I spend most of my time in the RAIVN Lab on offline reinforcement learning — specifically, how the way we report results quietly shapes the conclusions we draw from a benchmark. Our current work trains 145,000+ policies across 114 datasets to pull that apart.

Before that I designed a Sequential VAE for a generative TSP model, classified T-cell DNA sequences at Fred Hutch, and led a TA team building grading and testing tooling for 600+ students a quarter. What ties it all together is the part I actually love: building the thing, measuring it honestly, and leaving behind infrastructure other people can reproduce and build on.

// research

Research

  1. Undergraduate Researcher · RAIVN Lab

    Mentored by Nabil Omi · Advised by Prof. Ali Farhadi

    • Co-authoring a paper (under review) training 145,000+ policies across 114 datasets to study how reporting and hyperparameter choices affect offline-RL benchmark conclusions.
    • Implemented and adapted offline RL algorithms (CQL, BCQ) in PyTorch and d4rl; integrated ViZDoom environments into the MuJoCo-based benchmark suite.
    • Built data pipelines for JumpStart — the project's open-source resource suite of trained models, per-model hyperparameter and reward data, and strong baselines across 114 datasets.
    • PyTorch
    • d4rl
    • MuJoCo
    • ViZDoom
    • CQL
    • BCQ
  2. Research Assistant · Social Reinforcement Learning Lab

    Software Developer

    • Co-authored a paper (arXiv, 2025) on generative modeling for robust deep RL on the traveling salesman problem, introducing COGS — Combinatorial Optimization with Generative Sampling.
    • Designed a Sequential VAE architecture, reducing reconstruction loss by 75%.
    • Improved model accuracy by 10% using TensorFlow and biologically inspired heuristics.
    • TensorFlow
    • VAE
    • Generative models
  3. Machine Learning Intern · Fred Hutchinson Cancer Research Center

    Herbold Computational Biology Program · mentored by Dr. Phil Bradley

    • Examined patient cell responses to different T-cell receptors toward engineering an immunotherapy cancer treatment.
    • Analyzed >5,000 T-cell DNA sequences (16 features) with Pandas/NumPy to classify molecular structures.
    • Designed a regression model with scikit-learn and presented findings to 75+ researchers at a conference.
    • Pandas
    • NumPy
    • scikit-learn

// publications

Publications

  1. Co-author (details withheld during review).Jump Start Your Policy Learning with Lessons from 145,000 Training Runs.

    under review

  2. M. Li, E. Bae, C. Haberland, N. Jaques.Generative Modeling for Robust Deep Reinforcement Learning on the Traveling Salesman Problem.

    arXiv, 2025

// honors

Awards & Honors

  • 2nd place — Spokane Cyber Cup VSecurity capture-the-flag (CTF)2024
  • Wayne & Christy Timberlake Family Endowed Fund ScholarshipUniversity of Washington2024
  • Su Family Endowed Scholarship in Computer Science & EngineeringUniversity of Washington2023
  • National Coca-Cola Scholars Program — Semifinalist2022

// projects

Projects

96%

real-time accuracy

Sign Language Detector

Jun 2024 — Aug 2024

End-to-end real-time gesture-recognition pipeline in TensorFlow and computer vision, with minimal inference delay — released as a baseline for open-source contributors.

View code

1st

UW Datathon · 250+ entrants

Covid-19 ML Analysis

Feb 2023

Best ML model in the Health Track over a 1.04M-row, 128-feature dataset. A bootstrap aggregation approach tripled prediction accuracy to 90%.

View code

// experience

Experience

Lead TA (CSE 123) · Paul G. Allen School of CSE

  • Built an automated grading tool (Java, Maven, JS) that doubled grading speed for 150+ TAs.
  • Designed an exam-testing pipeline serving 600+ students each quarter.
  • Developed an open-access learning portal used by 100+ high-school teachers nationally (Jupyter, Java, HTML/CSS).

Software Engineer Intern · Dabble Startup

  • Architected a responsive front end (React, Ionic, MUI), improving page-load times by 30%.
  • Built a Node.js/MongoDB backend with sockets, auth, and Redux state management.

Controls Team Member · Washington Hyperloop

  • Collaborated with the team to build a robotic navigation system for the Not-a-Boring Competition.
  • Assembled Raspberry Pis and Arduinos to process and analyze 25+ streams of sensor data.

Coding Instructor · Advance Academy

  • Taught HTML/CSS and Java fundamentals to students as a part-time instructor.

// skills

Skills

Languages
Python, Java, C, C++, C#, JavaScript, TypeScript, R, Go, HTML/CSS
ML & RL
PyTorch, TensorFlow, gymnasium, MuJoCo, d4rl, scikit-learn
Data & Viz
Pandas, NumPy, Matplotlib, plotnine, Seaborn
Frameworks & Web
React, Next.js, Node.js, Ionic, MUI, RealityKit, ARKit
Systems & Tooling
SQL, MongoDB, Docker, Jupyter, Maven, Gradle