// notes
Working notes on offline RL, benchmarks, and things I'm building.
A short note on why the same set of runs can support opposite conclusions depending on how you report them — and what we're doing about it.