Blog

Machine learning, mathematics, and research

2025-08-15

Is your LLM performing @k?

2025-07-29

Is your LLM performing @k? A discussion of common metrics.

2025-01-30

Lessons from a DeepSeek R1 mini eval on AIMO2 Ref-10: Are Thinking LLMs A New Paradigm for Evaluation and Inference?

2024-08-09

Researchers are science accelerators (or: The citation-impact paradox)