Blog

Thoughts on machine learning, mathematics, and research

Is your LLM performing @k?

Tags:

Is your LLM performing @k? A discussion of common metrics.

Tags:

Lessons from a DeepSeek R1 mini eval on AIMO2 Ref-10: Are Thinking LLMs A New Paradigm for Evaluation and Inference?

Tags:

Researchers are science accelerators (or: The citation-impact paradox)

Tags: