Skip to content
{ ken ashe }
  • Building
  • Blog
  • Building
  • Blog
← Blog / Tags

// tag

evals

6 posts tagged evals.

  • Multimodal models still change answers when you shuffle the evidence Jun 25, 2026
  • Self-distillation can make models better on the first try and worse on the fifth Jun 25, 2026
  • Agent Success Rate is the only number that matters when a new model drops May 31, 2026
  • Marketers are still vibe-checking prompts. Frontier devs run evals before lunch. May 30, 2026
  • Stop Vibe-Checking New Models. Build a 50-Prompt Eval Set Instead. May 28, 2026
  • The Frustration Index: A Cheap Eval Most Teams Skip May 20, 2026
  • RSS
  • LinkedIn
  • X
  • GitHub
  • Email
  • Newsroom
  • Media Kit
  • Privacy
  • Terms
  • Disclosure

© 2026 Ken Ashe