REER, or Reverse-Engineered Reasoning, is a new way to teach AI models how to think deeply and step-by-step for open-ended tasks like writing stories or essays. Unlike traditional methods that build reasoning from scratch, REER starts with a high-quality final answer and works backward to uncover the hidden thinking process that could have led to it. This creates useful "reasoning trajectories"—detailed paths of thought—for training AI to handle creative, unstructured problems.
REER treats finding a good reasoning trajectory as a search problem in a huge space of possible thoughts. The goal is to discover a step-by-step path (called z) that best "explains" a high-quality output (y) for a given input (x). Quality is measured by perplexity—how surprised the AI is by y after following z. Lower perplexity means the path makes y seem more probable and coherent.
Key components:
Formally, it's optimizing: Find z that minimizes perplexity of y given x and z.