Skip to content

REER Reverse Reasoning Guide

REER, or Reverse-Engineered Reasoning, is a new way to teach AI models how to think deeply and step-by-step for open-ended tasks like writing stories or essays. Unlike traditional methods that build reasoning from scratch, REER starts with a high-quality final answer and works backward to uncover the hidden thinking process that could have led to it. This creates useful "reasoning trajectories"—detailed paths of thought—for training AI to handle creative, unstructured problems.

beginner4 / 7

REER as a Search Problem

REER treats finding a good reasoning trajectory as a search problem in a huge space of possible thoughts. The goal is to discover a step-by-step path (called z) that best "explains" a high-quality output (y) for a given input (x). Quality is measured by perplexity—how surprised the AI is by y after following z. Lower perplexity means the path makes y seem more probable and coherent.

Key components:

  • Iterative Local Search: Start with a basic initial thought path. Then, break it into segments and refine each one locally—tweak words or add steps—while checking if the overall perplexity improves. Repeat until the path is strong, guided by small, targeted changes to avoid getting stuck.
  • Perplexity-Guided Refinement: Perplexity acts as a compass. For each tweak, compute how well the updated path predicts the final answer. Keep changes that lower perplexity (better explanation) and discard those that raise it.
  • Data Curation: Collect real-world question-answer pairs (e.g., writing prompts and responses). Run the search on them to generate reasoning trajectories. Filter for quality using techniques like context setup and end-result checks, resulting in a diverse dataset focused on creative areas like literature and arts.

Formally, it's optimizing: Find z that minimizes perplexity of y given x and z.

Section 4 of 7
Next →