Skip to content

Recursive Micro-Networks for Efficient Reasoning

Engineer lightweight neural architectures that iterate on their own outputs to rival larger models in structured reasoning tasks.

advanced3 / 11

3. Training Strategies for Stable Recursion

Training recursive micro-networks requires specialized curricula.

Curriculum Learning#

  • Start with shallow recursion (1-3 steps) using simple tasks.
  • Gradually extend depth and complexity, ensuring the model learns to leverage extra steps effectively.
  • Interleave tasks requiring different reasoning styles (pattern completion, arithmetic, symbolic planning).

Teacher Guidance#

  • Use larger models or symbolic solvers to produce high-quality reasoning traces.
  • Distill these traces into the micro-network through supervised learning or policy gradients.
  • Emphasize the process, not just the final answer; include self-critique tokens and decision points.

Stability Techniques#

  • Penalize non-convergent loops via loss functions that reward timely termination.
  • Apply entropy regularization to avoid deterministic loops when exploration is needed.
  • Introduce “recursion dropout” where certain steps are skipped during training to build robustness.
Section 3 of 11
Next →