Skip to content

Scalable Transformer Flows (STARFlow)

Deep dive into the STARFlow architecture, combining the expressiveness of autoregressive transformers with the efficiency of normalizing flows.

advanced1 / 5

Introduction

Generative modeling is currently dominated by two families: Autoregressive Models (like GPT) and Diffusion Models (like Stable Diffusion).

Autoregressive#

Great at structure, slow (token-by-token).

Diffusion#

Great at quality, slow (iterative denoising).

STARFlow (Scalable Transformer Auto-Regressive Flow) introduces a hybrid architecture that aims to combine the best of both worlds: the expressiveness of transformers with the efficiency of Normalizing Flows.

Section 1 of 5
Next →