Deep dive into the STARFlow architecture, combining the expressiveness of autoregressive transformers with the efficiency of normalizing flows.
Generative modeling is currently dominated by two families: Autoregressive Models (like GPT) and Diffusion Models (like Stable Diffusion).
Great at structure, slow (token-by-token).
Great at quality, slow (iterative denoising).
STARFlow (Scalable Transformer Auto-Regressive Flow) introduces a hybrid architecture that aims to combine the best of both worlds: the expressiveness of transformers with the efficiency of Normalizing Flows.