Skip to content

Scalable Transformer Flows (STARFlow)

Deep dive into the STARFlow architecture, combining the expressiveness of autoregressive transformers with the efficiency of normalizing flows.

advanced2 / 5

The Architecture

STARFlow integrates a Transformer backbone into a Normalizing Flow framework.

What is a Normalizing Flow?#

A Normalizing Flow is a sequence of invertible transformations that maps a simple distribution (like a Gaussian) to a complex data distribution (like images).

Invertible#

You can go from Data -> Noise (Training) and Noise -> Data (Generation) using the exact same math.

Exact Likelihood#

Unlike GANs, you can calculate the exact probability of a data point.

The STARFlow Innovation#

Standard flows struggle to model long-range dependencies (global structure in an image). STARFlow solves this by using a Transformer to parameterize the flow's transformations.

Autoregressive Prior#

The model predicts the distribution of the next "patch" of the image based on previous patches.

Flow Refinement#

Instead of just predicting a pixel value, it predicts the parameters of a flow transformation that generates the pixel.

Section 2 of 5
Next →