The Coupling Within: Flow Matching via Distilled Normalizing Flows
Flow models have rapidly become the go-to method for training and deploying large-scale generators, owing their success to inference-time flexibility via adjustable integration steps. A crucial ingredient in flow training is the choice of coupling measure for sampling noise/data pairs that define the flow matching (FM) regression loss. While FM training defaults usually to independent coupling, recent works show that adaptive couplings informed by noise/data distributions (e.g., via optimal transport, OT) improve both model training and inference. We radicalize this insight by shifting the paradigm: rather than computing adaptive couplings directly, we use distilled couplings from a different, pretrained model capable of placing noise and data spaces in bijection -- a property intrinsic to normalizing flows (NF) through their maximum likelihood and invertibility requirements. Leveraging recent advances in NF image generation via auto-regressive (AR) blocks, we propose Normalized Flow Matching (NFM), a new method that distills the quasi-deterministic coupling of pretrained NF models to train student flow models. These students achieve the best of both worlds: significantly outperforming flow models trained with independent or even OT couplings, while also improving on the teacher AR-NF model.

Overview figure from the paper — see the linked paper for full details.
A paradigm shift: rather than computing adaptive couplings directly, distill couplings from a pretrained model that places noise and data in bijection — a property intrinsic to normalizing flows.
Normalized Flow Matching (NFM): distilling the quasi-deterministic coupling of pretrained autoregressive NF models into student flow models.
Students that beat flow models trained with independent or optimal-transport couplings, while also improving on the teacher AR-NF model.
@article{berthelot2026coupling,
title = {The Coupling Within: Flow Matching via Distilled Normalizing Flows},
author = {Berthelot, David and Chen, Tianrong and Gu, Jiatao and Cuturi, Marco and Dinh, Laurent and Chandna, Bhavik and Klein, Michal and Susskind, Joshua and Zhai, Shuangfei},
journal = {arXiv preprint arXiv:2603.09014},
year = {2026}
}