Scalable Diffusion Models with Transformers (DiT) - GitHub This repository contains: 🪐 A simple PyTorch implementation of DiT ⚡️ Pre-trained class-conditional DiT models trained on ImageNet (512x512 and 256x256) 💥 A self-contained Hugging Face Space and Colab notebook for running pre-trained DiT-XL 2 models 🛸 A DiT training script using PyTorch DDP
[2212. 09748] Scalable Diffusion Models with Transformers In addition to possessing good scalability properties, our largest DiT-XL 2 models outperform all prior diffusion models on the class-conditional ImageNet 512x512 and 256x256 benchmarks, achieving a state-of-the-art FID of 2 27 on the latter