The AI's checkpoints were made available for academic research purposes upon request.
In case you missed it, the AI model was trained on 512x512 images from the LAION-5B library and uses a frozen CLIP ViT-L/14 text encoder to condition the model on text prompts. With its 860M UNet and 123M text encoder, the model is relatively lightweight and runs on a GPU with at least 10GB VRAM.
At the moment, Stable Diffusion's checkpoints are only available for academic research purposes upon request. According to the team, this precaution was taken to prevent misuse and harm. In the future, however, the team plans to share a public release "with a more permissive license that also incorporates ethical considerations."