Resources
This document contains all the external resources, links, and references mentioned in the “Exploring Generative AI Models (Part 2)” lecture.
Foundational Papers
- Adding Conditional Control to Text-to-Image Diffusion Models (2023) - Zhang & Agrawala
- ControlNet paper from Stanford University
- https://arxiv.org/abs/2302.05543
Google Colab Notebooks
All demo notebooks from the presentation:
- Diffusion Text-to-Image
- Diffusion Image-to-Image
- ControlNet Human Pose
Image Generation Models & Tools
Stable Diffusion
- Stability AI: https://stability.ai
- Open-source text-to-image diffusion models
- Timeline: v1.4 (Aug 2022) → v1.5 → v2.0/2.1 → SDXL (Jul 2023) → v3.5 (Jun 2024)
Midjourney
- Website: https://midjourney.com
- Discord-based image generation service
- Known for exceptional artistic quality
- v5 launched March 2023
FLUX.1
- Black Forest Labs: https://blackforestlabs.ai
- State-of-the-art open image generation model (2024)
DALL-E / Sora (OpenAI)
- DALL-E: Text-to-image generation
- Sora (Feb 2024): Text-to-video up to 60 seconds
Imagen 3 (Google DeepMind)
- Photorealistic image generation
- 2024 release
Image Model Platforms & Resources
Replicate
- Website: https://replicate.com
- Cloud platform for running AI models via API
- Extensive library of image generation models
ComfyUI
- Website: https://comfy.org
- Node-based UI for Stable Diffusion workflows
- Powerful tool for complex image generation pipelines
Hugging Face Diffusers
- Docs: https://huggingface.co/docs/diffusers
- Python library for diffusion models
- Easy access to text-to-image, image-to-image, and ControlNet
Key Concepts & Techniques
ControlNet
- Add-on models for Stable Diffusion providing precise spatial control
- Conditioning types: pose, edges, depth, normal maps, segmentation, scribbles
- Published February 2023 by Stanford researchers
Diffusion Models
- Two-stage process inspired by thermodynamics
- Training: Learn to predict noise added to images
- Inference: Start with random noise, iteratively denoise guided by text prompt
Model Hubs
- Hugging Face Model Hub: Browse and download thousands of models
- CivitAI: https://civitai.com - Community for Stable Diffusion models
Communities
- r/StableDiffusion: Reddit community for image generation