Resources

Diffusion Models

Stable Diffusion

FLUX Models

ControlNet

Replicate

Depth Estimation

  • Depth Anything - State-of-the-art monocular depth estimation
  • MiDaS - Intel’s robust monocular depth estimation model
  • ZoeDepth Paper - “ZoeDepth: Zero-shot Transfer by Combining Relative and Metric Depth”

Inpainting and Outpainting

Vision Transformers

  • ViT Paper - “An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale” (2020)
  • CLIP Paper - “Learning Transferable Visual Models From Natural Language Supervision”
  • DINOv2 - Meta’s self-supervised vision transformer

Vision Language Models (VLMs)

Gradio for Image Applications

Prompt Engineering for Image Models

Citations