The Age of Digital Da Vinci: All About Image Generation
A journey through the evolution of image generation technologies, from encoder-decoders to GANs and the revolutionary Stable Diffusion models.
Want to book this talk?
The Age of Digital Da Vinci: All About Image Generation
Overview
The field of AI-driven image generation has seen remarkable advancements in recent years, transforming from basic image manipulations to creating photorealistic imagery from text descriptions. This talk takes the audience on a journey through the evolution of image generation techniques, from early Encoder-Decoder architectures to Variational Autoencoders (VAEs), Generative Adversarial Networks (GANs), and the latest breakthroughs with Diffusion Models.
Talk Details
- Duration: 45 minutes (including Q&A)
- Expertise Level: Intermediate
- Ideal For: AI Conferences, Tech Meetups, Creative Technology Events
- Audience Size: Any
- Previously Presented: Devfest 2023 (November 2023)
- Resources: Slides | Code
Talk Content
1. Evolution of Image Generation Models (10 minutes)
- Early encoder-decoder architectures and their limitations
- Introduction of Variational Autoencoders (VAEs) and the concept of latent spaces
- The GAN revolution initiated by Ian Goodfellow
- Progressive GANs and StyleGANs that improved image quality
- The emergence of diffusion models like DALL-E and Stable Diffusion
2. Understanding Diffusion Models (15 minutes)
- The forward diffusion process: gradually adding noise to images
- The reverse denoising process: learning to remove noise step by step
- Latent diffusion models and their computational efficiency
- The role of text conditioning with CLIP embeddings
3. Stable Diffusion Deep Dive (10 minutes)
- Architecture breakdown: UNet, VAE, text encoder components
- Training methodology and dataset considerations
- The importance of prompt engineering
- Techniques like inpainting, outpainting, and img2img transformations
4. Real-World Applications (5 minutes)
- Content creation for digital media and advertising
- Product visualization in e-commerce
- Assisting artists and designers in ideation
- Medical imaging and scientific visualization
5. Q&A (5 minutes)
Target Audience
This talk is suitable for ML/AI practitioners, researchers, artists, and technologists interested in the field of generative AI. While some technical concepts are covered, the presentation is designed to be accessible to those with a basic understanding of machine learning concepts.
Prerequisites
- Basic understanding of machine learning concepts
- Familiarity with neural networks is helpful but not required
- Interest in generative AI and computer vision
Key Takeaways
Attendees will learn:
- The evolution and current state of image generation technologies
- How diffusion models work and why they've revolutionized the field
- The architecture of Stable Diffusion and similar models
- Practical applications of image generation in various industries
- Ethical considerations and limitations of current technologies
Customization Options
This talk can be tailored for different audiences and time constraints:
- More technical deep dive for ML engineers and researchers
- More demo-focused for creative audiences
- Extended workshop format with hands-on prompt engineering
- Focus on specific applications (art, product design, medical, etc.)
Technical Requirements
- Standard projector setup
- Internet connection for live demos
- Audio capability for video demonstrations
About the Speaker
Abhik Sarkar is an AI researcher and engineer with extensive experience in computer vision and generative AI. He has worked on implementing and fine-tuning various image generation models and enjoys making these complex technologies accessible to diverse audiences.
Audience Feedback
“Abhik's talk on this topic was enlightening and practical. The audience was engaged throughout and left with actionable insights they could apply immediately.”
Interested in booking this talk?
I'd love to bring this topic to your event! Get in touch to discuss logistics, timing, and any specific areas you'd like me to focus on.

