OpenAI DALL·E: Creating Images from Text (Blog Post Explained)

First published at 19:19 UTC on January 6th, 2021.

#openai #science #gpt3

OpenAI's newest model, DALL·E, shows absolutely amazing abilities in generating high-quality images from arbitrary text descriptions. Like GPT-3, the range of applications and the diversity of outputs is astonishing, given that this is a single model, trained on a purely autoregressive task. This model is a significant step towards the combination of text and images in future AI applications.

OUTLINE:
0:00 - Introduction
2:45 - Overview
4:20 - Dataset
5:35 - Comparison to GPT-3
7:00 - Model Architecture
13:20 - VQ-VAE
21:00 - Combining VQ-VAE with GPT-3
27:30 - Pre-Training with Relaxation
32:15 - Experimental Results
33:00 - My Hypothesis about DALL·E's inner workings
36:15 - Sparse Attention Patterns
38:00 - DALL·E can't count
39:35 - DALL·E can't global order
40:10 - DALL·E renders different views
41:10 - DALL·E is very good at texture
41:40 - DALL·E can complete a bust
43:30 - DALL·E can do some reflections, but not others
44:15 - DALL·E can do cross-sections of some objects
45:50 - DALL·E is amazing at style
46:30 - DALL·E can generate logos
47:40 - DALL·E can generate bedrooms
48:35 - DALL·E can combine unusual concepts
49:25 - DALL·E can generate illustrations
50:15 - DALL·E sometimes understands complicated prompts
50:55 - DALL·E can pass part of an IQ test
51:40 - DALL·E probably does not have geographical / temporal knowledge
53:10 - Reranking dramatically improves quality
53:50 - Conclusions & Comments

Blog: https://openai.com/blog/dall-e/

Links:
TabNine Code Completion (Referral): http://bit.ly/tabnine-yannick
YouTube: https://www.youtube.com/c/yannickilcher
Twitter: https://twitter.com/ykilcher
Discord: https://discord.gg/4H8xxDF
BitChute: https://www.bitchute.com/channel/yannic-kilcher
Minds: https://www.minds.com/ykilcher
Parler: https://parler.com/profile/YannicKilcher
LinkedIn: https://www.linkedin.com/in/yannic-kilcher-488534136/

If you want to support me, the best thing to do is to share out the content :)

If you want to support me financially (completely optional and voluntary, but a lot of people have asked for this):
SubscribeStar: https://www.subscribestar.com/yannickilcher
Patreon: https://www.patreon.com/yannickilcher
Bitcoin (BTC): bc1q49lsw3q325tr58ygf8sudx2dqfguclvngvy2cq
Ethereum (ETH): 0x7ad3513E3B8f66799f507Aa7874b1B0eBC7F85e2
Litecoin (LTC): LQW2TRyKYetVC8WjFkhpPhtpbDM4Vw7r9m
Monero (XMR): 4ACL8AGrEo5hAir8A9CeVrW8pEauWvnp1WnSDZxW7tziCDLhZAGsgzhRQABDnFy8yuM9fWJDviJPHKRjV4FWt19CJZN9D4n

Category	Science & Technology
Sensitivity	Normal - Content that is suitable for ages 16 and over

DISCUSS THIS VIDEO

OpenAI DALL·E: Creating Images from Text (Blog Post Explained)

Playing Next

Related Videos

Warning - This video exceeds your sensitivity preference!