this post was submitted on 22 Nov 2023
1 points (100.0% liked)

Machine Learning

1 readers
1 users here now

Community Rules:

founded 11 months ago
MODERATORS
 

What are your thoughts on the DallE3 “paper” which doesn’t cover technical or architectural details? The only useful takeaway seems to be “higher quality data is better” and “image captioning models that provide a great amount of detail can create good datasets.”

you are viewing a single comment's thread
view the rest of the comments
[–] GorillaWithAKeyboard@alien.top 1 points 10 months ago

All these models are built on top of one another and they cite previous works they built on top of. T5 encoder (imagen) + data captioned with GPT-V. Improved SD VAE that they also open sourced.

I wished they would have published their hyper params but alas.

What else did you want to see from the paper?