Even_Adder

joined 1 year ago
[–] Even_Adder@lemmy.dbzer0.com 6 points 6 days ago* (last edited 6 days ago) (2 children)

If you want to mess with Omnigen it was designed for this kind of thing. The code and model were released a few days ago.

[–] Even_Adder@lemmy.dbzer0.com 4 points 1 week ago (1 children)

You're killing it with these gens.

[–] Even_Adder@lemmy.dbzer0.com 2 points 1 week ago

Here's a video explaining how diffusion models work, and this article by Kit Walsh, a senior staff attorney at the EFF.

[–] Even_Adder@lemmy.dbzer0.com 2 points 1 week ago

Your comment made my day. Thanks.

[–] Even_Adder@lemmy.dbzer0.com 0 points 1 week ago (9 children)

Anyone spreading this misinformation and trying gatekeep being an artist after the avant-garde movement doesn't have an ounce of education in art history. Generative art, warts and all, is a vital new form of art that's shaking things up, challenging preconceptions, and getting people angry - just like art should.

[–] Even_Adder@lemmy.dbzer0.com 5 points 2 weeks ago

Entertainment.

[–] Even_Adder@lemmy.dbzer0.com 15 points 2 weeks ago

It's not a computer playing, a person plans out the run and then executes the plan with the help of slow motion, save states, and frame-by-frame play. Seeing things that no human could possibly pull off unassisted is entertaining too.

[–] Even_Adder@lemmy.dbzer0.com 4 points 3 weeks ago

Their policy could never stop anyone in the first place.

[–] Even_Adder@lemmy.dbzer0.com 7 points 3 weeks ago

Thanks for digging up that lede.

[–] Even_Adder@lemmy.dbzer0.com 6 points 4 weeks ago

Using copyrighted works without permission isn't illegal and shouldn't be. You should check out this article by Kit Walsh, a senior staff attorney at the EFF, and this open letter by Katherine Klosek, the director of information policy and federal relations at the Association of Research Libraries.

[–] Even_Adder@lemmy.dbzer0.com 2 points 4 weeks ago

You're able to earn currency. I don't know how much it costs per generation.

 

(diegocr) (2023)

Image Caption: A digital painting of an anthropomorphic giraffe standing in the savanna wearing a cloak. The cloak is blue with gold trim and there is a golden chain holding it together. The background consists of short trees on yellow grass, with a mountain range in the distance. The sky is a bright blue with fluffy clouds.

Full Generation Parameters:

a highly detailed portrait of a humanoid giraffe in a blue cloak,adventurer,professional,unreal engine 5,octane render art by greg rutkowski,loish,rhads,ferdinand knab,makoto shinkai and lois van baarle,ilya k

Steps: 96, Size: 1024x1024, Seed: 775134154, Model: morphxl_v10, Version: v1.6.0-263-g464fbcd9, Sampler: UniPC, CFG scale: 4.5, Model hash: 61e137b575, "add-detail-xl: 9c783c8ce46c"

 

Abstract:

Significant advancements have been achieved in the realm of large-scale pre-trained text-to-video Diffusion Models (VDMs). However, previous methods either rely solely on pixel-based VDMs, which come with high computational costs, or on latent-based VDMs, which often struggle with precise text-video alignment. In this paper, we are the first to propose a hybrid model, dubbed as Show-1, which marries pixel-based and latent-based VDMs for text-to-video generation. Our model first uses pixel-based VDMs to produce a low-resolution video of strong text-video correlation. After that, we propose a novel expert translation method that employs the latent-based VDMs to further upsample the low-resolution video to high resolution. Compared to latent VDMs, Show-1 can produce high-quality videos of precise text-video alignment; Compared to pixel VDMs, Show-1 is much more efficient (GPU memory usage during inference is 15G vs 72G). We also validate our model on standard video generation benchmarks. Our code and model weights are publicly available at https://github.com/showlab/Show-1.

 

Abstract:

Significant advancements have been achieved in the realm of large-scale pre-trained text-to-video Diffusion Models (VDMs). However, previous methods either rely solely on pixel-based VDMs, which come with high computational costs, or on latent-based VDMs, which often struggle with precise text-video alignment. In this paper, we are the first to propose a hybrid model, dubbed as Show-1, which marries pixel-based and latent-based VDMs for text-to-video generation. Our model first uses pixel-based VDMs to produce a low-resolution video of strong text-video correlation. After that, we propose a novel expert translation method that employs the latent-based VDMs to further upsample the low-resolution video to high resolution. Compared to latent VDMs, Show-1 can produce high-quality videos of precise text-video alignment; Compared to pixel VDMs, Show-1 is much more efficient (GPU memory usage during inference is 15G vs 72G). We also validate our model on standard video generation benchmarks. Our code and model weights are publicly available at https://github.com/showlab/Show-1.

view more: next ›