• Popcorn Points determine how popular a video is. You can click the popcorn bucket or simply react (Like, Love, etc.) and it will register a vote.

An animated AI Scene created from static images using audio as the primary input.

I like to push the envelope a little. This video was created using hedra and invoke ai edited in Davinci resolve.


I really wanted to add some action to it, but so far I haven't had much luck. Mostly just simple things like showing the actors walking in and out of the room, sitting down in a char. Simple things. Viggle looked promising, but I had too many issues with the quality to continue messing with it.
Would have been nice if it would have worked out. It would have made creating actors much simpler as I would have only needed a full body 2d image of them.
I did some testing using freemocap with meta humans in URE5 and I think that may partially work for me in the future once some better facial tracking becomes available that doesn't require an iphone. But for now hedra and invoke allow me to make moving comics like this. Still trying to develop something better for doing the background removal on the hedra videos. They have a lot of color spill which makes chromakey difficult.
 
Upvote 0
Looks pretty good!

Might want to hit the despill a little harder in your resolve key settings. I've also had good luck with about a negative 50 setting of the in out under the masks refiner.

Action and interaction is the biggest obstacle right now, And the way I've solved it is by building a significant part of the pipeline as video to video rather than text to video or image to video. It all depends on how much time and effort you want to spend getting it to work (And ten other external factors). I've got it fully operational at a lower level of quality right now, and as of this month, several years of work in, I'm just one inch away from being able to animate any interaction or action at broadcast quality level.

Looking forward to seeing more of your videos!
 
Thanks, the background removal from the hedra videos was really challenging. I tried using a blue and green checker background when making the videos since any solid color always ended up with random artifacts that was giving me problems. I actually maxed out the despill to get it to this point. I later figured out that Davinci has an ai rotoscoping tool which made the removal much easier. I think if I would have gone back and recreated the hedra videos using a different background it would have turned out much better. My first approach was using the chromakey. I even tried training my own background removal model using frames extracted from some of the videos, but it did much worse. I don't think the expressions from hedra are too bad, but some of them are a bit repetitive, and the eyes aren't locked together so the characters ended up with a case of lazy eye. Since I uploaded the video I've gone back and messed around with it some more. I tried upscaling it in davinci, and that helped a bit. I also tried using facefusion to help restore some of the details from the original images, it helps some, but it's still not great. The characters don''t have to be super realistic, but I would like them to be a little sharper. I could then add the lens blur effect to the background to create a bit more depth.
 
Magic mask in resolve is pretty good most of the time. I've found that it takes a lot less marking than you would expect, with often just a straight line from face to leg being enough for it to get a good key. Despill at 100, in/out at -50, and blur at 5 to 7 usually gives a pretty good result.

The thing I do to get seamless composites is to refabricate the frames with very mild stylization after the composite, so basically hand the AI layer the composited frame and tell it to draw it again with only the slightest variance. Basically the AI just doesn't draw the compositing errors when it recreates the frame, and suddenly it's flawless. I call this the "polish" layer.
ComfyUI_04863_.png
 
Last edited:
Back
Top