Controlnet Application
Task: controlnet_animation
It is difficult to avoid video frame flickering when using stable diffusion to generate video frame by frame. Here we reproduce a method that effectively avoids video flickering, that is, using controlnet and multi-frame rendering. ControlNet is a neural network structure to control diffusion models by adding extra conditions. Multi-frame rendering is a community method to reduce flickering. We use controlnet with hed condition and stable diffusion img2img for multi-frame rendering.
prompt key words: a handsome man, silver hair, smiling, play basketball
caixukun_dancing_begin_fps10_frames_cat.mp4
prompt key words: a handsome man
zhou_woyangni_fps10_frames_resized_cat.mp4
Change prompt to get different result
prompt key words: a girl, black hair, white pants, smiling, play basketball
caixukun_dancing_begin_fps10_frames_girl2.mp4
We use pretrained model from hugging face.
Model | Dataset | Download |
---|---|---|
anythingv3 config | - | stable diffusion model |
There are two ways to try controlnet animation.
Running the following codes, you can get an generated animation video.
from mmagic.apis import MMagicInferencer
# Create a MMEdit instance and infer
editor = MMagicInferencer(model_name='controlnet_animation')
prompt = 'a girl, black hair, T-shirt, smoking, best quality, extremely detailed'
negative_prompt = 'longbody, lowres, bad anatomy, bad hands, missing fingers, ' + \
'extra digit, fewer digits, cropped, worst quality, low quality'
# you can download the example video with this link
# https://user-images.githubusercontent.com/12782558/227418400-80ad9123-7f8e-4c1a-8e19-0892ebad2a4f.mp4
video = '/path/to/your/input/video.mp4'
save_path = '/path/to/your/output/video.mp4'
# Do the inference to get result
editor.infer(video=video, prompt=prompt, negative_prompt=negative_prompt, save_path=save_path)
python demo/gradio_controlnet_animation.py
We also provide a demo to play controlnet animation with sam, for details, please see OpenMMLab PlayGround.
@misc{zhang2023adding,
title={Adding Conditional Control to Text-to-Image Diffusion Models},
author={Lvmin Zhang and Maneesh Agrawala},
year={2023},
eprint={2302.05543},
archivePrefix={arXiv},
primaryClass={cs.CV}
}