DynamiCrafter: Animating Open-domain Images with Video Diffusion Priors
🔆 Introduction
🤗 Welcome to DynamiCrafter, where we can bring open-domain still images to life using text prompts with our pre-trained video diffusion priors. Make sure to check out our project page and paper for more details.
😀 Stay tuned for updates as we continue to enhance the model’s performance, including offering higher resolution, eliminating watermarks, and improving overall stability.
1. Showcases
“bear playing guitar happily, snowing”
“boy walking on the street”
“two people dancing”
“girl talking and blinking”
“zoom-in, a landscape, springtime”
“A blonde woman rides on top of a moving
washing machine into the sunset.”
“explode colorful smoke coming out”
“a bird on the tree branch”
2. Applications
2.1 Storytelling video generation (see project page for more details)
2.2 Looping video generation
2.3 Generative frame interpolation
Input starting frame
Input ending frame
Generated video
📝 Changelog
- [2023.12.02]: 🔥🔥 Launch the local Gradio demo.
- [2023.11.29]: 🔥🔥 Release the main model at a resolution of 256×256.
- [2023.11.27]: 🔥🔥 Launch the project page and update the arXiv preprint.
🧰 Models
Model
Resolution
Checkpoint
DynamiCrafter256
256×256
Hugging Face
It takes approximately 10 seconds and requires a peak GPU memory of 20 GB to animate an image using a single NVIDIA A100 (40G) GPU.
⚙️ Setup
Install Environment via Anaconda (Recommended)
conda create -n dynamicrafter python=3.8.5
conda activate dynamicrafter
pip install -r requirements.txt
💫 Inference
1. Command line
- Download pretrained models via Hugging Face, and put the model.ckpt in checkpoints/dynamicrafter_256_v1/model.ckpt.
- Run the commands based on your devices and needs in terminal.
# Run on a single GPU:
sh scripts/run.sh
# Run on multiple GPUs for parallel inference:
sh scripts/run_mp.sh
2. Local Gradio demo
- Download the pretrained models and put them in the corresponding directory according to the previous guidelines.
- Input the following commands in terminal.
👨👩👧👦 Crafter Family
VideoCrafter1: Framework for high-quality video generation.
ScaleCrafter: Tuning-free method for high-resolution image/video generation.
TaleCrafter: An interactive story visualization tool that supports multiple characters.
LongerCrafter: Tuning-free method for longer high-quality video generation.
MakeYourVideo, might be a Crafter:): Video generation/editing with textual and structural guidance.
StyleCrafter: Stylized-image-guided text-to-image and text-to-video generation.
😉 Citation
@article{xing2023dynamicrafter,
title={DynamiCrafter: Animating Open-domain Images with Video Diffusion Priors},
author={Xing, Jinbo and Xia, Menghan and Zhang, Yong and Chen, Haoxin and Yu, Wangbo and Liu, Hanyuan and Wang, Xintao and Wong, Tien-Tsin and Shan, Ying},
journal={arXiv preprint arXiv:2310.12190},
year={2023}
}
🙏 Acknowledgements
We would like to thank AK(@_akhaliq) for the help of setting up hugging face online demo, and camenduru for providing the replicate & colab online demo.
📢 Disclaimer
We develop this repository for RESEARCH purposes,