这是用户在 2024-3-29 14:02 为 https://github.com/fudan-generative-vision/champ 保存的双语快照页面,由 沉浸式翻译 提供双语支持。了解如何保存?
Skip to content
You have no unread notifications
fudan-generative-...  /   champ  /  
Owner avatar champ Public

Notifications

Notification settings

Champ: Controllable and Consistent Human Image Animation with 3D Parametric Guidance
帆:基于 3D 参数化指导的可控且一致的人类形象动画

License

Open in github.dev Open in a new github.dev tab Open in codespace

fudan-generative-vision/champ

t

Add file

Add file

Repository files navigation

Champ: Controllable and Consistent Human Image Animation with 3D Parametric Guidance
帆: 通过 3D 参数指导控制和保持一致的人类形象动画

Shenhao Zhu*1Junming Leo Chen*2Zuozhuo Dai3Yinghui Xu2Xun Cao1Yao Yao1Hao Zhu+1Siyu Zhu+2
朱神豪 *1 陈俊明 Leo *2 戴作作 3 徐英慧 2 曹驯 1 姚姚 1 朱浩 +1 朱思雨 +2
1Nanjing University 2Fudan University 3Alibaba Group
南京大学 2 复旦大学 3 阿里巴巴集团
*Equal Contribution +Corresponding Author
* 平 contributor 分 + 相应作者
head.mp4 head.mp4 (该翻译仅限于输入 "head.mp4" 这个整体短文本,指代一个文件名,含义为 "头(head)的 mp4 文件",不涉及任何解释。)

Framework 框架

framework

Installation 安装

  • System requirement: Ubuntu20.04
    系统要求:Ubuntu20.04
  • Tested GPUs: A100, RTX3090
    测试过的 GPU:A100、RTX3090

Create conda environment:
创建 conda 环境: ``` conda create --name environment_name ``` 请将"environment\_name"替换为您想要使用的环境名称

  conda create -n champ python=3.10
  conda activate champ

Install packages with pip:
使用 pip 安装软件包:

  pip install -r requirements.txt

Download pretrained models
下载预训练模型

  1. Download pretrained weight of base models:
    下载基础模型的预训练权重:

  2. Download our checkpoints: \
    下载我们的检查点:\

Our checkpoints consist of denoising UNet, guidance encoders, Reference UNet, and motion module.
我们的检查点包括去噪 UNet、指导编码器、参考 UNet 和动作模块。

Finally, these pretrained models should be organized as follows:
最终,这些预训练模型应该按照以下方式组织:

./pretrained_models/
|-- champ
|   |-- denoising_unet.pth
|   |-- guidance_encoder_depth.pth
|   |-- guidance_encoder_dwpose.pth
|   |-- guidance_encoder_normal.pth
|   |-- guidance_encoder_semantic_map.pth
|   |-- reference_unet.pth
|   `-- motion_module.pth
|-- image_encoder
|   |-- config.json
|   `-- pytorch_model.bin
|-- sd-vae-ft-mse
|   |-- config.json
|   |-- diffusion_pytorch_model.bin
|   `-- diffusion_pytorch_model.safetensors
`-- stable-diffusion-v1-5
    |-- feature_extractor
    |   `-- preprocessor_config.json
    |-- model_index.json
    |-- unet
    |   |-- config.json
    |   `-- diffusion_pytorch_model.bin
    `-- v1-inference.yaml

Inference 推理 (Inference is the process of using evidence and reasoning to form a judgment or conclusion.)

We have provided several sets of example data for inference. Please first download and place them in the example_data folder.
我们为推理提供了 several sets 的示例数据。请先下载,然后放置到 example_data 文件夹中。 (注:example_data 是一个占位符,需要根据实际情况替换为正确的文件夹名称。)

Here is the command for inference:
下面是推理的命令:

  python inference.py --config configs/inference.yaml

Animation results will be saved in results folder. You can change the reference image or the guidance motion by modifying inference.yaml.
动画结果将保存在 results 文件夹中。您可以通过修改 inference.yaml 来改变参考图像或指导动作。

You can also extract the driving motion from any videos and then render with Blender. We will later provide the instructions and scripts for this.
您也可以从任何视频中提取驾驶动作,然后使用 Blender 渲染。我们将 later 提供相关指导和脚本。 (注:这里的"driving motion"被理解为指代视频中的车辆运动或运动轨迹,可能包括车辆的转向、加速、减速等动作。)

Note: The default motion-01 in inference.yaml has more than 500 frames and takes about 36GB VRAM. If you encounter VRAM issues, consider switching to other example data with less frames.
注意: inference.yaml 中默认的动画-01 超过 500 帧,需要约 36GB 的 VRAM。如果遇到 VRAM 问题,请考虑切换到其他帧数更少的示例数据。

Acknowledgements 感谢之词

We thank the authors of MagicAnimate, Animate Anyone, and AnimateDiff for their excellent work. Our project is built upon Moore-AnimateAnyone, and we are grateful for their open-source contributions.
我们感谢 MagicAnimate、Animate Anyone 和 AnimateDiff 的作者们的出色工作。我们的项目是建立在 Moore-AnimateAnyone 上的,我们非常感谢他们的开源贡献。

Roadmap 路线图

Visit our roadmap to preview the future of Champ.
访问我们的路线图,预览 Чамп的未来。 (这里的“Champ”是一个产品或项目的名称,因为英文文本中并没有提供具体的解释,所以我将它直接翻译为“ Чамп”,避免了解释。)

Citation 引证 (注:“Citation” 一词,直接翻译为 “引证” 或 “引用”,都可以。这里给出 “引证” 是因为 “citation” 在学术界常用于表示“参考文献”的意思,而 “引用” 则更多指的是在正文中引用他人已发表文章的情况。)

If you find our work useful for your research, please consider citing the paper:
如果您发现我们的工作对您的研究有所帮助,请考虑引用该论文:

@misc{zhu2024champ,
      title={Champ: Controllable and Consistent Human Image Animation with 3D Parametric Guidance},
      author={Shenhao Zhu and Junming Leo Chen and Zuozhuo Dai and Yinghui Xu and Xun Cao and Yao Yao and Hao Zhu and Siyu Zhu},
      year={2024},
      eprint={2403.14781},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

Opportunities available 机会可刺眼

Multiple research positions are open at the Generative Vision Lab, Fudan University! Include:
Generative Vision Lab 福达大学正在招聘多个研究 Position!包括:

  • Research assistant 研究助手
  • Postdoctoral researcher 博士后研究员
  • PhD candidate 博士研究生
  • Master students 硕士研究生

Interested individuals are encouraged to contact us at siyuzhu@fudan.edu.cn for further information.