Realistic Dynamic Facial Textures from a Single Image using GANs(ICCV2017)

Abstract

  • Purpose

  1. We present a novel method to realistically puppeteer and animate a face from a single RGB image using a source video sequence.
  • Procedures

  1. fitting a multilinear PCA model to obtain the 3D geometry and a single texture of the target face.
  2. dynamic per-frame textures that capture subtle wrinkles and deformations corresponding to the animated facial expressions
  • problems

  1. dynamic textures cannot be obtained directly from a single image
  2. not possible to obtain actual images of the mouth interior.
  • Solution

  1. a Deep Generative Network that can infer realistic per-frame texture deformations of the target identity
  • Conclusion

By retargeting the PCA expression geometry from the source, as well as using the newly inferred texture, we can both animate the face and perform video face replacement on the source video using the target appearance.

本文所要达成的目标是,把一张静态的二维的人脸,移植到另一张动态的人脸的视频上。首先,用PCA模型给人脸构建3D模型,其次,捕获人脸动态的皱纹等细节特征。由于动态的皮肤特征没法直接从一张图片中获得,而且口腔中牙齿等特征几乎没有,本文引入了conditional GAN来生成这些缺失的特征。

1. Introduction

前人的工作主要有video rewriting, face replacement还有realtime video reenactment. 它们的局限性是,都需要经过处理过的高清视频作为输出。而本文的方法是,从一张图片中生成脸部表情视频。
一种办法是套上一个3D模型,但是这样就没有皱纹这些细节之处。本文想要做到的是让这些脸部细节随着表情的变化而变化,该有的时候有,该没有的时候就没有。还有能够生成口腔中的细节特征。

2. Related Work

2.1 Facial retargeting and Enactment

目前最好的是用了两张或者更多图片去生成

2.2 Capturing and Retargeting Photorealistic Mouth Interior

这个有意思,有的人通过声音来还原嘴巴构造。最近的是通过搜索最相近的口腔模型来达到还原口腔的目的

2.3 Deep Generative Model for Texture Synthesis

之前有用马尔科夫网络用于生成高清人脸的,但是有瑕疵。有用统计模型来生成皱纹的,但是不高清。有用深度学习框架的,高清但是没能生成皱纹。

3. Overview

Our pipeline consists of the following steps (illustrated in Fig. 1):

overview
  1. Fit a 3D model to extract static albedo textures from each frame in the source video sequence and the single RGB target image (Section 4).
  2. Infer dynamic textures and retarget the per-frame texture expressions from the source video frames onto the target image texture using a generative adversarial framework (Section 5).
  3. Composite the target mesh with the generated dynamic textures into each frame in the source video (Section 6).

4. Fitting the Face Model

PCA modeling

5. Dynamic Texture Synthesis

看到这张图就像看到了老乡

5.1 Deep Learning Framework

image.png

5.2 Loss Function

image.png

5.3 Network Architecture

image.png

wiki: UV mapping is the 3D modeling process of projecting a 2D image to a 3D model's surface for texture mapping.

5.4 Mouth Synthesis

image.png

Optical flow or optic flow is the pattern of apparent motion of objects, surfaces, and edges in a visual scene caused by the relative motion between an observer and a scene.

首先,第一步,把嘴巴不管是闭着的还是开着的都把它的UVtexture投影到三维,如果是闭着的那就是一片粉色,没有牙齿。然后我们把张开嘴的视频通过深度学习框架迁移到静态的图象中,从而来推测嘴部以及口腔内部的样子。由于缺少口腔的训练数据,导致嘴部的分辨率较低,于是他们用了一种叫SIFT-Flow的东西,这个东西把源视频的每一帧提取特征,通过一种matching公式把它整合到目标图片中去

6. Video Face Replacement via Blending

6.1 Graph-Cut

7. Experiments

7.1 Data Collection

7.2 Data Augmentation

7.3 Training and Testing

8. Results

8.1 Comparisons

Note that the images on the right are capable of generating detailed wrinkles and filling the inner mouth cavity.

8.2 Quantitative Evaluation

9. Discussion

9.1 Limitations

9.2 Future Work

最后编辑于
©著作权归作者所有,转载或内容合作请联系作者
平台声明:文章内容(如有图片或视频亦包括在内)由作者上传并发布,文章内容仅代表作者本人观点,简书系信息发布平台,仅提供信息存储服务。

推荐阅读更多精彩内容

  • 我愿是夕阳,染红一半边天,哪怕落暮,近似黄昏 也不要做一抹微光,没落在人群,残喘着,只剩一丝生命。
    强迫症患者L阅读 2,415评论 0 1
  • 临睡前,看到一段话,感同身受。有读者问:“既然生命过程实属无意,那我们为生命还要活着?是为了燃烧,为了实...
    洛瑶的乐园阅读 4,306评论 0 2
  • 1. 命令解析 命令用途: 与cat全部显示文件内容不同,more用于分页显示文件内容,通过使用空格或者CTRL+...
    1519f8ccc7b0阅读 2,490评论 0 0
  • 我是哲哲,在简书第一篇文章,我决定写一写肥鱼摄影师。 很多时候,觉得自己去过很多地方,但遗憾的是,我们留不住当时的...
    咗小岸阅读 3,502评论 2 2