【美味蟹堡王今日营业】论文学习笔记10-05

呐今天要try out previous ideas，不过多看文章不能停呀~就从oral开始吧！

DensePose: Multi-Person Dense Human Pose Estimation In The Wild[project page and dataset][paper]

Abstract

In this work, we establish dense correspondences between an RGB image and a surface-based representation of the human body, a task we refer to as dense human pose estimation. We first gather dense correspondences for 50K persons appearing in the COCO dataset by introducing an efficient annotation pipeline. We then use our dataset to train CNN-based systems that deliver dense correspondence ‘in the wild’, namely in the presence of background, occlusions and scale variations. We improve our training set’s effectiveness by training an ‘inpainting’ network that can fill in missing ground truth values, and report clear improvements with respect to the best results that would be achievable in the past. We experiment with fullyconvolutional networks and region-based models and observe a superiority of the latter; we further improve accuracy through cascading, obtaining a system that delivers highly-accurate results in real time. Supplementary materials and videos are provided on the project page http: //densepose.org.

阅读笔记丰富：

Semi-parametric Image Synthesis[paper]

Abstract

We present a semi-parametric approach to photographic image synthesis from semantic layouts. The approach combines the complementary strengths of parametric and nonparametric techniques. The nonparametric component is a memory bank of image segments constructed from a training set of images. Given a novel semantic layout at test time, the memory bank is used to retrieve photographic references that are provided as source material to a deep network. The synthesis is performed by a deep network that draws on the provided photographic material. Experiments on multiple semantic segmentation datasets show that the presented approach yields considerably more realistic images than recent purely parametric techniques.

Introduction（part）

Photographic image synthesis by deep networks can open a new route to photorealism: a problem that has traditionally been approached via explicit manual modeling of three-dimensional surface layout and reflectance distributions [24]. A deep network that is capable of synthesizing photorealistic images given a rough specification could become a new tool in the arsenal of digital artists. It could also prove useful in the creation of AI systems, by endowing them with a form of visual imagination [19]. Recent progress in photographic image synthesis has been driven by parametric models – deep networks that represent all data concerning photographic appearance in their weights [11, 2]. This is in contrast to the practices of human photorealistic painters, who do not draw purely on memory but use external references as source material for reproducing detailed object appearance [17]. It is also in contrast to earlier work on image synthesis, which was based on nonparametric techniques that could draw on large datasets of images at test time [7, 15, 3, 13, 10]. In switching from nonparametric approaches to parametric ones, the research community gained the advantages of end-to-end training of highly expressive models. But it relinquished the ability to draw on large databases of original photographic content at test time: a strength of earlier nonparametric techniques. In this paper, we present a semi-parametric approach to photographic image synthesis from semantic layouts. The presented approach exemplifies a general family of methods that we call semi-parametric image synthesis (SIMS). Semi-parametric synthesis combines the complementary strengths of parametric and nonparametric techniques. In the presented approach, the nonparametric component is a database of segments drawn from a training set of photographs with corresponding semantic layouts. At test time, given a novel semantic layout, the system retrieves compatible segments from the database. These segments are used as raw material for synthesis. They are composited onto a canvas with the aid of deep networks that align the segments to the input layout and resolve occlusion relationships. The canvas is then processed by a deep network that produces a photographic image as output. We conduct experiments on the Cityscapes, NYU, and ADE20K datasets. The experimental results indicate that images produced by SIMS are considerably more realistic than the output of purely parametric models for photographic image synthesis from semantic layouts.

阅读笔记：

图像自动合成新方法，结果逼真堪比CG｜港中大英特尔CVPR论文

Taskonomy：Disentangling Task Transfer Learning[website]

CVPR 2018 Best Paper Taskonomy 作者解读利用条件GANs的pix2pix进化版：高分辨率图像合成和语义操作 | PaperDaily #23

Deep Semantic Face Deblurring[paper]

Abstract

In this paper, we present an effective and efficient face deblurring algorithm by exploiting semantic cues via deep convolutional neural networks (CNNs). As face images are highly structured and share several key semantic components (e.g., eyes and mouths), the semantic information of a face provides a strong prior for restoration. As such, we propose to incorporate global semantic priors as input and impose local structure losses to regularize the output within a multi-scale deep CNN. We train the network with perceptual and adversarial losses to generate photo-realistic results and develop an incremental training strategy to handle random blur kernels in the wild. Quantitative and qualitative evaluations demonstrate that the proposed face deblurring algorithm restores sharp images with more facial details and performs favorably against state-of-the-art methods in terms of restoration quality, face recognition and execution speed.

【划重点占坑】StarGAN: Unified Generative Adversarial Networks for Controllable Multi-Domain Image-to-Image Translation