Towards Unsupervised Deep Image EnhancementWith Generative Adversarial Network借助生成对抗网络实现无监督的深度图像增强

Abstract
Improving the aesthetic quality of images is challenging and eager for the public. To address this problem, most existing algorithms are based on supervised learning methods to learn an automatic photo enhancer for paired data, which consists of low-quality photos and corresponding expert retouched versions. However, the style and characteristics of photos retouched by experts may not meet the needs or preferences of general users. In this paper, we present an unsupervised image enhancement generative adversarial network (UEGAN), which learns the corresponding image-to-image mapping from a set of images with desired characteristics in an unsupervised manner, rather than learning on a large number of paired images. The proposed model is based on single deep GAN which embeds the modulation and attention mechanisms to capture richer global and local features.Based on the proposed model, we introduce two losses to deal with the unsupervised image enhancement: (1) fidelity loss,whichis defined as L2 regularization in the feature domain of a pretrained VGG network to ensure the content between the enhanced image and the input image is the same, and (2) quality loss that is formulated as a relativistic hinge adversarial loss to endow the input image the desired characteristics. Both quantitative and qualitative results show that the proposed model effectively improves the aesthetic quality of images. Our code is available at: https://github.com/eezkni/UEGAN.
Abstract
对于公众而言,提高图像的美学质量是充满挑战和渴望的。为了解决这个问题,大多数现有算法都基于监督学习方法来学习用于配对数据的自动照片增强器,该照片增强器由低质量的照片和相应的专家修饰版本组成。但是,专家修饰的照片的样式和特征可能无法满足一般用户的需求或偏好。在本文中,我们提出了一种无监督的图像增强生成对抗网络(UEGAN),该网络以无监督的方式从一组具有所需特征的图像中学习相应的图像到图像的映射,而不是学习大量的成对图像.所提出的模型基于单个深层GAN,它嵌入了调制和注意力机制以捕获更丰富的全局和局部特征。在所提出的模型的基础上,我们引入了两种损失来应对无监督图像增强:(1)保真度损失,其定义为在预训练的VGG网络的特征域中进行L2 正则化,以确保增强图像和输入图像之间的内容相同,以及(2)质量损失,表示为相对论铰链对抗性损失赋予输入图像所需的特性。定性和定量结果均表明该模型有效地提高了图像的美学质量。我们的代码位于:https://github.com/eezkni/UEGAN。
I. INTRODUCTION
With the rapid development of mobile Internet, smart electronic devices, and social networks, it is becoming Manuscript received March 16, 2020; revised August 6, 2020; accepted September 1, 2020. Date of publication September 22, 2020; date of currentversion September 29, 2020. This work was supported in part by the Natural Science Foundation of China under Grant 61772344 and Grant 61672443,in part by the * Research Grants Council (RGC) General Research Funds under Grant 9042816 (CityU 11209819) and Grant 9042957 (CityU 11203220), in part by the * Research Grants Council (RGC) Early Career Scheme under Grant 9048122 (CityU 21211018), and in part by theKey Project of Science and Technology Innovation 2030 supported by the Ministry of Science and Technology of China under Grant 2018AAA0101301.The associate editor coordinating the review of this manuscript and approving it for publication was Dr. Nikolaos Mitianoudis.(Corresponding authors:Shiqi Wang; Sam Kwong.) Zhangkai Ni, Wenhan Yang, and Shiqi Wang are with the Departmentof Computer Science, City University of *, * (e-mail:eezkni@gmail.com; yangwenhan@pku.edu.cn; shiqwang@cityu.edu.hk).Lin Ma is with the Meituan-Dianping Group, Beijing 100102, China(e-mail: forest.linma@gmail.com).Sam Kwong is with the Department of Computer Science, City University of *, *, and also with the City University of *, Shenzhen Research Institute, Shenzhen 518057, China (email:cssamk@cityu.edu.hk).This article has supplementary downloadable material available at http://ieeexplore.ieee.org, provided by the authors.Digital Object Identifier 10.1109/TIP.2020.3023615. more and more popular to record and upload the wonderful lives of people through social media and online sharing communities. However, due to the high cost of highquality hardware devices and the lack of professional photo graphy skills, the aesthetic quality of photos taken by the general public is often unsatisfactory. Professional image editing is expensive, and it is hard to provide such services in an auto-mated manner as aesthetic feelings and preferences are usually a personal issue. Therefore, the automatic image enhancement techniques providing the user-oriented image beautification are preferred.

I. INTRODUCTION
随着移动互联网,智能电子设备和社交网络的飞速发展,它已成为2020年3月16日收到的《手稿》;修订于2020年8月6日;接受日期为2020年9月1日。发布日期为2020年9月22日;日期为2020年9月29日。这项工作部分得到中国自然科学基金会的资助,资助金为61772344和61672443,部分得到香港研究资助局(RGC)普通研究基金的资助,资助金为9042816(CityU 11209819)拨款9042957(CityU 11203220),部分由香港研究资助局(RGC)早期职业计划资助9048122(CityU 21211018),部分由科学部支持的2030年科技创新重点项目授予2018AAA0101301的中国技术。协调该稿件的审查并批准出版的副编辑是Nikolaos Mitianoudis博士(通讯作者:王世奇;;三Sam)。倪张凯,杨文汉和王诗琪在香港城市大学计算机科学系工作(e-mail:eezkni@gmail.com; yangwenhan@pku.edu.cn; shiqwang@cityu.edu.hk )。林琳(Lin Ma)在美团点评集团(Meituan-Dianping Group),北京100102(电子邮件:forest.linma@gmail.com)。三w(Sam Kwong)在香港城市大学计算机科学系,香港和也与香港城市大学深圳研究所联系,深圳518057(email:cssamk@cityu.edu.hk)。本文提供了可下载的补充材料,网址为http://ieeexplore.ieee.org,由作者提供。 。数字对象标识符10.1109 / TIP.2020.3023615 通过社交媒体和在线共享社区越来越受欢迎地记录和上传人们的精彩生活。然而,由于高质量硬件设备的高成本以及缺乏专业的摄影技术,普通公众拍摄的照片的美学质量常常不能令人满意。专业的图像编辑非常昂贵,并且很难以自动方式提供此类服务,因为审美和喜好通常是个人问题。因此,优选的是提供面向用户的图像美化的自动图像增强技术。
Compared with high-quality images, low-quality images usually suffer from multiple degradations in visual quality,such as poor colors, low contrast, and intensive noises et al. Therefore, the image enhancement process needs to address this degradation with a series of enhancement operations,such as contrast enhancement,color correction, and details adjustment et al. The earliest conventional image enhancement approaches mainly focused on contrast enhancement of low-quality image. The most common histogram adjustment transfers the luminance histogram of a low quality image to a given distribution (may be provided by other reference images) to stretch the contrast of the low-quality image. According to the transformation scope, this kind of method can be further classified into two categories:global histogram equalization (GHE) and local histogram equalization (LHE). The former uses a single his-togram transformation function to adjust all pixels of the entire image. It may lead to improper enhancement results in some local regions, such as under-exposure, over exposure,color distortion,et al. To address this issue, the LHE derives the content adaptive transform functions based on the statistical information in local region and applies these transforms locally. However, the LHE is comput ationally complex and not always powerful because the extracted transformation depends on the dominating information in the local region. Therefore,they are also easy to generate visually unsatisfactory texture details, dull or over-saturated color .
与高质量图像相比,劣质图像通常会遭受视觉质量的多次降级,例如色彩差,对比度低以及强烈的噪点等。因此,图像增强过程需要通过一系列增强操作来解决这种退化,例如对比度增强,色彩校正和细节调整等。最早的传统图像增强方法主要集中在低质量图像的对比度增强上。最常见的直方图调整将低质量图像的亮度直方图转移到给定的分布(可能由其他参考图像提供)以拉伸低质量图像的对比度。根据转换范围,这种方法可以进一步分为两类:全局直方图均衡(GHE)和局部直方图均衡(LHE)。前者使用单个直方图变换功能来调整整个图像的所有像素。这可能会导致某些局部区域的增强效果不当,例如曝光不足,曝光过度,色彩失真等。为了解决这个问题,LHE基于局部区域中的统计信息来推导内容自适应变换函数,并在本地应用这些变换。但是,LHE在计算上很复杂,而且并不总是那么强大,因为提取的变换取决于局部区域中的主要信息。因此,它们也容易产生视觉上不令人满意的纹理细节,暗淡或过饱和的颜色。
For the past few years, deep convolutional neural net-works (CNN) have made significant progress in low-level vision tasks. In order to improve the modeling capacity and adaptivity, deep learning based models are built to introduce the excellent expressive power of deep networks to facilitate image enhancement with the knowledgeof big data. Ignatov et al. designed an end-to-end deep learning network that improves photos from mobile devices to the quality of digital single-lens reflex (DSLR) photos Ren et al. present a hybrid loss to optimize the frame work from three aspects (i.e.,color, texture, and content) to produce more visually pleasing results. Inspired by bilateral grid processing, Gharbiet al. made real-time image enhancement possible, which dynamically generates the image transformation based on local and global information. To deal with low-light image enhancement, Wang et al. established a large-scale under-exposed image dataset and learned an image-to-illumination mapping based on the Retinex model to enhance extremely low-light images.
在过去的几年中,深度卷积神经网络(CNN)在低级视觉任务中取得了重大进展。为了提高建模能力和适应性,构建了基于深度学习的模型,以引入深度网络的出色表达能力,从而借助大数据知识促进自动图像增强。伊格纳托夫等。设计了一个端到端的深度学习网络,该网络可以将移动设备中的照片改进为数字单镜头反射(DSLR)照片的质量。提出了一种混合损失,从三个方面(即颜色,纹理和内容)优化了框架,以产生视觉上更令人愉悦的结果。受双边网格处理的启发,Gharbiet等人。使实时图像增强成为可能,它可以根据本地和全局信息动态生成图像转换。为了应对弱光图像增强,Wang等人。我们建立了一个大规模的曝光不足的图像数据集,并学习了基于Retinex模型的图像到照明的映射,以增强超弱光图像。
However, these methods follow the route of fully supervised learning relying on large-scale datasets with paired low/high-quality images. First, paired data is usually expensive, and sometimes it takes a lot of effort and resources to build the dataset by professional photographers. Second, the judgment of image quality is usually closely related to the personality,aesthetics, taste, and experience of a person. “There are a thousand Hamlets in a thousand people’s eyes.” In otherwords, everyone has his/her different attitude towards thequality of the photography. To demonstrate this, a typical low-quality photo in MIT-Adobe FiveK Dataset and itscorresponding five high-qualityversions retouched by five different experts in photo beautification, are shown in Fig. 1,respectively. It can be observed that the images processed by one expert are very different from the image retouched by another expert. Consequently, it is impractical to create alarge-scale dataset with paired low and high-quality images to meets the preference of everyone. On the contrary, a more feasible way is to express the personal preferences of a user by providing a set of image collections that he/she loves. Therefore, an urgent demand is needed to build an enhancement model to learn the enhancement mapping from the low-quality dataset to a high-quality one even without the specific paired images. In this way, we can get rid of the burden of creating one-to-one paired data and rely only on the target dataset with the desired characteristics preferred by someone.
但是,这些方法遵循完全监督学习的方法,该方法依赖于具有配对的低/高质量图像的大规模数据集。首先,成对的数据通常很昂贵,有时由专业摄影师建立数据集需要花费大量的精力和资源。其次,对图像质量的判断通常与一个人的性格,审美,品味和经验密切相关。 “在一千个人眼中有一千个哈姆雷特。”换句话说,每个人对摄影质量都有不同的态度。为了证明这一点,图1分别显示了MIT-Adobe FiveK数据集中的一张典型的低质量照片,以及由五位不同的照片美化专家润饰的相应的五个高质量版本。可以看出,一个专家处理的图像与另一位专家修饰的图像有很大的不同。因此,用成对的低质量和高质量图像创建大型数据集以满足每个人的偏好是不切实际的。相反,一种更可行的方式是通过提供他/她喜欢的一组图像集来表达用户的个人喜好。因此,迫切需要建立一种增强模型,以学习从低质量数据集到高质量图像的增强映射,即使没有特定的配对图像也是如此。通过这种方式,我们可以摆脱创建一对一配对数据的负担,而仅依靠具有某人期望的所需特征的目标数据集。

介绍单词:
Manuscript :名词: 手稿, 稿, 草 形容词: 手写的
revised :修改
Research Grants Council:研究补助金理事会
associate:动词: 关联, 联想, 联合 名词: 联想, 合作者, 合伙人, 合夥人 形容词: 副的
coordinating :协调
supplementary:形容词: 补充, 异
downloadable material:可下载资料
communities:社区
hardware:名词: 硬件, 铁器, 金属制品
the lack of:缺乏
expensive:形容词: 昂贵, 贵, 高价, 高昂, 高, 昂, 悬, 费用浩大, 花钱多, 嶢, 禕
multiple degradations:多次降级
correction:名词: 更正, 改正, 纠正
The earliest conventional:最早的常规
luminance :名词: 亮度
categories:类别
equalization :均等化
improper :形容词: 不当, 不合适, 失当, 不端, 苟且, 偷
under-exposure:曝光不足
distortion:名词: 失真, 畸变, 扭曲
dominating information:主导信息
visually unsatisfactory texture:视觉上不令人满意的纹理
dull or over-saturated color:暗淡或过饱和的颜色
the excellent expressive:优秀的表现力
facilitate :动词: 促进, 便利, 促成, 便, 纾, 济, 使 … 容易, 纾缓
hybrid loss:混合损失
Inspired :启发
dynamically :动态地
bilateral :形容词: 双边, 双方, 双向, 两岸
illumination:名词: 照明, 光照, 照亮
extremely :副词: 非常, 极其, 极, 万分, 很, 异常, 绝顶, 太, 不得了, 甚, 不堪, 殊, 不胜, 绝, 死, 万般, 不亦乐乎, 酷, 郅, 要命, 穷, 痛, 綦, 至, 万, 极了
the route of:的路线
effort :名词: 功夫, 工夫, 成就, 气力
professional photographers:专业摄影师
judgment :名词: 判断, 判决, 判定, 裁判, 论断, 意见, 报应, 裁, 意思, 眼力, 报, 定案, 该判决书
personality,aesthetics, taste, and experience:个性,审美,品味和经验
To demonstrate this:为了证明这一点
retouched :修饰的
Consequently:副词: 所以, 于是
On the contrary:副词: 反之, 反而, 反
feasible :形容词: 可行, 切实, 可能的
urgent demand :紧急需求
get rid of the burden:摆脱负担

摘要单词:
aesthetic :形容词: 美的, 美学的, 审美的, 艺术的
corresponding :形容词: 相应, 对应的, 通信的
retouched :修饰的
characteristics :特征
preferences :首选项
adversarial :对抗的
desired :想要的
embeds :动词: 嵌, 镶嵌, 镶
the modulation and attention mechanisms:调节和注意机制
fidelity :名词: 忠诚, 忠实, 保真度
defined :定义的
content :名词: 内容, 含量, 满足, 物, 事由, 容度 动词: 满足, 邃 形容词: 怿, 逌, 安, 满意的
is formulated as:制定为
relativistic :相对论的
endow :动词: 赋予, 赋, 牺, 牺牲, 赋与, 输将
quantitative and qualitative:定量和定性
available :形容词: 可得到, 合宜的, 可用的

上一篇:Learning towards Abstractive Timeline Summarization翻译


下一篇:AAAI 2020 SiamFC++ Towards Robust and Accurate Visual Tracking with Target Estimation Guidelines阅读笔记