生成对抗网络.pptx-得力文库

资源描述

《生成对抗网络.pptx》由会员分享，可在线阅读，更多相关《生成对抗网络.pptx（41页珍藏版）》请在得力文库 - 分享文档赚钱的网站上搜索。

1、OUTLINEGenerative Adversarial Nets(GANS)Deep Convolutional Generative Adversarial Networks（DCGAN）Conditional Generative Adversarial Nets（CGAN）第1页/共41页OUTLINEGenerative Adversarial Nets(GANS)Deep Convolutional Generative Adversarial Networks（DCGAN）Conditional Generative Adversarial Nets（CGAN）第2页/共41页

2、Generative Adversarial Generative Adversarial NetsNets(GANS)(GANS)有监督学习经常比无监督的能获得更好的训练效果。但真实世界中，有监督学习需要的数据标注（label）是相对少的。所以研究者们从未放弃去探索更好的无监督学习策略，希望能从海量的无标注数据中学到对于这个真实世界的表示甚至知识，从而去更好地理解我们的真实世界。评价无监督学习好坏的方式有很多，其中生成任务就是最直接的一个。只有当我们能生成/创造我们的真实世界，才能说明我们是完完全全理解了它。然而，生成任务所依赖的生成式模型（generative models）往往会遇到两大

3、困难。首先是我们需要大量的先验知识去对真实世界进行建模，其中包括选择什么样的先验、什么样的分布等等。而建模的好坏直接影响着我们的生成模型的表现。另一个困难是，真实世界的数据往往很复杂，我们要用来拟合模型的计算量往往非常庞大，甚至难以承受。第3页/共41页Generative Adversarial Generative Adversarial NetsNets(GANS)(GANS)Ian Goodfellow提出的Generative Adversarial Networks(GANs)很好的避开了这两个困难。每一个 GAN 框架，都包含着一对模型一个生成模型（G）和一个判别模型（D）。因

4、为 D 的存在，才使得 GAN 中的 G 不再需要对于真实数据的先验知识和复杂建模，也能学习去逼近真实数据，最终让其生成的数据达到以假乱真的地步 D 也无法分别。论文中的模型优化公式：第4页/共41页Generative Adversarial Generative Adversarial NetsNets(GANS)(GANS)第5页/共41页Generative Adversarial Generative Adversarial NetsNets(GANS)(GANS)Sample minibatch of m examples x_1,x_2,.,x_msample minibatch

5、 of m noise samplesz_1,z_2,.,z_mGeneratorDiscriminator第6页/共41页Generative Adversarial Generative Adversarial NetsNets(GANS)(GANS)代码说明及实验成果：#定义判别器def discriminator(x):#计算D_h1=ReLU（x*D_W1+D_b1）,该层的输入为含784个元素的向量 D_h1=tf.nn.relu(tf.matmul(x,D_W1)+D_b1)#计算第三层的输出结果。因为使用的是Sigmoid函数，则该输出结果是一个取值为0,1间的标量（见上述权重

6、定义）#即判别输入的图像到底是真（=1）还是假（=0）D_logit=tf.matmul(D_h1,D_W2)+D_b2 D_prob=tf.nn.sigmoid(D_logit)#返回判别为真的概率和第三层的输入值，输出D_logit是为了将其输入tf.nn.sigmoid_cross_entropy_with_logits()以构建损失函数 return D_prob,D_logit第7页/共41页Generative Adversarial Generative Adversarial NetsNets(GANS)(GANS)#定义一个可以生成m*n阶随机矩阵的函数，该矩阵的元素服从均匀

7、分布，随机生成的z就为生成器的输入def sample_Z(m,n):return np.random.uniform(-1.,1.,size=m,n)#定义生成器def generator(z):#第一层先计算 y=z*G_W1+G-b1,然后投入激活函数计算G_h1=ReLU（y）,G_h1 为第二次层神经网络的输出激活值 G_h1=tf.nn.relu(tf.matmul(z,G_W1)+G_b1)#以下两个语句计算第二层传播到第三层的激活结果，第三层的激活结果是含有784个元素的向量，该向量转化2828就可以表示图像 G_log_prob=tf.matmul(G_h1,G_W2)+G_

8、b2 G_prob=tf.nn.sigmoid(G_log_prob)return G_prob第8页/共41页Generative Adversarial Generative Adversarial NetsNets(GANS)(GANS)#分别输入真实图片和生成的图片，并投入判别器以判断真伪D_real=discriminator(X)D_fake=discriminator(G_sample)#以下为原论文的判别器损失和生成器损失D_loss=-tf.reduce_mean(tf.log(D_real)+tf.log(1.-D_fake)G_loss=-tf.reduce_mean(t

9、f.log(D_fake)#定义判别器和生成器的优化方法为Adam算法，关键字var_list表明最小化损失函数所更新的权重矩阵D_solver=tf.train.AdamOptimizer().minimize(D_loss,var_list=theta_D)G_solver=tf.train.AdamOptimizer().minimize(G_loss,var_list=theta_G)第9页/共41页第10页/共41页Generative Adversarial Generative Adversarial NetsNets(GANS)(GANS)GAN的优势：1.根据实际的结果，它们

10、看上去可以比其它模型产生了更好的样本（图像更锐利、清晰）。2.生成对抗式网络框架能训练任何一种生成器网络。大部分其他的框架需要该生成器网络有一些特定的函数形式，比如输出层是高斯的。重要的是所有其他的框架需要生成器网络遍布非零质量（non-zero mass）。生成对抗式网络能学习可以仅在与数据接近的细流形（thin manifold）上生成点。3.不需要设计遵循任何种类的因式分解的模型，任何生成器网络和任何鉴别器都会有用。4.无需利用马尔科夫链反复采样，无需在学习过程中进行推断（Inference），回避了近似计算棘手的概率的难题。第11页/共41页Generative Adversarial

11、 Generative Adversarial NetsNets(GANS)(GANS)GAN的缺点：1.解决不收敛（non-convergence）的问题。目前面临的基本问题是：所有的理论都认为 GAN 应该在纳什均衡（Nash equilibrium）上有卓越的表现，但梯度下降只有在凸函数的情况下才能保证实现纳什均衡。当博弈双方都由神经网络表示时，在没有实际达到均衡的情况下，让它们永远保持对自己策略的调整是可能的【OpenAI Ian Goodfellow的Quora】。2.难以训练：崩溃问题（collapse problem）。GAN的学习过程可能发生崩溃问题（collapse prob

12、lem），生成器开始退化，总是生成同样的样本点，无法继续学习。【Improved Techniques for Training GANs】3.无需预先建模，模型过于自由不可控。与其他生成式模型相比，GAN这种竞争的方式不再要求一个假设的数据分布，而是使用一种分布直接进行采样sampling，从而真正达到理论上可以完全逼近真实数据，这也是GAN最大的优势。然而，这种不需要预先建模的方法缺点是太过自由了，对于较大的图片，较多的 pixel的情形，基于简单 GAN 的方式就不太可控了。在GANGoodfellow Ian,Pouget-Abadie J 中，每次学习参数的更新过程，被设为D更新k回

13、，G才更新1回，也是出于类似的考虑。第12页/共41页OUTLINEGenerative Adversarial Nets(GANS)Deep Convolutional Generative Adversarial Networks（DCGAN）Conditional Generative Adversarial Nets（CGAN）第13页/共41页Conditional Generative Adversarial Nets Conditional Generative Adversarial Nets（CGANCGAN）第14页/共41页Conditional Generative A

14、dversarial Nets Conditional Generative Adversarial Nets（CGANCGAN）In this work we introduce the conditional version of generative adversarial nets,which can be constructed by simply feeding the data,y,we wish to condition on to both the generator and discriminator.We show that this model can generate

15、 MNIST digits conditioned on class labels.We also illustrate how this model could be used to learn a multi-modal model,and provide preliminary examples of an application to image tagging in which we demonstrate how this approach can generate descriptive tags which are not part of training labels.第15

16、页/共41页Conditional Generative Adversarial Nets Conditional Generative Adversarial Nets（CGANCGAN）Generative adversarial nets can be extended to a conditional model if both the generator and discriminator are conditioned on some extra information y.y could be any kind of auxiliary information,such as c

17、lass labels or data from other modalities.We can perform the conditioning by feeding y into the both the discriminator and generator as additional input layer.第16页/共41页Conditional Generative Adversarial Nets Conditional Generative Adversarial Nets（CGANCGAN）In the generator the prior input noise pz(z

18、),and y are combined in joint hidden representation,and the adversarial training framework allows for considerable flexibility in how this hidden representation is composed.In the discriminator x and y are presented as inputs and to a discriminative function(embodied again by a MLP in this case).第17

19、页/共41页Conditional Generative Adversarial Nets Conditional Generative Adversarial Nets（CGANCGAN）第18页/共41页Conditional Generative Adversarial Nets Conditional Generative Adversarial Nets（CGANCGAN）第19页/共41页OUTLINEGenerative Adversarial Nets(GANS)Deep Convolutional Generative Adversarial Networks（DCGAN）C

20、onditional Generative Adversarial Nets（CGAN）第20页/共41页Generative Adversarial Generative Adversarial NetsNets(GANS)(GANS)第21页/共41页Generative Adversarial Generative Adversarial NetsNets(GANS)(GANS)In recent years,supervised learning with convolutional networks(CNNs)has seen huge adoption in computer vi

21、sion applications.Comparatively,unsupervised learning with CNNs has received less attention.In this work we hope to help bridge the gap between the success of CNNs for supervised learning and unsupervised learning.We introduce a class of CNNs called deep convolutional generative adversarial networks

22、(DCGANs),that have certain architectural constraints,and demonstrate that they are a strong candidate for unsupervised learning.Training on various image datasets,we show convincing evidence that our deep convolutional adversarial pair learns a hierarchy of representations from object parts to scene

23、s in both the generator and discriminator.Additionally,we use the learned features for novel tasks-demonstrating their applicability as general image representations.第22页/共41页Generative Adversarial Generative Adversarial NetsNets(GANS)(GANS)In this paper,we make the following contributions：We propos

24、e and evaluate a set of constraints on the architectural topology of Convolutional GANs that make them stable to train in most settings.We name this class of architectures Deep Convolutional GANs(DCGAN)We use the trained discriminators for image classification tasks,showing competitive performance w

25、ith other unsupervised algorithms.We visualize the filters learnt by GANs and empirically show that specific filters have learned to draw specific objects.We show that the generators have interesting vector arithmetic properties allowing for easy manipulation of many semantic qualities of generated

26、samples.第23页/共41页Generative Adversarial Generative Adversarial NetsNets(GANS)(GANS)Background:Historical arrempts to scale up GANs using CNNs to model images have been unsuccessful.We also encountered difficulties attempting to scale GANs using CNN architecures commonly used in the supervised litera

27、ture.However,after extensive model exploration we identified a family of architectures that resulted in stable training across a range of datasets and allowed for training higher resolution and deeper generative models.Core our approach is adopting and modifying three recently demonstrated changes o

28、f CNN architectures.第24页/共41页Generative Adversarial Generative Adversarial NetsNets(GANS)(GANS)Architecture guidelines for stable Deep Convolutional GANs Replace any pooling layers with strided convolutions(discriminator)and fractional-strided convolutions(generator).Use batchnorm in both the genera

29、tor and the discriminator.Remove fully connected hidden layers for deeper architectures.Use ReLU activation in generator for all layers except for the output,which uses Tanh.Use LeakyReLU activation in the discriminator for all layers.第25页/共41页Generative Adversarial Generative Adversarial NetsNets(G

30、ANS)(GANS)APPROACH AND MODEL ARCHITECTUREThe first is the all convolutional net which replaces deterministic spatial pooling functions(such as maxpooling)with stried convolutions.We use this approach in our generator,allowing it to learn its own spatial upsampling,and discriminator.Second is the tre

31、nd towards eliminating fully connected layers on top of convolutional features.The strongest example of this is global average pooling which has been utilized in state of the art image classification models(Mordvintsev et al.).We found global average pooling increased model stability but huir conver

32、gence speed.A middle ground of directly connecting the highest convolutional features to the input and output respectively of the generator and discrominator worked well.The first layer of the GAN,which takes a uniform noise distribution Z as input,could be called fully connected as it is just a mat

33、rix multiplication,but the result is reshaped into a 4-dimensional tensor and used as the start of the convolution stack.For the discriminator,the last convolution layer is flattened and then fed into a single sigmoid output.See Fig.1 for a visualization of an example model architecture.第26页/共41页Gen

34、erative Adversarial Generative Adversarial NetsNets(GANS)(GANS)Generate model:第27页/共41页Generative Adversarial Generative Adversarial NetsNets(GANS)(GANS)Discriminator model:h0=lrelu(conv2d(image,self.df_dim,name=d_h0_conv)h1=lrelu(self.d_bn1(conv2d(h0,self.df_dim*2,name=d_h1_conv)h2=lrelu(self.d_bn2

35、(conv2d(h1,self.df_dim*4,name=d_h2_conv)h3=lrelu(self.d_bn3(conv2d(h2,self.df_dim*8,name=d_h3_conv)h4=linear(tf.reshape(h3,self.batch_size,-1),1,d_h4_lin)第28页/共41页Generative Adversarial Generative Adversarial NetsNets(GANS)(GANS)Third is Batch Normalization(Ioffe&Szegedy,2015)which stabilizes learni

36、ng by normalizing the input to each unit to have zero mean and unit variance.This helps deal with training problems that arise due to poor initialization and helps gradient flow in deeper models.Directly applying batchnorm to all layers however,resulted in sample oscillation and model instability.Th

37、is was avoided by not applying batchnorm to the generator output layer and the discriminator input layer.The ReLU activation(Nair&Hinton,2010)is used in the generator with the exception of the output layer which uses the Tanh function.Within the discriminator we found the leaky rectified activation(

38、Maas et al.,2013)(Xu et al.,2015)to work well,especially for higher resolution modeling.This is in contrast to the original GAN paper,which used the maxout activation(Goodfellow et al.,2013).第29页/共41页Generative Adversarial Generative Adversarial NetsNets(GANS)(GANS)训练细节1、mini-batch训练，batch size是128.

39、2、所有的参数初始化由(0,0.02)的正态分布中随即得到3、LeakyReLU的斜率是0.2.4、虽然之前的GAN使用momentum来加速训练，DCGAN使用调好超参的Adam optimizer。5、learning rate=0.00026、将momentum参数beta从0.9降为0.5来防止震荡和不稳定。第30页/共41页Generative Adversarial Generative Adversarial NetsNets(GANS)(GANS)4.1 LSUNAs visual quality of samples from generative image models

40、has improved,concerns of over-fitting and memorization of training samples have risen.To demonstrate how our model scales with more data and higher resolution generation,we train a model on the LSUN bedrooms dataset containing a little over 3 million training examples.Recent analysis has shown that

41、there is a direct link between how fast models learn and their generalization performance(Hardt et al.,2015).We show samples from one epoch of training(Fig.2),mimicking online learning,in addition to samples after convergence(Fig.3),as an opportunity to demonstrate that our model is not producing hi

42、gh qualitysamples via simply overfitting/memorizing training examples.No data augmentation was applied to the images.第31页/共41页Generative Adversarial Generative Adversarial NetsNets(GANS)(GANS)4.1.1 DEDUPLICATIONTo further decrease the likelihood of the generator memorizing input examples(Fig.2)we pe

43、rform a simple image de-duplication process.We fit a 3072-128-3072 de-noising dropout regularized RELU autoencoder on 32x32 downsampled center-crops of training examples.The resulting code layer activations are then binarized via thresholding the ReLU activation which has been shown to be an effecti

44、ve information preserving technique(Srivastava et al.,2014)and provides a convenient form of semantic-hashing,allowing for linear time de-duplication.Visual inspection of hash collisions showed high precision with an estimated false positive rate of less than 1 in 100.Additionally,the technique dete

45、cted and removed approximately 275,000 near duplicates,suggesting a high recall.第32页/共41页Generative Adversarial Generative Adversarial NetsNets(GANS)(GANS)4.2 FACESWe scraped images containing human faces from random web image queries of peoples names.The people names were acquired from dbpedia,with

46、 a criterion that they were born in the modern era.This dataset has 3M images from 10K people.We run an OpenCV face detector on these images,keeping the detections that are sufficiently high resolution,which gives us approximately 350,000 face boxes.We use these face boxes for training.No data augme

47、ntation was applied to the images.4.3 IMAGENET-1KWe use Imagenet-1k(Deng et al.,2009)as a source of natural images for unsupervised training.We train on 32 _x0002_ 32 min-resized center crops.No data augmentation was applied to the images.第33页/共41页Generative Adversarial Generative Adversarial NetsNe

48、ts(GANS)(GANS)第34页/共41页Generative Adversarial Generative Adversarial NetsNets(GANS)(GANS)第35页/共41页Generative Adversarial Generative Adversarial NetsNets(GANS)(GANS)第36页/共41页Generative Adversarial Generative Adversarial NetsNets(GANS)(GANS)We show that the generators have interesting vector arithmetic properties allowing for easy manipulation of many semantic qualities of generated samples.第37页/共41页Generative Adversarial Generative Adversarial NetsNets(GANS)(GANS)第38页/共41页Generative Adversarial Generative Adversarial NetsNets(GANS)(GANS)第39页/共41页 Q&A谢谢聆听！第40页/共41页感谢您的观看！第41页/共41页

展开阅读全文