Tagesbuch

Start from GAN
If feed all data in, after enough iterations, all output of generator would be 1 (using MNIST dataset), which is the simplest digit. --> "the generator fools discriminator with garbage"
training GAN for each classes individually --> 1. GAN structure is suitable for some classes, but when training some classes it leads to collapse mode; 2. not easy to select models for each classes

then to conditional GAN
similar structure, but with one-hot label concatenated to input of G and D
Advantage: no need to train model individually
Note: learning rate set to 0.001, 0.0001 would lead to bad result

then ACGAN
Current test shows ACGAN works not well while using two dense layers, reason might be that ACGAN only works when using convolutional D and G
todo: pretrain D

then Wasserstein GAN


  1. January
  1. January
    refine the proposal

10-12. January

  • implement a DC classifier for preparation to implement the discriminator
  • read Improved GAN, focus on this paper in following days
  1. January
  • DC classifier has no bugs, but performs awfully
  • install theano and lasagne to run the improvedGAN code
  1. - 19. January
  • finally install theano and its GPU backend correctly and fix a lot of deprecated issues
  1. January
  • try to translate it to keras, find way to implement the loss function
  1. January
  • translation to keras is way complicated, first try paviaU in the original theano code
  • 1D improved GAN is too bad for training paviaU (maybe the reason of the training data, check the training and testing data and resave them)
  1. January
  • prepare for the questions for tomorrow's meeting:
  • the loss function in the code does not match the loss in the paper, and the former has a very strange type
  • the l_lab and the train_err is the same thing
  • no implementation of K+1 class
  1. February
  • as to the 3d convolution, an idea: set stride=(1,1,2), which only manipulate the spectral dimension
  • try semi-supervised gan, discriminator classifies labeled sample, and generated sample as k+1, use unlabeled training data, set label as [0.1, 0.1, 0.1, ..., 0], on mnist dataset
  1. Feb. /- 9. Feb.
  • 1D tryout, seems good, need more tests
  1. March
    ready to test:
  • (replace conv3d to conv2d)
  • different training data size (count)
  • different patch size
  • different channel number
  • (different batch size)
  • (different deepwise conv channel)
  1. March
    find a case: the results that randomly choose 200 samples from the whole image as training set is much better than using randomly choose 200 samples from training set

  2. April

  • email to cluster team
  • try cross validation
  • ask Amir how to determine the final result
  • read the "discr_loss" blog, and try their code
  • read gan paper
  1. April
  • adam vs sgd
    the validation curve of using adam is up and down --> not suitable for normal early stopping algorithm
    try to fix: use smaller learning rate

  • alternative for progress (early stopping)
    not calculate the ratio of average training loss and min training loss in a training strip, but average training loss and past average training loss

  • learning rate decay strategy

  • optimizer for G and optimizer for D

  • use cross entropy loss of only first 9 labels to determine when to early stop

  • double check the Dataloader in demoGAN (zhu et al)(pytorch)

  1. April
  • test feature match, start from one layer model (ssgan_improved_pytorch)
  • try to implement custom loss function like keras
最后编辑于
©著作权归作者所有,转载或内容合作请联系作者
平台声明:文章内容(如有图片或视频亦包括在内)由作者上传并发布,文章内容仅代表作者本人观点,简书系信息发布平台,仅提供信息存储服务。

推荐阅读更多精彩内容

  • 写下一篇文字,开启一段新的旅程 当我在键盘上敲下第一个文字的时候,我的脑海里总是浮现出这样一段话:“这个博客到底是...
    TYB阅读 523评论 0 51
  • 作为一个资深手机卖家,经常会有朋友问我该如何选购手机,感谢朋友给予的信任,我愿意和大家分享我的一些心得体会,希望为...
    菲完美阅读 786评论 0 0
  • 在java多线程并发编程中,Synchronized一直占有很重要的角色。Synchronized通过获取锁来实现...
    Vinctor阅读 801评论 0 2