新论文:最近6个月以内的 Batch Renormalization: Towards Reducing Minibatch Dependence in Batch-Normalized Models, S. Ioffe. Wasserstein GAN, M. Arjovsky et al. Understanding deep learning requires rethinking generalization, C. Zhang et al. [pdf] 老论文:2012年以前的 An analysis of single-layer networks in unsupervised feature learning (2011), A. Coates et al. Deep sparse rectifier neural networks (2011), X. Glorot et al. Natural language processing (almost) from scratch (2011), R. Collobert et al. Recurrent neural network based language model (2010), T. Mikolov et al. Stacked denoising autoencoders: Learning useful representations in a deep network with a local denoising criterion (2010), P. Vincent et al. Learning mid-level features for recognition (2010), Y. Boureau A practical guide to training restricted boltzmann machines (2010), G. Hinton Understanding the difficulty of training deep feedforward neural networks (2010), X. Glorot and Y. Bengio Why does unsupervised pre-training help deep learning (2010), D. Erhan et al. Recurrent neural network based language model (2010), T. Mikolov et al. Learning deep architectures for AI (2009), Y. Bengio. Convolutional deep belief networks for scalable unsupervised learning of hierarchical representations (2009), H. Lee et al. Greedy layer-wise training of deep networks (2007), Y. Bengio et al. Reducing the dimensionality of data with neural networks, G. Hinton and R. Salakhutdinov. A fast learning algorithm for deep belief nets (2006), G. Hinton et al. Gradient-based learning applied to document recognition (1998), Y. LeCun et al. Long short-term memory (1997), S. Hochreiter and J. Schmidhuber. (2010), T. Mikolov et al. Stacked denoising autoencoders: Learning useful representations in a deep network with a local denoising criterion (2010), P. Vincent et al. Learning mid-level features for recognition (2010), Y. Boureau A practical guide to training restricted boltzmann machines (2010), G. Hinton Understanding the difficulty of training deep feedforward neural networks (2010), X. Glorot and Y. Bengio Why does unsupervised pre-training help deep learning (2010), D. Erhan et al. Recurrent neural network based language model (2010), T. Mikolov et al. Learning deep architectures for AI (2009), Y. Bengio. Convolutional deep belief networks for scalable unsupervised learning of hierarchical representations (2009), H. Lee et al. Greedy layer-wise training of deep networks (2007), Y. Bengio et al. Reducing the dimensionality of data with neural networks, G. Hinton and R. Salakhutdinov. A fast learning algorithm for deep belief nets (2006), G. Hinton et al. Gradient-based learning applied to document recognition (1998), Y. LeCun et al. Long short-term memory (1997), S. Hochreiter and J. Schmidhuber.