阅读列表:4

2023-07-11 二 22:36 2025-02-21 五 22:25

Genernal Deeplearning

为什么要深度网络

使用 RBM 模型的(显示有些场景深度网络不一定好,可能是过拟合): Hugo Larochelle and Yoshua Bengio. Classification using discriminative restricted boltzmann machines. In Andrew McCallum and Sam Roweis, editors, Proceedings of the 25th Annual International Conference on Machine Learning (ICML 2008), pages 536–543. Omnipress, 2008.

以下就是说明深度网络对复杂数据有更好的学习能力的论文?这是如何分析的? Hugo Larochelle, Dumitru Erhan, Aaron Courville, James Bergstra, and Yoshua Bengio. An empirical evaluation of deep architectures on problems with many factors of variation. In Zoubin Ghahramani, editor, Twenty-fourth International Conference on Machine Learning (ICML 2007), pages 473–480. Omnipress, 2007. URL http://www.machinelearning.org/proceedings/icml2007/papers/331.pdf

Training strategies [0/7]

  • The curse of highly variable functions for local kernel machines.

    修改历史

    Yoshua Bengio, Olivier Delalleau, and Nicolas Le Roux.

  • ☐ Scaling learning algorithms towards AI.
  • Greedy Layer-Wise Training of Deep Networks 这篇是 2006 年
  • Exploring Strategies for Training Deep Neural Networks

    2009 年 Yoshua Bengio 团队的文章,算是对当时训练方式的综述?

    Hinton et al. recently proposed a greedy layer-wise unsupervised learning procedure relying on the training algorithm of restricted Boltzmann machines (RBM) to initialize the parameters of a deep belief network (DBN), a generative model with many layers of hidden causal variables.

    在这之前,训练多层网络是一个困难问题,以至于只能训练到 2 层,然而加大深度的动机来自于 complexity theory of circuits

    (Salakhutdinov and Murray, 2008; Larochelle and Bengio, 2008) 说明 deep architectures 不一定比浅层的 kernel 模型好,

    但 (Larochelle et al., 2007) 则显示 there has been evidence of a benefit when the task is complex enough, and there is enough data to capture that complexity

    这里提到了作者认为什么是好的表征: Each layer in a multi-layer neural network can be seen as a representation of the input obtained through a learned transformation. What makes a good internal representation of the data? We believe that it should disentangle the factors of variation that inherently explain the structure of the distribution

    • When such a representation is going to be used for unsupervised learning, we would like it to preserve information about the input while being easier to model than the input itself.(“更易于建模”是指,经过内部表示后的数据结构更简单, 模式更清晰,减少了原始数据中的复杂性和噪声,使得后续的模型更容易捕捉数据的本质特征。)
    • When a representation is going to be used in a supervised prediction or classification task, we would like it to be such that there exists a “simple” (i.e., somehow easy to learn) mapping from the representation to a good prediction

    (Fahlman and Lebiere, 1990; Lengelle´ and Denoeux, 1996) 等人用监督学习的方式构造这种内部表征. However, as we discuss here, the use of a supervised criterion at each stage may be too greedy and does not yield as good generalization as using an unsupervised criterion. 注意这里的用词是 too greedy ,这里的 greedy 说的是太过于早地去预测某个目标,从而抛弃了输入的部分特征:

    Aspects of the input may be ignored in a representation tuned to be immediately useful (with a linear classifier) but these aspects might turn out to be important when more layers are available.

    Combining unsupervised (e.g., learning about p(x)) and supervised components (e.g., learning about p(y|x)) can be be helpful when both functions p(x) and p(y|x) share some structure.

    Hinton 在 2006 的 RBN deep belief network 提出的训练方法打开了训练更深网络的大门,在再深度网络中:

    • Upper layers of a DBN are supposed to represent more “abstract” concepts that explain the input observation x,
    • Lower layers extract “low-level features” from x.

    In other words, this model first learns simple concepts, on which it builds more abstract concepts.

Pascal Vincent, Hugo Larochelle, Yoshua Bengio, and Pierre-Antoine Manzagol. Extracting and composing robust features with denoising autoencoders. In Andrew McCallum and Sam Roweis, editors, Proceedings of the 25th Annual International Conference on Machine Learning (ICML 2008), pages 1096–1103. Omnipress, 2008. URL http://icml2008.cs.helsinki.fi/papers/592.pdf. 通过 mask input 来防止自编码器训练出 trivial 的拷贝网络

Goeffrey E. Hinton, Simon Osindero, and Yee-Whye Teh. A fast learning algorithm for deep belief nets. Neural Computation, 18:1527–1554, 2006.

Geoffrey E. Hinton. To recognize shapes, first learn to generate images. Technical Report UTMLTR 2006-003, University of Toronto, 2006.

Efficient Training

radioLinkPopups

如对本文有任何疑问,欢迎通过 github issue 邮件 metaescape at foxmail dot com 进行反馈