Mathematical Foundations of Deep Learning: a Theoretical Framework for Generalization

Babaie, Anahita | 2018

494 Viewed
  1. Type of Document: M.Sc. Thesis
  2. Language: Farsi
  3. Document No: 51814 (02)
  4. University: Sharif University of Technology
  5. Department: Mathematical Sciences
  6. Advisor(s): Alishahi, Kasra; Hadji Mirsadeghi, Mir Omid
  7. Abstract:
  8. Deep Neural Networks, are predictive models in Machine Learning, that during the last decade they've had a great success. However being in an over-parametrized and highly non-convex regime, the analytical examinations of these models is quite a challenging task to do. The empirical developments of Neural Networks, and their distinguishing performance in prediction problems, has motivated researchers, to formalize a theoretical foundations for these models and provide us with a framework, in which one can explain and justify their behavior and properties. this framework is of great importance because it would help us to come to a better understanding of how these models work and also enables us to improve them by overcoming their weaknesses. The vast application of these models in a broad spectrum of problems, made these research a hot topic for researchers coming from different background. One of the key problems is the Generalization of Neural Networks. That despite being over-parametrized, they often can avoid overfitting and as a result show good generalization behavior. This gives rise to the question that what it the notion of capacity in Neural Networks and how this notion is related to generalization behavior of these models? In this thesis we would review the recent works of researchers on generalization of Neural Networks, in which base on Statistical Learning theory, form a theoretical framework to formalize generalization. We also will see how regularization act as a capacity controller, and effects generalization. Furthermore we will have a glance on works regarding the structure and the Geometry of Neural Networks Loss function's surface and will see how this structure effects the optimization Algorithm used for learning
  9. Keywords:
  10. Learning Theory ; Deep Neural Networks ; Generalization ; Regularization ; Random Matrix Theory

 Digital Object List


...see more