0000002209 16S 4SWS VO Statistical Modeling and Machine Learning (IN2332)   Hilfe Logo

LV - Detailansicht

Wichtigste Meldungen anzeigenMeldungsfenster schließen
Allgemeine Angaben
Statistical Modeling and Machine Learning (IN2332) 
Sommersemester 2016
Informatik 12 - Lehrstuhl für Bioinformatik (Prof. Rost)
Zuordnungen: 1 
Angaben zur Abhaltung
0. Univariate and simple multivariate calculus and summary of linear algebra with intuitive explanations
1. Concepts in machine learning: supervised vs. unsupervised learning, classification vs. regression, overfitting, curse of dimensionality
2. Probability theory, Bayes theorem, conditional independence, distributions (multinomial, Poisson, Gaussian, gamma, beta,...), central limit theorem, entropy, mutual information
3. Generative models for discrete data: likelihood, prior, posterior, Dirichlet-multinomial model, naive Bayes classifiers
4. Gaussian models: max likelihood estimation, linear discriminant analysis, linear Gaussian systems
5. Bayesian statistics: max posterior estimation, model selection, uninformative and robust priors, hierarchical and empirical Bayes, Bayesian decision theory
6. Frequentist statistics: Bootstrap, Statistical testing
7. Linear regression: Ordinary Least Square, Robust linear regression, Ridge Regression, Bayesian Linear Regression
8. Logistic regression and optimization: (Bayesian) logistic regression, optimization, L2-regularization, Laplace approximation, Bayesian information criterion
9. Generalized Linear Models: the exponential family, Probit regression
10. Expectation Maximization (EM) algorithm with applications
11. Latent linear models: Principle Component Anlaysis, Bayesian PCA
Linear algebra and multivariate calculus
Further requirements listed at http://gagneurlab.in.tum.de/statistical-modeling-and-machine-learning/
At the end of the module students are able to:
- 1. remember the concepts of supervised and unsupervised learning and to implement cross-validation procedures
- 2. remember the concepts of Bayesian probabilities, of conditional and unconditional dependences
- 3. derive mathematically the models and inference procedures of Bayesian linear regression, Generalized linear models, Bayesian Principal Component Analysis, and k-means.
- 4. identify use cases of the above mentioned models
- 5. apply the above mentioned models using the R programming language
- 6. assess the performance and significance of their results
- 7. develop simple novel Bayesian models and inference procedure thereof for situations for which the above mentioned models do not apply.
These achievements are assessed by a final exam and a statistical modeling competition. The final exam is a 2 hours written exam. It includes knowledge questions (achievements 1,2,4) and statistical modeling questions (derivation of the likelihood and of the inference procedure of a model not seen during the class, achievements 3,7), and a bit of R programming (achievement 5). The statistical modeling competition is on an open, unsolved problem. It is performed at home and presented in a 10 min presentation in class. It assesses the competence of setting in practice the acquired knowledge (achievements 4,5) and of developing novel (achievement 7) and implement models as well as evaluating their performance (achievement 6). The mark will be the one of the final exam plus bonus points for the modeling competition.

The class will be based on Christopher Bishop's book "Pattern Recognition and Machine Learning". The lecture will be held in inverted classroom style: Each week, we will give a ~30 min overview of the next reading assignment of a section of the book, pointing out the essential messages, thus facilitating the reading at home. Exercises to solve until next lecture will be given, including mathematical derivations of some book results. In the next lecture, the exercises will be discussed (~30 min), as well as questions and difficulties with the material are answered (~20 min). Then, practical exercises using the newly acquired material will be solved in teams, using the R statistics framework (100min). Further exercises will be performed during the Friday classes (3 hours) in smaller groups. The inverted classroom style is in our experience better suited than the conventional lecturing model for quantitative topics that require the students to think through or retrace mathematical derivations at their own speed.
Für die Anmeldung zur Teilnahme müssen Sie sich in TUMonline als Studierende/r identifizieren.
Anmerkung: We except master students from bioinformatics, (bio)physics, and computer science.
Pattern recognition and Machine Learning by Christopher Bishop
Online Unterlagen
E-Learning Kurs (Moodle)