# Modulbeschreibung - Detailansicht

Moduldetails
Data Analysis and Visualization in R
Fakultät für Informatik
TUINFIN
6
1
6
IN2339
Zuordnungen zu SPO-Versionen
Lehrveranstaltungen und Prüfungsveranstaltungen
Beschreibungen
Export
Allgemeine Daten (Modulhandbuch)
 Bachelor/Master
 Einsemestrig
 Wintersemester
 Englisch
 180
 90
 90
Studien- und Prüfungsleistungen
 Written exam and project work: The listed achievements, see Intended Learning Outcomes, are evaluated by one written exam of 90 min. There will be moreover two case studies, where the students must provide the source code that generates the report of an analysis of a given dataset. The analysis of this data covers all topics stated under Intended Learning Outcomes. The first case study covers topics 1-7. The second covers the topics 8-16. The final mark is the exam mark with bonus points for the two case studies.
 N
 J
Beschreibung
 At the end of the module students are able to:- 1. produce scripts that automatically generate data analysis report- 2. import data from various sources into R- 3. apply the concepts of tidy data to clean and organize a dataset- 4. decide which plot is appropriate for a given question about the data- 5. generate such plots- 6. know the methods of hierarchical clustering, k-means, PCA- 7. apply the above methods and interpret their outcome on real-life datasets- 8. know the concept of statistical testing- 9. devise and implement resampling procedures to assess statistical significance- 10. know the conditions of applications and how to perform in R the following statistical tests: Fisher test, Wilcoxon test, T-test.- 11. know the concept of regression and classification- 12 apply regression and classification algorithms in R- 13. know the concept of error in generalization, cross-validation- 14. implement in R a cross-validation scheme.- 15. know the concepts of sensitivity, specificity, ROC curves- 16. assess the latter in R
 R programming basics 1R programming basics 2 (including report generation with R markdown)Data importing Cleaning and organizing data: Tidy data 1Cleaning and organizing data: Tidy data 2Base plotGrammar of graphics 1Grammar of graphics 2Unsupervised learning (hierarchical clustering, k-means, PCA)Case study IDrawing robust interpretations 1: empirical testing by samplingDrawing robust interpretations 2: classical statistical testsSupervised learning 1: regression, cross-validationSupervised learning 2: classification, ROC curve, precision, recallCase study II
 Lecture provides the concept + programming exercises where these concepts are applied on data. The goal of each exercise is the generation of report documents.
 Weekly posted exercises online, slides, live demo
 An Introduction to Statistical Learningwith Applications in R http://www-bcf.usc.edu/~gareth/ISL/R for Data Science, by Garrett Grolemund and Hadley Wickham
Modulverantwortliche*r
 Prof. Dr. Julien Gagneur