Introduction to Data Science
Syllabus, Master's level, 1MS041
- Code
- 1MS041
- Education cycle
- Second cycle
- Main field(s) of study and in-depth level
- Computer Science A1N, Data Science A1N, Mathematics A1N
- Grading system
- Fail (U), Pass (3), Pass with credit (4), Pass with distinction (5)
- Finalised by
- The Faculty Board of Science and Technology, 22 October 2021
- Responsible department
- Department of Mathematics
Entry requirements
120 credits including 80 credits in computer science and mathematics, of which at least 15 credits in computer science including programming, and at least 30 credits in mathematics including probability and statistics, linear algebra and analysis. Proficiency in English equivalent to the Swedish upper secondary course English 6.
Learning outcomes
On completion of the course the student shall be able to:
- find publicly available data sets and evaluate their usefulness for given purposes;
- process data and transform it for analysis;
- use common clustering and dimension reduction methods to explore data sets and argue on mathematical grounds for the relevance of the methods to the data set and the purpose in question;
- choose among common probabilistic models for analysis of data set;
- when choosing model, take into account limitations in computational capacity;
- evaluate the reliability of a solution by applying appropriate theoretical principles;
- consistently take into account aspects of ethics, law and integrity;
- present the conclusions of an analysis / end product of an application.
Content
Common probability theory models and related inference principles, such as regression/classification, hypothesis testing, matrix filling; typical applications in data science, such as prediction, recommendation, A/B testing,as well as common algorithms for their implementation; modeling of dependence in underlying probability distributions arising from temporal, spatial and network structure; elementary data processing (ELT/Extract-Load-Transform): transformation/cleaning of data for later processing by combining data from different sources and using dimension reduction and clustering methods; use of visualization for exploratory data analysis and communicating results; legal and ethical aspects regarding collecting, processing and storing of data, case studies using real data involving relevant data processing, modeling and inference.
Instruction
Lectures and labs.
Assessment
Written examination (7.5 credtis) and written presentation of labs and assignments (2.5 credits).
If there are special reasons for doing so, an examiner may make an exception from the method of assessment indicated and allow a student to be assessed by another method. An example of special reasons might be a certificate regarding special pedagogical support from the disability coordinator of the university.