Introduction to Data Science

10 credits

Syllabus, Master's level, 1MS041

A revised version of the syllabus is available.
Code
1MS041
Education cycle
Second cycle
Main field(s) of study and in-depth level
Computer Science A1N, Data Science A1N, Mathematics A1N
Grading system
Fail (U), Pass (3), Pass with credit (4), Pass with distinction (5)
Finalised by
The Faculty Board of Science and Technology, 27 February 2020
Responsible department
Department of Mathematics

Entry requirements

120 credits including 80 credits computer science and mathematics, of which at least 15 credits computer science including programming, and at least 30 credits mathematics including probability and statistics, linear algebra and analysis. Proficiency in English equivalent to the Swedish upper secondary course English 6.

Learning outcomes

On completion of the course the student shall be able to:

  • find publicly available data sets and evaluate their usefulness for given purposes;
  • process data and transform it for analysis;
  • use common clustering and dimension reduction methods to explore data sets and argue on mathematical grounds for the relevance of the methods to the data set and the purpose in question;
  • choose among common probabilistic models for analysis of data set;
  • when choosing model, take into account limitations in computational capacity;
  • evaluate the reliability of a solution by applying appropriate theoretical principles;
  • consistently take into account aspects of ethics, law and integrity;
  • present the conclusions of an analysis / end product of an application.

Content

​Common probability theory models and related inference principles, such as regression/classification, hypothesis testing, matrix filling; typical applications in data science, such as prediction, recommendation, A/B testing,as well as common algorithms for their implementation; modeling of dependence in underlying probability distributions arising from temporal, spatial and network structure; elementary data processing (ELT/Extract-Load-Transform): transformation/cleaning of data for later processing by combining data from different sources and using dimension reduction and clustering methods; use of visualization for exploratory data analysis and communicating results; legal and ethical aspects regarding collecting, processing and storing of data, case studies using real data involving relevant data processing, modeling and inference.

Instruction

Lectures and labs.

Assessment

Written examination (7.5 credtis) and written presentation of labs and assignments (2.5 credits).

If there are special reasons for doing so, an examiner may make an exception from the method of assessment indicated and allow a student to be assessed by another method. An example of special reasons might be a certificate regarding special pedagogical support from the disability coordinator of the university.

FOLLOW UPPSALA UNIVERSITY ON

facebook
instagram
twitter
youtube
linkedin