Knowledge-Based Systems in Bioinformatics

5 credits

Syllabus, Master's level, 1MB416

A revised version of the syllabus is available.

Code: 1MB416
Education cycle: Second cycle
Main field(s) of study and in-depth level: Bioinformatics A1N, Technology A1N
Grading system: Fail (U), Pass (3), Pass with credit (4), Pass with distinction (5)
Finalised by: The Faculty Board of Science and Technology, 27 April 2012
Responsible department: Biology Education Centre

Entry requirements

120 credits inclusive Discrete Computational Biology or equivalent, Basic Statistics

Learning outcomes

The course intends to give knowledge and understanding for how logic-based methods can be used to support design of knowledge-based systems within the life sciences with the output of large amounts of data such as gene expression, molecular interaction and annotation data and combinations of clinical and genomic data. The course will provide a deeper understanding of how advanced learning system can be used to solve bioinformatics problems.

On completion of the course, the student should be able to

use and describe definitions and mathematical notation for information and decision systems, rough sets and regulatory systems
use other methods for machine learning such as clustering and decision trees, and put these in relation to rule-based methods
apply knowledge within regulatory systems and Monte Carlo-based trait selection to formulate and solve classification problems within the life sciences

Content

Introduction to boolean functions. Transformation and simplification of boolean expressions. Information, decision systems and rough sets. Traits and its synthesis and selection. Training and validation of models. Static properties of models. Example of applications within the life sciences includes: classification by means of expression data, prediction of gene function from expression time profiles and genomic databases, modelling of transcriptional mechanisms and ligand receptor bindings, drug resistance, prediction of protein function from structure and modelling by means of clinical and genomic data. The lectures are alternated with computer exercises with real and synthetic data. Ontologies. Pubmed. Machine learning: clustering, rough sets and decision trees, Monte Carlo-based trait selection, statistical model validity and significance.

Instruction

Lectures, lab-based computer exercises, and a project.

Assessment

Written exam at the end of the course - 3 credits. Successful (pass/fail) completion of 66% of the problem sets - 1 credit, successful (pass/fail) completion of the project 1 credit.