Course
Information Retrieval and Text Mining (DAT640)
The course offers an introduction to techniques and methods for processing, mining, and searching in massive text collections. The course considers a broad variety of applications and provides an opportunity for hands-on experimentation with state-of-the-art algorithms using existing software tools and data collections.
Course description for study year 2018-2019. Please note that changes may occur.
Semesters
Facts
Course code
DAT640
Credits (ECTS)
10
Semester tution start
Autumn
Language of instruction
English
Number of semesters
1
Exam semester
Autumn
Time table
Content
- Search engine architecture
- Text preprocessing and indexing
- Retrieval models (vector-space model, probabilistic models, learning to rank, neural models)
- Search engine evaluation
- Query modeling, relevance feedback
- Web search (crawling, indexing, link analysis)
- Semantic search (knowledge bases, entity retrieval, entity linking)
- Text clustering
- Text categorization
- Topic analysis (PLSA, LSA)
Learning outcome
Knowledge:
- Theory and practice of concepts, methods, and techniques for managing and analyzing large amounts of text data.
Skills:
- Process and prepare large-scale textual data collections for retrieval and mining.
- Apply retrieval, classification, and clustering methods to a range of information access problems.
- Conduct performance evaluation and error analysis.
General competencies:
- Understanding of the strengths and limitations of modern information retrieval and text mining techniques. Being able to identify promising business applications, participate in and lead such projects.
Required prerequisite knowledge
None
Exam
Course teacher(s)
Head of Department:
Tom RyenCourse coordinator:
Krisztian BalogCourse teacher:
Krisztian BalogCourse teacher:
Petra GaluscakovaMethod of work
6 hours of lectures/lab exercises each week.
Overlapping courses
Course | Reduction (SP) |
---|---|
Web Search and Data Mining (DAT630_1) , Information Retrieval and Text Mining (DAT640_1) | 5 |
Open for
Master students at the Faculty of Science and Technology.
Course assessment
Form and/or discussion.
Literature
Text Data Management and Analysis: A Practical Introduction to Information Retrieval and Text Mining (Zhai and Massung), ACM and Morgan & Claypool Publishers, 2016.
The course description is retrieved from FS (Felles studentsystem). Version 1