Course
Information Retrieval and Text Mining (DAT640)
The course offers an introduction to techniques and methods for processing, mining, and searching in massive text collections. The course considers a broad variety of applications and provides an opportunity for hands-on experimentation with state-of-the-art algorithms using existing software tools and data collections.
Dette er emnebeskrivelsen for studieåret 2020-2021. Merk at det kan komme endringer.
Semesters
Fakta
Emnekode
DAT640
Vekting (stp)
10
Semester undervisningsstart
Autumn
Undervisningsspråk
English
Antall semestre
1
Vurderingssemester
Autumn
Content
- Search engine architecture
- Text preprocessing and indexing
- Retrieval models (vector-space model, probabilistic models, learning to rank, neural models)
- Search engine evaluation
- Query modeling, relevance feedback
- Web search (crawling, indexing, link analysis)
- Semantic search (knowledge bases, entity retrieval, entity linking)
- Text clustering
- Text categorization
- Topic analysis
- Opinion mining and sentiment analysis
Learning outcome
Knowledge:
- Theory and practice of concepts, methods, and techniques for managing and analyzing large amounts of text data.
Skills:
- Process and prepare large-scale textual data collections for retrieval and mining.
- Apply retrieval, classification, and clustering methods to a range of information access problems.
- Conduct performance evaluation and error analysis.
General competencies:
- Understanding of the strengths and limitations of modern information retrieval and text mining techniques. Being able to identify promising business applications, participate in and lead such projects.
Forkunnskapskrav
Exam
Form of assessment | Weight | Duration | Marks | Aid | Exam system | Withdrawal deadline | Exam date |
---|---|---|---|---|---|---|---|
Home exam | 3/5 | 4 Hours | Letter grades | Inspera assessment | 17.11.2020 | — | |
Project work | 2/5 | Letter grades | — | 25.08.2020 | — |
The project is carried out individually or in groups of 2 or 3. The project is carried out in the groups set up by the course instructor. If a student fails the project, she/he has to take this part next time the subject is lectured.
Permitted aid: all written and printed material, and basic calculator
Fagperson(er)
Head of Department:
Tom RyenCourse coordinator:
Krisztian BalogCourse teacher:
Krisztian BalogCourse teacher:
Petra GaluscakovaMethod of work
Overlapping
Emne | Reduksjon (SP) |
---|---|
Web Search and Data Mining (DAT630_1) , Information Retrieval and Text Mining (DAT640_1) | 5 |