👷 Introduction to Information Retrieval
doc. RNDr. Petr Sojka, Ph.D.
👷 Introduction to Information Retrieval

Dear students,

Welcome to the 2024 run of the FI:PV211 Introduction to Information Retrieval course. As the main teacher will take a month of health recovery in Spring 2024, this year's lectures will be [partly] substituted by the previous year's recordings and invited lectures. Enrollment is thus limited (APPROVAL needed) with preference given to UMI students.

The course is based on the Introduction to Information Retrieval textbook by Manning, Raghavan, and Schutze (hard copies available in MU libraries) taught at Stanford, Munich, and other places. In the course you will, among other things, learn how it is possible to fulfill seekers' information needs at the pace of 10,000+ questions per second on the global web-scale within milliseconds. Since 2023, the use of transformers and deep approaches has been added to the syllabus.

Students will be motivated to try active/flipped learning approaches wherever possible.

The course moved from its  to IS MU in 2011. Please look if you would like to take a sneak peek at the and the topics we will discuss in the course. However, this interactive syllabus is this course's primary source of information.

Course trailer (in Czech)
A trailer for the PV211 Introduction to Information Retrieval course by Tomáš Effenberger
Second project assignment
CQADupStack Collection and the ARQMath Collection
Second project assignment (CQADupStack Collection)
Google Colaboratory code for the second project
Second project leaderboard (CQADupStack Collection)
Google Spreadsheet leaderboard for the second project
Alternative second project assignment (ARQMath Collection)
Google Colaboratory code for the alternative second project
Alternative second project leaderboard (ARQMath Collection)
Google Spreadsheet leaderboard for the alternative second project
Projects' Jupyter Hub
Dedicated computational resources for your projects

Chapter contains:
2
Discussion Forum
4
PDF
1
Folder
1
Study text
4
Web
Teacher recommends to study from 19/2/2024 to 25/2/2024.
Chapter contains:
5
PDF
1
Folder
1
Study text
5
Web
Teacher recommends to study from 26/2/2024 to 3/3/2024.
Chapter contains:
3
PDF
1
Folder
1
Study text
1
Web
Teacher recommends to study from 4/3/2024 to 10/3/2024.
Chapter contains:
1
Homework Vault
7
PDF
1
Folder
1
Study text
6
Web
Teacher recommends to study from 11/3/2024 to 17/3/2024.

2024-03-19: Submissions due for the first project

Chapter contains:
2
Peer assessment
5
PDF
1
Folder
1
Study text
1
Web
Teacher recommends to study from 18/3/2024 to 24/3/2024.

2024-03-26: Peer reviews due for the first project

This week, there will be a summary of the first part of the course, which is building an inverted index and querying on local and global scales, as well as the basics of the new generation of indexing based on the embeddings.

Chapter contains:
1
Image
6
PDF
1
Study text
13
Web
Teacher recommends to study from 20/3/2024 to 31/3/2024.
Chapter contains:
11
PDF
1
Folder
1
Study text
7
Web
Teacher recommends to study from 28/3/2024 to 7/4/2024.

Question Answering, Extractive Question Answering, Abstractive Question Answering, Maximum Marginal Likelihood, LLMs vs QA

Chapter contains:
1
PDF
1
Study Materials
1
Study text
3
Web
Teacher recommends to study from 8/4/2024 to 14/4/2024.
Chapter contains:
2
Study Materials
1
Study text
Teacher recommends to study from 15/4/2024 to 21/4/2024.
Chapter contains:
13
PDF
1
Study text
3
Web
Teacher recommends to study from 22/4/2024 to 28/4/2024.
Chapter contains:
2
Homework Vault
6
PDF
1
Folder
1
Study text
4
Web
Teacher recommends to study from 25/4/2024 to 5/5/2024.

2024-05-12: Submissions due for the second project

Chapter contains:
1
Image
2
Homework Vault
4
PDF
1
Folder
1
Video
1
Study text
1
Web
Teacher recommends to study from 6/5/2024 to 12/5/2024.

2024-05-19: Peer reviews due for the second project

Chapter contains:
5
PDF
1
Video
1
Study text
2
Web
Teacher recommends to study from 13/5/2024 to 19/5/2024.

    Here are materials from the previous runs of the course: spring 2019, spring 2020, spring 2021, spring 2022 and spring 2023

    I will be glad if you get encouraged into course topics and decide to get insight into them by solving [mini]projects. Activities in this direction will be rewarded with several premium points toward successful grading. The number of stars below is an estimate of project difficulty, from the mini project [(*), 10 points] to the big project size [(*****), 30+ points]. I am also open to assigning/extending a project as a Bachelor/Master/ Dissertation thesis. 

    • (*)+ Pointing to any (factual, typographical) errors in the course materials.
    • (**)+ Preparation of Deepnote instructions, documentation, and support for the solution of course projects
    • (**)+ Preparation of hot topic slides, production or preparation of motivating Khan-Academy style video, or other course materials in LaTeX.
    • (**)+ Presentation or teaching video on topics relevant to the course. Possible topics: Sketch Engine, search with linguistic attributes, random walks in texts, topic search and corpora, time-constrained search, topic modeling with gensim, LDA, Wolfram Alpha, specifics of search of structured data (chemical and mathematical formulae, linguistic trees - syntactic or dependency), etc.
    • (***) Participation in IR competition at Kaggle.com.
    • (***)+ Participation in IR research in our group Math Information Retrieval on research agendas and ARQMath task or EuDML project or DML project.
    • (***)+ Evaluation of Math Information Retrieval in system MIaS - possible as a Dean project or a Bachelor/Master/Dissertation thesis.

    To a pupil who was in danger, Master said, “Those who do not make mistakes, they are most mistaken for all – they do not try anything new.” Anthony de Mello

    Previous