👷 Introduction to Information Retrieval
doc. RNDr. Petr Sojka, Ph.D.
👷 Introduction to Information Retrieval

Dear students,

Welcome to the 2025 run of the FI:PV211 Introduction to Information Retrieval course.

The course starts with an introduction based on the Introduction to Information Retrieval textbook by Manning, Raghavan, and Schutze (hard copies available in MU libraries) taught at Stanford, Munich, and other places. In the course you will, among other things, learn how it is possible to fulfill seekers' information needs at the pace of 10,000+ questions per second on the global web-scale within milliseconds. Since 2023, the use of transformers and large language models has been added to the syllabus.

Students will be motivated to try active/flipped learning approaches wherever possible.

Please look if you would like to take a sneak peek at the and the topics we will discuss in the course. The topics covering the use of large language models to seek information will be extended (RAG, multilingual and multimodal retrieval, etc.) in the 2025 course run. This interactive syllabus is this course's primary source of information.

Course trailer (in Czech)
A trailer for the PV211 Introduction to Information Retrieval course by Tomáš Effenberger
Projects' Jupyter Hub
Dedicated computational resources for your projects

Chapter contains:
2
Discussion Forum
1
Homework Vault
4
PDF
1
Folder
1
Study text
4
Web
Teacher recommends to study from 10/2/2025 to 20/2/2025.
Chapter contains:
7
PDF
1
Folder
1
Study text
5
Web
Teacher recommends to study from 19/2/2025 to 27/2/2025.
Chapter contains:
4
PDF
1
Folder
1
Study text
1
Web
Teacher recommends to study from 26/2/2025 to 6/3/2025.
Study now
Chapter contains:
1
Homework Vault
8
PDF
1
Folder
1
Study text
6
Web
Teacher recommends to study now - from 5/3/2025 to 13/3/2025.

2025-03-19: Submissions due for the first project

Computing scores in a complete search system, and evaluation in information retrieval 17/3/2025
Teacher recommends to study from 12/3/2025 to 20/3/2025.

2025-03-26: Peer reviews due for the first project

Anatomy of the web-scale IR system and embedding revolution 24/3/2025
Teacher recommends to study from 19/3/2025 to 27/3/2025.
Latent semantic representations: Introduction to LLM, matrix decompositions, LSI, and distributed word representations 31/3/2025
Teacher recommends to study from 26/3/2025 to 3/4/2025.
Question Answering `with LLM' 7/4/2025
Teacher recommends to study from 2/4/2025 to 10/4/2025.
Modern Neural Information Retrieval Techniques 14/4/2025
Teacher recommends to study from 9/4/2025 to 17/4/2025.
Relevance feedback, query expansion, text classification, (and a lot more) 21/4/2025
Teacher recommends to study from 16/4/2025 to 27/4/2025.
Information retrieval by question answering by large language models and Clustering 28/4/2025
Teacher recommends to study from 24/4/2025 to 4/5/2025.

2025-05-07: Submissions due for the second project

Web search basics 5/5/2025
Teacher recommends to study from 1/5/2025 to 11/5/2025.

2025-05-14: Peer reviews due for the second project

Link analysis and Web crawling 12/5/2025
Teacher recommends to study from 12/5/2025 to 18/5/2025.


    I will be glad if you get encouraged into course topics and decide to get insight into them by solving [mini]projects. Activities in this direction will be rewarded with several premium points toward successful grading. The number of stars below is an estimate of project difficulty, from the mini project [(*), 10 points] to the big project size [(*****), 30+ points]. I am also open to assigning/extending a project as a Bachelor/Master/ Dissertation thesis. 

    • (*)+ Pointing to any (factual, typographical) errors in the course materials.
    • (**)+ Preparation of Deepnote instructions, documentation, and support for the solution of course projects
    • (**)+ Preparation of hot topic slides, production or preparation of motivating Khan-Academy style video, or other course materials in LaTeX.
    • (**)+ Presentation or teaching video on topics relevant to the course. Possible topics: Sketch Engine, search with linguistic attributes, random walks in texts, topic search and corpora, time-constrained search, topic modeling with gensim, LDA, Wolfram Alpha, LLM assistant for proof of P=NP!, etc.
    • (***) Participation in IR competition at Kaggle.com.
    • (***)+ Participation in IR research in my group on research agendas (cf. PV212 research seminar on current topics).
    • (***)+ Evaluation and updates of Math Information Retrieval system  and comparing it with the new RAG/LLM approaches - possible as a Dean project or a Bachelor/Master/Dissertation thesis.

    To a pupil who was in danger, Master said, “Those who do not make mistakes, they are most mistaken for all – they do not try anything new.” Anthony de Mello

    Previous