👷 Introduction to Information Retrieval

Anatomy of the web-scale IR system 22. 3. 2022

Lecture

Second term project assignments
Slides introducing the second term project by Vít Novotný
Chyba: Odkazovaný objekt neexistuje nebo nemáte právo jej číst.
https://is.muni.cz/el/fi/jaro2022/PV211/um/readings/Jeff-Dean-Stanford-CS276-April-2015.pdf

Readings

Google Crash Course (in Czech)
A web page by Dušan "Yuhů" Janovský
Chyba: Odkazovaný objekt neexistuje nebo nemáte právo jej číst.
https://is.muni.cz/el/fi/jaro2022/PV211/um/readings/334.pdf
The Google File System
A 2003 paper by Ghemawat et al.
The Anatomy of The Google Architecture
Slides for a lecture from 2009 by Ed Austin
Building Software Systems At Google and Lessons Learned
A lecture from 2010-11-10 by Jeff Dean
Lessons Learned While Building Infrastructure Software at Google
Slides for a lecture from 2013 by Jeff Dean
How Google Works (in Czech)
A tutorial from 2014 by Tomáš Effenberger
Sketch Engine
A unique search engines for lexicographers built by Lexical Computing at the Faculty of Informatics, Masaryk University

Seminar

Second term project assignments
Slides introducing the second term project by Vít Novotný
Second term project assignments (GitHub)
Google Colaboratory code for the second term project
Second term project assignments (JupyterHub)
A JupyterHub cluster kindly provided to the course by ICT MU. You can use JupyterHub to work on your first term project assignment. Compared to Google Colaboratory, JupyterHub offers up to 32 CPUs, 2 NVIDIA A40 GPUs and 64G RAM. Notebooks will be closed after 3 days of inactivity; make sure you download your work!
Second term project leaderboard (TREC collection)
Google Spreadsheet leaderboard for the second term project
Alternative second term project leaderboard (ARQMath collection)
Google Spreadsheet leaderboard for the alternative second term project