👷 Seminar on Machine Learning, Information Retrieval, and Scientific Visualization

doc. RNDr. Petr Sojka, Ph.D.

👷 Seminar on Machine Learning, Information Retrieval, and Scientific Visualization

doc. RNDr. Petr Sojka, Ph.D.

👷 Seminar on Machine Learning, Information Retrieval, and Scientific Visualization

Info

News

The course is a [i]regular [re]search seminar, mentored in a "family" manner. In this term, the seminar [sub]topic is Tame the complexity. The enrolled student must give a presentation on an agreed-upon topic (of interest, or her thesis, or he will talk about a research paper or area) once during the term. Topics of presentations focus primarily (but not necessarily) on those related to machine learning, representation learning, and scientific visualization or about our subtopic (tackling complexity by LLM agents, proving P=NP, etc. ;-).
There is a discussion group with official course information and a communication channel in addition to the course outline below: watch both frequently!

Topics and Course Outline

Week 1– 19.9. canceled due to floods

Readings: Motivating video: DEK's advice to young students.

Week 2 – Introduction, research strategy, evaluation, and course schedule planning

Join us at A502, Faculty of Informatics MU, on September 26th at 10 AM (CET) [or on Zoom, on-demand only].

09:50 Catering preparation (from brought Autumn fruits)

10:00 Class introduction, warm-up round-up check-in discussion (expectation, topics/expertise/background, suggested presentations, and readings). We will "Start with why" (Simon Sinek). Why do you go to college? Bring your research presentation offers (one slide) and ideas to present, read, study, and discuss! Take inspiration from last term's presentations.

10:25 The golden circle of research communication and scientific work.

Why? Put readers in your place! Specifics of CS research and doctoral studies and their evaluation at FI MU: CS conference rankings
How? to write
What (and where)? It is important to "sell the ideas and work," pick the right topics and questions, research "big issues," and pick the proper publication forums (in CS and NLP). An h-index as a measure of impact. The danger of Tyranny of metric.

11:20 Class round-up discussion (suggested presentations, readings, and presentation course schedule).

11:30 Varia, socializing (team builder wanted!), lunch.

Week 3 – Taming the complexity in/of your projects

Join us at A502, Faculty of Informatics MU, on Oct 3rd at 9:50 AM (catering preparation) and 10 AM (Invitation of newcomers, questions on readings, and a summary of Week 2). To join via Zoom, ask for a password in advance.

[All]: Taming the Complexity of Research Projects 3. 10. 2024

Přejít

We will present the research projects we are working on, in an elevator-pitch style. Presenting complex projects under these time constraints puts pressure on the compact style of presentation where each word or diagram matters and is challenging.

Kapitola obsahuje:

Obrázek

PDF

Studijní materiály

Video

Studijní text

Učitel doporučuje studovat od 22. 9. 2024 do 5. 10. 2024.

Week 4 – 10. 10. canceled

Week 5 – 17. 10. Ondřej Sojka

Join us at A502, Faculty of Informatics MU, on Oct 3rd at 9:50 AM (catering preparation) and 10 AM (Invitation of newcomers, questions on readings, and speaker introduction). To join via Zoom, ask for a password in advance.

[Ondřej Sojka]: Transfer Learning of Slavic Syllabification for Hyphenation Patterns

Přejít

Hyphenation patterns play a vital role in enhancing the readability and aesthetics of text, especially for Slavic languages. Current hyphenation systems for many Slavic languages are outdated, sometimes relying on manually created patterns with limited effectiveness. We explore the transfer learning of syllabic hyphenation patterns across multiple Slavic languages to develop improved, data-driven hyphenation systems. By using the International Phonetic Alphabet (IPA) as an intermediary, this research transfers hyphenation patterns between related Slavic languages, creating a unified set of IPA-based rules. These IPA patterns are then used to generate language-specific hyphenation patterns for each target language. The proposed approach aims to develop reliable hyphenation patterns using machine learning methods, improving syllabification across multiple languages. Although the work is ongoing, early results indicate promising improvements, particularly for Ukrainians. The new patterns are intended to be practical and easy to reproduce, ultimately contributing to better text layout quality for Slavic languages.

Kapitola obsahuje:

Studijní text

Učitel doporučuje studovat od 12. 10. 2024 do 20. 10. 2024.

Week 6 – 24. 10. No lecture

Week 7 – 31. 10. No lecture

Week 8 – 7. 11. Tereza Vrabcová a Marek Kadlčík

Join us at A502, Faculty of Informatics MU, on Oct 7th at 9:50 AM (catering preparation) and 10 AM (lectures). To join via Zoom, ask for a password _in advance_.

[Tereza Vrabcová a Marek Kadlčík]: Research plans of our Ph.D. talents 7. 11. 2024

Přejít

Human communication is complex. With its many rules and components, implicit and explicit meanings of words and sentences, within the Computer Science field it has been long researched by the area of Natural Language Processing (NLP). Though we have made strides in making the "computers" understand us, one of the key elements of communication still remains unsatisfactorily unresolved - the problem of negation. In this presentation, we will delve into the role of negation in human communication, the ability (or rather inability) of large language models to tackle negation, current approaches to this problem, and the possible research directions for solving this problem.

Kapitola obsahuje:

Obrázek

PDF

Video

Studijní text

Učitel doporučuje studovat od 31. 10. 2024 do 8. 11. 2024.

Week 9 – 14. 11. Jakub PekárTomáš Gregor

[Tomáš Gregor]: Introduction to Quantum Neural Networks 14. 11. 2024

Přejít

In recent years, enormous progress has been made in studying and developing artificial neural networks and machine learning models that can approximate any well-behaved function to an arbitrary precision. These models can perform even superhuman tasks, such as predicting the spatial structure of any protein just from its sequence of amino acids. Another cutting-edge research area is quantum computing, which studies using the quantum properties of polarized light or supercooled materials for computation. Researchers hope to use the properties of these quantum computations to solve problems that would take more than the universe's lifetime to compute on a classical computer. At the intersection of these two research areas lie quantum neural networks, a rapidly growing research topic with much promise but little concrete results thus far. In my presentation, I will lay out the theory behind the components of quantum computing: qubits, quantum circuits, and quantum algorithms. Armed with the foundations of quantum computing, the architecture of quantum neural networks, the methods used to train these models, results, problems, and advantages over classical neural networks will be discussed.

Kapitola obsahuje:

Obrázek

Video

Studijní text

Učitel doporučuje studovat od 8. 11. 2024 do 15. 11. 2024.

Week 10 – 21. 11. Martin Kňažovič

[Martin Kňažovič]: Preprocessing of Requests for Proposals by Large Language Models 21. 11. 2024

Přejít

As the potential of large language models (LLMs) grows, we can help businesses automate previously human-dependent processes. One such process that is crucial for many businesses is the processing of RFPs (requests for proposals). This process usually involves reading clients' emails, searching for the necessary details, and generating proposals for potential clients. This project aims to reduce the manual work involved in the RFP process. We believe that with the help of a clever AI system, only a fraction of man-hours will be needed to accomplish what teams of people spend many hours every week. Specifically, our solution utilizes LLM to process emails and their attachments to extract product-related information that humans only need to verify and price.

Kapitola obsahuje:

Obrázek

Studijní text

Učitel doporučuje studovat od 13. 11. 2024 do 22. 11. 2024.

Week 11 – 28. 11. Michal Štefánik

Join us at A502, Faculty of Informatics MU, on November 28th at 10 AM (CET) [and on Zoom].

[Michal Štefánik]: Scientific Report from EMNLP 2024, with emphasis on robustness and reasoning by LLMs 28. 11. 2024

Přejít

EMNLP is a top-tier NLP conference where leading experts in NLP and AI publish and meet together. Michal will report on the main take-home messages he brought from Miami.

Kapitola obsahuje:

Studijní text

Učitel doporučuje studovat od 21. 11. 2024 do 29. 11. 2024.

Week 12 – 5. 12. Frank Mittelbach

Join us at A502, Faculty of Informatics MU, on December 5th at 10 AM (CET) or [or on Zoom].

Nyní studovat

[Frank Mittelbach]: General Framework for Globally Optimized Pagination 5. 12. 2024

Přejít

An overview presentation of a general framework for globally optimized pagination of linear text, as well as for text plus floating objects, such as figures and tables. The framework uses a flexible constraint model that allows for the implementation of typical typographic rules that can be weighted against each other to support different application scenarios. In this context, "flexible" means that the rules of the typographic presentation of a document are not fixed but can be (to some extent) adjusted to different typographic requirements. It is easy to see that without restrictions, the float placement possibilities grow exponentially if the number of floats is linearly related to the document size. It is, therefore, important to restrict the objective function used for optimization in a way that the algorithm does not have to evaluate all theoretically possible placements while still being guaranteed to find an optimal solution. The goal is, therefore, to define a framework that is both rich in the expressiveness of modeling a large class of pagination applications and, at the same time, is capable of solving the optimization problem in an acceptable time for realistic input data.

Kapitola obsahuje:

Studijní text

Učitel doporučuje studovat nyní – od 26. 11. 2024 do 12. 12. 2024.

Week 13 – 12. 12. Jakub Pekár (and Merry Christmas)

Join us at A502, Faculty of Informatics MU, on December 12th at 10 AM (CET) or [or on Zoom].

[Jakub Pekar]: TBA 12. 12. 2024

Učitel doporučuje studovat od 5. 12. 2024 do 20. 12. 2024.

Tips for readings, discussions, and presentation preparations:

Žákovi, který se hrozil chyb, Mistr řekl: "Ti, kdo nedělají chyby, chybují nejvíc ze všech – nepokoušejí se o nic nového." Anthony de Mello: O cestě

To a student in danger, the Master said: "Those who do not make mistakes most of all – they do not try anything new." Anthony de Mello

Předchozí

Následující

👷 Seminar on Machine Learning, Information Retrieval, and Scientific Visualization
- Nyní studovat
  
  [All]: Taming the Complexity of Research Projects 3. 10. 2024
- Nyní studovat
  
  [Ondřej Sojka]: Transfer Learning of Slavic Syllabification for Hyphenation Patterns
- Nyní studovat
  
  [Tereza Vrabcová a Marek Kadlčík]: Research plans of our Ph.D. talents 7. 11. 2024
- Nyní studovat
  
  [Tomáš Gregor]: Introduction to Quantum Neural Networks 14. 11. 2024
- Nyní studovat
  
  [Martin Kňažovič]: Preprocessing of Requests for Proposals by Large Language Models 21. 11. 2024
- Nyní studovat
  
  [Michal Štefánik]: Scientific Report from EMNLP 2024, with emphasis on robustness and reasoning by LLMs 28. 11. 2024
- Nyní studovat
  
  [Frank Mittelbach]: General Framework for Globally Optimized Pagination 5. 12. 2024
- Nyní studovat
  
  [Jakub Pekar]: TBA 12. 12. 2024

Operace

Prohlédnout vše

Interaktivní osnova

News

Topics and Course Outline

Week 1– 19.9. canceled due to floods

Week 2 – Introduction, research strategy, evaluation, and course schedule planning

Week 3 – Taming the complexity in/of your projects

Week 4 – 10. 10. canceled

Week 5 – 17. 10. Ondřej Sojka

Week 6 – 24. 10. No lecture

Week 7 – 31. 10. No lecture

Week 8 – 7. 11. Tereza Vrabcová a Marek Kadlčík

Week 9 – 14. 11. Jakub PekárTomáš Gregor

Week 11 – 28. 11. Michal Štefánik

Week 12 – 5. 12. Frank Mittelbach

Week 13 – 12. 12. Jakub Pekár (and Merry Christmas)

Operace