👷 Seminar on Machine Learning, Information Retrieval, and Scientific Visualization
doc. RNDr. Petr Sojka, Ph.D.
👷 Seminar on Machine Learning, Information Retrieval, and Scientific Visualization

News

  • The course is a regular research seminar in the stated researched areas. It is mandatory that enrolled student has a presentation on a research topic (of interest, or her thesis, or he will talk about a research paper or area) during the term. Topics of presentations focus primarily (but not necessarily) on those related to the  group: machine learning, information retrieval, representation learning, and scientific visualization.
  • There is a discussion group with official course information and a communication channel in addition to the course outline below: watch both frequently!

Topics and Course Outline

Week 1Introduction

Join us at A502 Faculty of Informatics MU on February 22nd at 10 AM (CET) [or on Zoom].

  1. [9:50 Catering preparation (tea preparation, cakes,...]
  2. 10:00 Class introduction, warm-up round-up discussion (expectation, topics/expertise/background, suggested presentations, and readings). Bring your research presentation offers (one slide) and ideas to present, read, study, and discuss! Take inspiration from last term's presentations.
  3. 10:30 The importance of "selling the ideas and work." Picking the right topics and questions, researching "big issues." Picking the right publication forums (in CS and NLP). The danger of Tyranny of metrics.
  4. 10:50 Motivating video: DEK's advice to young students.
  5. 10:55 Preparation of schedule of talks for this term, topics to cover.
  6. 11:20 Varia, socializing (team builder wanted!), lunch.

Week 2  Research strategy, evaluation, and course schedule planning

Join us at A502, Faculty of Informatics MU, on February 29th at 10 AM (CET) [or on ].

  • [9:50 Catering preparation (tea preparation, cakes,...]
  • 10:00 Principles of research communication and scientific work: Put readers in your place!
    Specifics of CS research and doctoral studies and their evaluation at FI MU: CS conference rankings
  • 10:30 Importance of "selling the ideas and work," picking the right topics and questions, researching "big issues," and picking the right publication forums (in CS and NLP). An h-index as a measure of impact. The danger of Tyranny of metric
  • 10:45 Class round-up discussion (suggested presentations, readings, and presentation course schedule).
  • Week 3 – Martin Čermák:  Spatial Sharpening of Land Surface Temperature Data

    Join us at A502, Faculty of Informatics MU, on March 7th at 10 AM (CET) [or on Zoom].

    Remote sensing is an essential tool in efficiently gathering information about the Earth's surface on a global scale. This information can be used in various areas ranging from agriculture and forest management to battling the effects of heat islands in urban environments. There are many satellite missions with various types of sensors measuring different metrics. All come with their advantages and disadvantages. One of the most common trade-offs in remote sensing is spatial vs. temporal resolution due to physical constraints. This thesis explores the existing ways of enhancing the spatial resolution of land surface temperature (LST) images using environmental variables obtained at finer resolution. Additionally, we form and evaluate a hypothesis that chosen predictors that model anthropogenic heat flux will aid in improving the accuracy of the overall sharpening model.

    Chapter contains:
    1
    Image
    1
    Study Materials
    1
    Video
    1
    Study text
    Teacher recommends to study from 25/2/2024 to 9/3/2024.

    Week 4 – Martin Kňažovič, Jan Franěk: Interactive Learning Tool Utilizing AI

    Join us at A502, Faculty of Informatics MU, on March 14th at 10 AM (CET) or [or on Zoom].

    With the recent improvements of the Large Language Models (LLMs), their usage became relevant for most daily tasks. Yet, many of those areas are underutilizing the full potential of LLMs. One such area is the learning domain, which our project focuses on. Our main goal is to create a platform to help students by integrating LLMs and speeding up their learning process. However, there are obstacles in this area; for example, study materials are provided in different formats - videos, PDFs, and slides, which we want to touch on. In the following presentation, we will analyze available tools for learning enhancement, propose a possible solution, and evaluate its results.

    Chapter contains:
    1
    Image
    1
    PDF
    1
    Video
    1
    Study text
    Teacher recommends to study from 7/3/2024 to 22/3/2024.

    Week 5 – 21. 3. No contact meeting (EACL)

    Week 6 – 28. 3. No contact meeting (spa)

    Week 7 – 4. 4. No contact meeting (spa) 

    Week 8 – Katarína Hudcovicová

    Join us at A502, Faculty of Informatics MU, on April 11th at 10 AM (CET) and on Zoom.

    Previous works on natural language inference have examined how well transformer models can reason with text. But what was still lacking was addressing whether they could understand the logical semantics in natural language. The reason is mainly because the previously studied logical problems can be, depending on their structure, more or less computationally complex. Therefore, it is unclear whether the reason for lower performance is due to the difference in computational complexity or the inability to comprehend the logical semantics of natural language. The authors chose the model-checking problem, as their computational complexity is always PTIME. The results suggested that the form and type of language used significantly affect how well the transformer models perform. They can grasp some logical meanings in natural language but still fall short when learning the underlying algorithm of model-checking problems.

    Chapter contains:
    1
    Image
    1
    Study Materials
    1
    Video
    1
    Study text
    Teacher recommends to study from 5/4/2024 to 12/4/2024.

    Week 9 – Jiří Žák 

    Join us at A502, Faculty of Informatics MU, on April 18th at 10 AM (CET) (the teacher will connect via Zoom).

    This research centers on automated invoice processing, entailing an analysis of existing methods and systems to construct a comprehensive overview. The objective is to develop a pipeline based on established software and to construct a corresponding testing framework. Leveraging this foundation, the aim is to refine the pipeline and evaluate its efficacy within the established testing framework. The subsequent findings shed light on the performance and potential enhancements of the automated invoice processing pipeline.

    Chapter contains:
    1
    Image
    1
    Study Materials
    1
    Video
    1
    Study text
    Teacher recommends to study from 12/4/2024 to 19/4/2024.

    Week 10 – Zuzana Pitsmausová

    Join us at A502, Faculty of Informatics MU, on April 25th at 10 AM (CET).

    Feature construction (FC) is a crucial step in the machine learning pipeline, as the quality of features can significantly impact the model's performance. This presentation aims to acquaint listeners with feature construction and briefly overview the state-of-the-art FC methods. The primary focus of the presentation will be an experiment that was conducted using two FC frameworks based on genetic programming (GP) – Evolutionary Forest and M3GP.

    Chapter contains:
    1
    Image
    1
    PDF
    1
    Study text
    Teacher recommends to study from 17/4/2024 to 26/4/2024.

    Week 11 – David Čechák

    Join us at A502, Faculty of Informatics MU, on May 2nd at 10 AM (CET) [or on Zoom].

    Messenger RNA (mRNA) decay is a crucial process in regulating of gene expression, influencing cellular functions and organismal phenotypes. Precise identification of mRNA decay sites will help to understand the post-transcriptional control mechanisms that affect mRNA stability. In this presentation, we will discuss the prediction of mRNA decay sites using a DeBERTa transformer model. Our model is trained on sequences derived from direct RNA sequencing of polyadenylated RNA from HeLa cells, focusing on subsequences within mRNA coding regions to determine the presence of decay sites. Based on our model, we also analyze the role of single nucleotide variants in mRNA decay.

    Chapter contains:
    1
    Video
    1
    Study text
    Teacher recommends to study from 30/4/2024 to 10/5/2024.

    Week 12 – no lecture, individual consultations only

    Join us at A502, Faculty of Informatics MU, on May 9th at 10 AM (CET) or [or on Zoom].

    Week 13 – Marek Kadlčík

    Join us at A502, Faculty of Informatics MU, on May 16th at 10 AM (CET) or [or on ].

    ICLR is among the most impactful ML conferences (A* in CORE ranking). In this presentation, we will report on the acceptance of our presented paper in the main track. We will comment on the main research direction in empirical NLP and show you the highlights of the research that struck our attention during the event.

    Chapter contains:
    10
    Image
    1
    Video
    1
    Study text
    Teacher recommends to study from 9/5/2024 to 24/5/2024.

    Tips for readings, discussions, and presentation preparations:

    1. Top2Vec towardsdatascience.com/top2vec-new-way-of-topic-modelling 
    2. How to Speak by Patrick Winston (YouTube video)

    Žákovi, který se hrozil chyb, Mistr řekl: "Ti, kdo nedělají chyby, chybují nejvíc ze všech – nepokoušejí se o nic nového." Anthony de Mello: O cestě

    To a student in danger, the Master said: "Those who do not make mistakes most of all – they do not try anything new." Anthony de Mello

    Previous