MUNI FI PA154 - Technical Informations Introduction PA154 Language Modeling (1.1) Slides in IS https://is.muni.cz/autln/el/fi/jaro2025/PA154/ Final written exam, (open books, without interactive apps) 60 points, 30 points for E optional individual projects up to 30 points Pavel Rychlý pary@fi.muni.cz February 21, 2024 ^avel Rychlý ■ Introduction ■ February 21,2024 Individual projects Language model presentation on a new research in language modeling small project as a part of bigger collaborative projects ■ neuraL machine translation ■ Lexical, acquisition small task ■ describe errors in ChatGPT ■ annotation of a Langauge resource model ■ (mathematical.) abstractions ■ simiLar/same behavior of modeLed object language model ■ model, a natural. Language ^avel Rychlý ■ Introduction ■ February 21,2024 ^avel Rychlý ■ Introduction ■ February 21,2024 Language models-what are they good for? Predicting words assigning scores to sequencies of words predicting words generating text machine translation automatic speech recognition optical character recognition chat, question answering Do you speak... Would you be so ... Statistical machine ... Faculty of Informatics, Masaryk . WWII has ended in ... In the town where I was ... Lord of the ... ChatGPT and other LLMs are language models ^avel Rychlý ■ Introduction ■ February 21,2024 5/9 ^avel Rychlý ■ Introduction ■ February 21,2024 6/9 Generating text ^avel Rychly ■ Introduction ■ February 21,2024 MT + OCR ^avel Rychly ■ Introduction ■ February 21,2024 Language models - probability of a sentence ■ LM is a probability distribution over all possible word sequences. ■ What is the probability of utterance ofs? Probability of sentence Plm(Catalonia President urges protests) PtM(President Catalonia urges protests) piM(urges Catalonia protests President) Ideally, the probability should strongly correlate with fluency and intelligibility of a word sequence. ^avel Rychly ■ Introduction ■ February 21,2024 9/9