MUNI FI PA154 - Technical Informations Introduction PA154 Language Modeling (1.1) Slides in IS https://is.muni.cz/autln/el/fi/jaro2023/PA154/ Final written exam (online) 50 points, 25 points for E optional individual projects up to 25 points Pavel Rychlý pary@fi.muni.cz February 16, 2023 ^avel Rychlý ■ Introduction ■ February 16,2023 Individual projects Language model presentation on a new research in language modeling small project as a part of bigger collaborative projects ■ neuraL machine translation ■ Lexical, acquisition small task ■ describe errors in ChatGPT ■ annotation of a Langauge resource model ■ (mathematical.) abstractions ■ simiLar/same behavior of modeLed object language model ■ model, a natural. Language ^avel Rychlý ■ Introduction ■ February 16,2023 ^avel Rychlý ■ Introduction ■ February 16,2023 Language models-what are they good for? Predicting words assigning scores to sequencies of words predicting words generating text statistical machine translation automatic speech recognition optical character recognition Do you speak... Would you be so ... Statistical machine ... Faculty of Informatics, Masaryk . WWII has ended in ... In the town where I was ... Lord of the ... ^avel Rychlý ■ Introduction ■ February 16,2023 5/9 ^avel Rychlý ■ Introduction ■ February 16,2023 6/9 Generating text ^avel Rychly ■ Introduction ■ February 16,2023 MT + OCR ^avel Rychly ■ Introduction ■ February 16,2023 Language models - probability of a sentence ■ LM is a probability distribution over all possible word sequences. ■ What is the probability of utterance ofs? Probability of sentence Plm(Catalonia President urges protests) PtM(President Catalonia urges protests) piM(urges Catalonia protests President) Ideally, the probability should strongly correlate with fluency and intelligibility of a word sequence. ^avel Rychly ■ Introduction ■ February 16,2023 9/9