Large Language Models (LLM) PA154 Language Modeling (12.2) Pavel Rychly pary@fi.muni.cz May 14, 2024 From LM to Chat ■ LM generates prompt continuation ■ prompt engineering ■ fine tune on chat texts ■ some O&A/chat texts are in training data Pavel Rychly • Large Language Models (LLM) • May 14, 2024 2/9 prompt engineering ■ simple prompt: Q&A Q: {question} A: ■ general knowledge: Generate some knowledge about the concepts in the input. Input: {question} Knowledge: ■ task specific: If {premise} is true, is it also true that {hypothesis}? Ill {entailed}. Pavel Rychly • Large Language Models (LLM) • May 14, 2024 3/9 Chain-of-thought ■ direct asnwer is not correct for complex questions ■ solve a problem in steps ■ Q: {question} A: Let's think step by step. Pavel Rychly • Large Language Models (LLM) • May 14, 2024 4/9 Generated trained data ■ there are never enough text with the right O&As ■ generate data from a pattern especially for chain-of-thought ■ variation of variables (numbers in math,...) ■ generated using LLM Pavel Rychly • Large Language Models (LLM) • May 14, 2024 5/9 RLHF Reinforcement Learning from Human Feedback Ranking Data Prompt Data Pairs of prompts and responses training Supervised Model Aligned Model The learned RL policy after applying RLHF Pavel Rychly • Large Language Models (LLM) • May 14, 2024 6/9 Alignment ■ annotated data by humans ■ training using RLHF ■ eliminate toxicity, bias Pavel Rychly • Large Language Models (LLM) • May 14, 2024 7/9 Foundation models ■ LLM without fine tuning ■ can be used to addapt on a new domain/Language ■ fine-tuned on a specific task ■ chat ■ question answering ■ summarization Pavel Rychly • Large Language Models (LLM) • May 14, 2024 8/9 Lora Low-Rank Adaptation of Large Language Models fine-tune only a small fraction of parameters usually only attention matrices Pretrained Weights w e m.dxd d X Pavel Rychly • Large Language Models (LLM) • May 14, 2024 9/9