Large Language Models (LLM)
PA154 Language Modeling (12.2)
Pavel Rychly
pary@fi.muni.cz May 14, 2024
From LM to Chat
■ LM generates prompt continuation
■ prompt engineering
■ fine tune on chat texts
■ some O&A/chat texts are in training data
Pavel Rychly • Large Language Models (LLM) • May 14, 2024
2/9
prompt engineering
■ simple prompt: Q&A
Q: {question} A:
■ general knowledge:
Generate some knowledge about the concepts in the input.
Input: {question} Knowledge:
■ task specific:
If {premise} is true, is it also true that {hypothesis}? Ill {entailed}.
Pavel Rychly • Large Language Models (LLM) • May 14, 2024
3/9
Chain-of-thought
■ direct asnwer is not correct for complex questions
■ solve a problem in steps
■ Q: {question}
A: Let's think step by step.
Pavel Rychly • Large Language Models (LLM) • May 14, 2024
4/9
Generated trained data
■ there are never enough text with the right O&As
■ generate data from a pattern especially for chain-of-thought
■ variation of variables (numbers in math,...)
■ generated using LLM
Pavel Rychly • Large Language Models (LLM) • May 14, 2024
5/9
RLHF
Reinforcement Learning from Human Feedback
Ranking Data
Prompt Data
Pairs of prompts and responses
training
Supervised Model
Aligned Model
The learned RL policy after applying RLHF
Pavel Rychly • Large Language Models (LLM) • May 14, 2024
6/9
Alignment
■ annotated data by humans
■ training using RLHF
■ eliminate toxicity, bias
Pavel Rychly • Large Language Models (LLM) • May 14, 2024
7/9
Foundation models
■ LLM without fine tuning
■ can be used to addapt on a new domain/Language
■ fine-tuned on a specific task
■ chat
■ question answering
■ summarization
Pavel Rychly • Large Language Models (LLM) • May 14, 2024
8/9
Lora
Low-Rank Adaptation of Large Language Models fine-tune only a small fraction of parameters usually only attention matrices
Pretrained Weights
w e m.dxd
d
X
Pavel Rychly • Large Language Models (LLM) • May 14, 2024
9/9