Seminární skupina 02 předmětu Laboratoř elektronických a multimediálních aplikací

[Michal Štefánik]: Unsupervised Estimation of Out-of-Distribution Performance 1. 4. 2021

Abstract

Neural language models are consistently moving the SOTA on a wide range of NLP tasks, but they do not perform consistently well under a domain shift, i.e. when applied to samples from different language domains. This disallows their deployment to some critical applications. Questionable comparability of models based on in-domain evaluation also slows down further research progress in the relevant direction.

We propose a set of simple evaluation methods that can estimate the expected performance of the system on out-of-distribution (OOD) samples. We show how each of these methods corresponds to a true evaluated performance on OOD and demonstrate the practical implications of our work in zero-shot evaluation. 

Eventually, we present a set of interesting observations adjusting our understanding of neural language models, based on a novel insight that the evaluation methods bring.


Seminář 1. 4. 2021 10:00
Michal Štefánik: Unsupervised Estimation of Out-of-Distribution Performance

Readings

Unsupervised Estimation of Out-of-Domain Performance of Language Models
Talk by Michal Štefánik at the PV173 NLP seminar
Increasing Data Efficiency: Hugging Face Quantifies the Benefits of Prompts for Pretrained Language Models
A research team from Hugging Face shows that prompting is indeed beneficial for fine-tuning pre-trained language models and that this benefit can be quantified as worth hundreds of data points on average across classification tasks.