Laboratory of Electronic and Multimedia Applications (Research Section)

[Marek Petrovič]: One Bit at a Time: Impact of Quantisation on NMT Robustness 17. 3. 2022

Abstract

Quantization of neural networks is one of the methods to make neural networks faster and smaller. There has been thorough research about the quantization of various models. It has already been shown, that it is possible to quantize the BERT model for integer-only inference and achieve 3x speed-up. Some papers evaluate the quantization of Transformer models for NMT. The drawback of quantization is a possible decrease in accuracy. Quantization Aware Training tries to solve this by preparing the model for quantization by estimating quantization effects during training. This behavior might have regularization effects on NMT models. In our work, we want to explore available quantization modes and their effect on NMT models' inference speed, memory efficiency, with a special focus on domain robustness (regularization effects).

Readings

MIRMU PV174 Petrovic

2022-03-17-petrovic.mp4

Recordings of Marek Petrovič's lecture

Předchozí

Následující

Laboratory of Electronic and Multimedia Applications (Research Section)
- Nyní studovat
  
  Introduction 17. 2. 2022
- Nyní studovat
  
  Topics, Lecture Allocation, Literature Review Methodology 24. 2. 2022
- Nyní studovat
  
  [Marek Petrovič]: One Bit at a Time: Impact of Quantisation on NMT Robustness 17. 3. 2022
- Nyní studovat
  
  [Lukáš Mikula]: Think Twice Before You Answer: Mitigating Biases of Question Answering Models (24.3.2022)
- Nyní studovat
  
  [Katarína Grešová]: Modeling Small RNA Binding Rules 31. 3. 2022
- Nyní studovat
  
  [Michal Štefánik, Martin Geletka, Petr Sojka] Math Information Retrieval: The past, the present, and the bright ARQMath 3 future 7. 4. 2022
- Nyní studovat
  
  [Martin Geletka]: Visual Document Understanding 14. 4. 2022
- Nyní studovat
  
  [Jakub Ryšavý]: Decentralized Finance Backtesting 28. 4. 2022
- Nyní studovat
  
  [Dávid Čechák + Vlasta Martinek]: Deep Learning for Drug Discovery 5. 5. 2022
- Nyní studovat
  
  [Michal Štefánik]: Robustness of Neural Language Models 12. 5. 2022

Operace

Prohlédnout vše

Interaktivní osnova

[Marek Petrovič]: One Bit at a Time: Impact of Quantisation on NMT Robustness 17. 3. 2022

Abstract

Readings

Operace