Laboratory of Electronic and Multimedia Applications (Research Section)

[Marek Petrovič]: One Bit at a Time: Impact of Quantisation on NMT Robustness 16. 3. 2023

Abstract

Quantization of neural networks is one of the methods to make neural networks faster and smaller. There has been thorough research about the quantization of various models. It has already been shown, that it is possible to quantize the BERT model for integer-only inference and achieve 3x speed-up. Some papers evaluate the quantization of Transformer models for NMT. The drawback of quantization is a possible decrease in accuracy. Quantization Aware Training tries to solve this by preparing the model for quantization by estimating quantization effects during training. This behavior might have regularization effects on NMT models. In our work, we want to explore available quantization modes and their effect on NMT models' inference speed, memory efficiency, with a special focus on domain robustness (regularization effects).

Readings

MIRMU PV174 Petrovic

2022-03-17-petrovic.mp4

Recordings of Marek Petrovič's lecture

Předchozí

Následující

Laboratory of Electronic and Multimedia Applications (Research Section)
- Nyní studovat
  
  Introduction 16. 2. 2023
- Nyní studovat
  
  Topics, Lecture Allocation, Literature Review Methodology 23. 2. 2023
- Nyní studovat
  
  [Marek Petrovič]: One Bit at a Time: Impact of Quantisation on NMT Robustness 16. 3. 2023
- Nyní studovat
  
  [Lukáš Mikula]: Think Twice Before You Answer: Mitigating Biases of Question Answering Models (24.3.2022)
- Nyní studovat
  
  [Katarína Grešová]: Modeling Small RNA Binding Rules 30. 3. 2023
- Nyní studovat
  
  [Michal Štefánik, Martin Geletka, Petr Sojka] Math Information Retrieval: The past, the present, and the bright ARQMath 3 future 6. 4. 2023
- Nyní studovat
  
  [Martin Geletka]: Visual Document Understanding 13. 4. 2023
- Nyní studovat
  
  [Jakub Ryšavý]: Decentralized Finance Backtesting 27. 4. 2023
- Nyní studovat
  
  [Dávid Čechák + Vlasta Martinek]: Deep Learning for Drug Discovery 4. 5. 2023
- Nyní studovat
  
  [Michal Štefánik]: Robustness of Neural Language Models 11. 5. 2023

Operace

Prohlédnout vše

Interaktivní osnova

[Marek Petrovič]: One Bit at a Time: Impact of Quantisation on NMT Robustness 16. 3. 2023

Abstract

Readings

Operace