19. What Does BERT Look At? An Analysis of BERT’s Attention (2019)

Large pre-trained NN such as BERT had great success!

Proposes methods for analyzing the attention mechanism of pre-trained models, and apply them to BERT