For the NLP community, last week marked a new addition to the BERT family, this time from Google itself: ALBERT: A Lite BERT for Self-supervised Learning of Language Representations. (ALBERT came out of Google Research, the original BERT being the fruit of labor of Google AI Language. On a side note, I do wonder how Google structures their AI teams.)

Blessed be the fruit!

ALBERT sets new state-of-the-art on multiple NLP tasks with the added benefit of being smaller than BERT-large (considering that I had to use mixed precision to get even BERT-base fine-tuned on a dedicated NVIDIA GPU with 16 Gb of memory, any further downsizing will certainly be appreciated).


And, sure enough, the arXiv (actually, it is Google yet again, now in the guise of Google AI) does not disappoint. Let me just quote the authors of Extreme Language Model Compression with Optimal Subwords and Shared Projections:

Our method is able to compress the BERT-base model by more than 60x, with only a minor drop in downstream task metrics, resulting in a language model with a footprint of under 7MB.

Kind of a big deal!


In another effort to decrease the size of the state-of-the-art Transformer architecture, Facebook AI Research did a cool Dropout trick. To be fair, Dropout was pretty cool to begin with: who would have guessed that randomly turning connections in a network off and on would lead to [sometimes very significant] increase in performance? Reducing Transformer Depth on Demand with Structured Dropout takes this idea one step further, dropping entire layers during training. This allows to extract much smaller subnetworks for inference - essentially, through some strategic pruning.


And, because not everything in life is about NLP and/or Transformers, here is a review paper on what will only become a bigger problem at the rate that deep learning is advancing: Deep Learning for Deepfakes Creation and Detection. Conveniently, both approaches to creating and detecting deep fakes (photos and videos of people that have been generated with AI yet are indistinguishable from the real thing to a human eye) are included. So, pick your side.