Sam Sucik

1 post

Compressing BERT for faster prediction

Let's look at compression methods for neural networks, such as quantization and pruning. Then, we apply one to BERT using TensorFlow Lite.…