Machine Learning

Pruning BERT to accelerate inference

After previously discussing various ways of accelerating models like BERT, in this blog post we empirically evaluate the pruning approach. You can: read about the implementation…

Sam Sucik

Compressing BERT for faster prediction

Let's look at compression methods for neural networks, such as quantization and pruning. Then, we apply one to BERT using TensorFlow Lite.…

Sam Sucik

Algorithms alone won’t solve conversational AI - Introducing Rasa X

We're excited to announce Rasa X, our new product for developers in early access. Also, our open source framework Rasa is now available in 1.0…

Alan Nichol

Rasa NLU in Depth: Part 3 – Hyperparameter Tuning

Part 3 of our Rasa NLU in Depth series covers hyperparameter tuning. We will explain how to use Docker containers to run a Rasa NLU hyperparameter search for the best NLU pipeline at scale.…

Tobias Wochinger

Rasa NLU in Depth: Part 1 – Intent Classification

Attention, Dialogue, and Learning Reusable Patterns

Our latest research paper introduces the new embedding policy (REDP), which is much better at dealing with uncooperative users than our standard LSTM.…

Alan Nichol

How to handle multiple intents per input using Rasa NLU TensorFlow pipeline

In this post we are going to take a comprehensive look at how we can use Rasa NLU TensorFlow pipeline to build chatbots which can understand multiple intents per input.…

Justina Petraityte

Supervised Word Vectors from Scratch in Rasa NLU

We’ve released a new pipeline which is totally different from the standard Rasa NLU approach. It uses very little memory, handles hierarchical intents, messages containing…

Alan Nichol