Integrating Rasa with knowledge bases

Knowledge bases: Encode domain knowledge and resolve entities in Rasa

In this tutorial, you will learn about:

  • How you can use knowledge bases to bring your assistant’s understanding to the next level.
  • The kinds of knowledge bases and what challenges you can solve using them.
  • And, find out how to query knowledge bases from within your custom actions.

In a separate tutorial, you can also learn how to set up your own knowledge base.

UPDATE:
We added Knowledge Base Actions to Rasa. Try it out and share your feedback on the Rasa Community Forum.

Example
A contextual assistant that can shine with informed responses about, for example, any bank in Germany, goes beyond predefined answers. Certain knowledge and detail are needed to satisfy the user’s request.

Let’s take a look at an example:

A lot of the above questions require domain knowledge to be correctly answered. Users want to not only ask questions about a certain entity, they also want to compare entities, or get information about entities mentioned earlier in the conversation. Hard-coding the information would not help. Recent transactions change  quickly, and up-to-date information is important. Additionally, your bot should be able to resolve references to previously mentioned entities like “that account” or “the first one,” so entities need to be recognised and reused at a later point in the conversation.

What is a knowledge base?

A knowledge base can be used to store complex data structures. The data stored in your knowledge base represent your domain knowledge. For example, you can store information about several banks, or the relationship between a bank and its customers. Different technologies exist that help you store those data in an efficient way. To learn more about knowledge bases, please read the follow-up tutorial “Set up a knowledge base to encode domain knowledge for Rasa”.

How to use knowledge bases alongside Rasa?

Before stepping into the details, let’s take a look at the general flow of how messages are processed in Rasa. Rasa interprets messages and uses custom actions to query the knowledge base and integrate the domain knowledge into the responses:

You can find the example implementation, referred to as banking bot in this tutorial, here. The banking bot is able to answer questions about your bank accounts.

Note: As every bot is different and your knowledge base is likely to look different, this tutorial provides you with an example and encourages you to implement something on your own.

Let’s take a look at the steps you need to do to integrate a knowledge base into Rasa:

Step 1: Building the knowledge base

As a first step, you need to set up your knowledge base. In the tutorial “Set up a knowledge base to encode domain knowledge for Rasa” you will learn step-by-step how to achieve this.

Step 2: Adding new training data

To be able to understand what the customer actually wants, you need to generate new training data. Keeping in mind the example dialogue from the beginning: banking bot has four intents and a couple of stories related to domain knowledge. In the following steps, you will find just a few examples per intent and just a single example story. You can find the complete training data here.

To generate new training data, you need to do three things:

Define new intents

You will notice that the new intents of banking bot are kept generic. However, the entities are specific. If you keep the intents generic, your model is more flexible, and you do not need to create a new intent for every new entity you add to your knowledge base. For example, if you just have one generic intent, called query_attribute, for querying a specific attribute of all kind of entities, you just need to add more examples to the intent when adding a new entity. You don’t need to touch your stories at all.

Let’s take a look at the new intents of banking bot:

query_entities
The user asks to list some entities of a specific type.

query_attribute
Users want to know more details about a certain entity. They are asking about a specific attribute of that entity.

compare_entities
The user was confronted with a list of entities. Now the user wants to compare those entities by a certain attribute.

resolve_entity
Users were given a choice of entity options and they have to state what entity they meant earlier.


Define new stories

Using the identified intents, you can convert the dialogue from the beginning of this tutorial into a Rasa story:

## happy path
* greet
  - utter_greet
* query_entities
  - action_query_entities
  - slot{"entity_type": "bank"}
* compare_entities
  - action_compare_entities
* query_entities
  - action_query_entities
  - slot{"entity_type": "transaction"}
* query_attribute
  - action_query_attribute
* query_attribute
  - action_query_attribute
* resolve_entity
  - action_resolve_entity
* bye
  - utter_goodbye

As you can see, banking bot defines a custom action per intent as every intent requires you to query the knowledge base in a different way. In step 3 you will define those custom actions. Be sure to also add some unhappy paths to your stories. You can do this using interactive learning or the share your bot feature.

Update your domain file

After defining your new intents and stories, you need to add the intents, actions, and entities to your domain file. You can find the domain file of the banking bot here.

Step 3: Adding custom actions

To query your knowledge base, you will use custom actions. Custom actions allow you to react to a user’s utterance in a flexible way. As an example, you will see how to write the custom action action_query_attribute, and how certain edge cases are handled.

Write the custom action action_query_attribute

To write a custom action you need to write a new class that inherits from Action and override the methods name and run. The method name should return the name of the action. The method run contains the logic. So, let’s start with the following skeleton:

class ActionQueryAttribute(Action):
    def name(self):
        return "action_query_attribute"

    def run(self, dispatcher, tracker, domain):
        pass

Banking bot wraps the code for actually querying the knowledge base in a class called GraphDatabase, as banking bot uses a graph database to store the domain knowledge internally. So the first thing you need to do in the method run is to initialize your GraphDatabase.

graph_database = GraphDatabase()

Let’s assume the user asks the following:

The query to retrieve that information from a graph database might look like this:

match $bank isa bank, has name ‘N26’, has headquarters $x; get $x;

If you would use a relational database, you query might look like this:

SELECT headquarters FROM bank WHERE name = ‘N26’

In order to construct the above query, you need to have four things:

  1. entity of interest (e.g. N26)
  2. entity type of the entity of interest (e.g. bank)
  3. attribute of interest (e.g. headquarters)
  4. key attribute of the entity type (e.g. name)

Banking bot treats the first three of the above things as entities. In general, banking bot makes heavy use of slots and depends on the NER to detect them. Banking bot has a slot for each entity type and attribute defined in the knowledge base. Additionally, the following slots exist:

  • listed_items: list of entities, such as “[N26, DKB, Deutsche Bank]”
  • entity_type: entity type the user is asking about, such as “bank”
  • mention: reference to an entity that was mentioned before, such as “first”
  • attribute: name of the attribute the user is looking for, such as “headquarters”

Using these slots and the intent that was detected, you can build your queries to obtain the needed information from your knowledge base.

Let’s go back to the example: If your NER correctly extracted all the entities, Rasa will have set the slots for entity type, attribute, and bank during its NLU pipeline or previous conversation steps. In a custom action, you can get the value of a slot using the tracker’s get_slot method:

entity_type = tracker.get_slot("entity_type")
attribute = tracker.get_slot("attribute")
entity_name = tracker.get_slot(entity_type) # entity_type = "bank"

Let’s assume all the slots have been set. To query your knowledge base one thing is missing: the key attribute of the entity type (e.g. name). To get the key attribute, you need to define the database schema as a dict in the code. All entities with their corresponding attributes should be listed. Additionally, each entity should have a key attribute and a list of attributes that should be used to “print” that entity. For example, the schema for bank would look like the following:

schema = {
...
"bank": {
   "attributes": [
       “name",
       “headquarters”,
       "country",
       "english-website",
       "english-mobile-app",
       "allowed-residents",
       "official-name",
       "free-accounts",
       "free-worldwide-withdrawals",
       "english-customer-service",
   ],
   "key": "name",
   "representation": ["name"],
},
...
}

Using this dictionary you are able to retrieve the key attribute of the entity type bank. Now, you have all information to actually query the knowledge base:

key_attribute = schema[entity_type]["key"]

value = graph_database.get_attribute_of(entity_type, key_attribute, entity_name, attribute)

graph_database.get_attribute_of() just constructs and executes the query you saw above and returns the result. As a final step you can now utter a response:

if value is not None:
    dispatcher.utter_message(
        f"{entity_name} has the value '{value}' for attribute     '{attribute}'."
    )

The complete code of the action can be found here.

Resolve synonyms
Take a look at the following questions:

  • “Where is the main office of N26?”
  • “Where can I find the HQ of DKB?”

The NER would detect “HQ” and “main office” as entity attribute. In order to fetch the requested knowledge from your knowledge base, you need to resolve that “main office” and “HQ” are the same attribute as “headquarters”. For that purpose banking bot uses mapping tables stored in the knowledge base. Basically, mapping tables are just a mapping as the name already implies. So “HQ”, for example, is mapped to “headquarters”. Banking bot defines a mapping table for attributes and entity types to resolve synonyms:

  • attribute_mapping: maps attribute synonyms to the attributes used in the knowledge base, e.g. HQ is mapped to headquarters
  • entity_type_mapping: maps entity types of all kind to the entity types used in the knowledge base, e.g. people is mapped to person

So, what do you need to change now? Instead of just calling:

attribute = tracker.get_slot("attribute")

You need to add the following line of code:

attribute = graph_database.map("attribute-mapping", attribute)

This way you can make sure that the attribute you are using in your query exists. However, you have to make sure to add all kinds of variations of your attributes to that mapping table, otherwise the query will fail.

The same needs to be done for the entity type:

entity_type = tracker.get_slot("entity_type")
entity_type = graph_database.map("entity-type-mapping", entity_type)
Resolve mentions like “the first one”

Let’s take a look at the conversation:

The user refers to the first bank, e.g. N26. In order to query for the headquarters, you need to resolve this reference. How can we do that? The following line of code does not work, as the bank isn’t explicitly mentioned, and so NER can not detect a bank in the question “What is the headquarters of the first one?”.

entity_name = tracker.get_slot(“bank”)

However, the NER hopefully detected “first one” as entity mention. Additionally, banking bot has a mapping table called mention_mapping that maps certain mentions to an index position. So, you can use the found mention together with the mention_mapping to get the index position of the required entity.

mention = tracker.get_slot("mention")
if mention is not None:
   idx = int(graph_database.map("mention-mapping", mention))

Now you have the index. Whenever banking bot lists a couple of entities in the action query_entities the entities are stored in the slot called listed_items as list. You can reuse that list to retrieve the actual entity:

mention = tracker.get_slot("mention")
listed_items = tracker.get_slot("listed_items")

if mention is not None and listed_items is not None:
   idx = int(graph_database.map("mention-mapping", mention))

   if type(idx) is int and idx < len(listed_items):
       return listed_items[idx]

Now you’ve successfully resolved “first one” to N26.

Code of all custom actions

The other actions work similarly. You extract the entities the NER found, and if needed, use mapping tables to resolve entities. If all necessary parts are there, a query is constructed and the knowledge base is queried. The result is then used to formulate a response and to fill slots if required. Take a look at the code to see how these steps are followed to implement the other actions.

Feedback

If you have any questions about this tutorial or this repository, feel free to share them on Rasa Community Forum.