My Step-by-step Process for Updating to Rasa Open Source 2.0

This blog post documents everything I did to update the wellness check bot for Rasa Open Source 2.0. If you'd like an overview of what changed in 2.0, I recommend checking out our release announcement, deep dive webinar, and migration guide.

Before I began, there were three main areas of the bot that I knew I'd need to pay special attention to.

Training data. The 1.x bot's training data was in markdown format. To bring it up to speed with 2.0, I'd need to convert it to YAML.
Forms. In Rasa Open Source 2.0, Forms moved out of the Rasa SDK and into the main library, and the FormPolicy was replaced by the RulePolicy. I would need to make some adjustments to my config, domain, stories, and custom action code to take advantage of the new implementation.
Rules. The 2.0 release introduced the concept of rules, which enforce a fixed action or series of actions in response to a condition or intent. I knew I'd be able to simplify some of my existing stories by moving some of the functionality into a rule instead.

Setting up the environment and updating the Rasa Open Source version

The first thing I did was create a new branch on my repository, to keep my changes separate from the main branch while I worked on them. I created a new virtual environment in my project directory and installed Rasa Open Source 2.1.0, which was the latest version at the time I made these updates.

Shell

pip install rasa

Converting training data from markdown to YAML

One of the most visible changes between Rasa Open Source 1.x and 2.x is the move from markdown to YAML for training data. The next thing I wanted to tackle was updating my training data files to the new format.

Throughout the process, I kept the migration guide from the docs open in my browser. The migration guide includes helpful resources for making the update, including scripts you can run to automatically convert your training data files to YAML.

The first thing I did was rename my existing `data` directory to `old_data` and created a new, empty `data` directory to hold the converted training data files. When I ran the conversion scripts, `old_data` would be my SOURCE_DIR, and `data` would be my TARGET_DIR. I started by running the script for my NLU training data:

rasa data convert nlu -f yaml --data=old_data --out=data

This created a new file in my (previously empty) data directory called nlu_converted.yml. When I opened this file, I could see that my markdown NLU training data was now in the 2.x YAML format.

So far so good! The next step was to do the same thing for my stories.md file:

Shell

rasa data convert core -f yaml --data=old_data --out=data

Again, I had a new file in my data directory, called stories_converted.yml. But this time, the conversion script printed a warning:

Training data file 'olddata/stories.md' contains forms. Any 'form' events will be converted to 'activeloop' events. Please note that in order for these stories to work you still need the 'FormPolicy' to be active. However the 'FormPolicy' is deprecated, please consider switching to the new 'RulePolicy', for which you can find the documentation here: https://rasa.com/docs/rasa/rules.

This warning just made sure I didn't overlook what I mentioned earlier: that the functionality previously controlled by the FormPolicy was now part of the RulePolicy. To resolve the warning, I would need to update my config.yml file, replacing FormPolicy with RulePolicy. The warning also noted that form events were now called active_loop events. Later, when I added rules and updated my stories, I would need to make sure I used the new terminology.

Updating the config and domain

The next thing I needed to do was update my config.yml file, replacing any deprecated policies-MappingPolicy, FallbackPolicy, and FormPolicy-with the RulePolicy, which contains the logic of all 3.

Here's what my updated policies looked like:

YAML

policies:
  - name: MemoizationPolicy
  - name: TEDPolicy
    max_history: 5
    epochs: 100
  - name: RulePolicy

Next, I opened the domain.yml file and added version: "2.0" to the very top. This allows the domain file to be read as 2.0-compatible, which means you can actually split your domain file into smaller, more modular files if you like. In my case, the wellness check bot is still pretty small, so I left my domain in a single file.

Migrating forms to Rasa Open Source 2.0

The wellness check bot uses a form to ask the user six questions about their health habits. When the form is complete, the bot posts the data to a spreadsheet using the Airtable API.

In Rasa Open Source 2.x, a few notable changes were made to forms:

The FormPolicy was deprecated and bundled into the RulePolicy.
The Forms functionality was moved out of the Rasa SDK and into the standard library. That means it's possible to create a simple form, including defining the required slots and slot mappings, solely in the domain file. If you need to customize the form, with custom slot mappings, validation, or conditional logic, you can still do so with a custom action, by importing FormValidationAction from the Rasa SDK.
The submit() method used in version 1.x forms becomes an action, which should be invoked by a rule that defines what happens when the form is complete.

Since I'd already added the RulePolicy to my config file, I started by building out my form in the domain file. Before, the form section of my domain file looked like this:

YAML

forms:
  - health_form

This registered the name of the form, and all of the other form logic resided in my actions.py file. After I updated the forms section of domain.yml, it looked like this:

YAML

forms:
  health_form:
    confirm_exercise:
      - type: from_intent
        intent: affirm
        value: True
      - type: from_intent
        intent: deny
        value: False
      - type: from_intent
        intent: inform
        value: True
    exercise:
      - type: from_entity
        entity: exercise
    sleep:
      - type: from_entity
        entity: sleep
      - type: from_intent
        intent: deny
        value: None
    stress:
      - type: from_entity
        entity: stress
    diet:
      - type: from_text
        intent: 
          - inform 
          - affirm 
          - deny
    goal:
      - type: from_text
        intent: inform

In the new version, I defined the form's required slots and slot mappings directly in the domain file. That meant that I could remove all of this Python code from my actions.py file, because the YAML above did the same thing:

Python

class HealthForm(FormAction):

    def name(self):
        return "health_form"

    @staticmethod
    def required_slots(tracker):

        if tracker.get_slot('confirm_exercise') == True:
            return ["confirm_exercise", "exercise", "sleep",
             "diet", "stress", "goal"]
        else:
            return ["confirm_exercise", "sleep",
             "diet", "stress", "goal"]

    def slot_mappings(self) -> Dict[Text, Union[Dict, List[Dict]]]:
        """A dictionary to map required slots to
            - an extracted entity
            - intent: value pairs
            - a whole message
            or a list of them, where a first match will be picked"""

        return {
            "confirm_exercise": [
                self.from_intent(intent="affirm", value=True),
                self.from_intent(intent="deny", value=False),
                self.from_intent(intent="inform", value=True),
            ],
            "sleep": [
                self.from_entity(entity="sleep"),
                self.from_intent(intent="deny", value="None"),
            ],
            "diet": [
                self.from_text(intent="inform"),
                self.from_text(intent="affirm"),
                self.from_text(intent="deny"),
            ],
            "goal": [
                self.from_text(intent="inform"),
            ],
        }

Weeeell, almost. You see, the wellness check bot uses conditional logic to control which slots are required depending on how the user answers the first question ("Did you exercise yesterday?") If the user says no, we skip the next slot, which asks what type of exercise they did, and move on to the third question. A cool thing about forms in 2.x is that if you don't need this kind of customization, you can pretty much define your entire form in the domain file. But in my case, I needed to write a form validation function that would replicate the conditional slot logic I had before.

Here's what that looked like in my actions.py file:

Python

from rasa_sdk.forms import FormValidationAction

class ValidateHealthForm(FormValidationAction):
    def name(self) -> Text:
        return "validate_health_form"

    async def validate_confirm_exercise(
        self,
        value: Text,
        dispatcher: CollectingDispatcher,
        tracker: Tracker,
        domain: Dict[Text, Any],
    ) -> Dict[Text, Any]:
        if value:
            return {"confirm_exercise": True}
        else:
            return {"exercise": "None", "confirm_exercise": False }

The last change I needed to make was to make the form submission a custom action, instead of calling it using the submit() method in the FormAction class the way I did in 1.x. In this bot, the submit function passes all of the slot values to another function (which posts the data to the Airtable API) and then utters the message, "Thanks, your answers have been recorded!" In a later step, I would create a rule to call this custom action when the form is complete.

Here's what my updated submit function looked like in actions.py:

Python

class ActionSubmitResults(Action):
    def name(self) -> Text:
        return "action_submit_results"
    def run(
        self,
        dispatcher: CollectingDispatcher,
        tracker: Tracker,
        domain: Dict[Text, Any],
    ) -> List[Dict]:

        confirm_exercise = tracker.get_slot("confirm_exercise")
        exercise = tracker.get_slot("exercise")
        sleep = tracker.get_slot("sleep")
        stress = tracker.get_slot("stress")
        diet = tracker.get_slot("diet")
        goal = tracker.get_slot("goal")

        response = create_health_log(
                confirm_exercise=confirm_exercise,
                exercise=exercise,
                sleep=sleep,
                stress=stress,
                diet=diet,
                goal=goal
            )

        dispatcher.utter_message("Thanks, your answers have been recorded!")
        return []

And then finally, I made sure to list the names of the custom actions I just created in my domain file:

YAML

actions:
- action_submit_results
- validate_health_form

Updating rules and stories

I was getting close, but there was still one major part of my assistant I needed to update: the rules and stories. Rules are a new concept introduced with Rasa Open Source 2.0. They're designed to control single-turn interactions that should always happen the same way, regardless of what happened before. They consist of a condition or intent that invokes the rule, followed by one or more actions that the bot should take. Stories are different from rules because they map out multi-turn dialogues. Typically, stories should be used when the bot's next action depends on a sequence of events instead of a single condition.

To start with, I had the stories.yml file that had been generated by the conversion script. The next thing I did was create a new file called rules.yml in my data directory.

Since I'd just been working with forms, creating rules and stories to submit and activate the form seemed like a good place to start. First, I wrote a rule to call the submit action when the form was complete.

YAML

version: "2.0"
rules:

  - rule: Submit form
    condition:
    # Condition that form is active.
    - active_loop: health_form
    steps:
    - action: health_form
    - active_loop: null
    # The action we want to run when the form is submitted.
    - action: action_submit_results
    - action: utter_slots_values

Let's walk through what's happening in the rule. Earlier, we noted that form events are now called active_loop. The condition that activates this rule is that the health_form active_loop is running. When that happens, the active_loop runs until all of the slots have been filled, at which point the active_loop returns null. The next action I listed is my submit action, action_submit_results, followed by another action, utter_slots_values, which is a response template that repeats the user's answers back to them.

I used a rule to call the submit action because the follow-up steps hinge on a single event (the condition of a form being in progress), regardless of context.

That covered submitting the form, but what about activating it? For that, I decided to use a story. I chose a story because the conversation flow that starts my form relies on some contextual back-and-forth:

YAML

version: "2.0"
stories:
- story: survey happy path
  steps:
  - intent: greet
  - action: utter_greet
  - intent: affirm
  - action: health_form
  - active_loop: health_form

When the user says "hi", the assistant responds with a greeting. The same utter_greet response template also asks the user if they want to log their health data for the day. The affirm intent is technically the intent that triggers the form, but I only want the form to trigger when affirm is preceded by an exchange of greetings-I don't want the form to activate any time the user says "yes." Because context mattered, I went with a story instead of a rule. However, there are some cases where you might use a rule to trigger a form instead, for example, if a specific intent (like subscribe_newsletter) always triggers the form, regardless of context.

The next few rules I created were for single-turn exchanges. The wellness check bot handles a few FAQs about healthy diet and exercise, and I want the bot to answer the question and keep going, no matter when in the conversation the user asks the question. Those rules looked like this:

YAML

- rule: Ask exercise question
    steps:
    - intent: ask_exercise
    - action: utter_exercise_info

I also did the same for conversation snippets like greetings, goodbyes and thank yous:

YAML

- rule: Thanks
    steps:
    - intent: thankyou
    - action: utter_no_worries
    - action: utter_goodbye

Each time I created a rule, I deleted the story that had defined the same behavior in version 1.x. This allowed me to go from 10 stories to 4. Separating some of the dialogue snippets into rules helped me see much more clearly which conversation turns followed a fixed, single-turn behavior, and which conversations relied on things that had happened earlier in the flow.

However, the goal isn't to convert as many stories as possible to rules. One important difference between stories and rules is that stories teach the model to generalize, so it can make predictions about dialogues that don't exactly match the data it's seen before. Rules, on the other hand, don't generalize. If you have an assistant built entirely out of rules and the user does something that doesn't match any of them, the assistant will simply fail. So we want to make sure we still have stories, to give the model a chance to recover when it's presented with something new.

Conclusion

After making these updates, I was ready to train a new model and confirm the migration was successful. If you'd like to try the bot yourself or take a closer look at the code, you can find the full project on GitHub.

While the wellness check bot is much simpler than many of the assistants you'll see running in production, that's what makes it a good teaching tool. It allows us to focus on just a few features without distractions. If you'd like to take a look at more examples of Rasa assistants that have been updated for 2.0 (including a few that are more complex), you can check out the projects listed here:

We're standing by in the Rasa forum if you have any feedback or questions about Rasa Open Source 2.0. Try it out and let us know what you think!