Visualise Word-Embeddings with Whatlies

Today I am happy to announce that we're open sourcing a new tool; whatlies.

It is a tool that creates visualisations of word embeddings to help you figure out "what lies" in word embeddings. It invites play and it can create interactive visualisations that are share-able on the internet. Here's two examples of what it can produce.

These two charts show where words are in embedding space after the dimensions have been reduced by PCA and UMAP respectively. But we allow many operations and you can also visualise along the axes of embeddings too. Below we have an example where we try to remove gender bias.

Overview

To get an overview of what you can do with the package we've also created a video that is hosted on our algorithm whiteboard on youtube.

Note that this video only demonstrates a subset of the features. The library offers strong support for spaCy language models (including the huggingface ones) as well as sense2vec. We even provide a special syntax for contextual embeddings from Bert-style models. You can also find this package featured in the spaCy universe.

Getting Started

You can install the package via pip;

pip install whatlies

The documentation is full of getting started guides as well as a full overview of the API. If you're interested in contributing you can find the project on github.