SearchWidget

This Not That (TNT) provides a search widget that can search through data in a dataframe for matches via a variety of search methods. We will outline the core functionality of the SearchWidget and how to connect it with a data map plot, as well as looking at some of the optional customization available for the SearchWidget.

The first step is to load thisnotthat and panel.

[1]:
import thisnotthat as tnt
import panel as pn

To make Panel based objects interactive within a notebook we need to load the panel extension.

[2]:
pn.extension()

Now we need some data to use as an example. In this case we’ll use the Palmer’s Penguins dataset, which we can get easy access to via seaborn; we will also clean up the data and rename the columns for ease of use.

[3]:
import seaborn as sns

penguins = (
    sns.load_dataset('penguins')
    .dropna()
    .rename(
        columns={
            "bill_length_mm": "bill-length",
            "bill_depth_mm": "bill-depth",
            "flipper_length_mm": "flipper-length",
            "body_mass_g": "body-mass"
        }
    )
)

The penguins dataset consists of a series of measurements relating to three species of penguins (Adelie, Chinstrap, and Gentoo) found in three different islands (Torgersen, Biscoe and Dream) in the Antarctic. We can glance at the first few rows to get a sense of the data.

[4]:
penguins.head()
[4]:
species island bill-length bill-depth flipper-length body-mass sex
0 Adelie Torgersen 39.1 18.7 181.0 3750.0 Male
1 Adelie Torgersen 39.5 17.4 186.0 3800.0 Female
2 Adelie Torgersen 40.3 18.0 195.0 3250.0 Female
4 Adelie Torgersen 36.7 19.3 193.0 3450.0 Female
5 Adelie Torgersen 39.3 20.6 190.0 3650.0 Male

We can instantiate a SearchWidget by simply handing it a dataframe of data to search through. In this case we simply pass it the penguins dataframe. The object itself renders directly in a notebook (if pn.extension() has been run). To get the full interactivity it needs an active python kernel, so you will need to execute this in a notebook yourself to see the next steps in action.

[5]:
search = tnt.SearchWidget(penguins)

search
[5]:

The primary result of search is the selected attribute. Initially it is an empty list. However the value of the attribute is dynamic, and will change if used in an interactive notebook session. If you execute the cell below you will initially get an empty list, but if you type a query into the search widget above (for example, search for “Dream”) and hit the search button, and then re-evaluate/re-run the cell below, and you’ll see it has a list of indices matching the search.

[6]:
search.selected
[6]:
[]

If you want to get the actual data associated to a search then the search.data contains the dataframe being searched. You can use the iloc method to access rows by numerical index and get the selected items that way. At first, when the selected list is empty, this will be an empty dataframe, but after running a search you can re-evaluate this cell to see the dataframe of search matches.

[7]:
search.data.iloc[search.selected]
[7]:
index species island bill-length bill-depth flipper-length body-mass sex

In practice we likely want to attach a search widget to a data map. Let’s make a data map of the penguins data. For that we’ll need some sklearn preprocessing (to get our numeric data all on the same scale) and UMAP.

[8]:
from sklearn.preprocessing import RobustScaler
import umap

Now we just apply UMAP to the rescaled numeric data from our penguins dataframe. We can pass that directly into a PlotPane to get a data map up and running

[9]:
data_for_umap = RobustScaler().fit_transform(penguins.select_dtypes(include="number"))
penguin_datamap = umap.UMAP(random_state=42).fit_transform(data_for_umap)
plot = tnt.BokehPlotPane(
    penguin_datamap,
    labels=penguins.species,
    hover_text=penguins.island,
    width=400,
    height=400,
    legend_location="top_right",
    title="Penguins data map",
)

A quick visual check shows that our PlotPane data map looks like the sort of thing we want.

[10]:
plot.pane
[10]:

Now we need to link together our SearchWidget and the PlotPane. We could use the link method to explicitly link together the selected Params of each, but we can do this more simply by using the link_to_plot method of the SearchWidget which requires us only to specify the PlotPane we wish to link with. With this done we can create a simple Row layout of the PlotPane and our SearchWidget.

[11]:
search.link_to_plot(plot)
pn.Row(plot, search)
[11]:

If you are running this interactively in a notebook you can now search and have the results of the search highlighted in the plot. It is worth exploring the different search options, including the pandas query which accepts the syntax of pandas ``query` method <https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.query.html>`__ allowing queries that involve numeric columns etc.

The current search widget offers limited options for customization; you can specify a title (in markdown), and a size (width and height). For example, we could adjust things as below.

[12]:
search = tnt.SearchWidget(
    penguins,
    title="# Penguins Search",
    width=250,
    height=500,
    name="TNT Search Widget",
)

search
[12]: