DataPane

This Not That (TNT) provides a tabular data viewer that can link to selections in a PlotPane (and vice versa). We will outline the core functionality of the DataPane and how to connect it with a data map plot, as well as looking at some of the optional customization available for the DataPane.

The first step is to load thisnotthat and panel.

[1]:
import thisnotthat as tnt
import panel as pn

To make Panel based objects interactive within a notebook we need to load the panel extension; for the DataPane we also need to ensure the tabulator extension is loaded.

[2]:
pn.extension('tabulator')

Now we need some data to use as an example. In this case we’ll use the Palmer’s Penguins dataset, which we can get easy access to via seaborn; we will also clean up the data and rename the columns for ease of use.

[3]:
import seaborn as sns

penguins = (
    sns.load_dataset('penguins')
    .dropna()
    .rename(
        columns={
            "bill_length_mm": "bill-length",
            "bill_depth_mm": "bill-depth",
            "flipper_length_mm": "flipper-length",
            "body_mass_g": "body-mass"
        }
    )
)

The penguins dataset consists of a series of measurements relating to three species of penguins (Adelie, Chinstrap, and Gentoo) found in three different islands (Torgersen, Biscoe and Dream) in the Antarctic. We can glance at the first few rows to get a sense of the data.

[4]:
penguins.head()
[4]:
species island bill-length bill-depth flipper-length body-mass sex
0 Adelie Torgersen 39.1 18.7 181.0 3750.0 Male
1 Adelie Torgersen 39.5 17.4 186.0 3800.0 Female
2 Adelie Torgersen 40.3 18.0 195.0 3250.0 Female
4 Adelie Torgersen 36.7 19.3 193.0 3450.0 Female
5 Adelie Torgersen 39.3 20.6 190.0 3650.0 Male

We can instantiate a DataPane by simply handing it a dataframe of data for display. In this case we simply pass it the penguins dataframe. The object itself renders directly in a notebook (if pn.extension("tabulator") has been run). By default we get a paginated table of the raw data. If running in a notebook you can interactively sort by columns, and the Download button at the bottom downloads the data as a csv file – which will make more sense once we look at the selected Param.

[5]:
data_view = tnt.DataPane(penguins)

data_view
[5]:

The primary Param of the DataPane is the selected attribute. Initially it is an empty list, in which case the full dataframe is displayed. However the value of the attribute is dynamic, and can be changed. If used in an interactive notebook session, then setting the selected attribute to a list of numeric indices will select those numered rows from the dataframe, reducing the displayed set to just the selected items. If you execute the cell below you will get an empty list.

[6]:
data_view.selected
[6]:
[]

We can, however, set the selected Param to [1,3,5,7,9]. The result is that the table view will update, and only display those five records. Now the Download button will be downloading the smaller dataframe of only those five records.

[7]:
data_view.selected = [1,3,5,7,9]

Let’s reset the selected attribute so we have the full data table back again.

[8]:
data_view.selected = []

The goal of this is that we can link the selected attribute to selected items in a data map, allowing the user to select interesting subsets or regions of the data map and immediately see the associated data records, and download them for further analysis if it is an interesting set. To see how this works we’ll need a data map. For that we’ll need some preprocessing for the numeric columns of the penguins data, and UMAP.

[9]:
from sklearn.preprocessing import RobustScaler
import umap

We can now build a data map out of the rescaled numeric penguins data, and create a PlotPane for it.

[10]:
data_for_umap = RobustScaler().fit_transform(penguins.select_dtypes(include="number"))
penguin_datamap = umap.UMAP(random_state=42).fit_transform(data_for_umap)
plot = tnt.BokehPlotPane(
    penguin_datamap,
    labels=penguins.species,
    hover_text=penguins.island,
    width=600,
    height=600,
    legend_location="top_right",
    title="Penguins data map",
)

A quick visual check shows that our PlotPane data map looks like the sort of thing we want.

[11]:
plot.pane
[11]:

Now we need to link together our DataPane and the PlotPane. We could use the link method to explicitly link together the selected Params of each, but we can do this more simply by using the link_to_plot method of the DataPane which requires us only to specify the PlotPane we wish to link with. With this done we can create a simple Column layout of the PlotPane and our DataPane.

[12]:
data_view.link_to_plot(plot)
pn.Column(plot, data_view)
[12]:

Now, if running in a notebook, selecting items in the plot with the lasso selection tool will reduce the data table view to just the selected items. We can also go the other way, and select records in the table (using the check boxes on the left side), and see them reflected in the plot.

It is possible to style the data tables, but for that we refer you to the Panel documentation on Tabulator. A large number of the syle and configuration options for the tabulator are available in the DataPane using the same argument names.