SimpleDataPane
This Not That (TNT) provides a tabular data viewer that can link to selections in a PlotPane. For more complciated interactions, allowing selections from the table to be viewed in the plot see the DataPane
. We will outline the core functionality of the SimpleDataPane
and how to connect it with a data map plot, as well as looking at some of the optional customization available for the SimpleDataPane
.
The first step is to load thisnotthat
and panel
.
[1]:
import thisnotthat as tnt
import panel as pn
To make Panel based objects interactive within a notebook we need to load the panel extension
; for the SimpleDataPane
, unblike the more featureful DataPane
we do not need the tabulator
extension – this can be useful if internet-connectivity is limited.
[2]:
pn.extension()
Now we need some data to use as an example. In this case we’ll use the Palmer’s Penguins dataset, which we can get easy access to via seaborn; we will also clean up the data and rename the columns for ease of use.
[3]:
import seaborn as sns
penguins = (
sns.load_dataset('penguins')
.dropna()
.rename(
columns={
"bill_length_mm": "bill-length",
"bill_depth_mm": "bill-depth",
"flipper_length_mm": "flipper-length",
"body_mass_g": "body-mass"
}
)
)
The penguins dataset consists of a series of measurements relating to three species of penguins (Adelie, Chinstrap, and Gentoo) found in three different islands (Torgersen, Biscoe and Dream) in the Antarctic. We can glance at the first few rows to get a sense of the data.
[4]:
penguins.head()
[4]:
species | island | bill-length | bill-depth | flipper-length | body-mass | sex | |
---|---|---|---|---|---|---|---|
0 | Adelie | Torgersen | 39.1 | 18.7 | 181.0 | 3750.0 | Male |
1 | Adelie | Torgersen | 39.5 | 17.4 | 186.0 | 3800.0 | Female |
2 | Adelie | Torgersen | 40.3 | 18.0 | 195.0 | 3250.0 | Female |
4 | Adelie | Torgersen | 36.7 | 19.3 | 193.0 | 3450.0 | Female |
5 | Adelie | Torgersen | 39.3 | 20.6 | 190.0 | 3650.0 | Male |
We can instantiate a DataPane
by simply handing it a dataframe of data for display. In this case we simply pass it the penguins
dataframe. The object itself renders directly in a notebook. By default we get a table of the raw data, restricted to a maximum number of rows and columns displayed. The Download button at the bottom downloads the data as a csv file – which will make more sense once we look at the selected
Param.
[5]:
data_view = tnt.SimpleDataPane(penguins)
data_view
[5]:
We can set the max rows and columns to display at creation time.
[14]:
data_view = tnt.SimpleDataPane(penguins, max_rows=10, max_cols=10)
data_view
[14]:
The primary Param of the SimpleDataPane
is the selected
attribute. Initially it is an empty list, in which case the full dataframe is displayed. However the value of the attribute is dynamic, and can be changed. If used in an interactive notebook session, then setting the selected
attribute to a list of numeric indices will select those numered rows from the dataframe, reducing the displayed set to just the selected
items. If you execute the cell below you will get an empty list.
[15]:
data_view.selected
[15]:
[]
We can, however, set the selected
Param to [1,3,5,7,9]
. The result is that the table view will update, and only display those five records. Now the Download button will be downloading the smaller dataframe of only those five records.
[16]:
data_view.selected = [1,3,5,7,9]
Let’s reset the selected
attribute so we have the full data table back again.
[17]:
data_view.selected = []
The goal of this is that we can link the selected
attribute to selected items in a data map, allowing the user to select interesting subsets or regions of the data map and immediately see the associated data records, and download them for further analysis if it is an interesting set. To see how this works we’ll need a data map. For that we’ll need some preprocessing for the numeric columns of the penguins data, and UMAP.
[9]:
from sklearn.preprocessing import RobustScaler
import umap
We can now build a data map out of the rescaled numeric penguins data, and create a PlotPane for it.
[10]:
data_for_umap = RobustScaler().fit_transform(penguins.select_dtypes(include="number"))
penguin_datamap = umap.UMAP(random_state=42).fit_transform(data_for_umap)
plot = tnt.BokehPlotPane(
penguin_datamap,
labels=penguins.species,
hover_text=penguins.island,
width=600,
height=600,
legend_location="top_right",
title="Penguins data map",
)
A quick visual check shows that our PlotPane data map looks like the sort of thing we want.
[11]:
plot.pane
[11]:
Now we need to link together our SimpleDataPane
and the PlotPane. We could use the link method to explicitly link together the selected
Params of each, but we can do this more simply by using the link_to_plot method of the SimpleDataPane
which requires us only to specify the PlotPane we wish to link with. With this done we can create a simple Column layout of the PlotPane and our SimpleDataPane
.
[12]:
data_view.link_to_plot(plot)
pn.Column(plot, data_view)
[12]:
Now, if running in a notebook, selecting items in the plot with the lasso selection tool will reduce the data table view to just the selected items. We can also return the current selected dataframe viewed in the table (to see all the data) via the selected_dataframe
property:
[13]:
data_view.selected_dataframe
[13]:
original_index | species | island | bill-length | bill-depth | flipper-length | body-mass | sex | |
---|---|---|---|---|---|---|---|---|
row_num | ||||||||
0 | 0 | Adelie | Torgersen | 39.1 | 18.7 | 181.0 | 3750.0 | Male |
1 | 1 | Adelie | Torgersen | 39.5 | 17.4 | 186.0 | 3800.0 | Female |
2 | 2 | Adelie | Torgersen | 40.3 | 18.0 | 195.0 | 3250.0 | Female |
3 | 4 | Adelie | Torgersen | 36.7 | 19.3 | 193.0 | 3450.0 | Female |
4 | 5 | Adelie | Torgersen | 39.3 | 20.6 | 190.0 | 3650.0 | Male |
... | ... | ... | ... | ... | ... | ... | ... | ... |
328 | 338 | Gentoo | Biscoe | 47.2 | 13.7 | 214.0 | 4925.0 | Female |
329 | 340 | Gentoo | Biscoe | 46.8 | 14.3 | 215.0 | 4850.0 | Female |
330 | 341 | Gentoo | Biscoe | 50.4 | 15.7 | 222.0 | 5750.0 | Male |
331 | 342 | Gentoo | Biscoe | 45.2 | 14.8 | 212.0 | 5200.0 | Female |
332 | 343 | Gentoo | Biscoe | 49.9 | 16.1 | 213.0 | 5400.0 | Male |
333 rows × 8 columns