Since its inception, the Tableau mission has been to help people make better decisions by allowing them to see and understand their data. This includes helping organizations solve one of the most pressing challenges in data science today: getting the value of advanced analytical insights into the hands of business decision makers. Industry analysts report that many data science efforts fail to deliver return on investment, a key reason being the communication gap between data science teams and business decision makers. Communicating the process and results of data science work in Tableau can close the gap with accessible interactivity and exploration for business stakeholders.
In this post, I’m excited to share the latest Python integration features we’ve built to support data science and diverse analytics environments in Tableau.
TabPy 1.0
Python is a ubiquitous tool for analysts and data scientists across industries for applications from cleaning and shaping data to implementing cutting-edge machine learning algorithms. To better support data science integrations in Tableau at scale, our team has been working to expand the features and security of our Python server, TabPy. Tableau has supported dynamic integration with Python via TabPy in Desktop and Server since version 10.3 and in Prep since version 2019.3. You can find great use cases for integrating Python in Tableau in:
• Sessions from TC18 and TC19
• Blog post on scripting in Tableau Prep
• How-to guide for understanding and using table calculations
Since the initial release, we have added improvements to TabPy based on your requests and feedback. Now, with the release of Tableau 2020.1, we are happy to officially designate TabPy as a 1.0 release indicating it is an officially supported Tableau product and is ready for scaled up use. Read on to learn more about the features available in TabPy 1.0.
Streamlined install
In its initial iteration, TabPy was installed as a Python pip package called tabpy-server and, by default, required an installation of the Anaconda data science framework. TabPy also required a second package for deploying functions called tabpy-client. To make this process easier and more streamlined for our users, we’ve combined the functionality of both packages into a single package called tabpy and removed the dependence on Anaconda. TabPy running in an Anaconda virtual environment is still a great solution, but it can be easily installed in other Python setups as well. To install TabPy today on any machine with a Python 3.6+ environment, simply run:
pip install tabpy
To start the TabPy server, from the command line run:
tabpy
Once TabPy is running you can connect Tableau Desktop by navigating to Help->Settings and Performance->Manage External Service Connections and entering your connection information:
In Tableau Server, a connection can be configured by running the TSM security command.
Pre-built statistical functions
Once TabPy is installed and the server is running, you can install a library of pre-built statistical functions from the same machine, using the simple command line command:
tabpy-deploy-models
These functions include analysis features like Principal Component Analysis (PCA), Sentiment Analysis, a t-test, and ANOVA. Once installed, any of these functions can be called by name by any Tableau Desktop or Server connected to TabPy. In the following example, the t-test function is used for web A/B testing:
The tabpy_tools library that ships with TabPy allows you to define and deploy your own Python functions, including scoring with machine learning models. To try it yourself, simply use these instructions.
Secured connections and authentication
TabPy has support for secure connections on HTTPS using SSL and username and password authentication using basic authentication. Secured connections can be configured in the TabPy configuration file as shown here. Starting with Tableau 2020.1, Tableau Desktop and Server will read SSL certificates from the OS keystore and not require a certificate to be specified in Tableau. Authentication is configured through a utility included in the tabpy package and is documented here.
Quick configuration
TabPy can be started with custom configuration settings that are defined in a configuration file that is specified on starting the server. Find the specifications for the configuration file and a sample here. Configurable features include SSL, Authentication, Logging, Max Data Size, and Timeout. To start TabPy using a custom configuration, add the config startup parameter as in this example:
tabpy --config=path/to/my/config/file.conf
Enhanced logging
We’ve expanded TabPy’s logging features to support auditing of Python code run against the server and tracking which users ran what code. When connected to Tableau Server, this can be set to record the Server user’s Tableau username. Find instructions for configuring logging here.
With all of these features we’ve made dynamic Python in Tableau more flexible and powerful than ever before. We’re always looking at what’s next though, so please reach out with your questions and feedback.
For more information on Tableau features for developers, and to show off your skills in DataDev Hackathons, check out the Tableau Developer Program.
Nathan Mannheimer is Product Manager at Tableau.
7 november (online seminar op 1 middag)Praktische tutorial met Alec Sharp Alec Sharp illustreert de vele manieren waarop conceptmodellen (conceptuele datamodellen) procesverandering en business analyse ondersteunen. En hij behandelt wat elke data-pr...
11 t/m 13 november 2024Praktische driedaagse workshop met internationaal gerenommeerde trainer Lawrence Corr over het modelleren Datawarehouse / BI systemen op basis van dimensioneel modelleren. De workshop wordt ondersteund met vele oefeningen en pr...
18 t/m 20 november 2024Praktische workshop met internationaal gerenommeerde spreker Alec Sharp over het modelleren met Entity-Relationship vanuit business perspectief. De workshop wordt ondersteund met praktijkvoorbeelden en duidelijke, herbruikbare ...
26 en 27 november 2024 Organisaties hebben behoefte aan data science, selfservice BI, embedded BI, edge analytics en klantgedreven BI. Vaak is het dan ook tijd voor een nieuwe, toekomstbestendige data-architectuur. Dit tweedaagse seminar geeft antwoo...
De DAMA DMBoK2 beschrijft 11 disciplines van Data Management, waarbij Data Governance centraal staat. De Certified Data Management Professional (CDMP) certificatie biedt een traject voor het inleidende niveau (Associate) tot en met hogere niveaus van...
3 april 2025 (halve dag)Praktische workshop met Alec Sharp [Halve dag] Deze workshop door Alec Sharp introduceert conceptmodellering vanuit een non-technisch perspectief. Alec geeft tips en richtlijnen voor de analist, en verkent datamodellering op c...
10, 11 en 14 april 2025Praktische driedaagse workshop met internationaal gerenommeerde spreker Alec Sharp over herkennen, beschrijven en ontwerpen van business processen. De workshop wordt ondersteund met praktijkvoorbeelden en duidelijke, herbruikba...
15 april 2025 Praktische workshop Datavisualisatie - Dashboards en Data Storytelling. Hoe gaat u van data naar inzicht? En hoe gaat u om met grote hoeveelheden data, de noodzaak van storytelling en data science? Lex Pierik behandelt de stromingen in ...
Deel dit bericht