Tableau’s agile and self-service nature does come with some data management concerns. How to make sure data is not getting out of hands? How to ensure Tableau process does not break existing Personal Identifiable Information (PII) control process?
During old days, dashboards are only developed by small group of developers, education/control process is much easier. With Tableau self-service, source data is accessed by hundreds of business analytics who can develop dashboards, publish to server and decide access policy (permission) by himself or herself, business analysts love the flexibility but it can be data privacy or data security’s nightmare if Tableau data security is not managed closely.
Detect and delete Personal Identifiable Information (PII) on Tableau server
Tableau’s Encryption at Rest feature encrypt extracts sitting in FileStore but does not encrypt data in memory or during network transition. Most organizations have policy to enforce certain type of PII data (like SSN, Payment Cards, DoB) to be encrypted at rest and in transit. Unless you use special AWS config, your on-premise Tableau server will not meet this requirements. In other words, regular Tableau servers can’t have those types of PII data. The question is what if some publishers bring such PII data to Tableau server? Is there a way for Tableau platform to detect such data and even delete it?
The answer is Yes. Here is one example of PII detection/deletion notification. It can be fully automated.
The How To invoices 4 steps.
- PII Reference Repository or PII Taxonomy as minimum: Most likely this comes from organization’s privacy team. It is PII definition : list of Database name, schema name, table name and column name that have which kind of PII. This is the start point.
- Enable Tableau Lineage Tables (if not done yet) – run ‘tsm maintenance metadata-services enable’. The Tableau lineage tables will be populated with lineage data without Data Management Add-on model. The tables can be accessed by Postgre ‘Readonly’ users although not available to any Tableau server users w/o Data Management Add-on model.
- Create workbook to identify all connected workbooks using any columns in the PII Reference Repository or PII Taxonomy list.
- Actions on identified workbooks. Depends on your org policy, you can either send alert to content owners or remove data directly or both.
- If your org only has PII Taxonomy, you really don’t know for sure that it is PII data, alerts to content owners and ask owners to confirm is a recommended process
- If your org has clearly defined PII Reference Repository with database name, schema name, table name, column name and PII classification, you can delete certain workbooks directly and then send email to data owner as an enforced governance approach
RE-CAP: It is important to implement PII detection and delation process while ramp up Tableau self-service to hundreds or thousands self-service citizen publishers to ensure PII data not getting out of hands.
The process mainly involves PII Reference Repository or PII Taxonomy (by working together with your data privacy team) and Tableau’s lineage tables that can be enabled.