GOVERNED SELF-SERVICE ANALYTICS: Maturity Model (10/10)

My last 9 blogs covered all aspects of governed self-service and how to scale from department self-service to enterprise self-service. I received some very positive feedback and I am glad that my blogs inspired some readers:

Devdutta Bhosale says: “I read your article governed self-service analytics and as a Tableau server professional could instantly relate with some of challenges of implementing Enterprise BI with Tableau. Compared to Legacy BI tools such as BO, Micro-strategy, etc. enterprise BI is not the strength of Tableau especially compared to “the art of possible” with visualizations. I am so glad that you are writing so much in this space …. The knowledge you have shared has helped me follow some of the best practices with my recent Enterprise BI implementation at employer. I just wanted to say ‘thank you’ “.

Other readers also ask me how to measure governed self-service maturity. There are some BI maturity models by TDWI, Gartner’s, etc. However I have not seen any practical self-service analytics model. Here is my first attempt for the self-service analytics maturity model. I spent a lot of time thinking through this model and I read a lot too before I put this blog together.

I will describe the self-service analytics maturity model as followings: screenshot_184

  • Level 1: Ad-hoc
  • Level 2: Department Adoption
  • Level 3: Enterprise Adoption
  • Level 4: Culture of Analytics

Level 1 ad-hoc is where one or a few teams started to use Tableau for some quick visualization and insights. In other words, this is where Tableau initially landed. When Tableau’s initial value is recognized, Tableau adoption will go to business unit level or department level (level 2), which is where most of Tableau’s implementation is today. To scale further to enterprise adoption (level 3) needs business strategy alignment, bigger investment, and governed self-service model which is what this serious of blogs is about. The ultimate goal is to drive the culture of analytics and enable data-driven decision-making, which is level 4.

What are the characters of each maturity level? I will look into data, technology, governance, and business outcome perspectives for each of those maturity levels:

screenshot_204

Level 1: Ad-hoc

  • Data
    • Heroics data discovery
    • Data inconsistent
    • Poor data quality
  • Technology
    • Team based technology choice
    • Shadow IT tools
    • Exploration
  • Governance
    • No governance
    • Overlapping projects
  • Outcome
    • Focuses on what happened
    • Analytics does not reflect business strategy
    • Business process monitoring metrics

Level 2: Department Adoption

  • Data
    • Data useful
    • Some data definition
    • Siloed data management
    • Limited data polices
  • Technology
    • Practically IT supported architecture
    • Immature data preparation tools
    • Data mart like solutions
    • Early stage of big data technology
    • Scalability challenges
  • Governance
    • Functions and business line governance
    • Immature metadata governance
    • Islands of information
    • Unclear roles and responsibilities
    • Multiple versions of KPIs
  • Outcome
    • Some business functions recognizes analytics value and ROI
    • Analytics is used to inform decision-making
    • More on cause analysis & some resistant on adapting all insights
    • Data governance is managed in a piecemeal fashion

Level 3: Enterprise Adoption

  • Data
    • Data quality certification
    • Process & data measurement
    • Data policies measured & enforced
    • Data exception management
    • Data accuracy & consistency
    • Data protection
  • Technology
    • Enterprise analytics architecture
    • Managed analytics sandboxes
    • Enterprise data warehouse
    • Content catalog
    • Enterprise tools for various power users
    • Advanced technology
    • Exploration
  • Governance
    • Executive steering committee
    • Governed self-service
    • CoE with continuous improvement
    • Data and report governance
    • Enterprise data security
    • Business and IT partnership
  • Outcome
    • Analytics insight as a competitive advantage
    • Relevant information as a differentiator
    • Predictive analytics to optimize decision-making
    • Enterprise information architecture defined
    • Mature governed self-service
    • Tiered information contents

Level 4: Culture of Analytics

  • Data
    • Information life-cycle management
    • Data lineage & data flow impact documented
    • Data risk management and compliance
    • Value creation & monetizing
    • Business Innovation
  • Technology
    • Event detection
    • Correlation
    • Critical event processing & stream
    • Content search
    • Data lake
    • Machine learning
    • Coherent architecture
    • Predictive
  • Governance
    • Data quality certification
    • Process & data measurement
    • Data policies measured & enforced
    • Data exception management
    • Data accuracy & consistency
    • Data protection
    • Organizational process performance
  • Outcome
    • Data drives continuous business model innovation
    • Analytical insight optimizes business process
    • Insight in line with strategic business objectives
    • Information architecture underpins business strategies
    • Information governance as part of business processes

This will conclude the governed self-service analytics blogs. Here is key takeaways for the governance self-service analytics:

  1. Enterprise self-service analytics deployment needs a strong governance process
  2. Business and IT’s partnership is the foundation for a good governance
  3. If you are IT, you need to give more trust to your business partners
  4. If you business, be good citizen and follow the rule
  5. Community participation and neighborhood watch is important part of the success governance
  6. Governance  process evolves as your adoption goes

Thank you for reading.

Governed Self-Service Analytics: Content Management (9/10)

When executives get reports from IT-driven BI system, they trust the numbers. But if the reports are from spreadsheet, which can change anytime, they lower the trust level. If same spreadsheet is used to create Tableau visualization and be shared to executives for decision-making, does the trust level get increased? Can important business decisions be made based on the Tableau reports?

I am not against Tableau or visualization at all. I am a super Tableau fan. I love Tableau’s mission to help people to see and understand their data better. On the other side, as we all know that any dashboard is only as good as its data. How to provide trustworthy contents to end consumers? How to avoid the situation that some numbers are put into10K report while team is still baking the data definition?

The answer is to create a framework of content trust level indicator for end consumers. We do not want to slow down any innovation or discovery by self-service business analysts who still create their own analytics and publish workbooks. After dashboard is published, IT tracks the usages, identifies most valuable contents per defined criteria, certifies the data & contents so end consumers can use the certified reports the same way as reports from IT-driven BI. See the diagram below for overall flow:

Content

When you have a data to explore or you have a new business question to answer, hopefully you have report catalog to search if similar report is available to leverage. If yes, you do not have to develop it anymore although you may need to request an access to the report if you do not have access to it. If the visualization is not exactly what you are looking for but data attributes are there, you can always modify it to create your own version of visualization.

If there is no existing report available, you can also search published data source catalog to see if there is available published data source for you to leverage. If yes, you can create new workbooks by leveraging existing published data connections.

You may still need to bring your own data for your discovery. The early stage of discovery and analysis goes multi-iteration. Initial user feedback helps to reduce the overall time to market for your dashboards. At some point of time when your dashboard is good enough and is moved to production folder to share with a lot of more users, it will fall into track, identify and certify cycle.

Content cycle

What to track? Different organizations will have different answers. Here are examples:

  • Data sources with high hits
  • Reports accessed most frequently
  • Most active users
  • Least used reports for retirement

How to identify the most critical reports?

  • Prioritize based on usage (# of users, use cases, purpose, x-functional, benefits)
  • Prioritize based on data source and contents (data exist in certified env, etc)
  • Prioritize based on users. If CEO uses the report, it must be critical one for the organization

How to certify the critical reports? It is an on-going process:

  • Incrementally add self-service data to source of truth so data governance process can cover the new data sets (data definitions, data stewardship, data quality monitoring, etc)
  • Recreating dashboards (if needed) for better performance, add-on functionality, etc
  • Label the report with report trustworthy indicator

The intent of tracking, identifying and certifying cycle is to certify the most valuable reports in your organization. The output of the process is the report trustworthy indicator that helps end consumers to understand the level of trustworthy of data and reports.

End information consumers continue to use your visualizations that would be replaced with certified reports steps by steps, which is an on-going process. The certified reports will have trustworthy indicators on them.

What is the report trustworthy indicator? You can design multi level of trustworthy indicators. For example:

  • SOX certified:
    • Data Source Certified
    • Report Certified
    • Release Process Controlled
    • Key Controls Documented
    • Periodic Reviews
  • Certified reports:
    • Data Source Certified
    • Report Certified
    • Follow IT Standard Release Process
  • Certified data only
    • Data Source Partially Certified
    • Business Self-Service Releases
    • Follow Tableau Release Best Practices
  • Ad-Hoc
    • Business Self-Service Releases
    • Follow Tableau Release Best Practices

Content gov
As summary, content management helps to reduce the duplications of contents and data sources, and provide end information consumers with trustworthy level of the reports so proper decisions can be made based on the reports and data. The content management process outline above shows how to create the enterprise governance without slowing down innovations.

Please read next blog about governance mature level.

Governed Self-Service Analytics: Data Governance (8/10)

I was in the panel discussion at Tableau Conference 2015 about self-service analytics to a group of executives. Guess what is the no.1 most frequent asked question – data governance. How to make sure that data not get out of hands? How to make sure that the self-service analytics does not break the existing organization’s process, policy around data protections, data governance?

Data governance is a big topic. This blog focuses following 3 things:

  • Data governance for self-service analytics
  • How to enforce data governance in self-service environment
  • How to audit self-service environment
  1. Data governance for self-service analytics

First of all, what is data governance?

Data governance is a business discipline that brings together data quality, data management, data policies, business process management, and risk management surrounding the handling of data.

The intent is to put people in charge of fixing and preventing issues with data so that the enterprise can become more efficient.

The value of enterprise data governance is as followings:

  • Visibility & effective decisions: Consistent and accurate data visibility enables more accurate and timely business decisions
  • Compliance, security and privacy: Enable business to efficiently and accurately meet growing global compliance requirements

What data should be governed?

Data is any information in any of our systems. Data is a valuable corporate asset that indirectly contributes to organization’s performance.   Data in self-service analytics platform (like Tableau) definitely is part of data governance scope. All the following data should be governed:

  • Master Data: Data that is shared commonly across the company in multiple systems, applications and/or processes. Master Data should be controlled, cleansed and standardized at one single source. Examples: Customer master, product item master. Master data enable information optimization across systems, enable data enrichment, data cleaning and increase accuracy in reporting.
  • Reference Data: Structured data used in an application, system, or process. Often are common lists set once a fiscal year or with periodic updates. Examples like current codes, country codes, chart of accounts, sales regions, etc.
  • Transactional Data: The information recorded from transactions. Examples like user clicks, user registrations, sales transactions, shipments, etc. The majority of the enterprise data should be the transactional data. Can be financial, logistical or work-related, involving everything from a purchase order to shipping status to employee hours worked to insurance costs and claims. As a part of transactional records, transactional data is grouped with associated master data and reference data. Transactional data records a time and relevant reference data needed for a particular transaction record.

What are data governance activities?

  • Data ownership and definition: The data owner decides and approves the use of data, like data sharing/usage requests by other functions. Typically data owners are the executives of the business areas. One data owner is supported by many data stewards who are the operational point of accountability for data, data relationship and process definitions. The steward represents the executive owners and stakeholders. Data definition is what data steward’s responsibility although many people can contribute to the data definitions. In the self-service environment where data is made available to many analyst’s hands, it is business advantage to be able to leverage those data analyst’s knowledge and know-how about the data by allowing each self-service analyst to comment, tag the data, and then find a way to aggregate those comments/tags. This is again the community concept.
  • Monitor and corrective actions: This is an ongoing process to define process flow, data flow, quality requirement, business rules, etc. In the self-service environment where more and more self-service developers have capability to change metadata and create calculated fields to transform the data, it can be an advantage and can also become chaos if data sources and process are not defined within one business group.
  • Data process and policy: This is about exception handlings.
  • Data accuracy and consistency: Commonly known as data quality. This is where most of time and efforts are spent.
  • Data privacy and protection: There are too many examples that data leakage damages brand and causes millions for organizations. Some fundamental rules have to be defined and enforced for self-service enterprise to have a piece of mind.

2. How to enforce privacy and protection in self-service environment?

The concept here is to have thought leadership about top sensitive data before make data available for self-service consumption. To avoid potential chaos and costly mistakes, define what are the top sensitive dataset for your organization, then have IT to create enforcement in database layer so self-service users can’t mess up. Here is list of examples of what should be enforced to have a piece of mind:

  • No privacy and private data is allowed to self-service server. Like SSN, federal customer data, credit cards, etc. Most of those self-service platform (like Tableau) is defined for easy of use, and does not have the sophisticate data encrypt technologies.
  • Remove the sensitive data fields (like address, contacts) in database level before making the data available for self-service consumption. The reason is that it is really hard to control those data attributes once you open them to some business analytics super users.
  • Use site as partition to separate data, users, and contents for better data security. For example, finance is a separate site that has finance users only. Sales people have no visibility on finance site.
  • Create separate server instance for external users if possible. Put the external server instance in DMZ zone. Different level of network security will be applied as additional layer of security.
  • Create site for each partner / vendor to avoid potential problems. When you have multiple partners or vendors accessing your Tableau server, never put two vendors into same site. Try to create one site for each vendor to avoid potential surprises.

3. How to audit self-service environment?

You can’t enforce everything. You do not want to enforce everything either. Enforcement comes with disadvantages too, like inflexibility. You want to choose the most critical things to enforce, and then you leave the remaining as best practices for people to follow. Knowing the self-service analytics community always tries to find the boundary, you should have audit in your toolbox. And most importantly let community know that you have the auditing process.

  • What to audit:
    • All the enforced contents should be part of audit scope to make sure your enforcement works in the intended way
    • For all the policy that your BU or organization agreed upon.
    • For any other ad-hoc as needed
  • Who should review the audit results:
    • Self-service governance body should review the results
    • BU data executive owners are the main audiences of auditing reports. It is possible that executives gave special approvals in advanced for self-service analysts to work on some datasets that she or he does not have access normally. When they are too many exceptions, it is an indication of potential problem.
  • Roles and responsibilities of audit: Normally IT provides audit results while business evaluate risks and make decisions about process changes.
  • How to audit: Unfortunately Tableau does not have a lot of server audit features. There is where a lot of creativities come into play. VizAlert can be used. Often creating workbooks from Tableau database directly is the only way to audit.

Please read next blog about content management.

Governed Self-Service Analytics: Performance Management (7/10)

Performance management has been everyone’s concerns when it comes to a shared self-service environment since nobody wants to be impacted by others. This is especially true when each business unit decides their own publishing criteria where central IT team does not gate the publishing process.

How to protect the shared self-service environment? How to prevent one badly designed query from bringing all servers to their knees?

  • First, set server parameters to enforce policy.
  • Second, create daily alerts for any slow dashboards.
  • Third, made performance metrics public to your internal community so everyone in the community has visibility of the worse performed dashboards to create some peer pressures with good intent.
  • Fourth, hold site admin or business leads to be accounted for the self-service dashboard performance.

You will be in good shape if you do those four things above. Let me explain each of those in details.

performance

  1. Server policy enforcement

The server policy setting is for enforced policies. For anything that can be enforced, it is better to enforce those so everyone can have a piece of mind. The enforced parameters should be agreed upon business and IT, ideally in the governance council. The parameters can always be reviewed and revised when situation changes.

Examples of commonly enforced parameters like overall sizing allocation for a site, extracting time out, etc.

  1. Exception alerts

There are only a few limited parameters that you are able to control as enforcement. All the rest will have to be governed by process. The alerts are most common approach to server as exception management:

  • Performance alert: Create alerts when dashboard render time exceeds agreed threshold.
  • Extract size alerts: Create alerts when extract size exceed define thresholds (Extract timeout can be enforced on server but not size).
  • Extract failure alerts: Create alerts for failed extracts. Very often stakeholders will not know the extract failed. It is essential to let owners know his or her extracts failed so actions can be taken timely.
  • You can create a lot of more alerts, like CPU usage, overall storage, memory, etc.

How to do the alerts? There are multiple choices. My favorite one is VizAlert for Tableau https://community.tableau.com/groups/tableau-server-email-alert-testing-feedbac

Who should receive the alerts? It depends. A lot of alerts are for server admin team only, like CPU usage, memory, storage, etc. However most of the extracts and performance alerts are for the content owners. One best practice for content alert is always to include site admins or/and project owners as part of alerts. Why? Workbook owners may change jobs so the original owner may not be responsible for the workbooks anymore. I was talking with a well known Silicon Valley company recently, they are telling me that a lot of workbook owner changed in last 2 years, they had hard time to figure out whom they should go after for issues related to workbooks. Site admin should be able to help to identify the new owners. If site admin is not close enough to workbook level in your implementation, you can choose project leaders instead of site admin.

What should be the threshold? There is no universal answer. But nobody wants to wait for more than 10 seconds. The rule of thumb is that anything less than 5 seconds good. However anything more than 10 seconds is no good. I got a question when I present this in one local Tableau event. The question was what if one specific query used to take 30 minutes, and team made great progress to reduce it to 3 minutes. Do we allow this query to be published and run on server? The answer is depends. If the view is so critical for business, it will be of course worth of waiting 3 minutes for results to render. Everything has exception. However if the 3-minute query chokes everything else on the server and users may click the view to trigger the query often, you may want to re-think the architecture. Maybe the right answer will be to spin-off another server for this mission critical 3-minute application only so the rest of users will not impact.

Yellow and red warning: It is a good practice to create multi-level of warning like yellow and red warning with different threshold. Yellow alerts are warnings while red alerts are for actions.

You may say, hi Mark, this all sounds great but what if people do not take the actions.

This is exactly where some self-service deployments go wrong. There is where governance comes to play. In short, you need to have strong and agreed-upon process enforcement:

  • Some organizations use charging back process to motivate good behaviors. The charge back will influence people’s behaviors but will not be able to enforce anything.
  • The key process enforcement is a penalty system when red alert actions are not taken timely.

If owner did not take corrective actions during agreed period of time for red warning, a meeting should be arranged to discuss the situation. If the site admin refuses to take actions, the governance body has to make decision for agreed-upon penalty actions. The penalty can lead to site suspension. Once a site is suspended, nobody can excess any of the contents anymore except server admins. The site owners have to work on the improvement actions and show the compliances before site can be re-activated. The good news is that all the contents are still there when a site is suspended and it takes less than 10 seconds for server admin to suspend or re-active a site.

I had this policy that was agreed with governance body. I communicate to as many self-service developers about this policy as I can. I never got push back about this policy. It is clear to me that self-service community likes to have a strong and clearly defined governance process to ensure everyone’s success. I suspended a site for some other reasons but never had to suspend a site due to performance alerts. What happens is that is my third tricky about worse performed dashboard visibility.

  1. Make performance metric public

It takes some efforts to make your server dashboard performance metric public to all your internal community. But it turns out that it is one of the best things that a server team can do. It has a few benefits:

  • It serves as a benchmarking for community to understand what is good and good enough since the metric shows your site overall performance comparing with others on the server
  • It shows all the long render dashboards to provide peer pressures.
  • It shows patterns that help people to focus the problematic areas
  • It creates great opportunity for community to help each other. This is one most important success factor. What turns out is that the problematic areas are often the new team on-boarded to the server. It community always have so many ideas to make dashboard perform a lot of better. This is why we never had to suspend any sites since when it comes with a lot of red alerts that community is aware of, it is the whole community that makes things happen, which is awesome.
  1. Hold site admin accounted for

I used to manage Hewlett Packard’s product assembly line during my early career. Hewlett Packard has some well-known quality control processes. One thing that I learned was that each assembler is response for his or her own quality. Although there is QA at the end of line but each workstation has a checklist before pass to next station. This simple philosophy applies today’s software development and self-service analytics environment. The site admin is responsible for the performance of workbooks in the sites. The site admin can further hold workbook owners accounted for the shared workbooks. Flexibility comes with accountability too.

I believe theory Y (people have good intent and want to perform better) and I have been practicing theory Y for years. The whole intent of server dashboard performance management is to provide performance visibility to community and content owners so owners know where the issues are so they can take actions.

What I see often is that a well-performed dashboard may become bad over-time due to data changes and many other factors. The alerts will catch all of those exceptions no matter your dashboards are released yesterday, last week, last month or last year – this approach is a lot of better than gating releases process which is a common IT practice.

During a recent Run-IT as business meet-up, audiences were skeptical when I said that IT did not gate any workbook publishing process and it is completely a self-service. Then audiences started to realize that it did make sense when I started to talk about performance alerts that will catch it all together. What business likes most about this approach is the freedom to push some urgent workbooks to server even workbooks are not performing great – they can always come back later on to tune them and make them perform better for both better use experiences and being good citizen.

Please continue to read next blog about data governance.

GOVERNED SELF-SERVICE ANALYTICS: PUBLISHING (6/10)

The publishing process & policy covers the followings areas:  Engagement Process; Publisher Roles; Publishing Process and Dashboard Permissions.

PublishingFirst step is to get a  space on the shared enterprise self-service server for your group’s data and self-service dashboard, which is called Engagement Process. The main questions are:

  • From requester perspective, how to request a space on shared enterprise self-service server for my group
  • From governance perspective, who decides and how to decide the self-service right fit?

Once a business group has a space on the shared enterprise self-service server, the business group has to ask the following questions:

  • Who can publish dashboard from your group?
  • Who oversees or manages all publishers in my group?

After you have given a publishing permission to some super users from your business group, those publishers need to know the rules, guidance, constraints on server, and best practices for effective dashboard publishing. Later on you may also want to make sure that your publishers are not creating the islands of information or creating multiple versions of KPIs.

  • What are publishing rules?
  • How to avoid duplications?

The purpose of publishing is to share your reports, dashboards, stories and insights to others who can make data-driven decisions. The audiences are normally defined already before you publishing the dashboards although dashboard permissions are assigned after publishing from workflow perspective. The questions are:

  • Who can access the published dashboards?
  • What is the approval process?

Engagement Process

Self-service analytics does not replace traditional BI tools but co-exists with traditional BI tools. It is very rare that you will find self-service analytics platform is the only reporting platform in your corporation. Very likely that you will have at least one IT-controlled enterprise-reporting platform designed for standard reporting to answer known questions using data populated from enterprise data warehouse. In additional to this traditional BI reporting platform, your organization has decided to implement a new self-service analytics platform to answer unknown questions and ad-hoc analysis using all the available data sources.  Co-exist

This realization of traditional BI and self-service BI co-existing is important to understand this engagement process because guidance has to be defined which platform does what kinds of reporting. After this guidance is defined and agreed, continuous communication and education has to be done to make sure all self-service super users are in the same page for this strategic guidance.

Whenever there is a new request for a new self-service analytics application, fitness assessment has to be done before proceed. The following checklist serves this purpose:

  • Does your bigger team have an existing site already on self-service analytics server? If yes, you can use the existing site.
  • Who is the primary business / application contact?
  • What business process / group does this application represent? (like sales, finance, etc)?
  • Briefly describe the purpose and value of the application?
  • Do you have an IT contact for your group for this application? Who is the contact?
  • What are the data sources?
  • Are there any sensitive data to be reporting on (like federal data, customer or client data)? If yes, describe in details about the source data.
  • Are there any private data as part of source data? (like HR data, sensitive finance data)
  • Who are the audiences of the reports? How many audiences do you anticipate? Are there any partners who will access the data?
  • Does the source data have more than one enterprise data? If yes, what is the plan for data level security?
  • What are the primary data elements / measures to be reporting on (e.g. booking, revenue, customer cases, expenses, etc)
  • What will be the dimensions by which the measure will be shown (e.g. product, period, region, etc)
  • How often the source data needs to be refreshed?
  • What is anticipated volume of source data? How many quarters of data? Roughly how many rows of the data? Roughly how many columns of the data?
  • Is the data available in enterprise data warehouse?
  • How many self-service report developers for this application?
  • Do you agree with organization’s Self-Service Analytics Server Governance policy (URL ….)?
  • Do you agree with organization’s Self-Service Analytics Data Governance policy (URL ….)?

The above questionnaires also include your organization’s high-level policies on data governance, data privacy, service level agreement, etc since most of the existing self-service tools have some constraints in those areas. On one side, we want to encourage business teams to leverage the enterprise investment of the self-service analytics platform. On the other side, we want to make sure that every new application is setup for success and do not create chaos that can be very expensive to fix later on.

Publisher Roles

I heard a lot of exciting stories about how easy people can get new insights with visualization tools (like Tableau). Myself experienced a few of those insightful moments as well. However I also heard a story about new Tableau Desktop user who just came out of fundamental training, he quickly published something and shared to the team but caused a lot of confusions about the KPIs being published. What is wrong? It is not about the tool. It is not about the training but publishing roles and related process.

The questions are as followings:

  • Who can publish dashboard from my group?
  • Who oversees or manages all publishers in my group?

Sometimes you may have easy answers to those questions but you may not have easy answers for many other cases. One common approach is to use projects or folders to separate boundary for various publishers. Each project has project leader role who overalls all publishers within the project.

You can also define a common innovation zone where a lot of publishers can share their new insights to others. However just be aware that the dashboards in innovation zone are early discovery phase only and not officially agreed KPIs. Most of the dashboards will go through multiple iterations of feedback and improvement before become useful insights. We do encourage people to share their new innovations as soon as possible for feedback and improvement purpose. It will be better to distingue official KPIs with innovation by using different color templates to avoid the potential confusions to end audiences.

Publishing Process

To protect the shared self-service environment, you need to have clear defined publishing process:

  • Does IT have to be involved before publish a dashboard to the server?
  • Do you have to go from a non-production instance or non-production folder to a production folder?
  • What is the performance guidance?
  • Should you use live connection or extracts?
  • How often you should schedule your extracts? Can you use full refresh?
  • What are the data security requirements?
  • Do you have some new business glossary in your dashboards? If yes, did you spell out the definition of the new business glossary?
  • Does the new glossary definition need to get approval from data stewardship? Did you get the approval?
  • Who supports the new dashboards?
  • Does this new dashboard create potential duplication with existing ones?

Each organization or each business group will have different answers to those above questions. The answers to above questions form the basic publishing process that is essential for scalability and avoid chaos.

Here is summary of what most companies do – so call the common best practices:

  1. IT normally is not involved for the releasing or publishing process for those dashboards designed by business group – this is the concept of self-service
  2. IT and business agreed on the performance and extract guidance in advanced. IT will enforce some of the guidance on  server policy settings (like extract timeout thresholds, etc). For many other parameters that can’t be systematically enforced, business and IT agreed on alert process to detect the exceptions. For example a performance alert that will be sent to dashboard owner and project owner (or site admin) if dashboard renders time exceeds 10 seconds.
  3. Business terms or glossary definition are important part of the dashboards.
  4. Business support process is defined so end information consumers know how to get help when they have questions about the dashboard or data.
  5. Dashboards are clarified as certified and non-certified. Non-certified dashboards are for feedback purpose while certified ones are officially approved and supported.

Dashboard Permissions

When you design a dashboard, most likely you have audiences defined already. The audiences have some business questions; your dashboards are to answer those questions. The audiences should be classified into groups and your dashboards can be assigned to one or multiple groups.

If your dashboards have row level security requirements, the complexity of dashboards will be increased many times. It is advised that business works with IT for the row level security design. Many self-service tools have limitations for row level security although they all claim row level security capability.

The best practice is to let database handle row level security to ensures data access consistence when you have multiple reporting tools against the same database. There are two challenges to figure out:

  • Self-service visualization tool has to be able to pass session user variable dynamically to database. Tableau starts to support this feature for some database (like query banding feature for Teradata or initial SQL for Oracle)
  • Database has user/group role tables implemented.

As summary, publishing involves a set of controls, process,  policy and best practices. While support self-service and self-publishing, rules and processes have to be defined to avoid potential expensive mistakes later on.

Please read next blog for performance management

Governed Self-Service Analytics: Multi-tendance (5/10)

Tableau has a multi-tendance strategy which is called site.  I heard many people asking if they should use site, when should use site. For some large Tableau deployment,  people also ask if you have created separate Tableau instances. All those are Tableau architecture questions or multi-tendance strategy.

 

How do you approach this? I will use the following Goal – Strategy – Tactics to guide the decision making process.screenshot_42

It starts with goals. The  self-service analytics system has to meet the following expectations which are ultimate goals:  Fast, Easy, Cost Effectiveness, Data Security, Self-Service, Structured and unstructured data.

Now keep those goals in our mind while scale out Tableau from individual teams to department, and then from department to enterprise.

 

How do we maintain self-service, fast and easy with solid data security and cost effectiveness while you deal with thousands of users? This is where you need to have well-defined strategies to avoid chaos.

First of all, each organization has its own culture, operating principles, and different business environment. Some of the strategies that work very well in one company may not work for others. You just have to figure out the best approach that matches your business requirement. Here is some of food for thoughts:

  1. Do you have to maintain only one Tableau instance in your organization? The answer is no. For SMB, the answer may be yes but I have seen many large organizations have multiple Tableau instances for better data security and better agility. I am not saying that Tableau server can’t scale out or scale up. I have read the Tableau architecture white paper for how many cores one server can scale. However they are many other considerations that you just do not want to put every application in one instance.
  2. What are the common use cases when you may want to create a separate instance? Here is some examples:
    • You have both internal employees and external partners accessing your Tableau server. Tableau allows both internal and external people accessing the same instance. However if you would have to create a lot of data security constraints in order to allow external partners to access your Tableau server, the same constraints will be applied to all Tableau internal users which may cause extra complexity. Depends on the constraints you will have, if fast and easy goals are compromised, you may want to create a separate instance to completely separate internal users vs. external users – this way you have completely piece of mind.
    • Network seperation. It is getting common that some corporations have separate engineering network from the rest of corp network for better IP protections. When this is the case, create a separate Tableau instance within engineering network is an easy and simple strategy.
    • Network latency. If your data source is in APAC while your Tableau server is in US, likely you will have some challenges with your dashboard performance. You should either sync your database to US or you will need to have a separate Tableau server instance that sits in APAC to achieve your fast goals.
    • Enterprise mission critical applications. Although Tableau started as ad-hoc and exploration for many users, some Tableau dashboard start to become mission critical business applications. If you have any of those, congratulations! You have a good problem to deal with. Once some apps become mission critical, you will have no choice but tight up the change control and related processes which unfortunately are killers to self-service and explorations. The best way to resolve this conflict is to spin-off a separate instance with more rigors on mission critical app while leave the rest of Tableau as fast, easy self-service.

What about Tableau server licenses? Tableau server have seat-based license model and core-based license model. If you have seat-based model, which goes by users. The separate of instance should not have much impacts on total numbers of licenses.

Now Let’s say that you have 8 core based licenses for existing internal users. You plan to add some external users. If you will have to add 8 more cores due to external users,  your separate instance will not have any impacts on licenses.  What if you only want to have a handful external users? Then you will have to make trade-off decision. Alternately you can keep your 8 core for internal users while get handful seat-based license for external users only.

How about platform cost and additional maintenance cost when we add separate instance? VM or hardware are relatively cheap today. I will agree that there are some additional work initially to setup a separate instance but server admin work is not doubled because you have another server instance.  On the other side, when your server is too big, it is a lot of more coordinations with all business functions for maintenance, upgrade and everything. I have seen some large corp are happy with multiple instance vs. one huge instance.

How about sites?  I have blog about how to use site. As summary, site is useful for better data security, easy governance, employing self-service and distributing administrative work. Here is some cases when sites should not be used:

  • Do not create a new site if the requested site will use the same data sets as one of the existing sites, you may want to create a project within the existing site to avoid potential duplicate extracts (or live connections) running against the same source database.
  • Do not create a new site if the requested site overlaps end users a lot with one existing site, you may want to create a project within the existing site to avoid duplicating user maintenance works

As summary, while you plan to scale Tableau from department to enterprise. you do not have to put all of your enterprise users on one huge Tableau instance. Keep goals in your mind while deciding the best strategy for your business. The goals are easy, fast, simple, self-service, data security, cost effectiveness. The strategies are separate instance and sites.

 

Please read next blogs about release process.

Governed Self-Service Analytics: Community (4/10)

Self-service analytics community is a group of people who share the common interest about self-service analytics and common value about data-driven decision-making culture.

Why people are motivated for the internal self-service community?

The self-service community motivations are as followings:

  • Empowerment: Self-service stems from – and affects – a wider macro trend of DIY on one hand, and collaboration on the other hand: content builders are taking the lead for services they require, and often collaborate with others. The key is to offer the members empowerment and control over the process, so they can choose the level of services they would like to engage in, thus affecting the overall experience.
  • Convenience: The benefit of community self-service is obvious – they can get fast access to the information they need without having to email or call IT or a contact center. According to Forrester, 78% of people prefer to get answers via a company’s website versus telephone or email.
  • Engagement: It is their shared ideas, interests, professions that bring people together to form a community. The members join in because they wish to share, contribute and learn from one another. Some members contribute, while others benefit from the collective knowledge shared within the community. This engagement is amplified when members induce discussion and debate about tools, features, processes and services provided and any new products that are being introduced. The discussions within the community inform and familiarize people with the new and better ways of getting things done – the best practices.

How to start creating an internal user community?

When you start creating an internal user community, you need to keep in mind that a lot of community activities are completely dependent on intranet. So you need to ensure that the community is one that can be easily accessed by the maximum number of people. Below is the checklist:

  • Determine a purpose or goal for it. One example: The place you find anything and everything about self-service analytics. Community is the place of sharing, learning, collaborating….
  • Decide who your target audience will be. Most likely audience should be those content developers and future content developers. Mostly likely the audiences are not the server end users.
  • Design the site keeping in mind the tools for interaction and the structure of your community.
  • Decide upon the manner in which you will host the community.
  • Create the community using tools available within your organization.
  • Create interesting content for the community.
  • Invite or attract members to join your community. Try to find out who has the developer licenses and send invitation to all of them.
  • Administer it properly so that the community flourishes and expands. It is a good practice to have at least two volunteers as moderators who make sure to answer user’s questions timely and close out all open questions if possible.

Who are the community members?screenshot_20

The audiences are all the content builders or content developers from business and IT across organization. Of course, the governing body or council members are the cores of the community. It is a good practice that council members lead most if not all the community activities. The community audiences also include future potential content builders. Council should put some focuses to reach out to those potential content builders. The end information consumers, those who get dashboards or reports, are normally not parts of the community, as end information consumers really do not care too much tools, technology or processes associated with the self-service. All end information consumers care is the data, insights and actions.

What are the community activities?

The quick summary is in the below picture. More detailed will be discussed later on.

  • Intranet: Your community home. It is the place for everything and everything about your self-service analytics. The tool, process, policies, best practices, system configuration, usage, data governance polices, server policies, publishing process, license purchasing process, tip, FAQ, etc.
  • Training: The knowledge base at community intranet is good but is not good enough. Although most of the new self-service tools are designed for easy of use, they do have a few learning curves. Training has to be organized to better leverage the investment.
  • User Meetings: User summit or regular best practice sharing is one must have community activity.
  • License Model: When a lot of business super users have dashboard development tools, what is most cost effective license model for dashboard development tools? Do you want to charge back for the server usage?
  • Support Process: Who support the dashboards developed by business super users? What is IT’s vs. business’ role in support end users?
  • External Community: Most self-service software vendors have ver active local or virtual or industrial community. How to leverage external community? How to learn the best practices?

Key takeaway: Build a strong community is the critical piece for success self-service analytics deployment in enterprise.

Please next blogs for Multi-tendance strategy

Governed Self-Service Analytics: Roles & Responsibilities (3/10)

When business super users are empowered to create discovery, data exploration, analysis, dashboard building and sharing dashboards to business teams for feedback, business is taking a lot of more responsibilities than what they used to do in traditional BI & analytics environment. One of the critical self-service analytics governance components is to create a clear roles and responsibilities framework between business and IT. This is one reason why the governing body must have stakeholders from both business and IT departments. The governing body should think holistically about analytics capabilities throughout their organization. For example they could use business analysts to explore the value and quality of a new data source and define data transformations before establishing broader governance rules.

A practical framework for the roles and responsibilities of self-server analytics is in following picture.screenshot_18

Business owns

  • Departmental data sources and any new data sources which are not available in IT managed enterprise data warehouse
  • Simple data preparation: Data joining, data blending, simple data transformation without heavy lifting ETL, data cleansing, etc.
  • Content building: exploration, analysis, report and dashboard building by using departmental data or blending multiple data sources together
  • Release or publishing: sharing the analysis, report or dashboard to information end consumers for feedback, business review, metrics, etc.
  • User training and business process changes associated with the new reports & dashboard releases.

IT owns

  • Server and platform management, licensing, vendor management, etc
  • Enterprise data management and deliver certified, trustworthy data to business, build and manage data warehouse, etc
  • Create and maintain data dictionary that will help business super users to navigate the data warehouse.
  • Support business unit report developers by collaborating to build robust departmental dashboards and scorecards, converting ad hoc reports into production reports if necessary.
  • Training business to user self-service analytics tools

It is a power shift from IT to business. Both IT and business leaders have to recognize this shift and be ready to support the new roles and responsibilities. What are the leader’s roles to support this shift?

  • Create BI/Analytics Center of Excellence: Identify the players, create shared vision, facilitate hand-offs between IT and business
  • Evangelize the value of self-service analytics: create a branding of self-service analytics and market it to drive the culture of analytics and data-driven decision-making culture; run internal data/analytics summit or conference to promote analytics
  • Create a federated BI organization: manage steering committee or BI council, leverage BI& Data gurus in each organization, and encourage IT people to go from order takers to consultants.

Please read my next blogs for Community.

Governed Self-Service Analytics : Governance (2/10)

How to govern the enterprise self-service analytics? Who makes the decisions for the process and policies? Who enforces the decisions?

In the traditional model, governance is done centrally by IT since IT handles the entire data access, ETL and dashboard development activities. In the new self-service model, a lot of business super users are involved for the data access, data preparation and development activities. The traditional top down governance model will not work anymore. However no-governance will create chaos situation. What will be needed for self-service environment is the new bottom up governance approach.

In the new self-service analytics model, since super business users do most of dashboard development, the more effective governance structure is to include representatives of those super business users.screenshot_17

In the picture, the blue box in the middle is the self-service analytics governing body for enterprise. It consists of both business and IT team members. The self-service analytics governing body members are self-service analytics experts & stakeholders selected by each business unit. You can think of the governing body members are the representatives of their business units or representatives of the entire self-service analytics content builder community. The charter of this governing body is as followings:

  • Define roles and responsibilities between business & IT
  • Develop and share self-service best practices
  • Define content release or publishing process
  • Define analytics support process
  • Define data access, data connections and data governance process
  • Define self-moderating model
  • Define dashboard performance best practices
  • Helps on hiring and training new self-service analytics skills
  • Communicate self-service process to entire self-service content builder community and management teams
  • Enforce self-service analytics policies to protect the shared enterprise self-service environment
  • Make sure that self-service process and policy alignment with enterprise process and policy around data governance, architecture, business objectives, etc

Should business or IT lead the governing body? While there are times when a business-led governing body can be more effective, do not discount an IT-led governing body. There are many good reasons to consider the IT-led governing body.

  • IT understands how to safely and accurately exposes an organization’s data and can standardize how data is exposed to self-service super users.
  • IT has a centralized view of all analytics needs from all functions of the organization, which can help the enterprise develop streamlined, reusable processes and leading practices to help business groups be more efficient using the tool.
  • IT can also centralize functions such as infrastructure, licensing, administration, and deeper level development, all which further cut down costs and mitigates risks.

What are the key skills and expectations of the head of governing body or leader of the center of excellence team? Different organizations use very different titles for this person. But the person at the helm of your of governing body or leader of the center of excellence team should have the following skills:

  • The passion about self-service analytics and related technologies
  • The ability to lead, set strategy, and prioritize objectives based on needs/impact
  • An in-depth understanding of self-service tool, the business analytics space, and the analytics needs of the business
  • The ability to align self-service analytics objectives with corporate strategy and direction
  • Comfort in partnering and negotiating with both business and IT stakeholders
  • A talent for navigating the organization to get things done

Please read my next blogs for roles and responsibilities

Governed Self-Service Analytics (1/10)

Organizations committed to improve data-driven decision-making processes are increasingly formulating an enterprise analytics strategy to guide the efforts in finding new patterns and relationships in data, understanding why certain results occurred, and forecasting future results. Self-service analytics has become the new norm due to availability and simplicity of newer data visualization tool (like Tableau) and data preparation technologies (like Alteryx)

However many organizations struggle to scale self-service analytics into enterprise level or even business unit level beyond the proof of concept. Then they blame tools and start to try different tools or technologies. It is nothing wrong to try something else, however what many analytics practitioners did not realize that technologies along were never enough to improve data-driven decision-making processes. Self-service tools alone do not resolve organizational challenges, data governance issues, and process inefficiencies. Organizations that are most successful with self-service analytics deployment tend to have a strong business and IT partnership around self-service; a strategy around data governance; and defined self-service processes and best practices. The business understands its current and future analytics needs, as well as the pain points around existing processes. And IT knows how to support an organization’s technology needs and plays a critical role in how data is made available to the enterprise. Formalizing this partnership between business and IT in the form of a Center of Excellence (COE) is one of the best ways to maximize the value of a self-service analytics investment.

What are the key questions that Center of Excellence will answer?

  1. Who is your governing body?
  2. How to draw a line between business and IT?
  3. What are the checks and balances for self-service releases?
  4. How to manage server performance?
  5. How to avoid multiple versions of KPIs?
  6. How to handle data security?
  7. How to provide trustworthy data & contents to end consumers?

The ultimate goal of the center of excellence is to have governed self-service in enterprise. The governance can be classified as six areas with total 30 processes:

screenshot_16

Governing body

  • Governing structure
  • Multi tenant strategy
  • Roles & responsibilities
  • Direction alignment
  • Vendor management

Community

  • Intranet Space
  • Training strategy
  • Tableau User CoE meeting
  • Tableau licensing model
  • Support process

Publishing

  • Engagement process
  • Publishing permissions
  • Publishing process
  • Dashboard permission

Performance

  • Workbook management
  • Data extracts
  • Performance alerts
  • Server checkups for tuning & performance

Data Governance

  • Data protection
  • Data privacy
  • Data access consistence
  • Role level security
  • Data sources and structure

Content Certificatio

  • Content governance cycle
  • Report catalog
  • Report category
  • Data certification
  • Report certification

Please read my next blogs for each of those areas..