Self Service Analytics is a Business State

A Holistics colleague of mine, Dave, has this thing where he says 'No data tool can ever help you achieve data literacy in your company. But we sure can make sure we don't get in the way.'

I think about that a lot.

As a tool maker, it's easy to introspect too much and think about all the ways Holistics can solve all of our customers's problems, and make their lives better, and make them write happy emails to us, and so on. But the truth is that business intelligence problems are socio-technical problems, and you usually have to fix some combination of people (read: culture) and process and tool, all at the same time.

Which brings us to today's topic

What Exactly is Self-Service Analytics?

It shouldn't be a surprise to anyone that self service in the data analytics space is hard to define. Benn Stancil has a whole piece where he argues 'self service is a feeling' — which I largely agree with — and Stancil says that what self service analytics is depends on how the org feels about self-serving data from their tools. Do they trust it? Do they feel comfortable getting what they need, without emailing an analyst?

This, Stancil continues, depends on the context of the organization (do they trust the numbers in their data systems?) and their data maturity (do they feel comfortable with their BI tool?) and the needs from business users (does the CEO set the tone for metrics consumption?)

So, yes, the organizational context matters when you're talking about self service analytics. A self service setup that works in one company might not be equivalently self service in another.

But I think we can get more specific than 'Self-service is a feeling'.

Instead, I'm going to invert the question and define self-service analytics by what it's not. Because I think this is more useful.

In a sentence: I think self service can be thought of as a business outcome that successfully avoids a common organizational failed state. To put this more concretely, I think self-service analytics is a state where the business is sufficiently data-driven, but the data org does not look like an army of English-to-SQL translators.

This should become more useful in a minute. Let's walk through this together:

You are a small company.

You realize you need a data analytics team, so you hire your first analyst and you use Google Data Studio or Tableau or some other analytics platforms. Your analyst churns out reports for management, and all is well for a few months. But eventually your analyst can't keep up with all the requests she's getting from end users, so you hire another. And another. And another. And then your company grows up, creates departments that report to different leaders, and each department hires their own analysts, and now you have an army of analysts in various parts of the company all writing queries or tuning Excel spreadsheets, just trying to keep up with the business requests your company throws at them.

These analysts are mostly English-to-SQL translators, or Excel jockeys. They're all relatively junior. Some are senior, sure. But there's not much career progression for them overall. And many of them are suitably displeased with their jobs, and a reliable percentage of them churns out (read: quits your company) every six months or so. You keep hiring new analysts to keep up with business demand, and grit your teeth at the management challenge of constantly churning employees.

This is the failed state.

(Note that in this scenario, your company is data-driven. This isn't always a thing! It's more common to be in a company that isn't data-driven, which doesn't have this problem, and will instead have a different set of problems and a different set of failed states. Anyway.)

This is the failed state that self service analytics is supposed to solve. It is a failed state because it's rather painful to maintain an army of English-to-SQL translators. Ideally you want a smaller group of data folks that can service a much larger number of data consumers. And the only way you can hit that scale is to have some form of 'self service' — that is, some way that business users can get the data they need, without going through an analyst on Slack or email.

In other words, self service analytics is valuable as a goal because it increases the operating leverage of your data team. You can serve many more people with fewer analysts. This is an ideal business outcome.

Now: notice that I have not defined what features a self service analytics platform should have in this context. Notice that I have not talked about tools, or processes, or even org structure. All of these depend on the nature of the company.

Instead, I'm describing self service analytics by telling you what it is not — it is not this failed state where the company is data driven but they've gotten there by just throwing bodies at the problem, and have 100 data analysts spread across six departments writing 100-line SQL queries. Self service, when seen through the lens of my inverted definition, is how far away you are from that failed state.

Of course, smart readers would recognize that this is simply another way of saying 'in a data-driven company with high demand for data, bad data organizations tend look the same, but working data organizations look very different from each other'. And indeed data-driven companies with good self service capabilities all look very different. For instance, in one consumer software company I know, many people in the company's reporting structure are fluent with SQL, so they are able to solve their self service problems with a combination of SQL-oriented BI tool, a well-curated data warehouse, and one or two visualization tools. This would not work in a cosmetics company where the majority of their staff aren't SQL-savvy, and prefer to have dashboards built for them. Self service in the first company looks different from self service in the second. (Holistics, by the way, works better in achieving self-service analytics goals at that second company, as opposed to the first).

In other words, self service business intelligence is most usefully described as a business outcome — a place that you get to through a combination of tools and processes and org structure. And the way you get to it is by asking yourself, each step of the way: "does this move bring us closer or further away from the failed state?"

In such a scenario, the best thing a tool can do is to not get in your way. The best thing a Business Intelligence tool can do is to give you handles when you want to evolve your org away from the failed state.

That being said, knowing what self-service analytics is is different from actually achieving it.

Why True Self-Service Analytics Is Hard To Achieve

(this part is an excerpt from our Analytics Setup Guidebook. Get it here)

To understand why true self service analytics is hard to come by, let's talk about the arc of adoption.

What do we mean by this?

Well, most companies go through a very similar arc of data adoption. They do this because data usage is determined by organizational culture, and organizations go through similar cultural changes when given access to data. Understanding what that process looks like will help you understand why so many tools advertise an ability to provide ‘true self-service’. It will also help you prepare for future growth.

One: Ad-hoc Queries

Business users inevitably have ad-hoc data requests. That’s just a fact of life.

How you serve these queries depends heavily on the tools you have available to you. If you have access to a centralized data warehouse, it is likely that you would write some ad-hoc SQL query to generate the numbers you need.

If you operate in a more ‘decentralized’ data environment, you would have to find the right data sources, grab the subset of data that you need, and then analyze it in whatever tool you have sitting on your machine.

Two: Static Reports and Dashboards

Eventually, as more business people catch on to the idea of getting data to bolster their arguments (and as the company expands in complexity) a data team would begin to feel overwhelmed by the sheer number of requests they receive. The head of data then reaches for an obvious next step: a business intelligence solution to get some of the requests off his team’s back.

This head of data began to look for a BI tool to create dashboards for those predictable metrics, in order to free up his team for the more ad-hoc requests that they received from other parts of the company. Once he had created those reports, his data team immediately began to feel less overwhelmed.

“We’re very happy,” he told us, “The product team and the marketing team each got their own dashboard, and once we set everything up, the number of requests we got from those two teams went down. We now try and give them a new report every time they ask for something, instead of running ad-hoc queries all the time for them.”

Many companies realize the importance of having good reporting functions fairly quickly. If they don’t adopt a dashboarding solution, they find some other way of delivering predictable data to their decision-makers. For instance, a small company we know uses email notifications and Slack notifications to deliver timely metrics to their business users. The point is that the numbers reach them on a repeatable, automated basis.

Eventually, new hires and existing operators alike learn to lean on their ‘dashboards’. This leads us to the next stage.

Three: Self-Service

More dashboard usage leads to more data-driven thinking … which in turn leads to more ad-hoc requests! As time passes, business operators who lean on their dashboards begin to adopt more sophisticated forms of thinking. They learn to rely less on their gut to make calls like “let’s target Japanese businessmen golfers in Ho Chi Minh City!”, or “let’s invest in fish instead of dogs!” This leads to an increase in ad-hoc, exploratory data requests.

The data team finds itself overwhelmed yet again. Some companies have experimented with SQL training for their business people. Others buy into the self-service narrative sold by the second wave of BI tools. This includes things like PowerBI’s usage paradigm and Tableau Desktop’s drag-and-drop interface. “Give them such tools,” they think, “And they’ll be able to help themselves to the answers they need, without bottlenecking on the data team.”

Both approaches have problems, but the biggest problem is that they often lead to the metrics knife fight: Different business users may accidentally introduce subtly different metric definitions to their analyses.

These inconsistencies often lead to miscommunication, or — worse — errors of judgment at the executive level.

The Arc: Then and Now

The point we want to make here is that this arc is universal

The arc occurs because data-driven thinking is a learned organizational behavior. It spreads slowly throughout a company’s culture.

Most people are not data-driven by nature. They have to learn it, like they learned reading or writing. In a sufficiently large company, however, you will find certain people who are naturally data-driven in their thinking; others that seem data-driven from the beginning may have come from more data-mature organizations and therefore seek to continue the practices that they were used to.

When viewed through this lens, the data capabilities that you build out in your team will have an impact on the spread of data-driven thinking in your organization. The more data capabilities you have, the more people will be exposed to the potential of using data to advance their arguments. The more data capabilities you build up, the more leverage you give to data-driven people in your company’s culture to spread their way of thinking.

As a result, the amount of work that your data team has to do increases linearly with the spread of data-driven thinking in your company! That cycle looks something like this:

The upshot is that if all goes well, your data team will find itself buried under a wave of ad-hoc requests. You will seek solutions to this problem. You will discover that dashboards and automated reports will buy you some time.

But eventually, as your organization moves from reporting to insights to predictions, you would have to tackle this problem head-on.

This arc shouldn’t surprise us. Spend even a small amount of time looking at industry conferences, or data thought leadership articles, or marketing materials from vendors, and you’ll find that many of these professionals are obsessed over self service analytics as an ultimate goal. “Listen to us!” the thought leaders cry, “We will show you a way out of this mess!” To be clear, this is an understandable desire, because data-driven decision-making so often bottlenecks on the data team. Also to be clear: a majority of companies do not succeed in this effort. True self-service analytics is a difficult challenge.

Solving the Self-Service Analytics Problem Today

How are things different today? Is it possible to do better than the previous generations of BI tools?

If you’ve read our Analytics Setup Guidebook, you can probably guess at what we think about this: unlike ‘second wave’ BI tools, we think that:

  • Data modeling at a central data warehouse is part of the solution space: Define your business definitions once, in a modeling layer, and then parcel these models out for self-service. In this way, you get all the benefits of self-service analytics without the problems of ill-defined, inconsistent metric definitions
  • Another part of the solution space is analytics as-code. We believe that you should apply software engineering best practices to business intelligence: Define your analytics and reporting logic as code, put them through a Git-based review process, make sure the automated tests pass, then deploy to production. That way, you know exactly when, where and who made changes to your dashboard. Your analytics output is tightly controlled, and business user can consume the data with great confidence.

As far as we can tell, the only self-service BI tools to adopt this approach is Looker and Holistics. We expect more tools to adapt accordingly, however, especially if these ideas prove their worth in practice.

Will this approach win out in the end?

We’d like to think so. We think that there are many advantages to this approach. However, we can not know what problems will fall out of this new paradigm. We will have to wait a couple of years to see.