Optimizing Embedded Analytics Dashboard Performance by Pre-transforming Data

Your embedded analytics dashboard in Holistics takes a long time to load, thus impacting your customers' experience. This post shows you how to optimize the performance of embedded dashboard by preaggregating the data.

Suppose you run an ad network, and are using Holistics Embedded Analytics to display analytics to your existing customers via your internal portal. In your dashboard, you have a widget that shows number of ad impressions that each of your customers achieve during a certain period of time.

Your widget's SQL would look something like:

select count(*)
from ad_impressions A
where A.customer_id = {{customer_id}} -- CIV
  and A.ts::date >= {{start_date}}
  and A.ts::date <= {{end_date}}

Here, start_date and end_date are 2 date filter variables on your dashboard, while customer_id is the client identifier variable you set on the dashboard.

This query usually takes a long time to run since it needs to scan through a large amount of ad impressions data (millions of rows). Thus the impacts are:

  • Bad Customer Experience: It delivers a bad experience to your clients (they need to wait a long time for the data to load)
  • Resources Hogging on Database: Since each query rescans lots of rows, it takes up resources on your database every time.
  • Wasted Embedded Workers: Since each long-running query takes a Holistics embedded worker, this will cost you a lot more embedded workers than needed.

Among the negative impacts above, the "bad customer experience" is among the worst things that you don't want to happen, since it directly affects your image as a B2B business.

The Solution: Pre-transforming Your Data

The solution is simple yet very effective: build a summarized table. In the above example, the data is refreshed daily, thus we just need to build a daily table that summarizes the impressions by each customer.

To do this without asking (and waiting) for your engineers to write code and deploy, you can use Holistics Data Transforms functionality. It will automatically take care of:

  • Creating and filling the table with data
  • Scheduling the daily runs (usually after midnight)
  • Error handling and other minor stuff.
  • Note: you can even set up the transform to run incrementally, e.g. each day it will only process data from yesterday. For simplicity we don't mention it here.

You can read more on Data Transform docs to learn the step-by-step how to. After that, read more below to see how we apply it in this context.

We build an ad_impressions_daily table that stores how many impressions each customer has each day. The SQL query to build the table would look like:

select
  ts::date as date_d,
  customer_id,
  count(1) as impressions
from ad_impressions A
group by 1, 2

Once we have the daily table built, simply change the widget's SQL to:

select sum(impressions) as total_impressions
from ad_impressions_daily A
where A.customer_id = {{customer_id}} -- CIV
  and A.date_d >= {{start_date}}
  and A.date_d <= {{end_date}}

This query scans a significantly smaller amount of data, thus would run almost instantly. This would thus deliver a faster experience to your customers, spend less database resources and optimize your embedded workers.

Conclusion

The above post shows how you can improve performance of your embedded analytics dashboard by pre-aggregating your data (and done so easily with Holistics Data Transforms feature). Feel free to let us know if you have any questions or ideas.