Semantic Layer 3 min read

Push-down Semantic Layer

Last updated: 2026-04-15

A push-down semantic layer translates metric definitions into SQL and sends the computation to the data warehouse. The warehouse executes the query, handles the aggregation, and returns results. No data moves out of the warehouse into an intermediate engine.

This architecture contrasts with extract-based (in-memory) approaches where data is pulled into a separate engine for processing. The distinction shapes performance characteristics, scalability limits, and operational complexity.

Push-down vs. extract architectures

Push-down means the semantic layer acts as a query compiler. It takes a user's analytical question – "revenue by region, filtered to Q1, compared to last year" – translates it into optimized SQL, and sends that SQL to the warehouse. BigQuery, Snowflake, Redshift, or Databricks does the actual computation. The semantic layer never holds the data.

Extract (in-memory) means the tool pulls data from the warehouse into its own processing engine before running calculations. Power BI's VertiPaq engine imports data into compressed columnar storage. Tableau's Hyper engine does something similar. The analytical computation runs on the tool's own infrastructure rather than the warehouse.

These are fundamentally different bets about where computation should live.

Why push-down matters

No data movement. Data stays in the warehouse. This eliminates an entire class of problems – data freshness gaps between the warehouse and the BI tool's cache, import failures, storage limits on the BI tier, and the security implications of copying sensitive data to another system.

Warehouse-scale processing. Modern cloud warehouses handle petabyte-scale datasets with elastic compute. A push-down semantic layer inherits that scale automatically. The tool doesn't need its own scaling infrastructure because it offloads the heavy work to a system built specifically for it.

Single compute bill. All analytical computation runs on the warehouse, which means compute costs consolidate into one line item. Extract-based tools split computation between the warehouse (for initial data pulls) and the tool's own engine (for analytical queries), complicating cost management.

Consistent data. Every query hits the warehouse directly, so results always reflect the current state of the data. Extract-based tools introduce a lag between the warehouse update and the next import cycle.

Tradeoffs

Push-down has clear advantages, but it comes with real constraints:

Latency depends on the warehouse. A dashboard backed by a push-down semantic layer is only as fast as the warehouse can return results. For exploratory, click-heavy analysis where users expect sub-second response times, a cold warehouse query might take several seconds. In-memory engines like VertiPaq deliver near-instant response for imported datasets because the data is already loaded and indexed locally.

Warehouse concurrency limits. Heavy dashboard usage generates many concurrent queries. If 200 users open dashboards simultaneously, the warehouse must handle 200 concurrent analytical queries. This can strain warehouse concurrency limits or spike compute costs. In-memory tools absorb that load on their own infrastructure.

Query optimization burden. The SQL generated by a push-down semantic layer must be efficient. Poorly optimized generated queries – unnecessary joins, missing predicate pushdown, redundant subqueries – hit the warehouse hard. The analytics query language powering the semantic layer determines how well this translation works.

Which tools use which approach

Push-down: Holistics compiles AML definitions into warehouse-native SQL. Looker sends LookML-compiled queries to the warehouse. Lightdash and MetricFlow (dbt) generate SQL for warehouse execution. These tools intentionally avoid maintaining their own data storage.

In-memory / extract: Power BI imports data into VertiPaq and runs DAX calculations locally. Tableau uses its Hyper engine for extracts. Both tools offer a "live connection" mode that pushes queries to the warehouse, but their richest analytical features – complex DAX measures, Tableau calculations – often work best with imported data.

Hybrid: Some tools use caching layers or pre-aggregation tables on top of a push-down architecture. Cube pre-aggregates data in a cache to accelerate common queries while still querying the warehouse for cache misses.

The trend in modern data stacks favors push-down. As warehouses get faster and cheaper, the performance gap narrows. The architectural simplicity of keeping data in one place becomes increasingly attractive.

The Holistics Perspective

Holistics uses a push-down architecture. AQL compiles to optimized SQL that runs directly on the customer's data warehouse. There is no intermediate computation engine, no data extraction, and no performance bottleneck outside the warehouse itself.

See how Holistics approaches this →