If you find that you are often querying the same aggregation query of ADX data, it may be useful to create a Materialized View. A Materialized view performs the aggregation in advance as data is added to the table. We can then query the Materialized View, rather than the table, eliminating the need for our query to perform aggregation.
Setup
For the examples in this article, we will use a table created with the following ADX commands:
.drop table customers .create table customers ( FullName:string, LastOrderDate:datetime, YtdSales:decimal, City:string, PostalCode:string ) .ingest inline into table customers <| 'Bill Gates', datetime(2022-01-10 11:00:00), 1000000, 'Redmond', '98052' 'Steve Ballmer', datetime(2022-01-06 10:30:00), 150000, 'Los Angeles', '90305' 'Satya Nadella', datetime(2022-01-09 17:25:00), 100000, 'Redmond', '98052' 'Steve Jobs', datetime(2022-01-04 13:00:00), 100000, 'Cupertino', '95014' 'Larry Ellison', datetime(2022-01-04 13:00:00), 90000, 'Redwood Shores', '94065' 'Jeff Bezos', datetime(2022-01-05 08:00:00), 750000, 'Seattle', '98109' 'David Giard', datetime(2022-01-02 09:01:00), 1.50, 'Chicago', '60605'
You can run the examples in this article in either the ADX Data Explorer web page or in Kusto.Explorer - a rich client Windows application that you can download for free from here.
Creating a Materialized View
Here is some of the syntax for creating a Materialized View, using the features that we found most useful:
.create async materialized-view
with (backfill=backfill_status, docString='description' )
materialized_view_name
on table source {
aggregation_query
}
where:
- backfill_status is true, if you want to calculate the aggregation for all existing rows in the table. For large tables, this can take a long time. Set this to false if you only want to aggregate data inserted after the view is created.
- description is a brief description of the view, making it easier for others to identify its purpose.
- materialized_view_name is the name of the view to create
source is the name of the source table (or another Materialized View) - aggregation_query is a KQL query and/or set of commands that returns a dataset. This query must include aggregated data, such as avg, min, or max.
You must add the async keyword if you set backfill=true.
You can find the full syntax here.
Here is an example:
.create async materialized-view with (backfill=true, docString='Summary customer sales' ) myMaterializedView on table customers { customers | summarize numCustomers=count(), minSales=min(YtdSales), maxSales=max(YtdSales), avgSales=avg(YtdSales) by PostalCode }
myMaterializedView | where avgSales > 100000
Fig. 1
Altering a Materialized View
Use the .alter materialized-view to modify an existing View. The syntax is nearly identical to the .create materialized-view command.
For example, the following command will remove the minSales aggregated column in the view I created above.
.alter materialized-view myMaterializedView on table customers { customers | summarize numCustomers=count(), maxSales=max(YtdSales), avgSales=avg(YtdSales) by PostalCode }
Removing a Materialized View
You can remove a materialized view with the .drop materialized-view command, as in the following example:
drop materialized-view myMaterializedView
Conclusion
Although there is an initial performance hit when rows are inserted, using a Materialized View can speed up your queries considerably.