The Kusto Query Language (KQL) is ideal for analyzing time series data stored in Azure Data Explorer (ADX).
For the examples in this article, we will use a table created with the following ADX commands:
To use the time series functionality, it is important to
- Have a column string a time value
- Sort your data by that time value column
Getting the Previous value
After you have sorted the data, KQL provides the prev function that allows you to retrieve the value of any column in the sorted order. You can only use this function if your data has been sorted using the order by clause.
The syntax is
where column is the name of the column from which to retrieve the previous row's value.
The following example retrieves the battery life from the previous row; then, calculates the delta between the current row and the previous row.
The results of this query are shown in Fig. 1
By default, the prev() function returns a value 1 row prior to the current row. However, you can specify to go back any number of rows by providing an optional offset argument. The syntax is:
By default, the prev function returns null, if there is no previous value (for example, for the first row in the dataset). However, you can provide a different default for these cases with the optional default_value parameter. The syntax is:
prev(column, offset, default_value)
As you may have guessed, there is also a next function that works exactly the same way, except that it returns a value from the next row in the series, rather than the previous one.
Summarizing Data Into Bins
KQL provides the bin function to use when aggregating data. Typically, when you aggregate data, you use the by clause group by a field or fields in the table. The bin() function allows you to group time series data by a time increments. If you have data points for every hour, you can return results for each 15-minute interval. The syntax is:
- value is a column containing datetime values
- roundTo is a timespan indicating how far apart each grouping should occur
An example will help. The following query returns one row for each 1-hour interval, even though our sample data contains values every 15 minutes.
The results of this query are shown in Fig. 2
If you have multiple rows within your specified interval, it is reasonable to ask which row's values will be returned. The arg_max operator in the example above takes care of this. It tells Kusto to return the row with the maximum TimeStamp value in that interval. The first argument in arg_max specifies which column to consider when determining the maximum and the other arg_max arguments determine what other column values to return.
The bin function can also be used to group numeric data, so that you only show one row per 100 items, for example.
There are many other KQL features to help you work with Time Series data, but this article covered the ones that my team has found most useful.