Geo Functions in KQL

Comments [0]

Kusto Query Language (KQL) contains many built-in functions to work with Geographic data. In this article, I will describe some of those that I found most useful.

Setup

For the examples in this article, we will use a table created with the following ADX commands:

.drop table vehicleLocations

.create-merge table vehicleLocations
(
VehicleId:int,
TimeStamp:datetime,
Longitude:decimal,
Latitude:decimal
)

.ingest inline into table vehicleLocations <| 
1, datetime(2022-01-10 10:45:00), -73.99446487426758, 40.73857555787898
1, datetime(2022-01-10 11:00:00), -73.97476673126219, 40.73857555787898
1, datetime(2022-01-10 11:15:00), -73.97476673126219, 40.74091672247485
1, datetime(2022-01-10 11:30:00), -73.99446487426758, 40.74091672247485
2, datetime(2022-01-10 11:01:00), -73.92, 40.72
2, datetime(2022-01-10 11:02:00), -73.99, 40.61
2, datetime(2022-01-10 11:03:00), -73.90, 40.55
2, datetime(2022-01-10 11:04:00), -73.85, 40.49
2, datetime(2022-01-10 11:04:00), -73.79, 40.48

Useful Functions

geo_distance_2points()

This function calculates the distance in meters between 2 points, given the latitude and longitude of each. The syntax is:

geo_distance_2points(p1_longitude, p1_latitude, p2_longitude, p2_latitude)

where:

  • p1_longitude, p1_latitude are the latitude and longitude of the first point
  • p2_longitude, p2_latitude are the latitude and longitude of the second point

The path between the points is assumed to be a straight line, so it may or may not be the path taken to move from one point to the other. For example, you may be tracking a car's movements and that car is likely be restricted to driving on roads and not driving through buildings.

Here is a sample call:

vehicleLocations
| where VehicleId == 1
| order by VehicleId asc, TimeStamp asc
| project  VehicleId, TimeStamp, Longitude, Latitude,
distance=geo_distance_2points(Longitude, Latitude, prev(Longitude), prev(Latitude))

The query above yields the results shown in Fig. 1

Results of geo_distance_2points function
Fig. 1

geo_point_in_polygon()

This function returns true, if a given point is inside a given polygon; otherwise, it returns false.

The syntax is:

geo_point_in_polygon(longitude, latitude, polygon)

where:

  • longitude is the longitude of the point to test
  • latitude is the longitude of the point to test
  • polygon is the polygon in question

There are 2 ways to create a polygon.

One is to assign an array of longitude/latitude pairs to an object of type polygon. The other is to use the pack function and pass in an object containing an array of longitude/latitude pairs.

Below are examples of each type:

let polygon = 
```
{
"type": "Polygon",
"coordinates": 
[[
[-73.96219253540039, 40.782816128657224],
[-73.96682739257812, 40.77631678827737],
[-73.96176338195801, 40.77202687527417],
[-73.95579814910889, 40.77053184050704],
[-73.95034790039062, 40.77254687948199],
[-73.94682884216309, 40.77732422768091],
[-73.94918918609619, 40.78125634496216],
[-73.96047592163086, 40.78470081841747],
[-73.96219253540039, 40.782816128657224]
]]
}
```;
let p1Lon = -73.99892807006836;
let p1Lat = 40.72924259684576;
let p2Lon = -73.98395061492919;
let p2Lat = 40.72924259684576;
let p3Lon = -73.98395061492919;
let p3Lat = 40.74976037842817;
let p4Lon = -73.99892807006836;
let p4Lat = 40.74976037842817;
let p1 = pack_array(p1Lon, p1Lat);
let p2 = pack_array(p2Lon, p2Lat);
let p3 = pack_array(p3Lon, p3Lat);
let p4 = pack_array(p4Lon, p4Lat);
let polygon = pack("type","Polygon","coordinates", pack_array(pack_array(p1, p2, p3, p4, p1)));

The polygons above each define a rectangle in New York City, as shown in Fig. 2

Polygon shown on map of New York City
Fig. 2

Here is a sample call to geo_point_in_polygon, using either of the polygon objects created above:

vehicleLocations
| where geo_point_in_polygon(Longitude, Latitude, polygon)
| order by VehicleId asc, TimeStamp asc

The query above will return only those rows that are inside the polygon. The results are shown in Fig. 3.

Results of geo_point_in_polygon function
Fig. 3

geo_point_to_s2cell()

The S2 system divides the Earth into rectangles of approximately equal size. The rectangles take into account the curvature of the Earth's globe, eliminating gaps and distortions caused by models that attempt to flatten the Earth's surface. The size of each rectangle is dependent upon the level (00 through 30). Lower level values divide the earth into larger rectangles, so it takes fewer rectangles to cover the globe. For a given S2 level, if two points share the same S2 value, they exist in the same rectangle.

You can read more about S2 at s2geometry.io

The syntax of geo_point_to_s2cell() is:

geo_point_to_s2cell(longitude, latitude, level)

where:

  • longitude and latitude represent a point on the globe
  • level is the level that defines the S2 rectangle in which to place this point

The following example places calculates the S2 Level for every data point of Vehicle 2:

vehicleLocations
| extend s2_13 = geo_point_to_s2cell(Longitude, Latitude, 30)
| where VehicleId == 2 

The query above yields the results shown in Fig. 4

Results of geo_point_to_s2cell function
Fig. 4

geo_polygon_to_s2cells()

This function returns an array of s2 cell values that cover a polygon. The syntax is:

geo_polygon_to_s2cells(polygon, S2_level)

where:

  • polygon is a polygon, as described above
  • S2_level is the level (0-30) defining the size of the S2 rectangles, as described above.

For example, the following code will return an array of 15-level S2 cells that completely covers a polygon named myPolygon. Every S2 cell in the area will overlap myPolygon.

geo_polygon_to_s2cells(myPolygon, 15)

You can then use this array to further refine searches, as in the example below:

.drop table vehicleLocations
let p1Lon = -73.99892807006836;
let p1Lat = 40.72924259684576;
let p2Lon = -73.98395061492919;
let p2Lat = 40.72924259684576;
let p3Lon = -73.98395061492919;
let p3Lat = 40.74976037842817;
let p4Lon = -73.99892807006836;
let p4Lat = 40.74976037842817;
let p1 = pack_array(p1Lon, p1Lat);
let p2 = pack_array(p2Lon, p2Lat);
let p3 = pack_array(p3Lon, p3Lat);
let p4 = pack_array(p4Lon, p4Lat);
let polygon = pack("type","Polygon","coordinates", pack_array(pack_array(p1, p2, p3, p4, p1)));
let s2Array =geo_polygon_to_s2cells(polygon, 12);
let devicesInS2Boxes = 
    vehicleLocations
    | extend s2_12 = geo_point_to_s2cell(Longitude, Latitude, 12) 
    | where s2_12 has_any (s2Array);
devicesInS2Boxes
| where geo_point_in_polygon(Longitude, Latitude, polygon)

This code returns the same results as in Fig. 3 above; however, it can be much faster if our table contains many points all over the world, far from the polygon. By first finding the overlapping S2 cells, we speed our query by only considering points near the polygon. This will run even faster if we pre-populate the S2 cell value.

Conclusion

In this article, I described the Geo features of KQL that we found most useful. You can view more KQL Geo functions here