Is Big Geospatial Data Analytics the vision of the future?
Geospatial data by its very nature is voluminous. It’s relatively easy to collect. However, it’s hard to efficiently store, manage, process and display, especially as improved, updated and additional data is continuously collected over time. Read this blog to find out more about the future of Big Geospatial Data Analytics.
I recently attended a Big Geospatial Data Analytics workshop organized and hosted by the Canadian federal government. The focus of the event was on geospatial data and analysis, which included optical, radar and non-Earth Observation data. It was noted that processing just a few Earth Observation scenes does not constitute big data, but over the years, as more scenes and more sensors are added, the database of this raster data becomes enormous.
One of the obvious differentiators between raster datasets and vector datasets is often, data volume.
For example, if one considers only the most recent data for a large geographic area like Canada, the satellite imagery data volume would be significantly larger than the vector map data volume. However, the satellite imagery that has been collected over several decades is still useful, so keeping this temporal data compounds the data volume problem because multiple dates of imagery for the same area are often used in the analysis.
Then you add the non-optical imagery, such as radar and LiDAR, plus you have the non-satellite collected imagery from aerial survey programs gathered via aircraft and drones.
Finally, you add in mapping or GIS vector data, and now you’re starting to talk really big data for Canada-wide coverage.
In comparison, smaller regional analysis data volumes are not as big an issue, but when you’re processing data for an area as large as Canada, the problem is not insignificant. However, with the advent of relatively cheap cloud storage and processing environments, the problem becomes manageable.
To build a big geospatial data analytics system from scratch would be a daunting, incredibly risky and expensive task, even for the federal government. But, with so much reasonably priced capacity available in the cloud these days, the development of such a system using off-the-shelf technology is quite feasible. In fact, Esri has already developed a free online service in the cloud for viewing and analyzing open Landsat satellite image data that has been collected over a period of nearly five decades.
This Landsat image of Winnipeg taken September, 2016 has been processed in real time to show agriculture areas in green. Check out your own favorite geographic area at Esri’s Unlock Earth’s Secrets website.
Optical imagery is relatively easy to store, manage and process primarily due to the fact that the data is well understood, and can be pre-processed into manageable chunks of orderly data for processing. Generally, the spectral band data for the same area and date of collection is kept together, so imagery lends itself well to processing due to the orderly and well understood nature of optical imagery pixels.
The biggest issue related to the use of optical data is Earth’s atmosphere, which contains haze, clouds (with resulting shadows) and sometimes smoke, which obscures or modifies the reflectance of the earth as collected by the imaging sensor.
Much of the non-imaging sensor data and map data is also relatively well understood and can easily be stored and analyzed in the cloud. Take for example, Esri Canada’s Community Map of Canada, which contains vector map data from authoritative municipal, provincial and federal governments. Esri Canada knits this data into seamless Canadian coverage at a number of map scales. This allows users to zoom from a continental view of all of Canada down to looking at street furniture in Toronto or Vancouver. These multiscale maps are free to view and include some analytical functions.
Examples of image and map basemaps of the area near the entrance to Gatineau Park in Québec, which are from data collected and processed by the Esri Canada Community Maps Program.
It is more difficult to create a suite of products for viewing and analyzing image radar data (as opposed to weather Doppler radar data) online. Radar is a very useful and flexible sensor, which does not suffer the same problems with the atmosphere as optical sensors. This makes it very valuable for cloudy areas such as Canada’s coastline and the North.
However, the flexibility of radar makes creating analysis-ready radar data much more difficult due to factors such as the microwave frequency, pulse frequency, beam mode, look angle, wave polarization and a host of other issues that need to be considered. The selection of the processing for creating a visualization or analysis-ready product for radar data may take some time and thought.
Radar images from Sentinel 2 showing Colour Infrared and Natural Colour Images.
So, is Canada ready for big geospatial data analytics?
I think Canada is well positioned to take advantage of big data technology that will move the country to the next level of sophistication in providing scientists, engineers and interested citizens with significant information and capabilities to better understand Canadian geography.
It’s clear from the Big Geospatial Data Analytics workshop that the federal government is beginning to look beyond traditional methods of creating and providing Earth Observation products and services to their clients. By developing and implementing a relatively modest cloud strategy, they’ll be able to meet their complex, dynamic and often-changing requirements without breaking the bank. If the federal geospatial community can work with the new federal IT community in conjunction with a sound strategy, I’m sure they can develop leading-edge geospatial capabilities that we can all be proud of.