What We Do

The Center focuses on issues relating to extreme scale spatial data. Spatial data accounts for the vast majority of data currently being generated. Spatial data encode location information (e.g., ) along with the data and observations of interest. The Center’s activities have been funded through grants from the NSF, NIFA, and DHS. The overarching goal of the Center is to facilitate cutting-edge artificial intelligence, machine learning, and deep learning methods at scale over high-dimensional spatial datasets.

Our methodological innovations are data format agnostic, and our reference implementations can cope with data stored in over 20 different formats that include inter alia CSV, netCDF, HDF, XML, GRIB, BUFR, DMSP, NEXRAD, SIGMET. These systems have been deployed in urban sustainability, epidemiology, ecological monitoring, methane gas leak detections, and atmospheric sciences.

Key Features

  • Reconcile spatial observations encoded as multivariate vectors, shape files, data sketches, and hyperspectral imagery
  • The ability to manage trillions of small files with quadrillion observations
  • Support for building deep learning models at scale. Deep networks that we work with include foundational models, capsule networks, and LSTM/GRU based recurrent deep networks
  • Interactive visualizations over spatiotemporal datasets
  • Support for over 20 scientific data formats netCDF, HDF, XML, CSV, GRIB, BUFR, DMSP, NEXRAD, and SIGMET
  • Approximate queries, fuzzy queries, and probabilistic queries
  • Hypothesis testing, significance evaluations, and kernel density estimations

The Center performs foundational algorithmic work in spatiotemporal imputations, sketching, outlier detection, and trajectories.