Trillion Pixel 2019

Oak Ridge National Laboratory, September 26-27, 2019

Download Workshop Report

AI driven monitoring of the entire planet, every day, with unprecedented clarity and fidelity

The emergence of virtually ubiquitous global imagery, astonishing progress in AI, and transformational advances in high performance computing brings into view the possibility of mapping and interpreting the surface of our planet in unprecedented detail. Furthermore, rapid innovations in satellite and airborne remote sensing capabilities will soon collect high resolution imagery with daily, even hourly return rates across the entire planet. This brings forward the possibility of detailed change detection and new insights into how humans occupy, interact, and alter the surface of the planet over time. The implications for science, policy, and security are extraordinary. However, from a science and technology standpoint, this is a modern-day moonshot. Depending on resolution, the number of pixels required to cover the surface of the earth is easily in the trillions. It is difficult to imagine processing, exploiting, and interpreting this amount of data without a massive scaling of AI workflows and serious advances in generalizing those to accommodate highly heterogeneous conditions. As a prerequisite to future global imagery processing, we pose the Trillion pixel GeoAI challenge: What are the barriers, opportunities, and way forward in exploiting high resolution, planetary imagery involving 100s of trillions of pixels?  We invite you to join us at ORNL for an interdisciplinary gathering of experts from image science, computer vision, HPC, architecture, machine learning, advanced workflows, and societal AI challenges for a 2-day workshop focused on the trillion pixel challenge.

  • The objective of this workshop will be to further imagine, shape, and propose how we as a community might approach this challenge while remaining mindful of the societal impacts and challenges surrounding such advanced technology.

Schedule

Day 1

8:45-9:00 Welcome Remarks: Dr Michelle Buchanan, Deputy for Science and Technology – ORNL

09:00–10:30 Session 1: GeoAI in Global challenges

Scope: With a growing number of global challenges, it is imperative that societal impact due to AI in the context of geographic knowledge discovery is fully explored and understood. Designing GeoAI systems that will scale with greatest challenges will require engaging the frontlines and visionary societal perspectives for guidance. Join us in this session for a forward framing of the GeoAI  initiative as a bridge toward uncovering unlimited possibilities for impacting global sustainable development goals and challenges for society’s benefit.

Key questions:

  • Why Geospatial and AI?
  • Context on the trillion pixel challenge
  • Global challenges that can benefit from GeoAI

Session moderators: Dr Budhu Bhaduri, Division Director – National Security Emerging Technologies – ORNL

Speakers:

  • Speaker 1:  Mr. Frank Cooper, National Geospatial Intelligence Agency (NGA)
  • Speaker 2:  Mr. Pete Doucette, U.S. Geological Survey (USGS)
  • Speaker 3:  Ms. Raffianne Doyle, U.S. Navy
  • Speaker 4:  Dr. Rahul Ramachandran, National Aeronautical and Space Agency (NASA)
  • Speaker 5:  Mr. Chris Simi, Defense Advanced Research and Development Agency (DARPA)
  • Speaker 6:  Mr. Chris Vaughan, Department of Homeland Security (DHS)

10:30–11:00 Networking Break

11:00–12:30 Session 2: AI scalability and generalization

Scope: The volume, velocity and variety of geospatial data are constantly growing at an unprecedented pace. To enable transforming and disruptive geo-knowledge extraction capability with such rich and diverse data, scalability  and robust generalization aspects are critical to understand the design of new generation of AI systems. In this session, we will discuss AI scalability constraints and opportunities being motivated by various trillion pixel challenges toward extracting decision-critical and space-time relations from large scale geospatial data.  We further discuss scalability and robustness in the context of exploiting self-supervised learning methods that are unlimited in their capacity to uncover patterns from unlabeled data.

Key questions:

  • What are some of the key challenges in scaling GeoAI systems to the trillion pixel scenario? Are these purely computational?
  • How can we ensure that future GeoAI systems are robust and reliable and why is this a core issue in their design? How do we evaluate them at scale?
  • Generalization is a fundamental challenge due to how varied the Earth is. What opportunities are there to address this challenge? Are there other aspects of generalization that need addressing?
  • GeoAI systems, particularly those based on deep learning, face the same challenges as in other domains: large collections of labeled training data, model architecture search, etc. Are there unique opportunities in the geo domain to overcome them?

Session moderator: Professor Shawn Newsam – University of California at Merced

Speakers:

  • Speaker 1: Dr. Bill Pike, Division Director, Computing & Analytics Division, Pacific Northwest National Laboratory
  • Speaker 2: Professor Henry Medeiros, Marquette University
  • Speaker 3: Dr. Kimberly Scott, Co-Founder & VP of Data Science, Astraea
  • Speaker 4: Professor Vipin Kumar, William Norris Chair in Large Scale Computing, University of Minnesota
  • Speaker 5: Dr. Dalton Lunga, Lead Scientist, ORNL
  • Speaker 6: Mr. Todd Myers, National Geospatial Intelligence Agency (NGA)

12:30-14:00 Lunch (Keynote 1: ORNL’s 10-year Vision for Computing and Data, Dr Jeff Nichols, Associate Laboratory Director – ORNL)

14:00–15:30 Session 3: Geospatial Data Infrastructure

Scope: Massive geospatial datasets collected from survey, sensing, and social media provide rich data sources and geospatial context for GeoAI. Currently, these datasets have been largely unexplored by the machine learning community and AI research in geospatial data sciences community is still at an early stage of directly adopting existing machine learning frameworks. This session will discuss major big geospatial data challenges and opportunities for broad and specialized GeoAI R&D, including difficulties in and impracticality/practicality of labeling massive geospatial datasets, the potential of leveraging existing rich data sources for GeoAI, HPC-based data-intensive AI computations, and, consequently, imminent and future needs for geospatial data infrastructure solutions that couple geospatial processes, data analytics, and AI as a scalable platform for empowering large-scale GeoAI applications. Planet scale GeoAI will require lots of ground referencing. Join us to discuss the limitations for generating such information from imagery and explore future data infrastructure to capture these labels and generate large scale training datasets.

Key questions:

  • How are existing geospatial data infrastructure being used to support GeoAI?
  • What are key research and technological challenges that existing geospatial data infrastructure or new design must address in order to facilitate broader and large-scale GeoAI R&D in our community?
  • AI for geospatial data integration: how can machine learning help automate, streamline, or assist data integration in building geospatial data infrastructure?

Session moderator: Dr Yan Liu, Staff Scientist, ORNL, Dr Fabio Pacifici, Principal Scientist, Maxar Technologies

Speakers:

  • Speaker 1:  Mr. Christopher Brown, Software Engineer, Google Earth Engine
  • Speaker 2:  Professor Begüm Demir, Professor and Head of the Remote Sensing Image Analysis (RSiM) Group, Technische Universität Berlin
  • Speaker 3:  Dr. Lexie Yang, Lead Scientist, ORNL
  • Speaker 4:  Dr. Caitlin Kontigis, Applied Science Lead, Descartes Labs
  • Speaker 5:  Dr. Sud Menon, Director, Software Product Development, Esri

15:30–16:00 Networking Break

16:00–17:00 Session 4: Edge Computing for GeoAI

Scope: The cloud has become the ideal powerhouse for processing and storage of GIS information given that it outperform the capabilities of at-the-edge devices. However, geospatial datasets are growing at an exponential rate, and there is an increased demand for real-time processing and storage that the bandwidth of current communication networks cannot match. Therefore, edge computing emerges as the solution for GIS processing needs in the not so distant future. Edge computing offers an evolving, energy-efficient, distributed processing and storage network that can facilitate real time, customizable information sharing and AI-based decision making. At this session, we will discuss the state of edge computing, identify enabling technologies and grand challenges, and define future research directions to make edge computing a disrupting agent for GeoAI.

Key questions:

  • Why edge computing is key for GeoAI?
  • When edge computing make sense and when it does not?
  • What are and how to address edge security and reliability concerns?
  • How to adapt existing devices and AI embedded  technology for GIS use?
  • Is 5G enough for GIS edge computing?
  • What are the tradeoffs between computing and power efficiency?
  • How we envision edge-computing disrupting the GIS field?
  • What are the most successful platform to take deep learning to the edge?
     

Session moderator: Dr Sophie Voisin, Lead Scientist, Dr Hector Santos-Villalobos, Group Lead, ORNL

Speakers:

  • Speaker 1:  Mr. Jay Theodore, Chief Technology Officer, Esri
  • Speaker 2:  Professor Himanshu Thapliyal, University of Kentucky
  • Speaker 3:  Dr. Jonathan Howe, Senior Solutions Architect, NVIDIA
  • Speaker 4:  Professor Raju Vatsavai, NC State University
  • Speaker 5:  Dr Nicola Ferrier, Senior Computer Scientist, Argonne National Laboratory

17:00 ORNL Tours: OLCF Summit Supercomputer

18:00 Workshop Dinner (Keynote 2: The High Stakes History of Oak Ridge,  David Keim, Director of Communications – ORNL)

Day 2

09:00–10:30 Session 5: Hardware Design and High-Performance Computing for GeoAI

Scope: To address the growing number of global challenges using GeoAI will require a complex, high-performance computing (HPC) infrastructure to handle the processing, storage, and transfer of massive amounts of geospatial information. Whether in the cloud, in a HPC facility, or at-the-edge these computing capabilities must adapt to meet the rapidly growing needs of GeoAI. This session will explore the computational challenges facing GeoAI and discuss potential and holistic architectural solutions to overcome these challenges. These planet scale GeoAI architectures that are capable of making sense of trillions of pixels streaming in daily will require tight integration of network, storage, and computing resources that span CPUs, GPUs, accelerators, and potentially domain-specific architectures such as a GeoAI spatial processor. We will discuss promising architectural solutions and highlight key challenges.

Key questions:

  • What are current architectural solutions and will they scale to meet growing needs?
  • Is there an “optimal architecture” that balances edge-computing, network bandwidth, storage, and high-performance and cloud computing? If so, what are its characteristics?
  • How can accelerators and domain-specific architectures such as a GeoAI spatial processor address current and future challenges especially related to performance and energy efficiency?
  • What is the biggest challenge in hardware design and high-performance computing for GeoAI?

Session moderator: Professor Eric Shook, University of Minnesota

Speakers:

  • Speaker 1:  Mr. Tom Reed, Director, Solution Architecture, NVIDIA
  • Speaker 2:  Dr. Mallikarjun Shankar, Advanced Data and Workflows Group Lead, ORNL
  • Speaker 3:  Dr. Rangan Sukumar, Analytic Architect, CTO office, Cray Inc
  • Speaker 4:  Dr. Frank Liu, Distinguished R&D Staff Member, ORNL
  • Speaker 5:  Dr. Ahmed Eldawy, Assistant Professor, University of California Riverside

10:30-11:00 Networking Break

11:00–12:00 Session 6: GeoAI Opportunities – Collaboration and Partnerships

Scope: Earth Observations (EO) provide invaluable big datasets over large spatio-temporal scales for monitoring the Earth and its changing environment. Throughout the last decade, along with the advancements in data-driven techniques, many AI-based algorithms have been developed for EO. However, these data are still underutilized and have great potential to impact global development. In this session, we will discuss the research opportunities to advance applications of EO for positive global impact. We will also review the tools and resources that are required to enable the broader geospatial and data science communities to collaborate and build innovative solutions using EO.

Key questions:

  • What are the existing barriers to scale the scope of geospatial research problems?
  • How can we attract more talent (e.g. students) from the data science community to work on geospatial problems?
  • What are the opportunities for generating new data or models from existing EO?
  • Are there high priority areas (applications or tools) that are under pursued? If so, how can we raise awareness on their importance across the community?

Session moderator: Dr Robert Stewart, Group Lead, Geographic Data Sciences Group, ORNL, Dr Dalton Lunga, Lead Scientist, ORNL

Speakers:

  • Speaker 1: Dr. Matej Batic, EO Research team leader, Sinergise
  • Speaker 2: Dr. Nick Weir, Data Scientist, In-Q-Tel
  • Speaker 3: Mr. Daniel Getman, Director, Solutions and Market Development, Maxar
  • Speaker 4: Mr. Neil Gaikwad, Researcher, MIT Media Lab
  • Speaker 5: Mr. Mark Korver, Geospatial Lead, Amazon Web Services

12:05 Closing Remarks

1:00-1:30 ORNL Tours: Advanced Manufacturing Demonstration Facility(1:30) – 3D Printing or The Graphite Reactor (1:00PM)