Trillion Pixel 2023

June 21, 2023

The Way Forward in SCALING multimodal GeoAI

Update – 2023 Workshop report is now available

The emergence of virtually ubiquitous geospatial data, astonishing progress in artificial intelligence, transformational advances in cloud infrastructure, and high-performance computing are converging to create unprecedented detail in mapping and interpreting the Earth’s surface. Rapid innovations in satellite and airborne remote sensing capabilities will soon collect geospatial data with high throughput and cadence. Tremendous opportunities and challenges for seeking end-to-end GeoAI systems can be brought to bear, providing insights into how humans occupy, interact, and alter the planet’s surface over time for society’s benefit. The implications for science, policy, and national security are extraordinary.

In 2019 we held the 1st series of the “Trillion-Pixel GeoAI Challenge.” With an overwhelming agreement, the community established this as a modern-day moonshot for science and technology. In 2021 we built on the outcomes of the 1st series and presented “The Frontiers of Trillion-Pixel GeoAI End-to-End Systems.” In 2023, we set our eyes and present “Multimodal Trillion-Pixels: GeoAI Challenges.” Join us for an interdisciplinary gathering of experts from image science, computer vision, high-performance computing, architecture, machine learning, advanced workflows, and end-user communities to discuss Beyond Trillion-Pixels Multimodal GeoAI Challenges.

Day 1:  June 21, 2023
Venue: ORNL Building 5200, Tennessee Conference Rooms

Welcome

8:45-8:50 EDT 


Introductory Remarks

8:50-9:00 Dr. Moe Khaleel, Associate Laboratory Director, National Security Sciences Directorate, Oak Ridge National Laboratory

Session 1
09:00 - 10:30 Agency Grand Challenges and Use Cases

Moderators - Dr. Carter Christopher, Distinguished R&D Staff Member and Section Head, Oak Ridge National Laboratory and Dr. Dawn King, Senior AI/ML Geophysical Earth Scientist, National Geospatial-Intelligence Agency

A decade of evolution in sensing instruments and computing hardware is behind the astounding rise of AI and its impact on society. AI technology is now at the frontline of diagnosing diseases, translating languages, and transcribing speech. Yet in the context of geographic knowledge discovery, a growing number of emerging national security and open science challenges are revealing glaring weaknesses in current GeoAI systems. In this 3rd in a series of workshops, we seek to understand both existing and emerging grand geospatial use cases and the limiting factors to advancing the well-being and security of our society.

Key Questions

  • Revisiting grand challenges and use cases with GeoAI: What are the emerging GeoAI successes and Impacts? Can we do more? 
  • What are the gaps, and limitations of GeoAI addressing Agency uses cases in 2023? 
  • The next GeoAI: Toward Interdisciplinary GeoAI. Given these emerging successes and recognition of those limitations, what do we expect new GeoAI systems to look like for solving grand challenges, and engaging with multimodal use cases across disciplines?

Panelists

  • Mr. Daniel Cotter, Senior Advisor, US Department of Homeland Security
  • Mr. Mark Munsell,  Deputy Director, Data and Digital Innovation Directorate, National Geospatial-Intelligence Agency
  • Mr. Kevin Murphy, Chief Science Data Officer, National Aeronautics and Space Administration 

10:30–11:00 Networking Break

Session 2
11:00 - 12:30 AI Foundations

Moderators - Dr. Dalton Lunga, GeoAI Group Lead & Sr. R&D Scientist, Oak Ridge National Laboratory and Dr. Rahul Ramachandran, Senior Research Scientist, National Aeronautics and Space Administration

AI-enabled analytic tools promise to synthesize available Earth observation data more effectively and provide unexpected, impactful solutions relevant to various national security and climate science challenges, such as energy grid resilience, nuclear nonproliferation, and risk-based predictive systems for complex socio-economic and physical environments. But currently, the development of application-specific GeoAI models is resource intensive and costly in gathering large amounts of high-quality data, relearning common representations, and underutilization of other unlabeled data modalities. This puts the community at a significant disadvantage in advancing cutting-edge AI-enabled geospatial workflows to maintain decision superiority for national security and open science challenges. Join us in this forward-looking session to discuss the identification and production of relevant foundation GeoAI models for decision support. Our charge is to probe the emerging trends in large AI models and formulate building blocks pertinent to the trillion-pixel GeoAI challenges emerging from natural and man-made events.

Key Questions

  • What are the pathways to developing foundation models that will efficiently admit a broad class of derived models relevant to trillion-pixel GeoAI challenges?
  • What are the new research and development thrusts needed to advance the foundation models in GeoAI?
  • How do these thrusts build upon existing strengths (especially large-scale high-performance computing facilities and cloud computing across the science and security complex)?
  • What are the best ways to define the drivers and measures of success of the GeoAI foundation models across several downstream application challenges?
  • What are the building blocks for developing robust transparency and responsible GeoAI systems? What are the metrics for setting boundaries for the safe and appropriate use of GeoAI products?

    Panelists


    • Dr. David Alexander, Senior Science Advisor for Resilience, U.S. Department of Homeland Security 
    • Dr. Raghu Ganti, Principal Research Scientist, IBM
    • Dr. Philipe Dias, Research Scientist in Computer Vision and Machine Learning,  Oak Ridge National Laboratory
    • Dr. Kaleb Smith, Senior Data Scientist, NVIDIA AI Technology Center
    • Mr. Kumar Ramasubramanian, Computer Scientist, National Aeronautics and Space Administration 

    12:30-14:00 Lunch

    Session 3

    14:00–15:30 Geospatial Data Infrastructure

    Moderators - Dr. Yan Liu, Computational Scientist, Oak Ridge National Laboratory and Dr. Manil Maskey, Senior Research Scientist, National Aeronautics and Space Administration

    Massive geospatial datasets collected from surveys, sensing, and social media provide rich data sources and geospatial context for GeoAI. These datasets are then curated to fit the needs of a specific application. However, the GeoAI community still faces enormous challenges integrating large, spatially heterogeneous, multi-modal geospatial datasets into existing machine learning frameworks. This session will discuss major geospatial data challenges and opportunities related to Analysis Ready Data (ARD) and interoperability of multimodal data products, including difficulties in labeling massive geospatial datasets, the development of geospatial data schemas for machine learning models, data curation and validation, the potential to leverage existing rich data sources for GeoAI, HPC-based data-intensive AI computations, and, consequently, future needs for geospatial data infrastructure solutions that couple geospatial processes, data analytics, and AI as a scalable platform for empowering large-scale applications.

    Key Questions

    • How is existing geospatial data being collected, managed, and used to support GeoAI?
    • What are non-trivial geospatial data-specific challenges in GeoAI? How do geospatial aspects of data — such as data sources, coordinate system, resolution, and spatial extent — affect data pipeline components in existing machine learning models, including data loader, sampler, and augmentation?
    • What are key research and technological challenges that existing geospatial data infrastructure or new designs must address to facilitate broader and large-scale GeoAI R&D in our community?
    • AI for geospatial data integration: How can machine learning help automate, streamline, or assist data integration in building geospatial data infrastructure?
    • What are the data challenges and opportunities related to ARD and interoperability of multimodal data products?

    Panelists

    • Dr. Samantha Arundel, Acting Research Director, Center of Excellence for Geospatial Information Science, U.S.G.S.
    • Dr. Rasmus HouborgPrincipal Geospatial Fusion Scientist, Planet Labs
    • Dr. Brian Freitag, Research Scientist, National Aeronautics and Space Administration
    • Mr. John Wegrzyn, Senior Research and Development Engineer, UMBRA

    15:30–16:00 Networking Break

    Session 4

    16:00–17:00 Edge Computing

    Moderators - Professor Katie Schuman, University of Tennessee and Dr. David Page, Distinguished R&D Staff Member and Section Head,  Oak Ridge National Laboratory

    Geospatial edge devices — which include satellites, drones, and other systems that are typically size, weight, and power limited — play an important role in the emergence of trillion pixel challenges. These edge devices are the front line of the emerging global geospatial enterprise that collects daily more than 100 trillion pixels over the surface of the Earth. The current trend of connecting these edge systems to large, federated “cloud” computing demands ever-increasing bandwidth to handle the volume and velocity of raw data where a single satellite generates more than 100 terabytes per day. The primary mode of connection is through wireless networks to push this volume of data from the edge to centralized processing. Unlike this high-bandwidth network-based approach, edge computing offers an opportunity to reduce bandwidth requirements and to improve latency and security whereby processing and computing are distributed into the edge devices and condensing raw data into more manageable information. GeoAI, neuromorphic computing, quantum computing, and other advances offer exciting research avenues for edge computing to address SWAP limitations while simultaneously reducing downstream bandwidth requirements and enhancing decision-making loops. At this session, we will discuss the state of edge computing, identify enabling technologies and grand challenges, and define future research directions to make edge computing a disrupting agent for GeoAI-based decision-making.

    Key Questions

    • Why is edge computing key for GeoAI?
    • Billion-parameter foundation models are enabling new capabilities but present memory limitation challenges for edge computing devices. What are the new research thrusts — quantization, pruning algorithms, etc.? — to extend foundation model capability and unlock the benefits for edge computing?
    • What security and reliability concerns exist, and how should they be addressed?
    • How can advanced computing methods (i.e., neuromorphic and quantum computing) impact GeoAI edge devices?
    • How can GeoAI reduce bandwidth requirements in edge systems (i.e., drone video networks)?
    • What are some of the best edge deployment practices for obtaining model reliability and monitoring?
    • Achieving the best speed/accuracy in neural architecture search for edge deployment requires optimal trade-offs. What are the trade-offs and how best do we achieve optimal architectures?

    Panelists

    • Professor Katie Schuman, University of Tennessee, Knoxville
    • Dr. Angel Yanguas-Gil, Principal Materials Scientist, Argonne National Laboratory
    • Dr. Erica Montbach, Manager, NASA Planetary Exploration Science
    • Dr. Hamed Alemohammad, Director, Center for Geospatial Analytics, Clark University

    17:30 Dinner on Campus 

    KeynoteBetter (geospatial) data, greater (societal) relevance: challenges and opportunities on the path to a net-zero world, Dr. David  McCollum, Sr. R&D Staff, Oak Ridge National Laboratory

    Day 2 June 22, 2023
    Venue: ORNL Building 5200, Tennessee Conference Rooms

    Session 5

    09:00–10:30 HPC Hardware and Software Architectures

    Moderators - Professor Eric Shook, University of Minnesota and Dr. Jitendra Kumar, Research Staff Member, Oak Ridge National Laboratory

    Application of GeoAI to growing volumes of geospatial information will require efficient, high-performance computing (HPC) hardware and network infrastructure and scalable software frameworks to transfer, store and process massive amounts of data. Planetary-scale GeoAI architectures capable of learning and inferring from trillions of pixels streaming daily will require tight integration of network, storage, and computing resources that span CPUs, GPUs, accelerators, and potentially domain-specific architectures such as a GeoAI spatial processor. Also needed are scalable open-source software ecosystems to analyze these datasets leveraging state-of-the-art computational resources. This session will explore the computational challenges facing GeoAI and discuss potential and holistic architectural solutions to overcome these challenges.

    Key Questions

    • What are current architectural solutions, and will they scale to meet growing needs?
    • Is there an “optimal architecture” that balances edge computing, network bandwidth, storage, and high-performance and cloud computing? If so, what are its characteristics?
    • How can accelerators and domain-specific architectures such as a GeoAI spatial processor address current and future challenges related to performance and energy efficiency?
    • What is the biggest challenge in hardware design and high-performance computing for GeoAI?

    Panelists

    • Dr. Prasanna Balaprakash, Director of AI Programs, Oak Ridge National Laboratory
    • Dr. Erwin Gilmore, Senior AI/ML Specialist,  Amazon Web Services 
    • Professor Vijay Gadepally, Massachusetts Institute of Technology
    • Dr. Mo Sarwat, CEO, Wherobots Inc.
    • Dr. Forrest Hoffman, Distinguished  Computational Earth System Scientist , Oak Ridge National Laboratory

    10:30-11:00 Networking Break

    Session 6

    11:00–12:00 Workforce Requirements and Partnerships

    Moderators - Professor Shawn Newsam, University of California, Merced and Dr. Lexie Yang, Research Scientist, Oak Ridge National Laboratory

    Effective partnerships will be fundamental for addressing the challenges and opportunities arising in the multimodal GeoAI frontier. Researchers and practitioners with expertise in different GeoAI data modalities will need to come together to make progress on societal-scale challenges. Enabling AI model-based solutions to solve GeoAI grand challenges will require even more interdisciplinary collaboration due to the scale of resources and complexities of policy-making needed to sustain the ecosystem. What types of talent do we need to bring together now to build such an ecosystem? What types of talent do we need to train to sustain and extend it? What are the key elements needed to foster a workplace or collaborative environment that will support generations to carry on the mission and solve the grand challenges?

    Key Questions

    • How do we attract and train the next generation of GeoAI talent, both short-term and long-term?
      • How can we make GeoAI more attractive in the short term, especially for undergraduate and graduate students?
      • We probably can’t compete for salary-wise with Silicon Valley, so how can we do better around messaging--such as how GeoAI has a central role in solving societal-scale challenges such as climate change, etc.?
      • How do we plant the seeds for the long term so that K-12 students become interested in GeoAI or similar new educational, workforce, and research frontiers?
      • What does a competitive Spatial Data Science / GeoAI curriculum look like? What is the mix of GIS, geography, remote sensing, physics, computer science, and statistics?
    • What institutions, agencies, sectors, and communities are not sufficiently represented or engaged in GeoAI?
    • What are the potential pathways or key elements to engage with multiple sectors (potentially diverse data sources) more effectively?
    • Certain GeoAI efforts will likely follow the trend toward large models similar to the large language models driving services like ChatGPT. How do we ensure, especially through collaborations and partnerships, that such models remain open and documented to the research community?

    Panelists

    • Dr. Dawn King, Senior AI/ML Geophysical Applied Scientist, National Geospatial-Intelligence Agency
    • Professor Orhun Aydin, St. Louis University
    • Dr. Subit Chakrabarti, Vice President of Technology, Floodbase
    • Dr. Matej Batic, Research Team Lead, Sinergise

    12:00 Closing Remarks. Followed by a Guided Tour of  Frontier Supercomputer (the World's Fastest) and Summit Supercomputer ( the World's 5th Fastest)