Open & Democratize Methods for Geospatial Data Analysis

Supporting development of new open source tools to vastly improve collaboration, efficiency, accessibility and reproducibility in geospatial data analysis

The Problem

Geospatial data forms the foundation of environmental research but accessing and effectively using its necessary software, skills, and datasets is often a challenge. Furthermore, there is significant change afoot in how geospatial data is processed and shared, with cloud-native analysis workflows becoming increasingly common and sometimes necessary. On top of this, the high costs of software licenses, cloud computing, and data access create substantial barriers that limit the impact and effectiveness of this work. The future of geospatial data analysis demands removing these barriers to improve access and inclusion in the environmental and data science fields more broadly. 

 

The Opportunity

Imagine a future in which a climate scientist, an ecologist, and a city planner can seamlessly collaborate through the cloud and explore land management strategies that combine diverse data. This data could include climate projections with species evolution models and land usage boundaries from Nairobi to New York City. DSE believes this future is possible and has co-launched GeoJupyter to meet this unique opportunity.
 

We are leveraging the Jupyter ecosystem, which encompasses open-source software maintained by a global community that empowers collaboration, accessibility, and innovation in data science and research. Its interactive tools such as Jupyter Notebooks enable users to explore, visualize, and share insights while promoting data reproducibility and transparency. Nearly 10 million Jupyter notebooks have been made public by GitHub users and Nature deemed Jupyter one of 10 computer codes that transformed science. We are also grateful to DSE’s Co-Director Fernando Pérez (who co-founded Project Jupyter) and our collaborators at the UC Berkeley Geospatial Innovation Facility for lending their relevant expertise in teaching and research. 

 

Key Highlights

  • A collaborative effort, building on a close partnership with European Open Source developers (QuantStack in Paris, Simula in Oslo) and funded by the European Space Agency.
  • Prototyping new open source tools for geospatial data analysis that leverage the open-source Jupyter ecosystem to vastly improve collaboration, efficiency, accessibility and reproducibility in geospatial data analysis.
  • Conducting user interviews to better understand current pain points, needs, and future vision from a variety of users in education, research, industry, and others.

 

Our Impact

 

Our collaborators at QuantStack, with seed funding separately provided by the European Space Agency (ESA), have already developed a prototype and DSE will support development. This prototype leverages both the Jupyter ecosystem and the open source QGIS software, and is being openly developed in the JupyterGIS repository. It currently provides essential capabilities including:

 

  • Interactive Geospatial Mapping: Visualizes geospatial data in JupyterLab, which enables more detailed, interactive, and confident geospatial analysis.
  • Collaborative Editing: Allows multiple users to interact with data simultaneously, while ensuring that users retain full control over their data.

 

We are convening members of the geospatial research community and gathering feedback on our work. To date we have brought together over 20 leading geospatial researchers and open source software engineers who are eager to help build our vision. We have met regularly over the past year to share insights and develop strategy going forward, and have begun interviewing users to identify leading features to prioritize. Finally, we are developing roadmaps for contributions and collaboration from the Open Source Community. For more information please visit the GeoJupyter website.

 

Future Vision

DSE will continue refining and enhancing prototypes to ensure they meets the specific needs of geospatial programmers and scientists in 2025. Upcoming community events in the works include a workshop at the Community Surface Dynamics Modeling System (CSDMS) annual meeting and a multi-day, hybrid hackathon in May 2025. 

 

We are excited to be among the first users of GeoJupyter and look forward to integrating it into DSE projects. For example, GeoJupyter will help inform our forthcoming research on how urban development affects wildlife migratory patterns in the greater Yellowstone area of Wyoming, and how geospatial mapping can support urban planning efforts.

DSE Contributors

  • Image
    Matt Fisher

    Matt Fisher

    Research Software Engineer
    Eric and Wendy Schmidt Center for Data Science and Environment at Berkeley
  • Image
    Fernando Pérez

    Fernando Pérez

    Faculty Director & Associate Professor
    Statistics at UC Berkeley
  • Image
    Kristin Davis

    Kristin Davis

    Postdoctoral Researcher
    Stone Center for Environmental Stewardship
    Eric and Wendy Schmidt Center for Data Science and Environment at Berkeley
  • Image
    Ciera Martinez

    Ciera Martinez

    Senior Program Manager
    Eric and Wendy Schmidt Center for Data Science & Environment at Berkeley