Modernising Earth Science Data for High Performance Computing
Black and Green Motherboard. Image: Unsplash
This project, supported by the National Collaborative Research Infrastructure Strategy (NCRIS), aims to future-proof Australia’s valuable geophysical datasets by transforming them into formats compatible with high-performance computing (HPC).
Overview
This project, in collaboration with the Australian National University (ANU), the Geoscience Research Community, and the Geological Surveys, aims to harmonise existing datasets into modern HPC-compatible formats suitable for in-situ processing on AuScope’s NCI high-performance data and computing platform.
There has been significant investment to date in acquiring continental-scale geophysical datasets. This project will ensure that existing datasets can be accessed in coherent formats to enable new types of innovative, data-intensive research at a continental scale. The project will also collaborate with leading International research infrastructure projects such as Geo-INQUIRE, ChEESE, EPOS and EarthScope, which are migrating some of their datasets to new HPC-compatible formats, thus enabling the Australian datasets and Australian researchers to be part of global leading-edge HPC experiments.
This work is also part of a broader national effort to enhance data accessibility, interoperability, and reusability in line with FAIR, CARE, and TRUST principles.
The Challenge
This project aims to address a critical challenge: many of Australia’s geophysical datasets - such as those from AusLAMP and AusARRAY - were collected with legacy instruments and stored in outdated formats. These datasets have been challenging to integrate with newer acquisitions, and limit the potential for advanced computational analysis.
This project aims to enable seamless data integration, support AI-driven research, and position Australian scientists to participate in global research and experimentation by remastering these collections into HPC-ready formats and enriching them with machine-actionable metadata.
Expected Outcomes
Technical report on a 6-monthly basis detailing progress on project milestones.
Generate and publish South Australian AusLAMP Level 0/1 products.
Onboard and publish the identified priority data/metadata.
Publish outputs associated with investigations of methods for automatically generating rich metadata.
What are the benefits?
National Dataset Integration: This project will harmonise the legacy and new geophysical datasets (e.g. AusLAMP, AusARRAY) into consistent, HPC-compatible standards like HDF for research and policy.
HPC and AI Readiness: This project will convert data into modern formats (e.g., HDF/ASDF) for scalable, in-situ processing on high-performance computing platforms, enabling advanced analytics and machine learning.
Enhanced Transparent Metadata: This project will help embed FAIR, CARE, and TRUST principles to ensure data is findable, accessible, ethically traceable, and reproducible across research workflows.
Enable Global Collaboration: This project will assist Australian datasets by aligning them to international standards and repositories, enabling their participation in international HPC experiments (e.g. Geo-INQUIRE, EPOS, EarthScope).
Assist Open Access and Discoverability: This project will help publish datasets and tools via NCI and Research Data Australia with open-source software and documentation to support community reuse and innovation.
Who will benefit?
This project will benefit a wide range of shareholders and partners across the geoscience and research communities. By modernising legacy datasets into HPC-compartible formats, it empowers researchers data scientists and government agencies - such as Geoscience Australia, State and Territory Geological surveys, and University Teams - to conduct high-solution and data-intensive analyses at continental scales.
Access
Data Access: All project data hosted and oublsihed by NCI will be discoverable through the NCI data catalogue and Researhc Data Australia. Data will also be made available through open data services on NCI (e.g. THREDDS) and directly accessible for analysis via NCI’s other computing services.
Tool Access: Project software outputs will be published viua GitHub; Geophysics Specialised Research Environment will be maintained to support access via HPC.
Computing Time is either via AuScope’s partner share or via other accessible schemes at NCI (e.g. NCMAS, institutional shares, contracted services).
Additional project documentation wil be published and maintained via NCI’s documentation website (OPUS).
Acknowledging AuScope
Acknowledging AuScope and NCRIS helps us demonstrate the value of our research infrastructure, ensuring continued support and resources for the research community. Please add this sentence to your research, publications, presentations, and events where possible.
This project was made possible by support from the National Collaborative Research Infrastructure Strategy (NCRIS) through AuScope.
For more examples of acknowledgment, please visit our How to Acknowledge AuScope page.
We’d love to see your work—please tag us on your social media using:
@auscope | #AuScopeImpact | #NCRISimpact
Project Name
Modernising Earth Science Data for High Performance Computing
Project Lead
Timeframe
July 2025 to June 2027
Status
Active
Funding
Pilot 4
Host
Australian National University
NCRIS Collaborators
Pawsey
National Computational Infrastructure (NCI)
Other Collaborators
The University of Adelaide (UoA)
Geoscience Australia
State & Territory Geological Surveys
AuScope Programs