Data has been described as the world’s new oil, as a resource with immense potential to inform and transform daily life as well as science and engineering. The volume of newly generated scientific data is projected to exceed 40,000 exabytes by 2020. At present, however, only 20% of the world’s data are preserved online. Field, lab, and computing datasets all contribute to an improved understanding of subsurface systems. Within the United States (U.S.) Department of Energy (DOE), research to constrain and predict subsurface behavior spans oil and gas, carbon storage, groundwater, geothermal and underground waste disposal portfolios. Beyond DOE, studies of U.S. subsurface systems have contributed to a growing but disparate knowledge base of the in situ geology and pore filling media at a range of scales and resolutions.
To support energy research and development, we are using a combination of open-source and custom developed advanced computing capabilities to facilitate development of a virtual subsurface data framework for the U.S. This framework requires the integration of research, scientific, and engineering data resources, including subsurface characterization, modeling, and analytical datasets. However, subsurface datasets are still challenging to access, discontinuous in scale, and variable in resolution, but the proliferation of online data and improved curation of DOE Fossil Energy (FE) research products, there are significant opportunities to advance access and knowledge of subsurface systems. NETL’s Energy Data eXchange (EDX) is an online platform designed to address research data needs by improving access to energy R&D products through advanced search and access capabilities and hosts private, virtualized trust communities to support more efficient and effective, multi-organizational R&D.
Through the private and public side of EDX, NETL researchers are assembling and hosting structured and unstructured data and resources related to the subsurface U.S. These resources are being integrated with those curated from DOE and open-sources. The EDX team has also implemented a big data, machine learning custom SmartSearch tool to help locate and source data from the whole of the worldwide web. This tool is currently in beta testing, but has been used, along with custom data integration and database generation scripts, to produce a database of open, global oil and gas infrastructure resources that is being paired with the virtual subsurface data catalog. All of these resources will be hosted in EDX through its standard search capabilities, but are also being implemented in EDX’s web-mapping tool, Geocube. Geocube allows users to visualize surface and subsurface spatial datasets, add their own datasets to the visualization, and download resources if desired. Ultimately, EDX’s public-private capabilities seek to facilitate more effective research for subsurface scientists and increasingly pairs EDX hosted resources with other online capabilities, data, custom machine learning algorithms and capabilities to enhance user experience, and provide research teams with the resources needed to make subsurface energy research more efficient, reduce redundancy, and drive innovation.
|Acceptance of Terms and Conditions||Click here to agree|