The following are excerpts from the “Decadal Survey” relevant to the HPDE. Even if the DRIVE initiative is not funded beyond current levels of effort, these recommendations can provide a guide for future efforts.
Solar and Space Physics: A Science for a Technological Society (2013)
Committee on a Decadal Strategy for Solar and Space Physics (Heliophysics); Space Studies Board; Aeronautics and Space Engineering Board; Division of Earth and Physical Sciences; National Research Council
A relatively small, low-cost initiative, DRIVE provides high leverage to current and future space science research investments with a diverse set of science-enabling capabilities. The five DRIVE components are as follows:
• Diversify observing platforms with microsatellites and midscale ground-based assets.
• Realize scientific potential by sufficiently funding operations and data analysis.
• Integrate observing platforms and strengthen ties between agency disciplines.
• Venture forward with science centers and instrument and technology development.
• Educate, empower, and inspire the next generation of space researchers.
R1.0 Implement the DRIVE Initiative
The survey committee recommends implementation of a new, integrated, multiagency initiative (DRIVE— Diversify, Realize, Integrate, Venture, Educate) that will develop more fully and employ more effectively the many experimental and theoretical assets at NASA, NSF, and other agencies.
DRIVE, part 2:
Realize: Realize Scientific Potential by Sufficiently Funding Operations and Data Analysis
The value of a mission or ground-based investigation is only fully realized, and science goals only achieved, if the right measurements are performed over the mission’s lifetime and new data are analyzed fully (see Figure 4.3). Realizing the full scientific potential of solar and space physics assets therefore requires investment in their continuing operation and in effective exploitation of data (see Box 4.1). Furthermore, a successful investigation should also include a focused data analysis program (see Box 4.2) that supports science goals that may span platforms or change throughout a mission. The following program augmentations expand the potential for new discoveries from data.
Significant progress has been made over the last decade in establishing the essential components of the solar and space physics data environment. However, to achieve key national research and applications goals, a data environment that draws together new and archived satellite and ground-based solar and space physics data sets and computational results from the research and operations communities is needed. As discussed in more detail in Appendix B: Instrumentation, Data Systems, and Technology, such an environment would include:
• Coordinated development of a data systems infrastructure that includes data systems software, data analysis tools, and training of personnel;
• Community oversight of emerging, integrated data systems and inter-agency coordination of data policies;
• Exploitation of emerging information technologies without investment in their initial development;
• Virtual observatories as a specific component of the solar and space physics research-supporting infrastructure, rather than as a direct competitor for research funds;
• Community-based development of software tools, including data mining and assimilation;
• Semantic technologies to enable cross-discipline data access.
End Box 4.1
DRIVE: Integrate: Integrate Observing Platforms and Strengthen Ties Between Agency Disciplines
Data from diverse space- and ground-based instruments need to be routinely combined in order to maximize their multiscale potential. In fact, such coordinated investigations are likely to be a crucial element of future breakthrough science and to provide new pathways for translating scientific knowledge into societal value. The idea of coordinating multiscale observations resonates both with the types of system-science questions identified by the survey’s disciplinary panels and with the Heliophysics Science Centers described in the next section (Venture). Examples might include extending “World Day” coordination of NSF radars to other ground-based and mission data collections, combining data from CubeSat arrays and larger spacecraft, GPS receiver hosting, development of distributed arrays of ground- based instruments (potentially funded by an NSF mid-scale program), and ground-based and space mission solar observational support for the ATST (see Appendix C: Suborbital Platforms and Small Explorers).
Recommendation: NASA, NSF, and other agencies should coordinate ground- and space-based solar-terrestrial observational and technology programs and expand efforts to take advantage of the synergy gained by multiscale observations.
(Appendix) B.3 DATA SYSTEMS
Data from NASA’s heliophysics missions and many ground-based observatories can be obtained currently through the web, either directly from individual sites or through central archives such as the Solar Data Analysis Center (SDAC), Space Physics Data Facility (SPDF), or National Space Science Data Center (NSSDC). These data archives are also accessible through Virtual Observatories (VxOs), whose goals are to provide one-stop access to validated science data from many observatories, along with the necessary tools for cross-mission analysis and visualization. Access to sophisticated modeling tools is provided by repositories, such as the Community Coordinated Modeling Center (CCMC). Such agency-sponsored facilities host physics-based or empirical models developed by the user community and allow users to perform their own simulations.
B.3.1 Current Status
Significant progress has been made over past decade in defining the fundamental components of the data environment (Virtual Observatories, Archives, etc.) and in starting to build and integrate them. However there continues to be a dearth of tools for using and analyzing data. However, projected data requirements for new projects are not as demanding as the leap from SOHO to SDO. New requirements can probably be met with existing technologies and software. For instance, daily generation of ATST data in 2018 is estimated to be ~4 TB, about same as the current SDO export rate. It is also noted that some segments of the research community still suffer from the lack of effective data policies enforced by sponsoring agencies.
Data systems supporting heliophysics research over the past decade have evolved from stand- alone, custom-built “stove-pipes” to distributed, interacting systems that leverage software and technologies developed by the community. Much of this welcome development has come through NASA’s Heliophysics Data Environment (HPDE) Enhancement and NSF’s CISE and Cyber infrastructure. Many heliophysics datasets and models are hosted at multiple data archives and modeling centers, each with different architectures and formats. And much of the work on data systems infrastructure is funded through individual PI teams. This results in uncoordinated software development, unpredictable support lifecycle, and data analysis tools with limited scope. Such activity also draws funds and focus away from scientific research and analysis activities, since investigators are obliged to provide data sets and analysis tools as deliverables. Unfortunately, many of the existing archives, modeling centers, and VxOs are not inter-compatible, despite significant overlap in content or access.
The current lack of coordination among data and modeling centers stems mainly from their different philosophies, emphases, formats, architectures, and purposes. One can obtain similar datasets from various nationally funded data archives, as well as from VxOs. The existence of duplicative capabilities, each with significantly different purpose and implementation philosophy, provides greater, more flexible access at the cost of generating confusion about which path to follow to the data. National and international agencies have not identified a common goal nor have they adopted a standard approach for funding and implementing data facilities and archives.
Current modeling centers, such as the CCMC, have multiple sponsors and allow researchers to run simulations using community-provided models that cover vastly different domains such as the solar corona, the solar wind, the radiation environment in the heliosphere and Earth’s radiation belts, and the magnetic and electric field environments of the magnetosphere and ionosphere. Although some space weather modeling groups have developed end-to-end models, often the component modules employ controversial techniques and are based on assumptions with inherent strengths and weaknesses. Only a small fraction of all models can be run interactively, and even fewer can be coupled. This makes it difficult to validate different models and to model interesting space weather events.
B.3.2 Future Goals and Directions
Heliophysics is poised to make a natural transition from being driven predominantly by the pursuit of basic scientific understanding of physical processes towards one that must also address more operational, application-specific needs, much like terrestrial weather forecasting. This transition requires (1) instant unfettered access to a wide array of datasets from distributed sources in a uniform, standardized format, (2) incorporation of the results of community-developed models, and (3) the ability to perform simulations interactively and to couple different models to track ongoing space-weather events.
NASA has already taken the important first step in integrating many of these datasets and tools to form the Heliophysics Data Environment (HPDE). The main objective of the HPDE is to implement a distributed, integrated, flexible data environment. HPDE modeling centers should serve as a sound foundation for a future, fully integrated heliophysics data and modeling center.
The key ingredients necessary for any successful centralized data and modeling environment are (1) full involvement of data providers, (2) rapid, open access to scientifically validated data, (3) peer-reviewed data systems driven by community needs and standards, (4) coordinated, user friendly analysis tools, (5) reliable high-performance computing facilities and data storage, (6) uniform terminology and adequate documentation describing data products and sources, (7) flexible, interoperable, and inter-connected data archives, modeling centers, and VxOs, and (8) effective communication among data providers, national and international partners, and data users.
The tremendous quantity of heliophysics data that will become available in the next decade will strain the financial, personnel, hardware, and software resources available to individual scientists, teams, and even national agencies. The dramatic advances in computing and data storage technology over the last decade are likely to continue, so the cost of future data systems and modeling centers will be dominated by personnel and software development rather than securing ultra-fast computing or data storage. To achieve these goals efficiently, the national agencies will need to develop a common approach for funding data facilities, archives, modeling centers, and VxOs and coordinate the development of data systems infrastructure that includes the development of data systems software, data analysis tools, and training personnel.
B.3.3 Opportunities in New Data Systems B.3.3.1 Community Input to and Control of the Integrated Data Environment
A number of virtual observatory and other data identification and access tools have appeared or are under development. These efforts could be strengthened, better focused, and more efficiently managed if more user feedback were incorporated into their governance, perhaps by formalizing community oversight of such emerging, integrated data systems in an ad hoc group such as the NASA Heliophysics Data and Computing Working Group. Interagency coordination of the data environment as a whole would benefit researchers whose efforts are funded by multiple agencies.
B.3.3.2 Emerging Technologies
The IT industry continues to generate novel technologies and capabilities faster than any federally funded, competitively sourced research program can hope to match. Agencies must be agile enough to exploit emerging technologies without investing in their original development. The best approach is to (1) focus on commercially viable technologies for which there is a demonstrated need, such as high performance computing clusters, and (2) otherwise invest modestly in the evaluation of emerging commercial technologies through existing mission and small-scale data center activities.
B.3.3.3 Virtual Observatories
NASA has funded virtual observatories and related “middleware” development. Some of these have led to useful targeted data identification and access technologies, and some are still under development. Mature capabilities should not continue to compete with research proposals for funding. A more effective approach would be for NASA and its agency partners to establish a heliophysics-wide data infrastructure; selecting the most useful efforts for stable funding and bringing other efforts to a close. Future developments can be managed through the supplemental funding mechanisms discussed in sections B.3.3.2 or B.3.3.4.
B.3.3.4 Community-Based Software Tools
In a few sub-disciplines, such as solar physics, availability of integrated open-source data reduction and analysis tools make a significant difference in the ability of researchers to access and manipulate data. In areas where such tools are not available, immediate agency investment in community-based development would be highly productive. Where tools are already available, support to maintain and evolve them as new data sets and capabilities emerge should continue. Capabilities should expand to include data mining and assimilation in order to enable full exploitation of the large new heliophysics datasets.
B.3.3.5. Semantic Technologies
The astrophysics and geophysics communities have taken the lead in adopting modern, “semantic” technologies, where machines “understand” the context and meaning of data, to enable cross- discipline data access. Promoting the development of semantic technology would enable the emerging data access capability in heliophysics to share data and knowledge with other fields.
B.3.3.6 A National Approach to Data Policies
The heliophysics data policies of the funding agencies differ, or are in some cases lacking. The NSF, for instance, now requires a data management plan in all research proposals, but geosciences does not yet have a uniform data access and preservation policy. NASA Heliophysics has a well developed data management policy, but long-term preservation of data is in a state of flux. It would be wise for the agencies to formulate a national policy for curation of data from taxpayer-funded scientific research. For heliophysics, the Committee on Space Weather could review and monitor agency data policies.