VOs in Space and Solar Physics:  Draft Workshop Agenda (2004-10-19)

27-29 October 2004, Greenbelt, MD Marriott Hotel

 

Wednesday October 27

 

(8 am  Focus group leaders meet)

 

8:30 am Welcome from LWS

 

The Historical Setting: Overview of the history and importance of the VO effort (aka "Space Science Data System")  (Ray Walker)

 

The Larger Setting: VOs and the electronic Geophysical Year (Dan Baker or Volodya Papatashvili)

 

Some thoughts from NSF (Kile Baker et al.)

 

What NASA needs from us  (Chuck Holmes/Joe Bredekamp)

 

9:30 am A Framework for Discussion and the Science Context (Aaron Roberts)

 

The SPASE Data Model for Space and Solar Physics  (Space Physics Archive Search and Extract)

 

10:30 am Break & Posters

 

11 am What's inside the "small box"?  (Virtual Solar Observatory) 

 

A first "large box" and a plan for higher-order queries  (Virtual Space Physics Observatory)

 

Noon  Lunch & Posters

 

1:30 pm VxOs as subfield and repository-connection organizers  (Virtual Heliospheric Observatory)

 

Registries and metadata-driven searches in a solar-terrestrial grid context  (European Grid of Solar Observations)

 

Connecting data and services with SOAP (NSSDC/SPDF CDAWeb/SSCWeb)

 

3 pm Break & Posters

 

3:30 pm Creating and Integrating Space Science Services (Collaborative Sun-Earth Connector)

 

4 pm Focus group questions; charge to the group Aaron Roberts

Focus Groups meet; see below for discussion guidance. 

 

(5 pm Focus group leaders meet)

 

Thursday October 28

 

8:30 am Focus Group continuation

 

10:30 am  Break & Posters

 

11 am First presentation of focus group results and discussion.  Decisions on time needed for focus group vs plenary group work. 

 

Noon  Lunch & Posters

 

1:30 pm  A brief plenary demo time for people with www sites to show to everyone; possible plenary discussion continuation.  

 

Focus group continuation

 

3:30 pm Presentations and plenary discussion

 

 

Friday October 29

 

9 am  Focus group presentations/meetings as needed.  Wrap-up and conclusions.  Action items.  Report outline.

 

Noon Lunch; end of main meeting. 

 

(1:30 pm  Focus group leaders meet (as airlines, etc. allow, and with as much agency attendance as possible) in the afternoon to firm up outlines and assignments.)


 

Focus Group Questions:

 

Registries and Data Models (R. Bentley, MSSL; J. Mukherjee, SWRI)

 

(Possible specific questions: How do we describe products in a universal language?  Where and how should these products be registered?  What is the right level of detail?)

 

Possible conclusions:

(1) All products should be registered in a central registry according to a common data model.  A web-based tool should be available to make this easy.  

(2) Products should be registered in a common way at the repository sites, with a known URL for queries.

(3) Products should be documented, whenever possible, down to the level of the physical quantities contained, e.g., in the columns of a representative file. 

(4) SPASE provides the best starting point for a Data Model, and other efforts should be integrated with theirs.

(5) ESGO provides the best starting point for a Data Model, and other efforts should be integrated with theirs. 

(6) The community needs an international standards body, possible as a subset of an existing body, to provide a unified approach.  [This could be SPASE; this could also be overkill.]

 

Gateway/Brokers and front ends (R. Bogart, Stanford; V. Rezapkin, Aquilent)

 

(e.g., What are the best ways to use product registries?  What types of standards should we have for product access APIs? How do we avoid excessive duplication of effort in designing "middleware"? Can we make a "Broker in a Box"?  Is this desirable for linking particular datasets?)

 

 

Possible conclusions:

(1) The role of a "broker" is to connect user and/or VO-designed software ("front-ends") to repositories and services; as such, standards for APIs, data-model based, are needed.

(2) A broker should generally become a service, looking to the outside at least in part as an integrated repository.

(3) There is no need for API standards, since translators will be easily constructed. 

(4) Front-end applications (including web pages) should be encouraged to proliferate; the best will survive, and even low-use applications may be very useful. 

(5) At least some front-end applications should be developed by broad consortia that involve much community input. 

 

Repositories and "VxOs" (A. Szabo, GSFC; A. Davis, Caltech)

 

(e.g., How do data repositories "join" the VO environment?  Is a minimum service level required (e.g., is ftp for files too little)?  How can subfields best organize their data and make it available to a larger community? How do we assure tools for analysis are available for all the data?)

 

Possible conclusions: 

(1) We need simple procedures ("cookbooks") for repositories to make their data/services machine accessible.  (VSO, SSCWeb, and CDAWeb can provide SOAP examples.) 

(2) VxO's should assure that the data in a subfield are as comprehensive and accurate as possible, and should direct funding (within a grant) to needed improvements.

(3) VxOs should unify access to data where appropriate (e.g., all ground-based magnetometer chains), and provide machine access to these data as one entity.  This could be done, for example, by reformatting data and making a common database, or through a web services approach with a broker and translators. 

(4) VxOs should not necessarily make new brokers, but should take advantage of existing services as much as possible.

 

 

Services (N. Hurlburt, LMSAL; M. Weiss, APL)

 

(e.g., What services beyond data discovery and access, such as format translation, coordinate conversion, and visualization, are most important?  How are these services best coupled to the data access mechanisms?)

 

Possible priority services:

(1) Software for basic reading and analysis of datasets

(2) Event-driven searches for data

(3) Complex criteria ("higher-order") searches for data

(4) Format translation (or translation to ASCII)

(5) 2-D Graphics and 3-D visualization

(6) Coordinate transformations

(7) Various links to models which and how?

(8) Guides to the significance and typical uses of datasets

 

Possible connection and orchestration mechanisms ("science workflows"):

(1) Add all services through CoSEC.

(2) Integrate specific services with some VOs.

(3) Provide services through repositories (as in the CDAWeb SOAP service)

(4) Provide web service interfaces to services to allow them to be used by all VOs. 

What standards are needed?  What do we have to decide about the architecture?

 

Science Benefits (all)

 

Why are VOs worthwhile?  What new science do we expect?  Will the same tasks be a lot more efficient?  Would that be enough? 

 

Management (all)

 

How should the VOs be organized?  To what extent is a central authority needed, and to what extent should it just be a standards group?  How useful will it be to have a central location for broker software, etc., or will this all be effectively handled through distributed services? 

 

Visionary Ideas (all)

 

What should the data environment look like ten years from now?  How do we build the current system to make the transition to this easier?  What tools and methods are on the horizon that we should be looking to use?