Sie sind auf Seite 1von 32

DATA MANAGEMENT

FOR MARINE GEOLOGY


AND GEOPHYSICS

Tools for Archiving,


Analysis, and Visualization

WORKSHOP REPORT
LA JOLLA, CALIFORNIA
MAY 14-16, 2001
TABLE OF CONTENTS

Executive Summary ..................................................................................................................................... 1

1. Overview .................................................................................................................................................. 3
1.1 Motivation ........................................................................................................................................... 3
1.2 Objectives ........................................................................................................................................... 5
1.3 Agenda ................................................................................................................................................ 6
1.4 Evaluation ........................................................................................................................................... 6
1.5 Relevant URLs ..................................................................................................................................... 6

2. Working Group Summaries and Recommendations ................................................................................ 7


2.1 Working Group 1 - Structure of a Data Management System ............................................................... 7
2.2 Working Group 2 - Data Archiving and Access .................................................................................... 9
2.3 Working Group 3 - Data Documentation ........................................................................................... 12

Appendices ................................................................................................................................................ 16
Appendix 1: List of Attendees .................................................................................................................. 16
Appendix 2: Final Agenda ....................................................................................................................... 18
Appendix 3: Workshop Evaluation .......................................................................................................... 20
Appendix 4: Relevant URLs ...................................................................................................................... 24

Support for this workshop was provided


by the National Science Foundation and
Office of Naval Research.
ORGANIZING COMMITTEE

Steve Cande William Ryan


SIO/UCSD LDEO
9500 Gilman Dr. 61 Rte 9W
La Jolla, CA 92093-0220 Palisades, NY 10964
cande@gauss.ucsd.edu billr@ldeo.columbia.edu
Tel: 858-534-1552 Tel: 845-365-8312
Fax: 858-534-0784 Fax: 845-365-8156

Suzanne Carbotte, Co-Chair Deborah Smith, Co-Chair


LDEO Department of Geology
61 Rte 9W and Geophysics, MS #22
Palisades, NY 10964 WHOI
carbotte@ldeo.columbia.edu Woods Hole, MA 02543
Tel: 845-365-8895 dsmith@whoi.edu
Fax: 845-365-8168 Tel: 508-289-2472
Fax: 508-457-2187
Stephen Miller
SIO/UCSD Dawn Wright
9500 Gilman Dr. Department of Geosciences
La Jolla, CA 92093-0220 Oregon State University
spmiller@ucsd.edu Corvallis, OR 97331-5506
Tel: 858-534-1898 dawn@dusk.geo.orst.edu
Fax: 858-534-0784 Tel: 541-737-1229
Fax: 541-737-1200
EXECUTIVE SUMMARY

ON MAY 14 -16, 2001 the National Science Foundation and the Office of Naval Research sponsored a

workshop on Data Management for Marine Geology and Geophysics: Tools for Archiving, Analysis, and Visu-

alization. The workshop was held at the Sea Lodge Hotel in La Jolla, CA. The workshops objective was to bring

together researchers, data collectors, data users, engineers, and computer scientists to assess the state of

existing data management efforts in the marine geology and geophysics (MG&G) community, share experi-

ences in developing data management projects, and help determine the direction of future efforts in data

management.

The workshop agenda was organized around presen- MANAGEMENT SYSTEM


tations, plenary discussions, and working group dis-
3. Manage data using a distributed system with a
cussions. The presentations provided examples of the
central coordination center.
needs of data users, the needs of large, multidisci-
4. Manage different data types with user-defined
plinary MG&G projects, existing data management
centers.
projects in the community, tools that have been de-
5. Support area- or problem-specific databases if
veloped for data access and analysis, examples of or-
scientifically justified, but these databases should
ganizations with centralized databases, and current
link to rather than duplicate data holdings within
topics in information technology.
discipline-specific data centers.
Working groups addressed questions concern-
6. Fund core operating costs of the distributed data
ing three different themes: (1) the structure of a data
centers as 3-5 year facility cooperative agree-
management system, (2) data archiving and access,
ments.
and (3) data documentation. The working groups were
7. Evaluate the data management system using
also asked to recommend strategies to permit MG&G
oversight and advisory committees, in-depth peer
data management to move forward in each of these
reviews at renewal intervals, and ad hoc panels
areas.
to assess each data centers contribution to sci-
The Working Groups came up with 20 recom-
ence.
mendations:
DATA ARCHIVING AND ACCESS
OVERARCHING
RECOMMENDATIONS 8. Always archive raw data. Archive derived data
for high-demand products.
1. Create permanent, active archives for all MG&G
9. Store data in open formats.
data.
2. Create a centralized and searchable on-line
metadata catalog.

1
2

10. Develop standardized tools and procedures to 19. Require level 3 metadata* within each discipline-
ensure quality control at all steps from acquisi- specific data center. Archiving of publications
tion through archiving. related to the data should also be included (level
11. Improve access to common tools for data analy- 4 metadata*).
sis and interpretation for the benefit of the com- 20. Follow nationally accepted metadata standards
munity. (particularly for levels 1 and 3 metadata*).
12. Build data centers to address the needs of a di-
verse user community, which will be primarily A clear top priority of the workshop participants is to
scientists. immediately define and establish a centralized
13. Enforce timely data distribution through funding metadata catalog. The metadata catalog should be
agency actions. broad, containing information on as many data types
14. Promote interactions among federal agencies and as possible. It should support geospatial, temporal,
organizations, and international agencies to de- keyword, and expert-level searches of each data type.
fine data and metadata exchange standards and By definition, metadata are information about data that
policies. can evolve. The catalog should be a circular system
that allows feedback from the user/originator. The
DATA DOCUMENTATION
metadata catalog should serve as the central link to
15. Require ship operators and principal investiga- the distributed network of data centers where the ac-
tors (P.I.s) to submit level 1 metadata* and cruise tual data reside.
navigation to the centralized metadata catalog To move forward, funding agencies must estab-
at the end of each cruise as part of the cruise- lish a small working group or advisory board to de-
reporting process. velop the structure and implementation of a metadata
16. Generate a standard digital cruise report form and catalog. Additional working groups for each of the
make it available to all chief scientists for cruise high-priority, discipline-specific data centers also need
reporting (level 2 metadata*). to be assembled. It is critical to obtain the active in-
17. Require individual P.I.s to complete and submit volvement of scientists in all aspects of this process
standard forms for level 1 and 2 metadata* for through all operational phases, including data collec-
field programs carried out aboard vessels not in tion, processing, archiving, and distribution.
the University-National Laboratory System Section 2 of this report discusses these recom-
(UNOLS) fleet (e.g., foreign, commercial, other mendations further.
academic platforms).
18. Generate a standardized suite of level 1 and level *Metadata levels: Level 1. Basic description of the field program
including location, program dates, data types, collecting institu-
2 metadata* during operation of seafloor obser-
tions, collecting vessel, and P.I.s. Level 2. A digital cruise report and
vatories and other national facilities (e.g., the
data inventory. Level 3. Data object and access information includ-
Deep Submergence Laboratory, Ocean Bottom ing data formats, quality, processing, etc. Level 4. Publications de-
Seismograph (OBS) Instrument Pool), and sub- rived from the data.
mit to the central metadata catalog.
1. OVERVIEW

1.1 MOTIVATION
MG&G SCIENTIFIC data collections are grow- available for shallow-water problems, and digital ac-
ing at a rapid rate (Figures 1 and 2). Processed and quisition of 480 channel data is currently routine for
analyzed data made available to a broad community deep-ocean work (Figure 5).
of scientists, educators, and the general public can be With these new technologies it is becoming in-
used for discovering and distributing new knowledge. creasingly difficult for individual investigators to syn-
A significant problem is how to provide data users with thesize and organize data sets collected on single
the means to effectively access these data and the tools cruises let alone manage them in a manner that al-
to analyze and interpret them. lows data to be accessed efficiently by a larger user
Advances in data storage technology have elimi- pool. National archiving of some marine geoscience
nated practical constraints on storing large data vol- data is carried out. There have been several attempts
umes and have permitted data collection at increas- by individual P.I.s to establish geographic- or data-
ingly finer sample rates. New high-resolution systems specific databases. However, access to many data types
provide digital images of the seafloor at sub-meter pixel remains difficult and incomplete (e.g., MCS, multibeam
resolution and generate data at rates on the order of bathymetry, sidescan sonar, camera, and video imag-
Gigabytes per day (Figures 3 and 4). Seismic acquisi- ery). Large quantities of data are under-utilized by
tion capabilities have greatly expanded with long-term primary users and gaining access to these data is vir-
deployment of bottom sensors. High-resolution mul- tually impossible for secondary users. At the same time,
tichannel seismic reflection (MCS) systems are now our scientific interests are increasingly interdiscipli-

24.0 1000
23.0
22.0
21.0
NGDC Data Growth
20.0
19.0
18.0 100
Archive Size (terabytes)

17.0
16.0
15.0
14.0
33
13.0
Terabytes

12.0 10
11.0
10.0
9.0
8.0
7.0 1
6.0
5.0
4.0 17
3.0
2.0
1.0 0.1
0.0
9
Jul-92

Jul-93

Jul-94

Jul-95

Jul-96

Jul-97

Jul-98

Jul-99

Jul-00

Jul-01
Jan-92

Jan-93

Jan-94

Jan-95

Jan-96

Jan-97

Jan-98

Jan-99

Jan-00

Jan-01

GSN FDSN JSP Other US Regional PASSCAL 0.01


1986 1988 1990 1992 1994 1996 1998 2000 2002 2004

Fig. 1. Growth in the Incorporated Research Institutions for Seis- Year


mology (IRIS) data archive since 1992. Data holdings from a variety Fig. 2. Growth in the digital data holdings of marine geophysical
of different seismic networks are shown (GSN Global Seismo- data at the National Geophysical Data Center (NGDC) since 1990.
graphic Network, FDSN Federation of Digital Broad-Band Seis- Dashed lines show data doubling times of 9, 17 and 33 months.
mograph Networks PASSCAL Program for the Array Seismic Stud- Figure provided by George Sharman (NGDC, NOAA).
ies of the Continental Lithosphere). Figure provided by Tim Ahern
(IRIS).

3
4

Figure 3. Sun-illuminated perspective view of the Eel River margin Figure 4. Three-dimensional shaded relief map of the East Pacific
of Northern California. Multibeam bathymetry (EM1000, Rise near 9o-10oN. This is currently the best-studied section of fast-
Hydrosweep and Seabeam data) are merged with the USGS 30 m spreading mid-ocean ridge. Figure courtesy of Dawn Wright (OSU).
DEM for the adjacent land. Three-dimensional visualization of the
merged topography is carried out using Fledermaus from Interac-
tive Visualization Systems and Analysis. (Fonseca, L , Mayer, L. and
Paton, M., ArcView Objects in the Fledermaus Interactive 3-D Vi-
sualization System: An example from the STRATAFORM GIS, in
Wright, D.J. (ed.), Undersea With GIS, Redlands, CA: ESRI Press, in
press, 2001).

To obtain a new grid at an appropriate resolution

Drag mouse to select a profile

Lat: Lon: Depth displayed at curser location

Figure 5. Example of a multichannel seismic reflection record from Figure 6 Example of capability provided by MapAp, a web-based
the northwest shelf of Australia. Seismic interpretation of various map driven, database interface developed for the RIDGE Multibeam
reflectors is superimposed with various colors. Data are stored and Synthesis project (see Appendix 4). Figure shows a multibeam
displayed within the GEOQUEST IESX seismic-well log integrator/ bathymetry map of Axial Seamount, NE Pacific, with a user-defined
data browser. The power engine of IESX is an Oracle-based data- profile location and corresponding bathymetry profile displayed.
base system that organizes seismic, well log and geographical data Figure provided by Bill Haxby (LDEO).
in a local environment for interpretation. Figure courtesy of Garry
Karner (LDEO).
nary and require easy access to the broad spectrum
of data collected. Throughout the marine geoscience
community, scientists want access to data, the ability
to compare data of different types, and tools to ma-
nipulate and interpret these data (Figures 6, 7, 8).
With these concerns in mind, the National Sci-
ence Foundation (NSF) and Office of Naval Research
(ONR) sponsored a workshop on MG&G data man-
agement on May 14-16, 2001 in La Jolla, California.
The coordinating committee advertised the workshop;
participation was open. Approximately 80 represen-
tatives from science, engineering, computer science,
government, and industry attended (Appendix 1). The Figure 7. The Virtual Jason Control Van is a web-based application

workshop provided a forum for a focused interchange that takes real-time snapshots of information occurring inside the
control van during vehicle operations and makes this information
of ideas among data users, data providers, and tech-
immediately available for shipboard scientists and for collabora-
nical experts in information technology.
tion and post-cruise analysis on shore. Features include monitor-
ing real-time operations, searching for events, dates, etc. Figure
courtesy of Steven Lerner (WHOI).
1.2 OBJECTIVES
The overall goal of the workshop was to develop a
strategy for MG&G data management that would meet
Experimental Data Processing
scientists needs for improved data access and im- Ocean
Data
proved tools for data analysis and interpretation. Ac- Matlab
complishing this goal will lead to greater use of data Adjust Constraints
by the MG&G community and the education and out- Parameters Geodynamic Seismic
Application Velocity
reach community. Model Parameters

Another workshop objective was to provide NSF


Seismic Published Result
Velocity
and ONR with recommendations on how to implement Model

this data management strategy. The organizing com- Viz


Visualize Model Space
Matlab
Add Physics
mittee thus created three thematic working groups:
(1) structure of a data management system, (2) data
Figure 8. Schematic diagram of scientific workflow. There are three
archiving and access, and (3) data documentation.
stages that modelers go through when developing a computational
Each group discussed key problems within their theme result: (1) experimental data processing; (2) parameter adjustment;
and provided a list of recommendations on how to and (3) publication of the result. It is important that the modeler be
solve them. Critical to implementing these recommen- able to link tools easily to the output of computational applications
dations is active involvement of scientists in the entire (e.g., to visualize the data). Figure courtesy of Dawn Wright (OSU).

process, including collecting, processing, archiving,


and distributing data.

5
6

1.3 AGENDA
The first day of the workshop was devoted to short The presentations on days 1 and 2 served as cata-
talks, each followed by a brief discussion. In addition, lysts for discussions that were held within the theme
a longer discussion followed each group of subject- working groups, each of which consisted of an inter-
specific talks. The longer discussion was led by a pre- disciplinary group of scientists, engineers, and com-
assigned discussion leader. Our intent was to engage puter scientists. These working groups addressed a
the participants in the meeting right from the start number of questions that formed the basis for presen-
through the discussion. tations in the morning of the third day of the work-
The workshop began with presentations from shop.
data users. The talks focused on problems that P.I.s The full agenda is given in Appendix 2. Presen-
have had in the past with gaining access to data, and tations and poster abstracts can be obtained through
possible solutions to these problems. Representatives the workshop conveners.
from large, multidisciplinary MG&G programs gave
overviews on anticipated database needs for new pro-
1.4 EVALUATION
gram initiatives. Individual P.I.s made presentations
on database projects which they initiated, providing An evaluation form was included in the workshop
working examples of data access and functionality over packet that participants received. The forms were col-
a range of disciplines. In the late afternoon of the first lected at the conclusion of the workshop. The re-
day, workshop participants presented models for data sponses to the questions have been compiled and are
access. The format of this session was somewhat dif- presented in Appendix 3.
ferent as the talks served as introductions for demon-
strations that were part of the evening poster session.
In addition to these invited demonstrations, the evening
1.5 RELEVANT URLS
session included posters and demonstrations contrib- During the meeting, participants were asked to pro-
uted by workshop participants. vide links to web sites that are relevant to MG&G da-
The second day of the workshop began with pre- tabase efforts. This URL list is provided in Appendix 4.
sentations by representatives of organizations with
large central databases. Talks focused on anticipated
future directions in data access and database design
as well as insights on successes and major obstacles
encountered during their efforts to date. The final set
of talks focused on current developments in informa-
tion technology, including data mining issues and de-
signing databases to serve real-time data.
2. WORKING GROUP SUMMARIES

2.1 WORKING GROUP 1


Structure of a Data Management System

Working Group 1 considered how to structure a MG&G attendees and the MG&G community, in general. The
data management system. Currently, some data are consensus of Working Group 1 is that the community
archived at the National Geophysical Data Center must begin taking small, concrete steps towards es-
(NGDC), such as the suite of underway geophysical tablishing a metadata catalog. From there the com-
data (navigation, gravity, magnetics, topography) col- munity should move towards a discipline-oriented,
lected on most large UNOLS vessels. However, it is distributed data management system that will improve
not standard practice to submit to NGDC all data col- the data use by a broad community. Development of
lected by the MG&G community. the discipline-oriented data centers should be handled
Several ship-operating institutions have archived through the normal competitive proposal process. Al-
data at some level. However, no standardization across though participants agreed that significant resources
institutions exists and these efforts have been carried are needed for new database efforts, exact details of
out at the discretion of the individual institutions. The the level of government agency funds for the manage-
need for a sound data management system is recog- ment system were not determined.
nized, and a few workshops have been held to ad-
RECOMMENDATIONS
dress this problem for specific data types (e.g., MCS
Workshop, La Jolla, CA, 1999). In addition, individual WG1_1. Create permanent, active archives for all
P.I.s have made specific data types available to the MG&G data.
broader community (see Appendix 4). It is evident that It is very important that the funding agencies
there is no community-wide strategy in place to solve maintain and strengthen their commitment to long-
MG&G data management problems. term data archiving. As noted at the meeting, data col-
lected by the HMS Challenger are still being used. Per-
QUESTIONS CONSIDERED
manent archives for all types of MG&G data must be
Working Group 1 addressed the following questions: established. The community must continually add to
What model is appropriate for a data manage- and update these permanent archives.
ment system (e.g., distributed versus centralized)?
How do we fund the data management system? WG1_2. Manage data using a distributed system
How do we evaluate the system? with a central coordinating center.
Do we need a permanent archive? The management system should operate as close
to the data generators as possible. Scientists must be
There was clear agreement within the group that the actively involved in data management, placing the re-
MG&G community needs a distributed data manage- sponsibility for and authority over the data as close
ment system with a coordination center to facilitate as possible to where the expertise resides. Data qual-
communication among different data centers. The ity control should be provided by those generating the
working group session started with a discussion of data. Mechanisms should be developed to enable us-
metadata, indicating the importance of metadata to ers to easily provide feedback on data quality.

7
8

A coordination center is necessary to facilitate needs. There is a critical need for one-stop ac-
communications among the distributed data centers, cess quality-control and processing centers with
and to ensure that everyone works together. A good tools to generate higher level products. There does
example of central coordination is the OBS Instrument not seem to be a quality-control process in place
Pool. The individual instrument centers provide qual- although the MB-System software provides tools
ity control and write standard format data. The Incor- for reading a broad suite of multibeam data.
porated Research Institutions for Seismology (IRIS) 7. Deep submergence data collected by near-bottom in-
then archives the data and provides community ac- struments (submarines, remotely operated vehicles
cess to them. (ROVs), autonomous underwater vehicles (AUVs),
etc.). In principle, these data should be managed
WG1_3. Manage different data types with user-de- in the same way that other shipboard data are
fined centers. managed. A data management plan must be de-
Examples of different data types and their man- fined and overseen, perhaps through the Deep
agement status are given below. The list is not all in- Submergence Science Committee (DESSC), the ex-
clusive. isting operators and user group.
1. Ocean bottom seismograph/hydrophone (OBS/OBH) 8. Gravity/magnetic data. NGDC maintains archives
data. Quality control is provided by the three OBS of these data, but there are major quality-con-
instrument centers and archival and community trol and user-interface problems. The commu-
access is provided by IRIS. nity concerned about these data needs to be de-
2. Rock petrology/geochemistry data. A web-served fined. Value-added products (derived products)
database is being developed to provide metadata should be archived and made available to the
and processed results for rock samples. This ef- broad community.
fort is ready for migration to permanent support. 9. Sedimentology, paleontology data. Although it was
3. Core/rock collections. Sample curation appears to noted that problems exist, there were too few rep-
be in good shape. NGDC maintains a central cata- resentatives from these communities at the work-
log of the existence of physical samples and some shop to define the issues and possible solutions.
sample metadata. NSF should encourage mini-workshops or work-
4. Ocean Drilling Program (ODP) data. ODP developed ing groups for these data.
the JANUS database based on community rec-
ommendations that came out of several work- WG1_4. Support area- or problem-specific data-
shops. It appears to be in good shape. There are bases if scientifically justified, but these databases
plans in place to transition the database from ODP should link to rather than duplicate data holdings
to IODP in 2003. within discipline-specific data centers.
5. Single channel and multichannel seismic data. Working Group 1 recognizes that there might be
A workshop was held in 1999 to determine the a future need to set up databases for specific oceanic
needs of the community for database manage- regions or for specific scientific goals. Examples of
ment. Recommendations were made from that area-specific databases are those for the 9oN area of
workshop. An interested subgroup of the MCS the East Pacific Rise and the Juan de Fuca region. Ex-
community needs to define the model details and amples of problem-specific databases are those that
submit a proposal to NSF. will develop from the MARGINS and RIDGE programs.
6. Multibeam sonar data (bathymetry, sidescan, back- These databases should be supported, but they should
scatter, LIDAR, etc.). This community needs a user/ serve as links to discipline-specific databases and
generator workshop or working group to define should not duplicate data holdings within these data-
the problems and solutions to their database bases.
WG1_5. Evaluate the data management system us- communitys needs. Selection of data centers should
ing oversight and advisory committees, in-depth be determined through competition, and a data cen-
peer reviews at renewal intervals, and ad hoc pan- ter should not expect to be funded permanently.
els to assess each data centers contribution to sci-
ence. WG1_6. Fund core operating costs of the distrib-
The data management system should undergo uted data centers as 3-5 year facility cooperative
regularly scheduled peer review. A new set of advi- agreements.
sory groups representing the broad spectrum of the This is a corollary to recommendation WG1_5 in
MG&G research community should be established. that funds should cover a finite number of years after
This will ensure that the recommendations regarding which each of the data centers should be evaluated
data sets and models will be responsive to the for effectiveness and responsiveness to users needs.

2.2 WORKING GROUP 2


Data Archiving and Access
Working Group 2 focused primarily on data archiving form data set for a region. Multibeam bathymetric sys-
issues. Problems associated with current MG&G tems are currently operated on most deep-ocean ves-
archiving efforts range from complete absence of an sels within the academic fleet, but standards do not
archive for many important data types, to lack of qual- exist for navigation editing, beam editing, or even the
ity control and inadequate data delivery to archives. nature of the final data product (corrected or uncor-
While ship operators deliver underway geophysical rected meters). Some multibeam data are archived with
data to the NGDC (Figure 9), there are no standards the NGDC, some with the ship operating institutions,
for data quality and it can be difficult to obtain a uni- and some within problem-specific archives (e.g., the

Figure 9. World map showing over 15 million miles of ship tracks with underway geophysical data inventoried within
NGDCs Marine Trackline Geophysics Database. Bathymetry, magnetics, gravity, and seismic reflection data along
these tracks from 4600 cruises were collected from 1939 to 2000. Figure provided by John Campagnoli, NGDC.

9
10
6
Data Storage RIDGE Multibeam Synthesis Web Site). However, de-
5
livery to these data archives is largely at the discre-
tion of the P.I., and access to these data and many
4 other data types is often difficult.
Log (PB/yr)

Data ownership issues continue to be significant


3 obstacles to archiving efforts. Although NSF policy per-
mits a two-year proprietary hold on data collected by
2
a P.I., data are commonly held well past this time pe-
EOS riod.
1 NOAA
Moores Law Recognition for contribution to data archives is
0
an important issue that has not been well addressed
2001 2004 2007 2010
by any existing archives.
Figure 10. Predicted future growth in data holdings at NGDC com-
pared with data storage capability predicted by Moores law for the QUESTIONS CONSIDERED
next 10 years. Figure demonstrates that expected data storage ca-
pability should be more than adequate to handle expected data
Working Group 2 considered the following questions:
volumes. Figure courtesy of Herbert Kroehl, NGDC. What data need to be archived?
Should we archive raw data, processed data and/
or interpretations?
Worldwide Users
How can we ensure quality control?
Query
Local Analysis How do we provide broad data access for scien-
Reproducibility
Comparability
tists and the public?
Data What tools are needed for data interpretation and
Retrieval Digital Library (Curation)
Authorization Metadata analysis?
Cataloging
Search and Retrieval
How do we enforce timely data distribution?
How do we reward data contribution to archives?
QA/QC Data Repository
Persistent Names
(Archival) (Accession Numbers)
Anomaly Detection Overall priorities were defined as: (1) save the data,
and Reporting ADO
Data Upload (2) provide a catalog of all of the data, and (3) provide
easy access to the data. With advances in storage tech-
Worldwide QA/QC Data
Publication
nology and web access, the bulk archiving of data is
Contributors
feasible, but our community will be challenged in the
Figure 11. Arbitrary Digital Objects (ADO) are produced when con- areas of quality control and metadata generation (Fig-
tributed data are uploaded to the data repository. The ADO is as- ures 10 and 11). A centralized metadata catalog of all
signed a persistent and unique name within the repository. These data-collection activities was viewed by this group as
and other metadata are passed to the digital library function where a very high priority. To build this catalog, the working
they can be searched using a catalogue database. Key elements of
group recommended the development of easy-to-use
this process are the continuing involvement of the authors of the
tools for automatic metadata generation aboard
data and the maintenance of a dialogue between data users and
authors or their successors. Another important consideration is the UNOLS vessels. The data types to be archived include,
separation of the metadata catalogue search function from the data but are not limited to:
repository and delivery function. Both become more portable and Underway geophysical data, including time, po-
reliable when functionally separated. Figure from J. Helly, T. T. Elvins, sition, magnetics and gravity, multibeam bathym-
D. Sutton, D. Martinez, S. Miller, S. Pickett and A. M. Ellison, Con- etry, sidescan sonar, single and multichannel
trolled Publication of Digital Scientific Data, Communications of the
seismics.
ACM (accepted October 3. 2000).
Standard supporting data, including sea state, to enter archives, common sanity and geographic-
XBT, CTD, sea surface temperature and salinity, bounds checks need to be applied. Circular archives
and derived sound velocity profiles, as well as are needed to permit content to be updated as errors
calibration data for each sensor. are found by users, with appropriate notations in
Station information for dredging, coring, trawl- metadata. The peer-review process in electronic jour-
ing, and other over-the-side operations, with nals can provide broad-based quality assessment, and
complete data or metadata, as appropriate. the publication of data briefs in electronic journals is
Some individual-investigator instrumental data encouraged.
need not be saved, but metadata with time, posi-
tion, and contact need to be archived. For ex- WG2_4. Improve access to common tools for data
ample, it may not be appropriate to archive test analysis and interpretation for the benefit of the
data collected from a prototype sensor, but the community.
existence of these data should be documented A combination of public domain and commercial
and preserved. tools are used widely, including GMT, MB-System,
ArcView, Fledermaus, and Matlab. A data center should
RECOMMENDATIONS
maintain a list of suitable software for viewing and
WG2_1. Always archive raw data. Archive derived analyzing each data type, instructions on installation,
data for high-demand products. data exchange and usage for our community, and con-
Current data storage capability is adequate for tacts for further assistance. Custom, open-source de-
on-the-fly generation of some types of derived data velopment may be needed for special tools and inter-
products. However, some types of derived data should faces for community-wide use.
be archived for high-demand products such as
multibeam bathymetry grids or maps, or when non- WG2_5. Build data centers with the goal of address-
trivial processing steps are required (e.g., MCS data). ing the needs of a diverse user community, which
Easy retrieval of these derived, value-added will be primarily scientists.
products (e.g., images of reflection profiles, grids, Platform-independent, browser-based data ex-
bathymetric maps, graphs) must be developed. Ev- traction tools are needed. The authoritative metadata
eryone benefits from maximum use of the data includ- catalog should support geospatial, temporal, keyword,
ing scientists, the government, and the public. and expert-level searches of each data type. A feder-
ated system of distributed data centers, easily updated
WG2_2. Store data in open formats. and synchronized, will provide reliable and efficient
Tested, portable, and noncommercial software for delivery for each data type. Data should be archived
data translation must be freely available for all users. in a form easily used by other disciplines. Experience
has shown that well-organized, image-rich, search-
WG2_3. Develop standardized tools and procedures able databases will serve the needs of both research-
to ensure quality control at all steps from acquisi- ers and the public (Figure 12).
tion through archiving.
Standardized shipboard tools are the first step to WG2_6. Enforce timely data distribution through
ensuring quality control. Easy-to-use, real-time data funding agency actions.
quality monitoring tools are needed for UNOLS ves- Raw data should be delivered to the designated
sels, as well as cost-effective ship-to-shore commu- data center immediately following each cruise. The
nication of sufficient compressed data for quality as- designated center will restrict data access to the P.I.
sessment and troubleshooting. Before data are allowed and identified collaborators for an initial proprietary

11
12

hold period. This lock is released when the period ex- WG2_7. Promote interactions among federal agen-
pires. The standard period is two years, although some cies and organizations, and international agencies
circumstances may warrant an extension to be granted to define data and metadata exchange standards and
by the cognizant funding agency. policies.
Auditing access to data will provide usage statis- The community would benefit from the standard-
tics and facilitate interdisciplinary collaboration as well ization of forms, such as an end-of-cruise digital data
as the communication of future updates, within the form, as well as from metadata content and exchange
restrictions of privacy requirements. standards. We encourage collaboration among the fed-
The NSF Final Report could include a field to de- erally mandated agencies (NSF, ONR, USGS, NOAA,
scribe how the P.I. complied with NSF data-distribu- NAVO, etc.) to review marine database standards.
tion policies. Noncompliance might have a negative International discussions should be encouraged
effect on future proposals. Data publication in citable to define exchange standards and policies. At a mini-
journals and in technical briefs such as USGS open- mum, exchange of cruise tracks and sample locations
file reports should be encouraged. would be a major benefit for cruise planning.

2.3 WORKING GROUP 3


Data Documentation

Working Group 3 focused on metadata issues. The de- form metadata are collected during federally funded
velopment of appropriate metadata and metadata stan- MG&G field programs. Basic information regarding
dards for ocean floor and other types of oceanographic cruise location, date, project P.I.s, and data types col-
data are an extremely important issue. The growth in lected can be difficult to obtain, and no central and
information technology has led to an explosion in the comprehensive catalog is available. Cruise reports of-
amount of information that is available to researchers ten contain detailed information regarding general ex-
in many fields. This is the case in the marine environ- periment configuration, data calibration, and data
ment where a state-of-the-art visual presence (e.g., quality, all of which are of great importance for sub-
through long-term monitoring by cameras and other sequent data analysis. In many instances, the cruise
instruments) may result in the acquisition of data that report may be the only record of this information, but
quickly overtakes the speed at which the data can be no easily accessible digital archive of these reports
interpreted. The paradox is that as the amount of po- exists.
tentially useful and important data grows, it becomes
QUESTIONS CONSIDERED
increasingly difficult to know what data exist, the ex-
act location where the data were collected (particu- Working Group 3 addressed the following questions:
larly when navigating at sea with no landmarks), How do we move toward metadata standards?
and how the data can be accessed. In striving to man- How do we standardize data collection proce-
age this ever-increasing amount of data, and to facili- dures?
tate their effective and efficient use, compiling What is the role of the ship operating institu-
metadata becomes an urgent issue. tions in the archiving of data and generation of
Although metadata are contained within some of metadata?
the digital data file formats commonly used to store What existing software and structures should we
MG&G data (e.g,. MGD77 and SEGY formats), no uni- take advantage of?
How should we deal with real-time data acquisi-
tion?

To aid the development of a standardized procedure


for generating metadata during MG&G studies, four
levels of metadata were defined, each defining a par-
ticular stage of the data-acquisition to publication pro-
cess:
Level 1. Basic description of the field program
including location, program dates, data types,
collecting institutions, collecting vessel, and P.I.s.
Level 2. A digital cruise report and data inven-
Howland Contour Interval = 125 meters
tory. Tokelau Seamounts
Grid Size = 180 meters
Sun Azimuth at 340

Level 3. Data object and access information in-


176 40'W 176 30'W
cluding data formats, quality, processing, etc.
Level 4. Publications derived from the data.

00 AVON2-27
Responsibility for each metadata level could reside with 1 00'N -50 1 00'N

different groups (e.g., ship-operating institution or P.I.) -4000


0
-300
and some metadata generation could be automated
AVON2-26

and standardized across UNOLS vessels. The construc- -20


00

0 50'N 0 50'N
tion of a central metadata catalog for levels 1 and 2
metadata was viewed as the highest priority. The group
consensus is that level 1 metadata should be gener-
AVON2-29
-4000 AVON2-28
ated during data acquisition and should be submitted 0 40'N 0 40'N
-5000
to the central metadata archive immediately following
a field program. Level 2 metadata should also be ar-
chived within the central metadata catalog, whereas
176 40'W 176 30'W
level 3 metadata would reside with the actual data
themselves. The appropriate archive for level 4 -6500 -6000 -5500 -5000 -4500 -4000 -3500 -3000 -2500 -2000 -1500 -1000 -500 0
Depth (m)

metadata may be both the central metadata catalog NSF OCE97-30394 Institute of Geophysics & Planetary Physics, Scripps Institution of Oceanography, UCSD, USA

and the individual data centers. The requirements for


Figure 12. Example of an on-line database where the interface per-
levels 1 and 2 metadata should be standardized
mits users to search for data available from seamounts. Bottom fig-
whereas level 3 requirements will vary by data type.
ure shows a contoured bathymetry map for Howland seamount.
The groups consensus is that a first step toward Figure courtesy of Anthony Koopers and Stephen Miller (SIO).
a central metadata catalog is to develop and imple-
ment procedures for metadata collection for all future
MG&G data-acquisition efforts. Archiving and rescue
efforts for legacy data should be handled as a parallel
but secondary priority and should begin with cata-
loging existing data.

13
14

RECOMMENDATIONS

WG3_1. Create a centralized and searchable on-line leg ID (if appropriate). Metadata standardization is very
metadata catalog. important. Metadata and data need to be handled
The metadata catalog should be broad, contain- separately for maximum efficiency.
ing information on as many data types as possible. It
should support geospatial, temporal, keyword, and WG3_3. Generate a standard digital cruise report
expert-level searches of each data type. By definition, form and make it available to all chief scientists for
metadata are information about data that can evolve. cruise reporting (level 2 metadata).
The catalog should be a circular system that allows These digital forms should be uniform across all
feedback from the user/originator. The metadata cata- federal agencies for all future cruises and should be
log should serve as the central link to the distributed submitted to the centralized metadata catalog.
network of data centers where the actual data reside. Old cruise reports should be digitized, perhaps
Selection of an organization to develop and main- from the NOAA National Oceanographic Data Center
tain this metadata catalog should be through a com- (NODC) archive, as a parallel effort. Standard report-
petitive process. The organization will oversee the de- ing should include essential fields described above as
velopment of metadata entry tools for easy entry into well as specific details for each data type (e.g., data
the metadata catalog. A high performance storage sys- ranges for each data type, acquisition quality control
tem to archive and serve the catalog to the commu- records, number and location of sample stations). The
nity should also be implemented. responsible individual and physical location where
each data type will reside following a cruise should be
WG3_2. Require ship operators and P.I.s to submit identified.
level 1 metadata and cruise navigation to the cen-
tralized metadata catalog at the end of each cruise WG3_4. Require individual P.I.s to complete and
as part of the cruise reporting process. submit standard forms for level 1 and 2 metadata
This function should be provided by the techni- for field programs carried out aboard non-UNOLS
cal support staff aboard UNOLS vessels, although the vessels (e.g., foreign, commercial, other academic
ultimate responsibility for generating and delivering platforms).
these data should lie with the project P.I. Tools need Not all field programs carried out by MG&G re-
to be developed to facilitate this task, simplifying the searchers involve UNOLS vessels, and procedures need
process with a smart web form. Standard forms should to be developed that permit the cataloging of data col-
be used on all UNOLS vessels and for all kinds of data- lected during these programs as well.
collection activities (chemical, physical, biological, and
geological studies). Level 1 metadata along with cruise WG3_5. Generate a standardized suite of level 1 and
navigation should be submitted. 2 metadata during operation of seafloor observato-
UNOLS may be an appropriate organization to ries as well as other national facilities (e.g., the Deep
manage the metadata submission process (and pos- Submergence Laboratory, OBS Instrument Pool) and
sibly the catalog), perhaps through modification of the submit to the central metadata catalog.
UNOLS electronic ship request form. Metadata need The metadata required should parallel that ac-
to be defined, but should include items such as the quired from UNOLS operations with additional fields
chief scientist(s), project P.I.(s), institution(s), data as relevant. Navigation from submersibles, ROVs, and
types collected, dates of field program, geographic AUVs needs to be captured and archived along with
coordinates of the field area, ship name, and cruise support-ship navigation.
WG3_6. Require level 3 metadata within each dis-
cipline-specific data center.
Required metadata for a specific data type will
likely vary and will be decided through development
of individual data centers. These metadata include, for
example, descriptions of data formats, retrieval infor-
mation, data quality, and processing procedures. Ar-
chiving of publications related to the data should also
be included (level 4 metadata).

WG3_7. Follow nationally accepted metadata stan-


dards (particularly for metadata levels 1 and 3).
A national content standard for metadata has al-
ready been established by the Federal Geographic Data
Committee (FGDC). The standard is being migrated to
match international ISO metadata standards, and fully
outlines as much vital information as possible per-
taining to a data sets source, content, format, accu-
racy, and lineage (i.e., what processing changes the
data set has gone through over time). The content stan-
dard was developed by the FGDC primarily for GIS
and satellite remote-sensing data as one way of imple-
menting the National Spatial Data Infrastructure
(NSDI).
We recommend taking advantage of these efforts.
The FGDC standard is extremely complex, but small
portions of it will be very useful in the creation of work-
able metadata standards for the various subdisciplines
of MG&G.

15
16

APPENDIX 1: LIST OF ATTENDEES


May 14-16, 2001

NAME AFFILIATION E-MAIL


Ahern, Tim -SPEAKER ................................ IRIS ......................................... tim@iris.washington.edu
Arko, Robert ............................................... LDEO ...................................... arko@ldeo.columbia.edu
Barclay, Andrew ......................................... Univ. of Washington ................ abarclay@whoi.edu
Bartling, William ........................................ SciFrame, Inc. .......................... wbartling@sciframe.com
Baru, Chaitan ............................................. UCSD ...................................... baru@sdsc.edu
Batiza, Rodey ............................................. NSF ......................................... rbatiza.nsf.gov
Bemis, Karen .............................................. Rutgers Univ. ........................... bemis@rci.rutgers.edu
Benson, Rick .............................................. IRIS ......................................... rick@iris.washington.edu
Bobbitt, Andra -SPEAKER ........................... OSU ........................................ bobbitt@pmel.noaa.gov
Brovey, Robert ............................................ Exxon/Mobil ........................... rlbrove@upstream.xomcorp.com
Cande, Steve -CONVENER .......................... SIO/UCSD ............................... scande@ucsd.edu
Carbotte, Suzanne -CONVENER ................. LDEO ...................................... carbotte@ldeo.columbia.edu
Caress, David ............................................. MBARI ..................................... caress@mbari.org
Carron, Michael ......................................... Naval Oceanogr. Office ............ carronm@navo.navy.mil
Case, James D. ........................................... SAIC ........................................ jcase@mtg.saic.com
Chayes, Dale .............................................. LDEO ...................................... dale@ldeo.columbia.edu
Childs, Jonathan ......................................... USGS ....................................... jchilds@usgs.gov
Christie, David - SPEAKER .......................... OSU ........................................ dchristie@oce.orst.edu
Cochran, James .......................................... LDEO ...................................... jrc@ldeo.columbia.edu
Cushing, Judy -SPEAKER ............................ Evergreen State ....................... Judith.Cushing@inria.fr
Domenico, Ben -SPEAKER .......................... Unidata/UCAR ........................ ben@unidata.ucar.edu
Eakins, Barry .............................................. SIO/UCSD ............................... beakins@ucsd.edu
Epp, David ................................................. NSF ......................................... depp@nsf.gov
Fornari, Daniel -DISCUSSION LEADER ....... WHOI ...................................... dfornari@whoi.edu
Gahagan, Lisa ............................................ UTIG ....................................... lisa@utig.ig.utexas.edu
Gaudet, Severin SPEAKER ......................... Canadian Astr. Data Center ..... Severin.Gaudet@nrc.ca
Gourley, Mike -SPEAKER ............................ CARIS ...................................... Gourley@caris.com
Griffin, Nicole ............................................. Univ. of New Orleans .............. ngriffin@uno.edu
Habermann, Ted -SPEAKER ........................ NOAA/NGDC .......................... Ted.Habermann@noaa.gov
Heath, G. Ross ........................................... Univ. of Washington ................ rheath@u.washington.edu
Helly, John -SPEAKER ................................. UCSD ...................................... hellyj@ucsd.edu
Henkart, Paul ............................................. SIO/IGPP ................................. phenkart@ucsd.edu
Hey, Richard ............................................... HIGP/SOEST ........................... hey@soest.hawaii.edu
Howland, Jonathan ..................................... WHOI ...................................... jhowland@whoi.edu
Jenkins, Chris ............................................. Univ. of Sydney ....................... cjenkins@es.usyd.edu.au
Karner, Garry -SPEAKER ............................ LDEO ...................................... garry@ldeo.columbia.edu
Kent, Graham ............................................. SIO/IGPP ................................. gkent@ucsd.edu
Knoop, Peter .............................................. Univ. of Michigan .................... knoop@umich.edu
Koppers, Anthony ...................................... SIO/IGPP ................................. akoppers@ucsd.edu
Kroehl, Herb -SPEAKER .............................. NGDC ..................................... hkroehl@ngdc.noaa.gov
NAME AFFILIATION E-MAIL
Kurras, Greg ............................................... Univ. of Hawaii/ HMRG ........... gkurras@soest.hawaii.edu
Langmuir, Charlie -SPEAKER ...................... LDEO ...................................... langmuir@ldeo.columbia.edu
Lawrence, Richard -SPEAKER ..................... ESRI ........................................ rlawrence@esri.com
Le Bas, Tim ................................................ SOC ......................................... tlb@soc.soton.ac.uk
Lehnert, Kerstin .......................................... LDEO ...................................... lehnert@ldeo.columbia.edu
Lenhardt, W. Christopher ........................... CIESIN ..................................... clenhardt@ciesin.columbia.edu
Lerner, Steve -SPEAKER .............................. WHOI ...................................... slerner@whoi.edu
Lonsdale, Peter -SPEAKER .......................... SIO/UCSD ............................... plonsdale@ucsd.edu
Mayer, Larry -SPEAKER .............................. Univ. of New Hampshire ......... lmayer@unh.edu
McCann, Mike ............................................ MBARI ..................................... mccann@mbari.org
McDuff, Russell .......................................... Univ. of Washington ................ mcduff@ocean.washington.edu
Miller, Stephen -CONVENER ...................... SIO/UCSD ............................... spmiller@ucsd.edu
Moore, Reagan ........................................... UCSD ...................................... moore@sdsc.edu
Morris, Peter .............................................. British Antarctic Survey ........... pmor@bas.ac.uk
Mountain, Gregory ..................................... LDEO ...................................... mountain@ldeo.columbia.edu
Naar, David ................................................ Univ. of South Florida .............. naar@usf.edu
Orcutt, John -DISCUSSION LEADER ........... SIO/IGPP ................................. jorcutt@ucsd.edu
Rack, Frank -SPEAKER ............................... JOI-ODP .................................. frack@brook.edu
Reagan, Mary ............................................. LDEO ...................................... mreagan@ldeo.columbia. edu
Reid, Jane A. ............................................... USGS ....................................... jareid@usgs.gov
Ryan, William -CONVENER ....................... LDEO ...................................... billr@ldeo.columbia.edu
Sarbas, Barbel ............................................ Max-Planck Inst. ..................... sarbas@mpch-mainz.mpg.de
Scheirer, Daniel .......................................... Brown Univ. ............................ scheirer@emma.geo.brown.edu
Sharman, George -SPEAKER ....................... NGDC ..................................... gsharman@ngdc.noaa.gov
Shaw, Kevin B. ............................................ NRL ......................................... shaw@nrlssc.navy.mil
Shipley, Tom -SPEAKER .............................. UTIG ....................................... tom@ig.utexas.edu
Small, Chris -SPEAKER ............................... LDEO ...................................... small@ldeo.columbia.edu
Smith, Deborah -CONVENER ..................... WHOI ...................................... dsmith@whoi.edu
Smith, Stu .................................................. SIO/UCSD ............................... ssmith@ucsd.edu
Staudigel, Hubert ....................................... SIO/IGPP ................................. hstaudigel@ucsd.edu
Su, Young ................................................... LDEO ...................................... ysu@ldeo.columbia.edu
Syvitski, James ........................................... Univ. of Colorado .................... james.syvitski@colorado.edu
Tivey, Maurice ............................................ WHOI ...................................... mtivey@whoi.edu
Tolstoy, Maya ............................................. LDEO ...................................... tolstoy@ldeo.columbia.edu
Toomey, Douglas ........................................ Univ. of Oregon ....................... drt@newberry.uoregon.edu
Vaquero, Maria .......................................... Spanish Oceanogr. Inst. ........... juan.acosta@md.ieo.es
Wessel, Paul ............................................... Univ. of Hawaii ........................ pwessel@hawaii.edu
Wilcock, William ........................................ Univ. of Washington ................ wilcock@ocean.washington.edu
Wright, Dawn -CONVENER ....................... OSU ........................................ dawn@dusk.geo.orst.edu
Zaslavsky, Ilya ............................................ UCSD ...................................... zaslavsk@sdsc.edu

17
18

APPENDIX 2: FINAL AGENDA

DAY 1: MONDAY MAY 14


7:00 am Continental Breakfast
8:00 am 8:15 am Plenary Session - Introduction to the Workshop
8:15 am 9:50 am - Data user needs (15-minute talks, 10 minutes for questions)
Presentations from individual data users.
Tom Shipley (UTIG)
Peter Lonsdale (SIO)
Chris Small (LDEO)
Discussion 20 minutes, DL: Dan Fornari (WHOI)
9:50 am 10:20 am Break
10:20 am 11:55 am - Large programs (15-minute talks, 10 minutes for questions)
Presentations from representatives of large programs.
Dave Christie (OSU) - RIDGE
Gary Karner (LDEO) - MARGINS
Severin Gaudet (Canadian Astronomy Data Center) - NEPTUNE
Discussion 20 minutes DL: John Orcutt (SIO/IGPP)
11:55 am 12:20 pm Existing projects, P.I. driven (15-minute talks, 10 minutes for questions)
Presentations on individual P.I.-initiated data management projects.
Suzanne Carbotte/Bill Ryan (LDEO)
12:20 pm - 1:30 pm Lunch
1:30 pm 3:30 pm Existing Projects P.I. driven, continued (15-minute talks, 10 minutes for questions)
Charlie Langmuir (LDEO)
Larry Mayer (UNH)
Andra Bobbitt (NOAA/PMEL)
Dawn Wright (OSU)/Judy Cushing (Evergreen State College)
Discussion 20 minutes - DL: Dawn Wright (OSU)
3:30 pm 4:00 pm Break
4:00 pm 5:00 pm Tools for data access and analysis (10-minutes talks, 5 minutes for questions).
Presentations on models for data access.
Ted Habermann (NGDC)
Richard Lawrence (ESRI)
Steve Lerner (WHOI)
Mike Gourley (CARIS)
5:00 pm 5:15 pm - Summary of Day 1
6:00 pm: Reception/Poster Session at SIO/IGPP Munk Conference Room.
DAY 2: TUESDAY, MAY 15
7:00 am Continental Breakfast
8:00 am 8:15 am Plenary Session - Introduction to Day 2
8:15 am 10:15 am - Organizations with centralized databases (15-minute talks, 10 minutes for ques-
tions) Presentations on large database efforts.
Frank Rack (ODP)
Tim Ahern (IRIS)
George Sharman (NGDC)
Stephen Miller (SIO)
Discussion 20 minutes; DL: Steve Cande (SIO)
10:15 am 10:45 am Break
10:45 am 12:20 pm Database components (15-minute talks, 10 minutes for questions)
Presentations on information technology issues.
Herb Kroehl (NOAA)
John Helly (UCSD)
Ben Domenico (UCAR)
Discussion 20 minutes, DL: Severin Gaudet (Canadian Astronomy Data Center)
12:20 pm 1:30 pm Lunch
1:30 pm 1:45 pm Plenary session Define goals of the Working Groups
1:45 pm - 3:15 pm - Working Groups
Break into multidisciplinary groups to address questions
3:15 pm 3:45 pm Break
3:45 pm 5:15 pm - Working Groups
5:15 pm 5:30 pm Summary of Day 2
Evening: Dinner on your own. Tour of San Diego Supercomputer Center

DAY 3: WEDNESDAY, MAY 16


7:00 am Continental Breakfast
8:15 am 10:00 am - Plenary Session
Working Group summaries break into Working Groups if needed.
10:00 am 10:30 am Break
10:30 pm 12:00 pm - Plenary Session
Where do we go from here? List recommendations.
12:00 pm
End of meeting

19
20

APPENDIX 3:
WORKSHOP EVALUATION

The following workshop evaluation consists of answers QUESTION 2: What single suggestion
to two questions and a list of additional comments would you make to improve this
made by the participants. The evaluation was collected workshop?
from the participants at the end of the workshop. Fol-
Needed an example of a working data informa-
lowing the comments by the participants, pie diagrams
tion system on the WWW (such as land use sys-
of the session evaluation data are presented. There
tem).
were 35 forms submitted (~45% of the participants).
Follow up with another one in a year or so.
Not all participants answered each question.
Would have been nice to have more input from
QUESTION 1: Was there adequate funding agencies.
time for each activity? Discussion was dominated by data providers.
Clear visions of what the long term goals "should"
Yes - 28 No - 5
be often got lost. Long term goals should have
If you were of the opinion there was inadequate time,
more user input, including general nonscientific
please explain.
community.
The working groups required much more time to
Mandates to working groups were somewhat
discuss their issues. Also, the size (too large) pro-
vague and overlapping. Need to be more focused
hibited focused discussions.
and carefully thought out.
More time for working groups.
Present proposals prior to workshopmaybe de-
Too many issues that are unfamiliar to the ma-
veloped by very small groups.
jority of participants were brought up and a final
Reconvene at least once within 12-18 mo. after
recommendation is premature. More meetings
the proceedings and recommendations have been
with focus groups seem required.
disseminated and reviewed by the NSF manage-
The size of the meeting was too large too many
ment and community.
people. The time required is proportional to the
Ask NSF PMs to talk about NSF commitment to
square of the number of attendees (2x people
workshop objectives at the end of the workshop.
need 4x time).
Better fit of room to audience size.
The time for "tools for data access and analysis"
None.
was a little bit limited. It is understandable that
None.
the time allotted for commercial presentations
A room that would make it easier to see the pre-
was less than the other general presentations but
sentations. However, the surroundings were
the time allotted did not allow for much interac-
pleasant and the location at the hotel was con-
tion with the audience.
venient.
Provide a summary of existing workshop recom- Internet access at the meeting! Posters at the
mendations on database management in other meeting!
fields. More info provided prior to the meeting.
More UNOLS participation (especially since we More focus on who and how this is going to make
generated "unfunded mandates"). this happen.
Invite international attendees like the French.
ADDITIONAL COMMENTS
Organize it so that it focuses on more specific
recommendations, less general (?). Room should be laid out broad and shallow in-
A more focused group of experts from both the stead of long and deep.
scientific and computer science communities The follow-up workshop should emphasize more
should be gathered to improve progress, where focused groups of users and providers by disci-
domain experts in data base management and pline on data type (e.g., MB, MCS, UW Video,
science plan a detailed proposal to NSF. etc.)
Small item but short description of agenda items Very well run workshop.
would be useful. My background is in C.S. and I enjoyed this con-
Handout of overheads/presentation slides. ference very much!!
None. Very informative learned a lot of what is done
The sessions probably should have had a man- and available.
date to develop some themes or recommenda- I hope that CARIS would be invited back.
tions and the session leaders could have been Thanks for the big effort to organize it.
given the mandate to develop some consensus The meeting was exceptionally well choreo-
or themes as part of the session. These "results" graphed and no problems with the time lines.
could have been fed into the working groups to This was a useful fact-finding workshop, but the
make them more productive. details on how the future will be mapped is not
None. clear at the end of this session.
Abstracts and titles of talks available before meet- Very useful workshop. Good to see consensus
ing. building throughout. More productive than many
There should have been "read-ahead" material workshops.
to inform participants about other database dis- I think the workshop was very successful in gath-
cussions and workshops that have already taken ering the experience and articulating the needs
place under NSF sponsorship. and concerns of the MG&G community. The key
I thought having the people from "outside" the will be to craft recommendations that will lead
MG&G community (esp., Cushing, Gaudet, to coherent actions.
Brovey) was a good idea. Perhaps a bit more in- I thought Gaudets talk on the data he worked
put from the oil industry would have been good with (the amount and flow) was good as it put
they collect very similar data and face similar the amount of data our group is discussing into
problems: serving up data, what media to store perspective. It gives me a sense that we should
data on. be able to organize the data that we have.
More IT (Information Technology) specifics. An important and refreshing opportunity to re-
It was a good balance of researchers and work- think and reconsider NGDC/MGGs role and re-
ers associated with DB systems. Job well done. sponsibilities to the community.
Fewer, more select audience/participants at the Data catalog vs. database distinction is impor-
risk of compromised broadness to achieve a tant. I would like to have seen more examples of
higher degree of focus. working solutions such as the one that Peter
More pre-meeting planning and distribution of Knoop presented.
material.

21
22

SUMMATIVE EVALUATION DATA

Monday Morning: Monday Morning: Monday Afternoon:


Data User Needs Large Programs Existing Projects
0% 0%
0% 3% 0% 0%
12% 9% 15%

32%

47% 53%

41% 35%
53%

Monday Afternoon:
Tools for Data Access Reception at IGPP Poster Session/
and Analysis Demonstrations

0% 0%
3% 7% 3% 3% 3%
29% 26%

36%
30%
44%
60%

24% 32%

Very valuable
Valuable
Average Value
Limited Value
Very little value
Tuesday Morning:
Organizations With Tuesday Morning: Tuesday Afternoon:
Centralized Databases Database Components Working Groups
0% 0%
6% 0% 6% 0% 6% 0%

17%

43%

53%
41% 35%
59%

34%

Wednesday Morning:
Summaries of Wednesday Morning:
Tour of SDSC Working Groups Wrap Up
0% 0%

9% 10% 0% 0%
14%
26%
17%

48%
32%
58%

17% 38%

31%

Very valuable
Valuable
Average Value
Limited Value
Very little value

23
24

APPENDIX 4: RELEVANT URLS

PARTICIPANT-PROVIDED WEB SITES

Institution Site Description

CARIS http://www.caris.com Marine software solutions, MB/SB/SSS processing, Hydro-


graphic database
CARIS http://www.spatialcomponents.com Spatial Fusion web mapping
CIESIN http://sedac.ciesin.columbia.edu/gateway Distributed metadata catalog search tool
CIESIN http:// sedac.ciesin.columbia.edu/plue/gpw Gridded population of the world
ESRI http://www.esri.com GIS, ArcInfo, ArcView, ArcIMS
IRIS http://iris.washington.edu Iris data management center
LDEO http://www.ldeo.columbia.edu/adgrav Antarctic digital gravity synthesis
LDEO http://www.ldeo.columbia.edu/cgif Coastal geophysics imaging facility
LDEO http://www.ldeo.columbia.edu/~dale/dataflow Discussion of bits to data
LDEO http://coast.ldeo.columbia.edu Ridge multibeam synthesis
LDEO http://petdb.ldeo.columbia.edu Ridge petrological database
LDEO http://www.ldeo.columbia.edu/SCICEX SCICEX, SCAMP Arctic mapping with US Navy submarines
Max-Planck http://georoc.mpch-mainz.gwdg.de Geochemical database for oceanic island, island arcs, LIPs
Institut fur
Chemie in Mainz
MBARI http://www.mbari.org/data/mapping MBARI multibeam and other data
NOAA/NGDC http://www.ngdc.noaa.gov/mgg/mggd.html National archives of underway, multibeam, and seafloor
sediment/rock data
NOAA http://www.pmel.noaa.gov/vents/data NOAA Vents program data gateway
NOAA http://www.pmel.noaa.gov/vents/acoustics.html Underwater Acoustic Monitoring
NOAA http://newport.pmel.noaa.gov/nemo/realtime/ NeMO Net Real-Time Monitoring & Data
ODP/LDEO http://www.ldeo.columbia.edu/BRG/ODP/DATABASE ODP downhole logging database
ODP http://www-odp.tamu.edu/database Janus database
Oregon St./ http://dusk.geo.orst.edu/vrv Virtual Research Vessel, MOR data access and online com-
U of Oregon http://www.cs.uregon.edu/research/vrv-et putational environment
Oregon St http://dusk.geo.orst.edu/djl Davey Jones' Locker seafloor mapping and marine GIS
Oregon St http://buccaneer.geo.orst.edu/dawn/tonga Boomerang 8 cruise database (Tonga trench and forearc)
SAIC http://www.oe.saic.com Ocean Explorers Public domain dataset rapid visualiza-
tion tool
SIO http://gdcmp1.ucsd.edu Geological data center cruise archives
SIO http://www.earthref.org Geochemical Earth Reference Model (GERM), seamount
catalogue
SIO http://sioseis.ucsd.edu/reflection_archive Seismic reflection data
SIO http://topex.ucsd.edu/marine_topo/mar_topo.html Global Topography
SOEST http://www.soest.hawaii.edu/STAG/data.html Marine data archives
SOEST http://www.soest.hawaii.edu/HMRG HMRG data archives
SOEST http://ahanemo2.whoi.edu/ AHA-NEMO 2 cruise database
UCAR http://www.unidata.ucar.edu Tools for accessing and visualizing real time data
U Michigan http://www.si.umich.edu/SPARC Space Physics and Aeronomy Research Collaboratory web-
based access to distributed databases
U Sidney http://www.es.usyd.edu.au/geology/centres/osi/ Australian seabed database
auseabed/au7_web.html
USGS http://walrus.wr.usgs.gov/infobank Coastal and marine metadatabank
USGS http://edc.cr.usgs.gov Land surface data, DEMs, satellite imagery
UTIG http://www.ig.utexas.edu/srws UTIG Seismic reflection data holdings
UW http://bromide.ocean.washington.edu/gis/ Endeavour Segment GIS
WHOI http://4dgeo.whoi.edu/virtualvan Jason Virtual Control Van
WHOI http://science.whoi.edu/kn16213 Recent cruise database for ROV Jason cruise to the Indian Ocean
WHOI http://drifor.whoi.edu/LuckyStrike96/ MAR Lucky Strike database
WHOI http://www.divediscover.whoi.edu Dive and Discover public cruise outreach
WHOI http://mbdata.whoi.edu/mbdata.html Multibeam data archives

RELATED SITES ORGANIZATIONS AND DATABASES


Provided by Steve Miller (SIO) and Dawn Wright (OSU)

Acronym Institution Site Description

AGSO http://www.agso.gov.au/databases Multibeam surveys of Australian Territorial wa-


http://www.agso.gov.au/marine/marine.html ters: Marine and Coastal Data Directory: GEOMET
metadatabase
ADEPT UCSB http://www.alexandria.ucsb.edu/ Alexandria Digital Earth Prototype (ADEPT)
DLESE DLESE http://www.dlese.org Digital Library for Earth System Education
http://www.dlese.org/Metadata
Geological Survey http://www.gsiseabed.ie Database and chart management for surveys of
of Ireland entire coastal zone
GEOMAR http://www.geomar.de/projekte/alle/ Cruise imagery and publications
expedition.html
GOMaP NRL/UW http://www.neptune.washington.edu/pub/ Global Ocean Mapping Program
documents/gomap_pilot.html
IFREMER http://www.ifremer.fr/sismer/sismer/ SISMER Oceanographic data Center
serveura.htm
NBII NSF http://www.nbii.gov/ National Biological Information Infrastructure
NEEDS http://www.needs.org/ National Engineering Education Delivery System
http://www.synthesis.org/
NIWA/LINZ http://www.niwa.cri.nz/NIWA_research/ New Zealand territorial waters surveys and data-
coastal.html bases
NOAA http://www.csc.noaa.gov/opis NOAA Ocean GIS (Southeast U.S.)
NPACI NPACI http://www.npaci.edu/ National Partnership for Advanced Computing
http://www.npaci.edu/About_NPACI/ Infrastructure
index.html
NSDL NSF http://www.ehr.nsf.gov/due/programs/nsdl/ National SMETE (Science, Mathematics, Engineer-
http://www.smete.org/nsdl/ ing and Technology Education)
Digital Library
OSU http://buccaneer.geo.orst.edu Oregon Coast Geospatial Clearinghouse
SOPAC SIO http://sopac.ucsd.edu SOPAC geodetic archives and GIS
THREDDS UCAR http://www.unidata.ucar.edu/projects/ Thematic Realtime Earth Data Distributed
THREDDS/Overview/Home.htm Servers
UCGIS UCGIS http://www.ucgis.org University Consortium for Geographic Informa-
tion Science
USGS/Microsoft http://www-nmd.usgs.gov/esic/esic.html "Digital Backyard" - USGS & Microsoft TerraServer
http://microsoft.terraserver.com
WOCE http://www-ocean.tamu.edu/WOCE/ World Ocean Circulation Experiment
uswoce.html
http://whpo.ucsd.edu

25
26

TOOLS FOR DATA ACCESS AND ANALYSIS


Provided by Stephen Miller (SIO)

Tool Organization Site Description

ArcGMT Oregon St http://dusk.geo.orst.edu/arcgmt GMT<--> GIS converter


ArcInfo ESRI http://www.esri.com GIS, Web GIS
ArcIMS
ArcView
CARIS CARIS http://www.caris.com Marine software solutions, MB/SB/SSS pro-
cessing, Hydrographic database
DICE SDSC www.npaci.edu/DICE/ Data Intensive computing, technology devel-
opment and software toolkits
ERMapper Earth Resource www.ermapper.com Image mapping and manipulation
Mapping
FGDC FGDC http://www.fgdc.gov Federal Geographic Data Committee
(Metadata, Clearinghouses)
Fledermaus IVS http://www.ivs.unb.ca 3D visualization, QA
Geomedia Intergraph http://www.intergraph.com GIS, format translator, web server
Geoshare Geoshare http://www.geoshare.com Nonprofit corporation for managing exchange
of petroleum related data and software
GMT SOEST http://www.soest.hawaii.edu/gmt/ Generic Mapping Toolkit
Macromedia Macromedia http://www.macromedia.com Web content development
Dreamweaver
MATLAB Matlab http://www.mathworks.com Modeling, display, analysis
MBSYSTEM MBARI http://www.mbari.org/~caress/MB- Multibeam System
System_intro.html Seafloor mapping toolkit public domain
MrSID LizardTech www.lizardtech.com Multi-resolution Seamless Image Database
Open GIS Open GIS http://opengis.opengis.org/wmt/ GIS standards and techniques, Web Mapping
Consortium Testbed
Oracle Oracle http://www.oracle.com Database management, web serving
http://oai,oracle,com/pls/oai_site/
oai_site.home
SRB SDSC http://www.npaci.edu/online/v5.4srb118.html Storage Resource Broker
http://www.npaci.edu/DICE/SRB/
UNB/Ocean http://www.omg.unb.ca/omg/ Software tools for swath bathymetry and
Mapping Group sidescan sonar
UNH/Center for http://www.ccom-jhc.unh.edu/ Multibeam processing, statistical beampoint
Coastal and Ocean editing, GIS operability
Mapping
Photos on the back cover are courtesy of the SIO Archives (top
right) and Emerson Hiller Personal Collection, Woods Hole
Oceanographic Institution, Data Library and Archives (left middle,
bottom center).

27
28

September 2001

Editing and design by


Geosciences Professional Services, Inc.

Support for this workshop was provided


by the National Science Foundation
and Office of Naval Research.

Das könnte Ihnen auch gefallen