1/16/08 Meeting Minutes
1) (10 min) Bill explains the current data handling pipeline
Four sources of data from each cruise: ADCP, CTD, flowthrough, and water samples. The ADCP, CTD, and flowthrough are "homogenized" and loaded into a database in semi-real time to support web applications. The "raw" data in their original file formats are uploaded to the CMOP servers for archival. The sample data is recorded on a spreadsheet (see Byron's proposed standard)
2) (30 min) The group comes up with answers to Murray's questions:
Who is keeping the Master List identifying what we consider to be “data”? Who is responsible to report to this Master List writer?
Charles Seaton will maintain the master list. The Chief Scientist is responsible to make sure the data is delivered to Charles, though most data will be captured and uploaded automatically by the telemetry system. The "electronic log book" that records sample information is the most important item to deliver.
Who is the Archiver of the data and metadata?
OHSU. Charles and Bill will keep two basic forms of data: "Raw" and "homogenized." Additionally, we will produce official "data releases" for each cruise.
Who has the responsibility of defining what metadata needs to accompany the data?
Metadata requirements come from at least three sources: users, NSF, and applications. Bill will gather requirements from applications and NSF policies. Charles will gather user requirements as people try to use the data.
Who has the responsibility for making sure the data and metadata get to the Archiver?
The Chief Scientist, though usually this will happen automatically.
What processing needs to be done to the data before giving to the Archiver? Note that the identity of real-time data vs archived data needs to be made, and there should be different processing guidelines for each type.
The archived data will be grouped into "Data Releases" which will be associated with explicit and permanent metadata records, which include a description of the processing that was performed.
Who is responsible for ensuring that the data and metadata are consistent with our needs and NSF requirements for archiving in a national data bank?
See the earlier metadata question.
- Login to post comments
