11/1/12 Georgetown Meeting

From SURA Research Data Management
Jump to: navigation, search

Notes From 11/1 IT Committee Meeting

What Did We Hear (Learn) From Agency Presentations?

  • Data management is still a key issue/challenge
  • Agency’s have different requirements and are at different maturity levels
  • Conflicts exist between sharing data and maintaining IP/ownership
  • Tension between use of data within a domain and use across domains
  • Is data sharing a subset of data management and where do we focus first?
  • Data sharing is about further use (beyond original research) – validation, alternate use…
  • Management of data must be flexible enough to accommodate the full data lifecycle (different steps may have different policies and different responsible parties)
  • How do you maintain the integrity of the data being maintained?
  • Standards are key – local, regional, national, international standards are needed to make data sharable
  • What do you do with data sets that are too large to move? (compute in place).
  • What is the role of NSF in funding the working elements of a data management/sharing infrastructure? What is the role of the institutions? (see NSF DIBBS)
  • How can past NSF projects be mined for RDM solutions? (need for a data directory of data mgmt and sharing of plans?)
  • Need a clear definition of responsibilities for various elements of data lifecycle.
  • Need to define the difference between data management and sharing and where metadata falls in that dichotomy. Most of the presenters this morning focused on the "management" * portion of the problem but skirted the metadata and sharing components.
  • Not enough focus on sharing of data (is this where we can add value?)

Next Steps:

Involvement in DataWay Program and Charrette

  • Can SURA offer NSF a subset of the national community to identify mechanisms/models for aggregating community input for DataWay? (Is this a DataWay white paper topic?)
    • Researcher focus?
    • Institutional focus?
  • Implement our pilot (see below) – use DataWay to improve
  • How to sustain (NASA spends $100M yr which represents 10% recurring for life of data). This must be budgeted for new projects.
  • What are the grand challenges for DataWay?

Respond to call for DataWay White Papers – possible RDM Focus Areas:

  • Access and discoverability of distributed data sources including rights management (“rights management”, privacy, copyrights, compliance)
  • Governance and economic models for sustainable curation including distribution of effort between local through global communities
  • Communities, methods and processes for the definition of metadata and ontologies (vocabulary) (link to pilot)
  • Models for building multi-institutional, cross disciplinary infrastructure to improve the management of research data
  • Identification of logical / potential division between institution / regional / national responsibilities for various DataWay (data life cycle) components (link to SURA collaboration)
  • Data Management vs. Data Sharing (Retention vs. Access)
    • What responsibility does the institution have to maintain data over time (how long?) and how is this funded
    • Who is responsible for making the data sharable and to how large an audience (research collaborators vs. everyone)
    • Sharing requires management (you can’t share data if you haven’t collected and catalogued it)

Identify a focused project to make progress in a specific area

Create a meta-data tool/repository and federate across institutions (check for existing ones)

  • Simple meta-data standard and tool (Data version of MARK tool)
  • Review PURR (Purdue University Research Repository), DataVerse and Hopkins models for elements that could be added to SBSG or knitted together into a more comprehensive resource.
  • Auto meta-data extraction (see OAI-PMH)
  • Review COAR Document: The Current State of Open Access Repository Interoperability

Continued development of Best Practices Doc (SBSG)

  • Encourage and document use of SBSG
  • Institutional case studies of use of SBSG
    • Who is using it? How is it being used?
    • How can it be improved? Operational review?
  • How are researchers being engaged in its use?
  • What are the outcomes? DMP is input.
  • Can/should SBSG be tailored for specific funding agency requirements?


UNC Research Data Stewardship Report