Many federal agencies, including the National Institutes of Health (NIH) and most recently the National Science Foundation (NSF), are requiring that grant applications contain data management plans for projects involving data collection. Beginning January 18, 2011, proposals submitted to NSF must include a supplementary document of no more than two pages labeled “Data Management Plan” (DMP). This supplementary document should describe how the proposal will conform to NSF policy on the dissemination and sharing of research results. According to the NSF Grant Proposal Guide, the DMP will now be reviewed as an integral part of the proposal. Proposals that do not include a DMP will not be able to be submitted.
Elements of a Good Data Management Plan include:
- Data DescriptionBrief, high-level description of the information to be gathered; the nature, scope and scale of the data that will be generated or collected.
- Content and FormatFormats in which the data will be generated, maintained, and made available, including a justification for the procedural and archival appropriateness of those formats.
- Access and SharingIndicate how you intend to archive and share your data and why you have chosen that particular option. This should include a description and rationale for any restrictions on who may access the data under what conditions and a timeline for providing access. This should also include a description of the resources and capabilities (equipment, connections, systems, expertise, repositories, etc.) needed to meet anticipated requests. These resources and capabilities should be appropriate for the projected usage, addressing any special requirements such as those associate with streaming video or audio, movement of massive data sets, etc.
- Metadata Content and FormatStatement of plans for metadata content and format, including description of documentation plans and rationale for selection of appropriate standards. Existing, accepted standards should be used where possible. Where standards are missing or inadequate, alternate strategies for enabling data re-use and re-purposing should be described..
- Intellectual Property Rights ProtectionStatement of plans, where appropriate and necessary, for protection of privacy, confidentiality, security, intellectual property and other rights.
- SecurityA description of technical and procedural protections for information, including confidential information, and how permissions, restrictions, and embargoes will be enforced.
- Selection and Retention PeriodsA description of how data will be selected for archiving, how long the data will be held, and plans for eventual or termination of the data collection in the future.
- Archiving and PreservationDescription of plans for preserving data in accessible form. Plans should include a timeline proposing how long the data are to preserved, outlining any changes in access anticipated during the preservation timeline, and documenting the resources and capabilities (e.g., equipment, connections, systems, expertise) needed to meet the preservation goals. Where data will be preserved beyond the duration of direct project funding, a description of other funding sources of institutional commitments necessary to achieve the long-term preservation and access goals should be provided.
- Storage and BackupStorage methods and backup procedures for the data, including the physical and cyber resources and facilities that will be used for the effective preservation and storage of the research data.
- ResponsibilityNames of the individuals responsible for data management in the research project. *This particularly important when working with multiple PIs and/or collaborative partners.
- BudgetThe costs of preparing data and documentation for archiving and how these costs will be paid. Requests for funding may be included, depending on the agency (i.e., NSF guidance)
Marshall University Data Management Information
Marshall University provides a Central Data Center (MU Datacenter) on its main campus in Huntington, WV in support of administrative, instructional, and research computing. This data center is powered by a power distribution system with UPS and generator facilities for continuous operation. The data center is cooled with a redundant and independent cooling system. Physical security is provided by card access control and video security monitoring as well as individual locked cabinets to secure host servers and storage for independent projects.
The Data Center hosts switched gigabit and ten gigabit server connections as part of a dedicated network secured from the campus network with Cisco firewalls. Data transfers can be secured by VPN, SSL, and SSH. The MUNet campus network has over 11,000 switched gigabit network connections and a ten gigabit backbone. MUNet is connected to the commodity Internet by redundant carriers with diverse paths providing 1.6Gb of commodity Internet service to the campus. The campus also has a 1Gb Internet2 connection linked to OARnet and Internet2.
The Data Center currently has a single HPC Cluster with over 1Tflop of compute services and extends these services through the use of other Internet2 connected resources such as the TeraGrid. Storage is provided by Dell/EMC Clarion fiber channel SANS and both 1Gb and 10Gb iSCSI Dell/Equalogic SANS. Backup is provided by an ADIC tape robot with off-site storage. Backup services are being migrated to remote site disk to disk backup during calendar year 2011.
A research portal for data sharing and collaboration is currently in the pilot phase and is being based on HUBZero. Storage and compute services are charged-back to all units on campus including research projects based on a published IT Rate Schedule. The university Information Technology Council provides a link to the university —IT Policies (privacy, confidentiality, security, intellectual property rights (copyright), etc.).
Example Data Management Plans
NSF Data Management Plan Templates and Examples
When preparing your Data Management Plan (DMP) for your NSF grant application, you can follow these steps:
- Remember to check for additional directorate guidelines.
- University of New Mexico’s Data Management Plan Examples
- Rice University’s Data Management Plan Examples for biosciences and social and behavioral sciences.
- Yale University’s Data Management Plan Examples
- DataONE (Data Observation Network for Earth) data management plan outline and example for environmental scientists.
- University of Delaware NIH Example
- University of Delaware NSF Example
- Directorate specific templates for NSF data management plans from the University of Virginia Library Scientific Data Consulting Group. These are very useful, but remember these are tailored to the UVa community.
- Integrated Earth Data Applications (IEDA) Data Management Tool is an online form you can fill out to help generate your data management plan. The form is for the earth sciences.
- Data Conservancy recognizes the need for institutional and community solutions to digital research data collection, curation and preservation challenges. DC tools and services incentivize scientists and researchers to participate in these data curation efforts by adding value to existing data and allowing the full potential of data integration and discovery to be realized.