From 839f5c0d1dd669f4108991fc405ba67a3eafc652 Mon Sep 17 00:00:00 2001 From: Collin Schwantes Date: Wed, 7 Dec 2022 16:47:10 -0500 Subject: [PATCH 1/5] added resources to dmp section --- data.Rmd | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/data.Rmd b/data.Rmd index 8eb6671..64deca7 100644 --- a/data.Rmd +++ b/data.Rmd @@ -115,6 +115,11 @@ Data management activities, but not necessarily infrastructure, are an allowable ## Learn - Watch M3 on [Data Management Plans](https://airtable.com/appwlxIzmQx5njRtQ/tbledVCO9MRKkK9MW/viwfFq11zdwCbBT83/recNVSuG2ApgfYkbl?blocks=hide) - Read California Digital Library guidance on [Data Management Plans](https://dmptool.org/general_guidance) +- [Data Management Plan Skill Building](https://dataoneorg.github.io/Education/bp_step/plan/) from DataOne +- [NIH Data Sharing Guidance](https://sharing.nih.gov/data-management-and-sharing-policy) + - [NIH Data Sharing learning Resources](https://sharing.nih.gov/about/learning) + - [Condensed NIH DMSP Guidance Resources](https://osf.io/uadxr/) +- [NSF Bio DMP Guidance](https://www.nsf.gov/bio/biodmp.jsp) - Read Hadley Wickham's [tidy data paper](http://vita.had.co.nz/papers/tidy-data.pdf) for the general concept. Note the *packages* in this paper are out of date, but the structures and From eaada46d083ab45588d732f22545d8d4e184cfc8 Mon Sep 17 00:00:00 2001 From: Collin Schwantes Date: Wed, 7 Dec 2022 17:23:12 -0500 Subject: [PATCH 2/5] added section on navigating expectations --- data.Rmd | 16 +++++++++++++--- 1 file changed, 13 insertions(+), 3 deletions(-) diff --git a/data.Rmd b/data.Rmd index 64deca7..a609338 100644 --- a/data.Rmd +++ b/data.Rmd @@ -21,11 +21,16 @@ structuring data makes interoperability between tools easier. ## Data Management Plan -*Data Management Plans* , also called *Outputs Management Plans* or *Data Management and Sharing Plans*, are living documents that help structure the creation and management of data throughout the lifecycle of a project. DMPs are flexible and do not force researchers to choose a particular technology set but rather ask probing questions about the mechanics and ethics of data use in research projects. Organizing data management in this way provides a common framework to think about data without requiring specific technologies be used in the research workflow. Furthermore, DMPs use reliable identifiers (URIs) to connect components of the research workflow, making long term data access more reliable. +*Data Management Plans* , also called *Outputs Management Plans* or *Data Management and Sharing Plans*, are living documents that help structure the creation and management of data throughout the lifecycle of a project. DMPs are flexible and do not force researchers to choose a particular technology set but rather ask probing questions about the mechanics and ethics of data use in research projects. Organizing data management in this way provides a common framework to think about data without requiring specific technologies be used in the research workflow. Furthermore, DMPs use stable identifiers (URIs) to connect components of the research workflow, making long term data access more reliable. + +The majority of funders require a DMP; however, each funder has specific expectations +about what, when, and how research outputs should be shared. It is important you +and your collaborators understand those expectations. ![](assets/data_mgmg_plan.png) *Data management plan as hub in knowledge management system* +**Important note on budgeting**: Data management activities, but not necessarily infrastructure, are an allowable cost for most funding agencies (NIH, NSF, NASA). Gray areas include paying for hosting services and other infrastructure-like components of the DMP. **Benefits of using a DMP**: @@ -54,12 +59,14 @@ Data management activities, but not necessarily infrastructure, are an allowable 2. Data Management Plans are living documents that change with a project 3. DMPs are created collaboratively and stored in DMPTool.org 4. We ensure our DMPs meet EHA best practices for FAIR data and Reproducible Science +5. Collaborators, especially those from outside institutions, are full participants in the DMP process ### DMP Process Overview 0. [Create an account](https://dmptool.org/quick_start_guide) on DMPTool.org associated with EcoHealth Alliance 1. Identify Funder DMP requirements and `r params$data_librarian_appt` with the `r params$data_librarian` -2. Create a DMP using appropriate template given your funder. If no template is available or the funder has no requirements, use the EHA Minimal Data Management Plan. Add collaborators and complete as much of the plan as you can +2. Create a DMP using appropriate template given your funder. If no template is available or the funder has no requirements, use the EHA Minimal Data Management Plan. Add collaborators and complete as much of the plan as you can +3. Principle Investigators and Project Partners explicitly agree to abide by the DMP. All collaborators should fully understand and agree with the data sharing components of the plan before approving it. 3. Request feedback from the `r params$data_librarian` 4. Work with the `r params$data_librarian` to incorporate feedback 5. Export DMP for inclusion in grant @@ -68,13 +75,16 @@ Data management activities, but not necessarily infrastructure, are an allowable **Proposal/Pre-Award Phase** -- Look for and use Funder Requirements for DMPs. If no template exists, use this one or create one based on funder requirements. +- Look for funder requirements and use funder specific templates for DMPs. If no template exists, use the EHA Minimal Data Management Plan or create one based on funder requirements. - Think about how you might make data Findable, Accessible, Interoperable and Reproducible (FAIR) +- Establish expectations for data sharing and outputs with collaborators and PIs. - Consider what tools you will use throughout the lifecycle of your data  - Consider how data collection, analysis and management tasks will be divided among collaborators - Outline the ethical considerations for properly managing data in your project +- Ensure collaborators and PIs understand the commitments they are making via the DMP. Request and incorporate feedback from collaborators. - `r params$data_librarian_appt` with the Data Librarian, create a timeline for proposal submission, and have a notion of tools and standards to use + **Post-Award/Early Phase** - Review and update proposal DMP From 9090d50bbe6d7d08ecd5d04dfe463708c8c61135 Mon Sep 17 00:00:00 2001 From: Collin Schwantes Date: Thu, 8 Dec 2022 08:59:10 -0500 Subject: [PATCH 3/5] Update data.Rmd Co-authored-by: Noam Ross --- data.Rmd | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/data.Rmd b/data.Rmd index a609338..ee6b890 100644 --- a/data.Rmd +++ b/data.Rmd @@ -77,7 +77,7 @@ Data management activities, but not necessarily infrastructure, are an allowable - Look for funder requirements and use funder specific templates for DMPs. If no template exists, use the EHA Minimal Data Management Plan or create one based on funder requirements. - Think about how you might make data Findable, Accessible, Interoperable and Reproducible (FAIR) -- Establish expectations for data sharing and outputs with collaborators and PIs. +- Establish expectations for data sharing and outputs with collaborators and PIs. These discussions should begin early at the same time as discussing project responsibilities and budget. - Consider what tools you will use throughout the lifecycle of your data  - Consider how data collection, analysis and management tasks will be divided among collaborators - Outline the ethical considerations for properly managing data in your project From 6c291acc7b3360e45da98611c9d305e695775a9a Mon Sep 17 00:00:00 2001 From: Collin Schwantes Date: Mon, 12 Dec 2022 13:41:35 -0600 Subject: [PATCH 4/5] added an intro paragraph and shifted some materials --- data.Rmd | 82 ++++++++++++++++++++++++++++++++++---------------------- 1 file changed, 50 insertions(+), 32 deletions(-) diff --git a/data.Rmd b/data.Rmd index a609338..1004820 100644 --- a/data.Rmd +++ b/data.Rmd @@ -1,23 +1,14 @@ # Data Management -*Can the data be shared and published, and easily re-used in other analyses*? - -- Create and maintain a [data management plan](https://dmptool.org/plans) -- Store data in simple, cross-compatible formats such as CSV files. -- Microsoft Excel can be a useful tool for data entry and organization, but - limit its use to that, and organize your data in a way that can be easily - exported. -- Metadata! Metadata! Document your data. -- For relational datasets you can create linked data on [Airtable](https://airtable.com/). For more information see \@ref(airtable) -- For data sets that cross multiple projects, create data-only project folders - for the master version. When these data sets are finalized, they can be - deposited in public or private data repositories such as - [figshare](https://figshare.com/) and [zenodo](https://zenodo.org/). In some - cases it makes sense for us to create data-only R packages for easily - distributing data internally and externally. +EcoHealth Alliance is committed to producing and promoting reliable and +reproducible research. In order to achieve this, we have to provide data +(and other research outputs) that non-team members can interpret and use; as well +as promote best practices for data management among collaborators. Ideally, the +framework for managing data laid out in this chapter will facilitate the creation +of high quality, share-able research outputs. By focusing on [Data Management +Plans](https://datamanagement.hms.harvard.edu/plan-design/data-management-plans) and the [dmptool](https://dmptool.org/plans), we can build on well +established workflows for producing high quality research outputs. -We aim to generally work in a **tidy data** framework. This approach to -structuring data makes interoperability between tools easier. ## Data Management Plan @@ -25,7 +16,11 @@ structuring data makes interoperability between tools easier. The majority of funders require a DMP; however, each funder has specific expectations about what, when, and how research outputs should be shared. It is important you -and your collaborators understand those expectations. +and your collaborators understand those expectations before submitting a DMP. Its +equally important that all collaborators understand and agree to the obligations +created when submitting a DMP. Early communication between collaborators +is key to navigating differing expectations about data sharing from researchers +in different contexts. ![](assets/data_mgmg_plan.png) *Data management plan as hub in knowledge management system* @@ -35,14 +30,14 @@ Data management activities, but not necessarily infrastructure, are an allowable **Benefits of using a DMP**: -1. They are a funder requirement and you want funding - - NIH, NSF, NASA, Wellcome Trust, etc. require a DMP be submitted with a proposal. -2. They provide a scaffold for you to conceptualize data management for your project +1. They provide a scaffold for you to conceptualize data management for your project - What data do you need to answer your research question, where will it come from, what resources are needed throughout the project lifecycle, what are the mechanics of managing the data? -3. They make it easier collaborate +2. They make it easier collaborate - Defining responsibilities, Committing to using data standards, Documenting how the project works -4. They make it easier for your data to be reused +3. They make it easier for your data to be reused - You get more citations, your effort contributes to knowledge creation in unexpected ways, your results become more reproducible +4. They are a funder requirement and you want funding + - NIH, NSF, NASA, Wellcome Trust, etc. require a DMP be submitted with a proposal. **Components of a DMP**: @@ -61,15 +56,6 @@ Data management activities, but not necessarily infrastructure, are an allowable 4. We ensure our DMPs meet EHA best practices for FAIR data and Reproducible Science 5. Collaborators, especially those from outside institutions, are full participants in the DMP process -### DMP Process Overview - -0. [Create an account](https://dmptool.org/quick_start_guide) on DMPTool.org associated with EcoHealth Alliance -1. Identify Funder DMP requirements and `r params$data_librarian_appt` with the `r params$data_librarian` -2. Create a DMP using appropriate template given your funder. If no template is available or the funder has no requirements, use the EHA Minimal Data Management Plan. Add collaborators and complete as much of the plan as you can -3. Principle Investigators and Project Partners explicitly agree to abide by the DMP. All collaborators should fully understand and agree with the data sharing components of the plan before approving it. -3. Request feedback from the `r params$data_librarian` -4. Work with the `r params$data_librarian` to incorporate feedback -5. Export DMP for inclusion in grant ### Expectations by project phase @@ -122,6 +108,38 @@ Data management activities, but not necessarily infrastructure, are an allowable - Use EHA institutional tags where possible e.g. [Zenodo Community](https://zenodo.org/communities/ecohealthalliance/?page=1&size=20) - `r params$data_librarian_appt` with the `r params$data_librarian` +### DMP Process Overview + +0. [Create an account](https://dmptool.org/quick_start_guide) on DMPTool.org associated with EcoHealth Alliance +1. Identify Funder DMP requirements and `r params$data_librarian_appt` with the `r params$data_librarian` +2. Create a DMP using appropriate template given your funder. If no template is available or the funder has no requirements, use the EHA Minimal Data Management Plan. Add collaborators and complete as much of the plan as you can +3. Principle Investigators and Project Partners explicitly agree to abide by the DMP. All collaborators should fully understand and agree with the data sharing components of the plan before approving it. +3. Request feedback from the `r params$data_librarian` +4. Work with the `r params$data_librarian` to incorporate feedback +5. Export DMP for inclusion in grant + +## Notes on data management +*Can the data be shared and published, and easily re-used in other analyses*? + +- Create and maintain a [data management plan](https://dmptool.org/plans) +- Store data in simple, interoperable formats such as CSV files. +- Microsoft Excel can be a useful tool for data entry and organization, but + limit its use to that, and organize your data in a way that can be easily + exported. +- Metadata! Metadata! Document your data. +- For relational datasets you can create linked data on [Airtable](https://airtable.com/). For more information see \@ref(airtable) +- For data sets that cross multiple projects, create data-only project folders + for the master version. When these data sets are finalized, they can be + deposited in public or private data repositories such as + [figshare](https://figshare.com/) and [zenodo](https://zenodo.org/). In some + cases it makes sense for us to create data-only R packages for easily + distributing data internally and externally. + +We aim to generally work in a **tidy data** framework. This approach to +structuring data makes interoperability between tools easier. + + + ## Learn - Watch M3 on [Data Management Plans](https://airtable.com/appwlxIzmQx5njRtQ/tbledVCO9MRKkK9MW/viwfFq11zdwCbBT83/recNVSuG2ApgfYkbl?blocks=hide) - Read California Digital Library guidance on [Data Management Plans](https://dmptool.org/general_guidance) From 19d60180c64a123d25c4bc95a59e844cc916ea12 Mon Sep 17 00:00:00 2001 From: Collin Schwantes Date: Mon, 12 Dec 2022 14:26:57 -0600 Subject: [PATCH 5/5] added link to fair principles and updated section title --- data.Rmd | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/data.Rmd b/data.Rmd index b81fb34..44dc412 100644 --- a/data.Rmd +++ b/data.Rmd @@ -53,7 +53,7 @@ Data management activities, but not necessarily infrastructure, are an allowable 1. Its never too late to write a DMP 2. Data Management Plans are living documents that change with a project 3. DMPs are created collaboratively and stored in DMPTool.org -4. We ensure our DMPs meet EHA best practices for FAIR data and Reproducible Science +4. We ensure our DMPs meet EHA best practices for [FAIR data](https://www.go-fair.org/fair-principles/) and Reproducible Science 5. Collaborators, especially those from outside institutions, are full participants in the DMP process @@ -108,7 +108,7 @@ Data management activities, but not necessarily infrastructure, are an allowable - Use EHA institutional tags where possible e.g. [Zenodo Community](https://zenodo.org/communities/ecohealthalliance/?page=1&size=20) - `r params$data_librarian_appt` with the `r params$data_librarian` -### DMP Process Overview +### Using DMPTool to create prepare your proposal data Management plan 0. [Create an account](https://dmptool.org/quick_start_guide) on DMPTool.org associated with EcoHealth Alliance 1. Identify Funder DMP requirements and `r params$data_librarian_appt` with the `r params$data_librarian`