Our data stewardship service provides comprehensive and tailored support to FBM UNIL-CHUV researchers, helping you manage, share, and preserve their research data effectively and in accordance with best practices and policies.
Data Management Plan (DMP) preparation and reading
Our service assists researchers in creating effective DMPs for SNFS and European grant applications as well as for data storage purposes. This helps you outline how you will collect, manage, and share your data throughout the research process.
Details
A DMP is a crucial document in research projects that outlines how data will be managed throughout its entire life cycle. The aim of a DMP is to provide a structured approach to ensure that data is effectively collected, processed, stored, shared, and preserved in a way that promotes data quality, accessibility, and long-term usability. By creating and following a well-structured Data Management Plan, researchers can enhance the quality of their research, facilitate collaboration, comply with funding agency requirements, and ensure the long-term value and accessibility of their data.
Key components of a Data Management Plan typically include:
Data Description: A detailed description of the data to be collected or generated, including its format, structure, and potential volume.
Data Collection: Information about how the data will be collected, including methodologies, instruments, and tools.
Data Documentation: Plans for documenting the data, such as metadata standards, data dictionaries, and annotations, to ensure that others can understand and use the data.
Data Organization and Storage: Details about how the data will be organized, named, and stored during the project. This may involve considerations of file formats, folder structures, and storage locations.
Data Sharing and Access: Plans for making the data accessible to others, which might involve repositories, embargo periods, access controls, and licensing arrangements.
Data Preservation and Archiving: Strategies for preserving the data beyond the project’s completion, including considerations of data formats, storage options, and potential repositories or archives.
Data Security and Ethics: Measures to ensure data security and ethical handling, such as anonymization, encryption, and compliance with relevant regulations or standards.
Roles and Responsibilities: Clearly defined roles and responsibilities for individuals involved in data management, including researchers, collaborators, and data stewards.
Budget and Resources: Allocation of resources, both financial and human, needed for effective data management throughout the project.
Data Disposal: Plans for the secure disposal or retention of data, taking into account legal and ethical considerations.
Data Management Training: Details about any training that will be provided to researchers to ensure they understand and follow proper data management practices.
Practical courses about these aspects are provided by our service under demande.
Visite our dedicated webpage for more information on our tools
FAIR Data Compatibility
Our service guides researchers on making their data FAIR (Findable, Accessible, Interoperable, Reusable) compatible. This involves ensuring that metadata is comprehensive and standardized, using open formats and standards, and enhancing data accessibility.
Details
FAIR data principles
One of the grand challenges of data-intensive science is to facilitate knowledge discovery by assisting humans and machines in their discovery of, access to, integration and analysis of, task-appropriate scientific data and their associated algorithms and workflows. Force11 describes FAIR – a set of guiding principles to make data Findable, Accessible, Interoperable, and Reusable. The term FAIR was launched at a Lorentz workshop in 2014, the resulting FAIR principles were published in 2016 (link).
To be Findable:
F1. (meta)data are assigned a globally unique and eternally persistent identifier.
F2. data are described with rich metadata.
F3. (meta)data are registered or indexed in a searchable resource.
F4. metadata specify the data identifier.
To be Accessible:
A1 (meta)data are retrievable by their identifier using a standardized communications protocol.
A1.1 the protocol is open, free, and universally implementable.
A1.2 the protocol allows for an authentication and authorization procedure, where necessary.
A2 metadata are accessible, even when the data are no longer available.
To be Interoperable:
I1. (meta)data use a formal, accessible, shared, and broadly applicable language for knowledge representation.
I2. (meta)data use vocabularies that follow FAIR principles.
I3. (meta)data include qualified references to other (meta)data.
To be Re-usable:
R1. (meta)data have a plurality of accurate and relevant attributes.
R1.1. (meta)data are released with a clear and accessible data usage license.
R1.2. (meta)data are associated with their provenance.
R1.3. (meta)data meet domain-relevant community standards.
FAIR Data Sharing
Research data and metadata are made available in a format that adheres to standards, making them both human and machine-readable, in line with principles of good data governance and management, following FAIR principles (Findable, Accessible, Interoperable, and Reusable). It’s important to note that FAIR does not necessarily imply open accessibility, and sharing can occur in restricted or contractual forms if needed. However, metadata should be made as openly available as possible.
SNSF Explanation of the FAIR Data Principles (PDF) (link)
Wilkinson et al. (2016), The FAIR Guiding Principles for scientific data management and stewardship, Scientific Data 3, doi:10.1038/sdata.2016.18 (link)
Metadata Standards and Readme file preparation
Our service is knowledgeable about metadata standards and readme file for datasets. This helps you document your data effectively for effective sharing and future use.
Our service will assist you in easily creating documentation for your dataset in the form of a README file, thanks to our tool DataSquid.
Retrospective documentation: If experiments were conducted before information was added to DataSquid, the tool generates a simplified README file to document the past experiments efficiently.
Proactive documentation: README files and metadata automatically created for new acquisitions or analyses are combined to create the readme file for LTS documentation.
Details
Metadata
Metadata and readme file are absolutely necessary for a complete understanding of the research data content and to allow other researchers to find and re-use your data.
Metadata should be as complete as possible, using the standards and conventions of a discipline, and should be machine readable. Metadata should always accompany a dataset, no matter where it is stored.
Readme file
A README file is a crucial component of dataset documentation, providing essential information to ensure that the data can be accurately interpreted and utilized by others, as well as by yourself in the future. It serves to enhance the usability, reproducibility, and transparency of your dataset.
Key Elements to Include in a Data README File:
General Information:
Dataset Title: Provide a clear and descriptive title for your dataset.
Author Information: List the names, affiliations, and contact details of the principal investigator and any co-investigators.
Date of Data Collection: Specify the dates when the data was collected.
Geographical Location: If applicable, mention where the data was collected.
Keywords: Include relevant keywords to describe the data’s subject matter.
Data and File Overview:
File Descriptions: Provide a brief description of each file, including its format and purpose.
File Structure: Explain the organization of the files and any relationships between them.
File Naming Conventions: Describe the naming conventions used for files and directories.
Methodological Information:
Data Collection Methods: Detail the procedures and instruments used to collect the data.
Data Processing: Explain any processing or transformation applied to the data.
Quality Assurance: Describe steps taken to ensure data quality and integrity.
Data-Specific Information:
Variable Definitions: Define all variables, including units of measurement and possible codes.
Missing Data: Specify how missing data is represented in the dataset.
Data Formats: Indicate any specialized formats or abbreviations used.
Sharing and Access Information:
Licenses or Restrictions: State any licenses or
restrictions associated with the data.
Related Publications: Provide references to publications that use or are related to the data.
Citation Information: Offer a recommended citation for the dataset.
Best Practices:
File Format: Write the README as a plain text file (e.g., README.txt) to ensure accessibility and longevity.
Standardized Formatting: Use consistent formatting and terminology throughout the README.
Clarity and Detail: Provide sufficient detail to allow others to understand and use the data without additional assistance.
For comprehensive guidance and templates, consider consulting DSBU website.
Practical courses about these aspects are provided by our service under demande.
Visite our dedicated webpage for more information on our tools
Standards Files format
To ensure long-term access and reusability of your data, the DSBU encourages you to deposit and share your files using standard preservation and Open file formats most likely to be accessible in the future.
Details
As technology evolves, it is important to consider which file formats you will use for preserving files in the long run.
File formats most likely to be accessible in the future have the following characteristics:
- Non-proprietary
- Open, documented standard
- Popular format
- Standard representation
- Unencrypted
- Uncompressed
We can provide you with guidance on which format to use for long-term preservation and sharing of your data.
Tool
For help on long-term preservation standards format have a look at our DSBU Recommended Files format link
Open Research Data Sharing
Our unit actively supports researchers in sharing their data openly on selected FAIR repositories. This helps increase the visibility of your work within the research community. Regarding Open Research Data (ORD), our guidance on preparing and documenting datasets has facilitated the sharing of over 80 FBM-UNIL and CHUV datasets on the Zenodo repository within the FBM community space (link).
Identifying Suitable Repositories for Open Data
Our service assists researchers in finding appropriate FAIR data repositories that align with the requirements of funding agencies and journals. This ensures that research data can be published and accessed according to established policies.
Identifying Suitable catalogues for restricted-Access Data
For research sensitive data that cannot be shared openly, we offer guidance in collaboration with the CHUV IT (DSI) service on making datasets visible by publishing metadata describing the data’s characteristics using the Horus CHUV dataset catalogue. This allows international researchers to understand the dataset that has been published and request access if necessary, following proper legal procedures.
Details
The externally accessible HORUS CHUV dataset catalog is the product of a close collaboration of the DSBU with the CHUV IT Department (DSI-CHUV). Developed and managed by the DSI-CHUV, this catalog focuses on showcasing metadata (documentation) that describes the content of sensitive clinical datasets generated at CHUV, which cannot be shared due to legal restrictions.
Sensitive CHUV datasets are secured with controlled access through the Datasets Catalog Horus. The CHUV catalog-Zenodo transfer ensures interoperability, where FAIR metadata is automatically transmitted and made publicly accessible via the FBM-Zenodo community.
Process Overview:
- Users upload their datasets and complete the required metadata fields (dataset information).
- A datasteward (curator) from the DSBU reviews the metadata to confirm that the dataset has been correctly deposited.
- Once verified, the curator approves the dataset for final submission.
Data Copyright and Licensing
We assist researchers in understanding data copyright, licensing, and self-archiving rules. This ensures that you are aware of your rights and responsibilities when sharing your data.
Data preservation (Long Term Storage)
Our service, in partnership with the Computing and Research Support Division of UNIL (DCSR), has been actively engaged in the implementation of Long-Term Storage (LTS) for FBM-UNIL research data (link). This initiative is of crucial importance in the face of the exponential growth in data storage costs generated by research.
Through the process of data life cycle management, our team is providing information, advice and help to FBM UNIL researchers for long term storage and preservation of their data for free at the DCSR UNIL.
We can provide you with tools and guidance on how to prepare a Readme file and reorganize your data in order to preserve your work.
- Informing and guiding you through the process of reorganizing and describing your research data in the form of explanatory documents called “readme file”.
- Final validation of your readme file before data migration from the DCRS NAS to the LTS platform.
Visite our dedicated webpage for more information Long Term Storage / Archiving