Through the process of data life cycle management, our team is providing information, advice and help to FBM UNIL researchers for long term storage and preservation of their data for free at the DCSR UNIL.
Contact us for guidance on how to prepare your Readme file and reorganize your data in order to preserve your work.
DSBU will help you by:
- Informing and guiding you through the process of reorganizing and describing your research data in the form of explanatory documents called “readme file”.
- Offering you two powerful tools, DeepScan to screen your data folders on the NAS-DCSR and DataSquid to generate semi-automatically the documentation as a readme file.
- Final validation of your readme file and data re-organization before data migration from the DCRS NAS to the LTS platform.

Securing UNIL Data: Key Practices
- Mandatory Storage: Store all UNIL research data on NAS Recherche DCSR or LTS (two copies required); external disk or servers are prohibited. Link: Directive UNIL 4.5 – Traitement et gestion des données de recherche )
- Under the responsability of the PI
- Data Organization: Clean and reorganize data from FBM_Mov into structured project directories on NAS-DCSR.
- Storage Rules:
- No personal, sensitive, or admin data on NAS-DCSR and LTS.
- Use NAS-UNIL Admin partition for Administratif personal data.
- Use OneDrive or the NAS-UNIL Admin partition for research-related files like protocols, presentations, references, figures for paper preparation…
LTS Caracteristics
- Under the responsability of the PI
- Free of charge (2 copies with security and backup)
- Time limit for storage duration (e.g. for published data at minimum 10 years after the date of the article acceptance) with possible extension (1x).
- Restricted access (limited number of accesses to data on LTS and under request only).
- Effective organization of your data under a project form (matched with a funding, a specific theme, a publication etc.).
- Production of a Readme file (document describing the dataset content) for each individual futur TAR subdirectory
- Naming rules for (TAR = Tape Archive) subdirectories in the LTS directory
Data organization
- You will be informed and guided through the process of reorganizing and describing your research data in the form of explanatory document called readme file.
- You will be will helped to determine the best organizational strategy for your data.
- During this process, you will be able to give a temporary access to your data stored on the DCSR NAS to Cécile Lebrand via the Ci interface on the homepage and using the following username: clebrand.
- DSBU Screening and Control using our tool DeepScan (link) for automated file screening: contents of the LTS folder will be systematically controlled to ensure compliance with cleanup and documentation requirements.
- You will need to reorganize your data according to the projects you will have defined and clean up/eliminate obsolete data.
- Research data must be organized into the LTS directory around a given project (matched with a funding, a specific theme, a publication etc.).
- Transfert your data to the LTS subfolder of the NAS project using the copy/paste function or linux command line.

TAR (Tape Archive) subdirectory to be archived

- TAR archive file: size of the data volume in each individual subdirectory to be archived is free up to 4 TB per subdirectory in the LTS directory. No need to compress your data since this step will be done during the creation of the TAR subdirectory.
- The TAR files to be archived must remain at the root of the LTS directory.
- Each TAR subdirectory has to comply to naming rules (see below) and should be accompanied by an independent readme file. We suggest to use our new DSBU tool DataSquid to create a LTS Readme file that can be shortly adapted to each individual TAR subdirectory.
- The naming rule only applies to the first TAR sub-folders in the LTS directory. The TAR archive files created from these directories will have the same name. Within these directories the names of the files and data directories are free.
- Naming rules for subdirectories in the LTS directory (TAR archive file). The length of the folder name must not exceed a maximum of 40 charactersaccording to the rulesbelow
- Numbers from 0-9
- Letters a-z
- Letters A-Z
- Hyphen ( – ) OK but not at the beginning or end of the directory name
- Underline ( _ ) OK but not at the beginning or end of the directory name
- No white spaces
- No accented characters or symbols
Readme file
Complete the readme file for each distinct subdirectory to be archived in individual TAR archive file following the established guidelines. Ask DSBU for final revision of your readme file. In the event your readme file are not considered complete enough to understand the nature of the data set, C. Lebrand will send you an add-in request.
Naming rules for the readme file in each TAR directory: The name and file format of the readme file for LTS must not be changed, “LongTermStorage_Data_Description_EN.docx”.
1 – UNIL LTS Template
2 – Datasquid tool: DSBU automated readme file
Our unit is deploying a user-friendly and automated methodology for generating readme files for Big Datasets .
DataSquid is an advanced tool that automates data documentation for experiments by integrating with equipment databases and laboratory protocols, ensuring comprehensive, consistent, and reproducible research documentation. link
Once your readme file have been approved by clebrand, they should be included at the root of each individual moved within data subdirectory and the future TAR file should be moved from the D2C (or D1C) directory to the LTS directory.
Procedure to request the transfer of “cold” Data to Long-term storage (tape)
Request LTS tape transfer via the NAS DCSR dashbord
- Via the interface on the homepage, you will be able to request the LTS tape transfer for part of your data by clicking on the button “Request for long-term storage (LTS)” in the list of actions listed on the interface homepage. Connect and sign in to your NAS DCSR dashboard via your SWITCH UNIL account using this link
- Select «making a long term storage request» via your DCRS dashboard interface to ask for Long Term Storage (LTS)
- To send your request for long term storage of part of your research data on magnetic tape, click on the submit button. Mention which data you would like to transfert for long term storage.
- An e-mail will be send automatically to the DSBU.
- DCSR will perform last automatic technical controls
clebrand will validate the LTS request and will send you a report. At that step the PI will not pay anymore for the data.
Validation
The DCSR will create the TAR archive files and transfer your data on magnetic tape with all individual readme file included at the root of each individual TAR archive fileAfter completing the magnetic tape transfer process for the TAR directory, the LTS directory will include the following elements:
- The TAR file will be renamed by adding the prefix “ARCHIVED*,” and the underlying information will only be accessible in read-only mode.
- A copy of the readme file “LongTermStorage_Data_Description_EN.docx” will be accessible in read-only mode.
- A file named “INVENTORY_OF_ARCHIVED_FILES.txt” within the TAR directory will list the archived files with their full paths and provide information on the recovery procedure from the tapes.
This procedure ensures that the archived data remains accessible and easily identifiable. The addition of the “ARCHIVED*” prefix allows UNIRIS and the researcher to quickly confirm that archiving has been completed by simply examining the TAR directory names at a given moment.
Data recovery of TAR subdirectory from magnetic tape (LTS)
Retrieval Cost Model still under discussion
- Limited Free Retrieval Allowance: Researchers can retrieve a limited number of TB of data within a 12-month period without incurring costs.
- Limit on the number of extraction requests (not TAR requests) made by the DCSR.
- A minimum delay of several months between the submission on LTS and data retrieval.
- Additional Retrieval Cost: Retrievals exceeding the limit in the 12-month window will be charged per additional TB.
Request TAR retrieval for the tape via the NAS DCSR dashbord
Connect and sign in to your NAS DCSR dashboard via your SWITCH UNIL account using this link

- Select «Retrieving data from Long term Storage» via your DCRS dashboard interface to ask for recovery of your data on the Long Term Storage (LTS)

- To validate your request to retrieve some or all of your research data stored on magnetic tape (LTS), please click on the “Submit” button below. If you already have a specific request or question, you may, if you wish, leave your comment(s) below.

- An e-mail will be send automatically to The DCSR that will process your demand and you will be able to access and to work with your data on your DCSR-NAS space.