Storing your research data is important for several reasons. First of all, according to the Netherlands Code of Conduct for Scientific Practice (VSNU, part III), researchers are obliged to store their raw research data for at least ten years (no maximum period) for validation purposes. Secondly, journals or funders may require you to give open access to your research data or at least share your data with other researchers upon request (see Data Sharing).
Where to store #
- The directive on raw research data storage is minimally 10 years, to the extent that this is compatible with the GDPR stating to store personal data no longer than necessary.
- Data should never be stored solely on personal and/or local drives: data storage on the m-/p-drive of the UT are certified according to the ISO/IEC 27001 and NEN 7510-standards. This is the highest level of protection for your personal and also sensitive data.
- (Raw)data will be stored on the central and secured BMS server, privacy sensitive data of a project can be protected by encryption. Indicate this with a project sign-up at BMS LAB.
- The datafiles will be stored together with the EC approval in the same folder
- Back-up on external, secure SSD drive: These drives are specially designed for the safe transportation of research data and of documents containing confidential, privacy-sensitive data.
- BMS Datavault: BMS LAB offers a safe vault for your sensitive research data.
- After the research, data will be stored in a trusted repository (e.g. DANS) or permanently stored on one of the secured servers of the faculty. This concerns at least the raw data.
What to store #
- Raw data file: the raw data file contains the originally collected, unprocessed data.
- Derived dataset: the derived dataset is the dataset underlying certain results or publications. You can derive different datasets from your raw data for different purposes.
- Syntaxes: a syntax file contains the code, algorithms or commands used to create your derived dataset from your original, raw dataset. It also contains (stepwise) information about the transformations and analyses performed on the raw dataset.
- Metadata file: a metadata file is a separate file attached to your dataset, which contains information about your dataset for future use (by yourself or others). For example, a metadata file should contain information on the following subjects: creator, access conditions, context, collection methods, time references, structure and organization of data files, variable names, labels and descriptions of variables and values, codes for missing values, file formats, and hard- and software used to process and analyse the data.
As common sense dictates, storing and sharing (sensitive) data should be handled with care (see Guidelines Personal Information). The level of precaution that should be taken depends on the sensitivity of the data and can range from ‘simple’ precaution to storage on a secured, isolated and off-line computer or encrypted USB sticks in the IGS data vault.
Preferred file formats #
To ensure long-term preservation that is independent of certain specific software, you are encouraged to save your files in commonly used and easily re-usable file formats with open documentation. Please find a list of different preferred and acceptable file formats for different types of data here.
In general, any scientific work should be reproducible. This applies to the social sciences as much as it does to the natural sciences. In practice, this means that the whole process of how you handle data should be documented. Gathering, cleaning, coding, transforming and scaling as well as analyses performed should all be documented. It is good practice to perform the above tasks using syntax and to store the syntax along with the data.
Note that, even though it may be tempting to perform a ‘quick fix’ in the SPSS data view, such a change may become lost or be overlooked, rendering reproduction of the research more difficult.