Internal, open access

Research Data Archive FAQs

  1. Do I need to deposit data with my Metadata Record?
  2. What data should I include in my Dataset?
  3. Will my data be assigned a Digital Object Identifier?
  4. Who owns the data?
  5. What file formats does the Archive accept?
  6. What documentation should I provide for my data?
  7. How should I license my data?
  8. Can I monitor who uses my data?
  9. How can I control who uses my data?
  10. How do I generate bounding box coordinates with OpenStreetMap?
  11. Can I delay the release of my data?
  12. Can a Dataset be withdrawn after it has been deposited?
  13. Where can I store data for private use?

Do I need to deposit data with my Metadata Record?

No: you can create a Metadata Record that describes and links to a Dataset held in an external data centre or data sharing service, such as a NERC data centre or figshare, or you can create a record describing a non-digital Dataset held in the University.

These are two reasons for doing this:

  • So that the University can maintain a record of research data collected/generated in the course of research based at the University, wherever the data may be held, and in whatever format they are;
  • To comply with your funder's policy: for example, EPSRC requires research organisations it funds to 'ensure that appropriately structured metadata describing the research data they hold is published […] and made freely accessible on the internet' (EPSRC Expectations).

 

What data should I include in my Dataset?

There is not necessarily a single way to constitute a Dataset: one set of research materials could be grouped in various different ways which would each still be valid as Datasets. How you select and organise your data may depend on the relative weight of the main reasons for making your data available:

  • to enable validation/replication of published research findings;
  • to facilitate re-use of data for further research and/or teaching purposes.

You are likely to make a selection from data at various stages of processing, which could be thought of broadly as follows:

  • raw data, i.e. data in the format of initial capture or collection, such as files in specialist software formats generated by experimental and observational instruments; survey responses; audio/video recordings;
  • intermediate processed data, e.g. instrument data calibrated, cleaned and coded in data processing and visualisation software, such as MATLAB; raw data input as values in spreadsheets and databases; transcriptions of recorded interviews; textual data processed in analysis software such as NVivo;
  • 'final' processed or output data: selected illustrative subsets of data, e.g. filtered samples, figures and charts, as may be included in a research paper or submitted to a publisher as supplementary information.

In some cases it may not be suitable to share raw data, e.g. where they contain personal information or where files are not practically usable by nature of their format. But bear in mind that processing of data is likely to limit the usability of the data for various purposes.

Bear in mind that data will also include the information and materials necessary to interpret or regenerate recorded outputs, such as experimental methods, algorithms and research software. The Archive offers a number of standard Open Source licences for software. For guidance on publishing and licensing research software, read our Guide to publishing research software.

The Checklist for appraising research data published by the Digital Curation Centre provides a useful framework for deciding how to constitute your preservation Dataset.

 

Will my data be assigned a Digital Object Identifier?

Yes: normally a Digital Object Identifier (DOI) will be assigned to the Metadata Record for a Dataset under the authority of the University once a submission has been approved by an Archive Administrator and the Metadata Record has been published. DOIs will be assigned to Metadata Records for both digital and non-digital Datasets. DOIs will not be assigned to Metadata Records for Datasets held outside the authority of the University, such as those held in external data centres.

The DOI is a unique, permanent identifier for the Dataset, and once it has been allocated the Dataset is effectively fixed. You will not be able to modify or delete any data files uploaded as part of the Dataset, or any values entered in the Creator(s), Title, Data Publisher, Publication Year, and DOI metadata fields.

Sometimes it may be desirable to deposit a pre-final version of your Dataset, which will undergo changes before a final version is established. This may happen if your Dataset will be subject to review and amendment as part of the editorial and peer review process for a submitted paper.

If this is the case you should deposit your Dataset as normal, but state clearly in the Comments and suggestions field that you do not wish a DOI to be assigned at this stage. You will be able to publish the Dataset with a unique URL as an interim means of reference, and until a DOI is assigned you can make changes to any metadata. If you wish to modify any data files you will need to create a new version of the Dataset: this will be archived separately and made available at a new URL. When all changes have been made and you have a final fixed version of your Dataset, you can then ask the Archive Administrators to allocate a DOI.

 

Who owns the data?

If you plan to deposit any data files, you should always make sure you know who owns and has rights in the data, as this will affect what you can do with them.

Research collaboration and funding agreements usually include intellectual property and publication clauses that govern ownership of data and requirements to notify interested parties of any intended publication.

If you are an employee of the University, data created by you in the course of your employment belongs to the University, unless any contracts or collaboration agreements stipulate otherwise. As an employee of the University you may deposit the data collected by you or your colleagues in the Research Data Archive or another suitable data centre or repository, in accordance with University policies and subject at all times to any third party contract terms or rights underpinning the creation, ownership and publication of the data.

If you are a student at the University, by default you own the IP in any data you have created or generated. But your rights in the data may not be exclusive: if you created the data jointly as part of a research team or collaboration, or were sponsored or employed by another organisation, ownership may be shared or otherwise assigned. You can also assign IP elsewhere, for example to the University, and in some cases this may have been required, e.g. where there has been a significant contribution to the IP from a supervisor or other member of staff, or where you have received significant financial support/material contribution from the University to undertake the research.

Go to the Intellectual Property Management section of the staff website for more information about Intellectual Property and to read the University Code of Practice on Intellectual Property.

If you are unsure about ownership and rights in your data, contact the Research Data Manager for advice.

 

What file formats does the Archive accept?

The Archive will accept any file type that you choose to deposit, but you should wherever possible aim to deposit files that are optimised for long-term preservation and use. For these purposes, the Archive acknowledges three categories of files:

  • Recommended file formats: standard preservation;
  • Acceptable file formats: general purpose;
  • Other file formats: specialist and rarely-used.

Guidance on file formats with examples of recommended and acceptable formats is provided in our Recommended File Formats guide (PDF).

 

What documentation should I provide for my data?

A minimum of one documentation file must be deposited for each set of files uploaded. This is essential to ensure that the data can be understood and effectively used. You should provide sufficient information for the set of downloaded files to be a self-contained meaningful Dataset.

Think of documentation as a user guide to the data. For simple Datasets the documentation requirement may be relatively straightforward, such as a file listing with brief content descriptions. For more complex Datasets you may need to include more detailed documentation. Here are some examples of types of documentation that might be included: information regarding data collection and processing methods, instrument user guides and technical reports, lab notebook records, survey questionnaires, data dictionaries and codebooks, information sheets, sample consent forms and interview instructions.

The recommended minimum information to be included in a documentation file is the name of the research project and a file listing. A template README.txt file template is provided for you to use if you wish.

 

How should I licence my data?

You will be required to choose a licence option for each file you upload to the Archive. In order to license the data you must be the data owner or authorised to assign a license on behalf of the data owner.

The terms on which you can license the data may depend on the interests of any third parties who have rights in the data. Third party interests may be present where other organisations were involved in collection of the data, or where data have been derived from existing sources. If you have any concerns about how you can license your data, you should contact us for advice.

In general you should aim to license the data to maximise the possibility for re-use. We provide a number of standard licences for this purpose.

For open data you should use Creative Commons licences. The University recommends that where possible you use Creative Commons Attribution 4.0, which permits re-use of the data provided that proper attribution is made. Other licence options are available, including the Creative Commons Attribution Non-Commercial licence. For more guidance on Creative Commons licences go to the Creative Commons website.

For software code the GNU General Public License 3.0 and the GNU Lesser General Public License 3.0 are recommended. Guidance on licence options for software can be found at Choose a license. For general guidance on publishing and licensing research software, read our Guide to publishing research software.

If you need to restrict access to files to authorised users only, you can select the Restricted access setting for your file or zip bundle and use the University of Reading Licence for Restricted Data, which allows data to be used, subject to authorisation, in confidence, for non-commercial research and learning purposes only. Alternatively, you can upload your own licence. But you are strongly encouraged to use the recommended standard licences unless there is a justification for using an alternative option.

 

Can I monitor who uses my data?

The Archive does not monitor individual data downloads. The only way for you to monitor who downloads your data is to make files Restricted; the person who authorises access to Restricted files can then log any requests for access to the data.

This is not recommended as a means of monitoring usage. You should only give files a Restricted access setting if there is good reason to restrict access to the content, for example, based on the sensitive nature of the data. If you restrict access to files in this way, you will need to provide details of a Contact who can respond to all user requests for access to the data.

There are other ways to collect information about usage of your data:

  • the Archive Statistics provide information about Metadata Record views and item downloads;
  • Once a DOI has been assigned to a Dataset it is very easy to identify citations in scholarly publications and other sources. The Altmetric service (look for the 'doughnut' on the right of the Metadata Record) tracks mentions of the Dataset in social media sites, newspapers, policy documents, blogs, Wikipedia, and many other sources.

 

How can I control who uses my data?

You can exert some control over who uses your data, but you should only do this where there is a legitimate need to restrict access, based on either the nature of the data or any third-party interests present in the data:

  • Where the Dataset contains sensitive data, such as personal data obtained from research participants, or business information, which can only be made available on a confidential basis;
  • Where data have been collected in collaboration with other organisations, which may specify particular terms on which the data can be made available;
  • Where data derived from an existing data source may be subject to terms of use associated with the source data.

In any of the circumstances above you can apply the Restricted access setting to any file you upload and license the file using the University of Reading Licence for Restricted Data, which allows data to be used, subject to authorisation, in confidence, for non-commercial research and learning purposes only. Alternatively, you can upload your own licence.

If you restrict access to files in this way, you will need to provide details of a Contact who can respond to all user requests for access to the data.

 

How do I generate bounding box coordinates with OpenStreetMap?

  • Navigate to the OpenStreetMap main page;
  • Adjust your view (drag or use the search box) on the map so that it covers at least the full geographical extent of your data collection;
  • Click the Export button at the top left of the interface to generate four values arranged in a rough diamond shape to the left of the map;
  • These are the latitude and longitude values representing the current view in the map window. Clockwise starting from the top they are: North latitude, East longitude, South latitude, West longitude;
  • Underneath the four values you can Manually select a different area using the resizeable box in the map window and adjust the bounding area highlighted on the map;
  • When the box accurately reflects your data collection, copy and paste the values into their corresponding boxes in the Geographic location field on the Details page of your Dataset.

 

Can I delay the release of my data?

Yes: you can place data files under embargo when you deposit them, so that so that the Metadata Record and file names are visible, but the file contents cannot be viewed or downloaded.

An embargo enables you to deposit your data when the research is still fresh on your mind, for example while preparing for publication, but to delay release of the data, e.g. until the research findings have been published or any intellectual property has been exploited.

Most public funders allow researchers a period of exclusive use of their data, but data should be made available within a reasonable time after they have been collected, and no later than publication of your research findings, unless there is good reason for longer-term restrictions on access.

Data files should not be embargoed for longer than 12 months from the date of deposit, unless you have good reason to restrict access for longer. If this is the case, the reasons for requiring a longer embargo period should be stated at the time of deposit using the Comments and suggestions field.

 

Can a Dataset be withdrawn after it has been deposited?

As a rule, once the deposit process has been completed and the Metadata Record for a Dataset has been published, it may not be withdrawn for at least ten years, in accordance with the Archive Preservation Policy, except in response to an identified breach of law or policy, or on receipt of a valid and proven complaint, as described in the Takedown policy.

If you wish to modify or update a Dataset, you can link the new version to the original, so that anyone landing on the latter will be directed to the more recent version. To do this, log in to the Archive, find the Dataset you wish to update in your My data section, and click on the View item icon. On the Actions tab click the New version button to create a duplicate of the original Dataset, which you can modify by making changes to the metadata and the fileset. When you have completed your changes, submit the new version. Once approved, this will be published and linked to the previous version of the Dataset.

 

Can I use the Archive to store data that is not for public use?

The purpose of the Archive is to register, manage and provide public access to data. Data deposited in the Archive can be embargoed for up to 12 months from the date of deposit, but after this period some data should be made accessible on some basis.

The Archive is not currently suitable for use as a private data store. For private storage, small amounts of data can be held in your personal network drive, or on a shared drive, if you have the capacity. Larger amounts of data that you need to retain for the long term will require dedicated archive storage. This can be provided by means of your own storage media/devices, such as set of CDs backed up to an external hard drive. Contact us if you have questions about this.

 

Page navigation

 

Search Form

A-Z lists