It is important to organise and document datasets appropriately so that they can be used and reused efficiently, but also to facilitate discovery by others if they are to be shared.
What is metadata?
The terms and statements used to describe data are commonly referred to as metadata. A widely cited definition is presented by the UK Data Archive:
What information is necessary?
There is no single schema defining what metadata elements should be collected to describe research data or how they should be recorded. However, the metadata should normally include:
- all the information which will be needed to cite the data: author(s), title, date of creation, version, where the data can be found, if possible a DOI
- any information which would be needed to enable people to verify the data e.g. details of experiments, hardware, software, how data was gathered
- details of why the data was created or collected
- subject terms or keywords which will help you or other potential users to identify relevant files
The Digital Curation Centre (DCC) provides a useful resource, listing the various types of disciplinary metadata
How to create metadata
It is possible to provide metadata in different ways:
- embedded documentation, such as that created using the properties facility in Microsoft Office programs. This method is used for simple files and allows searching and displaying files according to the metadata elements included.
- separate files of documentation with links to the data files. This is more appropriate for complex datasets.
- catalogue metadata, structured according to international standards to facilitate harvesting and searching in a variety of ways.
For further detailed information about creating metadata, see: