DataCite Metadata Schema
In the following recommendations:
How to cite this record FAIRsharing.org: DataCite Metadata Schema; DataCite Metadata Schema; DOI: https://doi.org/10.25504/FAIRsharing.me4qwe; Last edited: Dec. 17, 2019, 4:50 p.m.; Last accessed: Aug 13 2020 8:41 a.m.
Record added: April 27, 2015, 10:25 a.m.
Record updated: Dec. 16, 2019, 6:36 p.m. by The FAIRsharing Team.
Edits to 'https://fairsharing.org/FAIRsharing.me4qwe' by 'The FAIRsharing Team' at 18:36, 16 Dec 2019 (approved): 'description' has been modified: Before: The DataCite Metadata Schema is a list of core metadata properties chosen for the accurate and consistent identification of a resource for citation and retrieval purposes, along with recommended use instructions. The resource that is being identified can be of any kind, but it is typically a dataset. The term ?dataset? can include not only numerical data, but any other research data outputs. After: The DataCite Metadata Schema is a list of core metadata properties chosen for the accurate and consistent identification of a resource for citation and retrieval purposes, along with recommended use instructions. The resource that is being identified can be of any kind, but it is typically a dataset. The term ?dataset? can include not only numerical data, but any other research data outputs.
Edits to 'https://fairsharing.org/FAIRsharing.me4qwe' by 'The FAIRsharing Team' at 22:48, 16 Jul 2018 (approved): 'publications' has been modified: Before: After: DataCite Metadata Schema Documentation for the Publication and Citation of Research Data. Version 4.1. DataCite e.V. Added: DataCite Metadata Schema Documentation for the Publication and Citation of Research Data. Version 4.1. DataCite e.V. Removed:
Edits to 'https://fairsharing.org/FAIRsharing.me4qwe' by 'The FAIRsharing Team' at 18:44, 16 Jul 2018 (approved): 'description' has been modified: Before: The DataCite Metadata Schema is a list of core metadata properties chosen for the accurate and consistent identification of a resource for citation and retrieval purposes|along with recommended use instructions. The resource that is being identified can be of any kind|but it is typically a dataset. The term dataset can include not only numerical data|but any other research data outputs. After: The DataCite Metadata Schema is a list of core metadata properties chosen for the accurate and consistent identification of a resource for citation and retrieval purposes|along with recommended use instructions. The resource that is being identified can be of any kind|but it is typically a dataset. The term dataset can include not only numerical data|but any other research data outputs.
Conditions of UseApplies to: Data use
REST Web Services
|API Access points||https://support.datacite.org/v1.1/reference#introduction|
DataCite Metadata Schema Documentation for the Publication and Citation of Research Data. Version 4.1. DataCite e.V.
DataCite Metadata Working Group
Models and Formats
No identifier schema standards defined
No metrics standards defined
figshare is a data publishing platform that is free for all researchers. Some of figshare’s core beliefs are: academic research outputs should be as open as possible, as closed as necessary; academic research outputs should never be behind a paywall; academic research outputs should be human and machine readable/query-able; academic infrastructure should be interchangeable; academic researchers should never have to put the same information into multiple systems at the same institution; identifiers for everything; and the impact of research is independent of where it is published and what type of output it is. figshare supports embargoing and managed access, and will embargo data while undergoing peer review. Metadata in figshare is licenced under is CC0. All files and metadata can be accessed from docs.figshare.com. figshare has also partnered with DuraSpace and Chronopolis to offer further assurances that public data will be archived under the stewardship of Chronopolis. In the highly unlikely event of multiple AWS S3 failures, figshare can restore public user content from Chronopolis. figshare is supported through Institutional, Funder, and Governmental service subscriptions.
Dryad is an open-source, community-led data curation, publishing, and preservation platform for CC0 publicly available research data. Dryad has a long-term data preservation strategy, and is a Core Trust Seal Certified Merritt repository with storage in US and EU at the San Diego Supercomputing Center, DANS, and Zenodo. While data is undergoing peer review, it is embargoed if the related journal requires / allows this. Dryad is an independent non-profit that works directly with: researchers to publish datasets utilising best practices for discovery and reuse; publishers to support the integration of data availability statements and data citations into their workflows; and institutions to enable scalable campus support for research data management best practices at low cost. Costs are covered by institutional, publisher, and funder members, otherwise a one-time fee of $120 for authors to cover cost of curation and preservation. Dryad also receives direct funder support through grants.
PANGAEA - Data Publisher for Earth and Environmental Science
The information system PANGAEA is operated as an Open Access library aimed at archiving, publishing and distributing georeferenced data from earth system research. PANGAEA is a member of the ICSU World Data System (WDS).
Open Science Framework
The Open Science Framework (OSF) is a free and open source project management tool that supports the entire research lifecycle: planning, execution, reporting, archiving, and discovery. Features include automated versioning, logging of all actions, collaboration support, free and unlimited file storage, registrations, and connections to other tools/services (ie. Dropbox, figshare, Amazon S3, Dataverse, GitHub). It is 100% free to researchers, open source, and intended for use in all domain areas. OSF has an open, public API to support broad indexing, as well as a partnership with Internet Archive for long-term preservation with a $250k preservation fund and an IMLS grant for transfer to Internet Archive (currently in progress). The OSF supports embargoing during peer review via a view-only link with the ability to anonymize contributor list. It also provides managed access by allowing access requests and private sharing settings. OSF is a non-profit with direct funder support through grants, government contracts, and community memberships.
Sea scientific open data publication
Seanoe (SEA scieNtific Open data Edition) is a publisher of scientific data in the field of marine sciences. It is operated by Sismer within the framework of the Pôle Océan. Data published by SEANOE are available free. They can be used in accordance with the terms of the Creative Commons license selected by the author of data. Seance contributes to Open Access / Open Science movement for a free access for everyone to all scientific data financed by public funds for the benefit of research. An embargo limited to 2 years on a set of data is possible; for example to restrict access to data of a publication under scientific review. Each data set published by SEANOE has a DOI which enables it to be cited in a publication in a reliable and sustainable way. The long-term preservation of data filed in SEANOE is ensured by Ifremer infrastructure.
Movebank Data Repository
A free and public archive of animal tracking datasets associated with peer-reviewed publications. The Repository is the long-term archive associated with Movebank (movebank.org), a free, online database created to help animal tracking researchers to manage, share, protect, analyze, and archive their data. Hosted by the Max Planck Institute for Ornithology and the University of Konstanz Library.
The FAIRDOMHub is a publicly available resource build using the SEEK software, which enables collaborations within the scientific community. FAIRDOM will establish a support and service network for European Systems Biology. It will serve projects in standardizing, managing and disseminating data and models in a FAIR manner: Findable, Accessible, Interoperable and Reusable. FAIRDOM is an initiative to develop a community, and establish an internationally sustained Data and Model Management service to the European Systems Biology community. FAIRDOM is a joint action of ERA-Net EraSysAPP and European Research Infrastructure ISBE.
Open Researcher and Contributor ID Registry
ORCID is an open, non-profit, community-driven effort to create and maintain a registry of unique researcher identifiers and a transparent method of linking research activities and outputs to these identifiers. The ORCID Registry is a repository of unique researcher identifiers which allows researchers to manage a record of their research activities. In addition, there are APIs that support system-to-system communication and authentication. ORCID makes its code available under an open source license, and will post an annual public data file under a CC0 waiver for free download.
The CancerData site is an effort of the Medical Informatics and Knowledge Engineering team (MIKE for short) of Maastro Clinic, Maastricht, The Netherlands. It offers a central, online repository for the sustained storage of clinical protocols, publications and research datasets. The data that are offered can vary from documents, spreadsheets to (bio-)medical images and treatment simulations. CancerData is a registered member of DataCite, which is an international consortium and member of the International DOI Foundation. Via DataCite, we have the ability to offer persistent identifiers to the datasets via the registration of Digital Object Identifiers (DOI).
Zenodo is a generalist research data repository built and developed by OpenAIRE and CERN. It was developed to aid Open Science and is built on open source code. Zenodo helps researchers receive credit by making the research results citable and through OpenAIRE integrates them into existing reporting lines to funding agencies like the European Commission. Citation information is also passed to DataCite and onto the scholarly aggregators. Content is available publicly under any one of 400 open licences (from opendefinition.org and spdx.org). Restricted and Closed content is also supported. Free for researchers below 50 GB/dataset. Content is both online on disk and offline on tape as part of a long-term preservation policy. Zenodo supports managed access (with an access request workflow) as well as embargoing generally and during peer review. The base infrastructure of Zenodo is provided by CERN, a non-profit IGO. Projects are funded through grants.
UK Polar Data Centre Data Archive
The UK Polar Data Centre (UK PDC) is the Natural Environment Research Council's (NERC) Designated Data Centre for polar science. It is the focal point for Arctic and Antarctic environmental data management in the UK.
The National Science Foundation funded OpenTopography facilitates community access to high-resolution, Earth science-oriented, topography data, and related tools and resources.
Harvard Dataverse is a research data repository running on the open source web application Dataverse. Harvard Dataverse is fully open to the public, and allows upload and browsing of data from all fields of research, and is free for all researchers worldwide (up to 1 TB). Links to related grants, authors, software and research products are provided. Harvard Dataverse supports managed access (with an access request workflow) as well as embargoing generally and during peer review. Dataverse allows users to share, preserve, cite, explore, and analyse research data. It facilitates making data available to others, and allows you to replicate others' work more easily. Researchers, data authors, publishers, data distributors, and affiliated institutions all receive academic credit and web visibility. The Harvard Database receives support from Harvard University, public and private grants, and an emergent consortium model.
Mendeley Data is a multidisciplinary, free-to-use open repository specialized for research data. Data files of up to 10GB can be uploaded and shared. Search more than 20+ million datasets indexed from 1000s of data repositories and collect and share datasets with the research community following the FAIR data principles. Links are available to related authors, software, grants and research. Each version of a dataset is given a unique DOI, and dark archived with DANS (Data Archiving and Networking Services), ensuring that every dataset and citation will be valid in perpetuity. Metadata is licensed CC0, and datasets are and will continue to be free access. Mendeley Data will shortly support managed access, and currently supports embargoing of data both generally and while undergoing peer review. It is funded by a subscription model for Academic & Government entities.
Qualitative Data Repository
The Qualitative Data Repository (QDR) is a dedicated archive for storing and sharing digital data (and accompanying documentation) generated or collected through qualitative and multi-method research in the social sciences. QDR provides search tools to facilitate the discovery of data, and also serves as a portal to material beyond its own holdings, with links to U.S. and international archives. The repository’s initial emphasis is on political science. Four beliefs underpin the repository's mission: data that can be shared and reused should be; evidence-based claims should be made transparently; teaching is enriched by the use of well-documented data; and rigorous social science requires common understandings of its research methods.
4TU.Centre for Research Data
4TU.Centre for Research Data (short: 4TU.ResearchData) was started in 2008 as an initiative of the three technical universities in the Netherlands – Delft University of Technology, Eindhoven University of Technology, and the University of Twente. The ambition was, and still is, to create and maintain a national state-of-the-art facility for storing and preserving science and engineering research data and for making those data openly accessible. The data archive has been fully operational since 2010 and it has evolved to become a trusted and certified repository for science and engineering. By publishing data-sets via 4TU.ResearchData you will make your data FAIR. Every single data-set is assigned a DOI and metadata (F), the archive is accessible 24/7 online worldwide via https protocol (A), the data-files adhere to community and preservation standards (I), and a readme-file and usage license is provided for every data-set (R). This archive is accessible and usable for any researcher from the science and engineering disciplines. Please visit our website for more details.
Scholars Portal Dataverse
Scholars Portal Dataverse is a repository of research data in all fields of research. Researchers can share, publish, archive, find and cite data across all research fields. Researchers from subscribing institutions can use Dataverse to directly deposit data, create metadata, release and share data openly or privately, visualize and explore data, and search for data.
DataCite is a leading global non-profit organisation that provides persistent identifiers (DOIs) for research data. Their goal is to help the research community locate, identify, and cite research data with confidence. They support the creation and allocation of DOIs and accompanying metadata. They provide services that support the enhanced search and discovery of research content. They also promote data citation and advocacy through community-building efforts and responsive communication and outreach materials. DataCite gathers metadata for each DOI assigned to an object. The metadata is used for a large index of research data that can be queried directly to find data, obtain stats and explore connections. All the metadata is free to access and review. To showcase and expose the metadata gathered, DataCite provides an integrated search interface, where it is possible to search, filter and extract all the details from a collection of millions of records.
Ag Data Commons
The Ag Data Commons is a data access system maintained by the US Department of Agriculture's (USDA) National Agricultural Library. It uses a customized version of open-source DKAN software, which is compliant with U.S. Project Open Data standards for federal agencies providing access to publicly-funded data. Ag Data Commons holds data files managed directly by NAL and also links to datasets and resources located on other websites. The Ag Data Commons provides access to a wide variety of open data relevant to agricultural research and related domains. These may include subjects such as agronomy, genomics, hydrology, soils, agro-ecosystems, sustainability science, and economic statistics. Data included in the Ag Data Commons is funded in whole or in part by USDA. Our goal is that Ag Data Commons will foster innovative data re-use, integration, and visualization to support bigger, better science and policy.
Portail Data INRAE
Portail Data INRAE is offered by INRAE as part of its mission to open the results of its research. INRAE is Europe’s top agricultural research institute and the world’s number two centre for the agricultural sciences. Data INRAE will share research data in relation to food, nutrition, agriculture and environment. It includes experimental, simulation and observation data, omics data, survey and text data. Only data produced by or in collaboration with INRAE will be hosted in the repository, but anyone can access the metadata and the open data. Data INRAE is built on software from the Dataverse Project.
CaltechDATA is an institutional data repository for Caltech. Caltech library runs the repository to preserve the accomplishments of Caltech researchers and share their results with the world. Caltech-associated researchers can upload data, link data with their publications, and assign a permanent DOI so that others can reference the data set. The repository also preserves software and has automatic Github integration. All files present in the repository are open access or embargoed, and all metadata is always available to the public.
Environmental Data Portal
EnviDat is the environmental data portal and repository developed by the Swiss Federal Research Institute WSL. The portal provides unified and managed access to environmental monitoring and research data. The portal has the capability to host and publish data sets. While sharing of data is centrally facilitated, data management remains decentralised and the know-how and responsibility to curate research data remains with the original data providers.
CyVerse Data Common Repository
The Data Commons provides services to manage, organize, preserve, publish, discover, and reuse data. Using our pipelines, you can easily publish data to the NCBI or directly to the CyVerse Data Commons. CyVerse Curated Data are stable and have DOIs. Community Released Data are maintained by community members and may not be permanent.
Imperial College Research Data Repository
A lightweight digital repository for data based on the concepts of collections of filesets. Both the collection and the fileset are assigned a DOI by the DataCite organisation which can be quoted in articles
Vivli Center for Global Clinical Research Data
The Vivli data repository provides a global data-sharing and analytics platform serving all elements of the international research community. It is focused on sharing individual participant-level data from completed clinical trials to serve the international research community. Vivli acts as a neutral broker between data contributor and data user and the wider data sharing community. Vivli is a non-profit organization focused on data sharing and analysis. Vivli provides managed access for human subject data. It provides a no-charge period for data only available within their secure research environment. There are costs after the no-charge time period ends. Vivli supports managed access as well as embargoing generally and during peer review. It has data preservation funding and assurances from Microsoft that it will maintain and archive data for the lifetime of its use, up to 20 years. Vivli is funded via grants and member fees.
e-cienciaDatos is a multidisciplinary data repository that houses the scientific datasets of researchers from the public universities of the Community of Madrid and the UNED, members of the Consorcio Madroño, in order to give visibility to these data. The purpose of this repository is to ensure data preservation and to facilitate data access and reuse. e-cienciaDatos collects datasets from of each of the member universities. e-cienciaDatos offers the deposit and publication of datasets, assigning a digital object identifier DOI to each of them. The association of a dataset with a DOI will facilitate data verification, dissemination, reuse, impact and long-term access. In addition, the repository provides a standardized citation for each dataset, which contains sufficient information so that it can be identified and located, including the DOI.
Federated Research Data Repository
The Federated Research Data Repository (FRDR) is a place for Canadian researchers to deposit and share research data and to facilitate discovery of research data in Canadian repositories.
heiDATA is Heidelberg University’s research data repository. It is managed by the Competence Centre for Research Data, a joint institution of the University Library and the Computing Centre. All researchers affiliated with Heidelberg University can use this service for archiving and publishing their data. heiDATA runs on software from the Dataverse Project.
ZBW Journal Data Archive
The ZBW Journal Data Archive is a service for editors of journals in economics and management. The aim of this newly established web service is to offer scholarly journals an easy to handle infrastructure for managing and storing data sets of published articles and to link the data sets to their corresponding publication. The service is free of charge for academic journals.
INPTDAT – The Data Platform for Plasma Technology
The interdisciplinary data platform INPTDAT provides easy access to research data and information from all fields of applied plasma physics and plasma medicine. It aims to support the findability, accessibility, interoperability and re-use of data for the low-temperature plasma physics community.
The BonaRes Repository stores soil and agricultural research data from research projects and long-term field experiments which contribute significantly to the analysis of changes of soil and soil functions over the long term. Research data are described by the metadata following the BonaRes Metadata Schema (DOI: 10.20387/bonares-5pgg-8yrp) which combines international recognized standards for the description of geospatial data (INSPIRE Directive) and research data (DataCite 4.0). Metadata includes AGROVOC keywords. Within the BonaRes Repository research data is provided for free reuse under the CC License and can be discovered by advanced text and map search via a number of criteria is possible.
Iowa State University's DataShare
Iowa State University’s DataShare is an open access repository for sharing, publishing, and archiving research data created by Iowa State University scholars and researchers.
UNC Dataverse is a research data repository hosted by the Odum Institute at the University of North Carolina at Chapel Hill. UNC Dataverse is an open access repository that accepts data deposits from individual researchers, research groups, institutions, journals, and other entities from all disciplinary domains. UNC Dataverse offers value-added features that support archiving, discovery, and sharing of research data that align with FAIR principles for findable, accessible, interoperable, reusable data.
The Tromsø Repository of Language and Linguistics
The Tromsø Repository of Language and Linguistics (TROLLing) is a repository of data, code, and other related materials used in linguistic research. The repository is open access, which means that all information is available to everyone. All postings are accompanied by searchable metadata that identify the researchers, the languages and linguistic phenomena involved, the statistical methods applied, and scholarly publications based on the data (where relevant). DataverseNO is aligned with the FAIR Guiding Principles for scientific data management and stewardship. Being part of DataverseNO, TROLLing is CoreTrustSeal certified.
DataverseNO (https://dataverse.no/) is a national, generic repository for open research data, owned and operated by UiT The Arctic University of Norway. DataverseNO is aligned with the FAIR Guiding Principles for scientific data management and stewardship. The technical infrastructure of the repository is based on the open source application Dataverse, which is developed by an international developer and user community led by Harvard University. DataverseNO is CoreTrustSeal certified.
EBRAINS is a platform for sharing brain research data ranging in type as well as spatial and temporal scale. The EBRAINS data curation service aims to provide maximum impact, visibility, reusability, and longevity. The user interface of the EBRAINS Knowledge Graph allows you to easily find data of interest. EBRAINS hosts a wide range of data types and models from different species. All data are well described and can be accessed immediately for further analysis.
NSF Arctic Data Center
The Arctic Data Center is the primary data and software repository for the Arctic section of NSF Polar Programs. The Center helps the research community to reproducibly preserve and discover all products of NSF-funded research in the Arctic, including data, metadata, software, documents, and provenance that links these together. The repository is open to contributions from NSF Arctic investigators, and data are released under an open license (CC-BY, CC0, depending on the choice of the contributor). All science, engineering, and education research supported by the NSF Arctic research program are included, such as Natural Sciences (Geoscience, Earth Science, Oceanography, Ecology, Atmospheric Science, Biology, etc.) and Social Sciences (Archeology, Anthropology, Social Science, etc.).
NASA Socioeconomic Data and Applications Center
The Socioeconomic Data and Applications Center (SEDAC) is a regular member of the World Data System and focuses on human interactions in the environment. Its mission is to develop and operate applications that support the integration of socioeconomic and Earth science data and to serve as an "Information Gateway" between the Earth and social sciences. The SEDAC is one of the Earth Observing System Data and Information System (EOSDIS) Distributed Active Archive Centers (DAACs) that archive and distribute earth science data, managed by NASA's Earth Science Data and Information System Project (ESDIS), as part of the Earth Science Data Systems (ESDS) Program.
Scroll for more...
Center for Open Science (COS), Charlottesville, VA, USA (Government body)
CrossRef, Lynnfield, MA, USA (Company)
Open Data Institute ODI (Research institute)
Research Data Alliance (RDA) (Consortium)
Datacite, Hannover, Germany (Consortium)