standards > model/format > DOI:10.25504/FAIRsharing.9y4cqw
FAIRsharing is now available for record creation, update, and search at, please visit us there! Replacement of this read-only version of the legacy site with the new version of FAIRsharing is planned for the week commencing 31 January 2022.

This record is undergoing active curation and therefore the values may change.

ready Protein Data Bank Format

Abbreviation: PDB

General Information
An exchange format for reporting experimentally determined three-dimensional structures of biological macromolecules that serves a global community of researchers, educators, and students. The data contained in the archive include atomic coordinates, bibliographic citations, primary and secondary structure, information, and crystallographic structure factors and NMR experimental data.

How to cite this record PDB; Protein Data Bank Format; DOI:; Last edited: Oct. 23, 2020, 10:35 p.m.; Last accessed: Jan 24 2022 11:12 a.m.

Record updated: Oct. 23, 2020, 10:35 p.m. by The FAIRsharing Team.

Show edit history



Additional Information


    No tools defined


Access / Retrieve Data

Conditions of Use


Announcing the worldwide Protein Data Bank.

Berman H., Henrick K., Nakamura H.,
Nat. Struct. Biol. 2003

View Paper (PubMed) View Publication

Related Standards

Terminology Artifacts

No semantic standards defined

Identifier Schemas

No identifier schema standards defined


No metrics standards defined

Related Databases (65)
CAPS-DB : a structural classification of helix-capping motifs
CAPS-DB is a structural classification of helix-cappings or caps compiled from protein structures. Caps extracted from protein structures have been structurally classified based on geometry and conformation and organized in a tree-like hierarchical classification where the different levels correspond to different properties of the caps.

Database of Aligned Ribosomal Complexes
The Database for Aligned Ribosomal Complexes (DARC) site provides a resource for directly comparing the structures of available ribosomal complexes.

EcoliWiki: A Wiki-based community resource for Escherichia coli
EcoliWiki is a community-based resource for the annotation of all non-pathogenic E. coli, its phages, plasmids, and mobile genetic elements.

FunTree: A Resource For Exploring The Functional Evolution Of Structurally Defined Enzyme Superfamilies
A resource for exploring the evolution of protein function through relationships in sequence, structure, phylogeny and function.

Database of interaction Hotspots across the proteome. Hot spots are energetically important residues at protein interfaces and they are not randomly distributed across the interface but rather clustered. These clustered hot spots form hot regions. HotRegion, provides information of these interfaces by using predicted hot spot residues, and structural properties of these interface residues such as pair potentials of interface residues, accessible surface area (ASA) and relative ASA values of interface residues of both monomer and complex forms of proteins. Also, the 3D visualization of the interface and interactions among hot spot residues are provided.

InterEvol database : Diving into the structure and evolution of protein complex interfaces
Evolution of protein-protein Interfaces InterEvol is a resource for researchers to investigate the structural interaction of protein molecules and sequences using a variety of tools and resources.

MINAS - A Database of Metal Ions in Nucleic AcidS
MINAS contains the exact geometric information on the first and second-shell coordinating ligands of every metal ion present in nucleic acid structures that are deposited in the PDB and NDB. Containing also the sequence information of the binding pocket-proximal nucleotides, this database allows for a detailed search of all combinations of potential ligands and of coordination environments of metal ions. MINAS is therefore a perfect new tool to classify metal ion binding pockets in nucleic acids by statistics and to draw general conclusions about the different coordination properties of these ions. This record has been marked as Uncertain because the homepage for this resource is no longer active, and we have not been able to get in touch with the owners of the resource. Please contact us if you have any information regarding MINAS.

Major Intrinsic Proteins Modification Database
This is a database of comparative protein structure models of the MIP (Major Intrinsic Protein) family of proteins. The MIPs have been identified from the completed genome sequence of organisms available at NCBI.

Prokaryotic Glycoproteins Database
ProGlycProt (Prokaryotic Glycoproteins) is a manually curated, comprehensive repository of experimentally characterized eubacterial and archaeal glycoproteins, generated from an exhaustive literature search. This is the focused beginning of an effort to provide concise relevant information derived from rapidly expanding literature on prokaryotic glycoproteins, their glycosylating enzyme(s), glycosylation linked genes, and genomic context thereof, in a cross-referenced manner.

Protein-Chemical Structural Interactions
Protein-Chemical Structural Interactions provides information on the 3-dimensional chemical structures of protein interactions with low molecular weight.

Protein Structure Change Database
The Protein Structural Change DataBase (PSCDB) presents the structural changes found in proteins, represented by pairs of ligand-free and ligand-bound structures of identical proteins, and links these changes to ligand-binding.

SNPeffect is a database for phenotyping human single nucleotide polymorphisms (SNPs). SNPeffect primarily focuses on the molecular characterization and annotation of disease and polymorphism variants in the human proteome. Further, SNPeffect holds per-variant annotations on functional sites, structural features and post-translational modification.

Statistical Torsional Angles Potentials of NMR Refinement Database
The STAP database contains refined versions of the NMR structures deposited in PDB. These refinements have been performed using statistical torsion angle potential and structurally- or experimentally- derived distance potential. The refined structures have a significantly improved structural quality compared to their initial NMR structure.

Compilation and Creation of datasets from PDB
ccPDB (Compilation and Creation of datasets from PDB) is a collection of commonly used data sets for structural or functional annotation of proteins. There are numerous datasets from the literature and the Protein Data Bank (PDB), which were used for developing methods to annotate proteins at the sequence (or residue) level. A tool is available for creating a wide range of customized data sets from PDB.

Death Domain Database
Death Domain Database is a manually curated database of protein-protein interactions for Death Domain Superfamily.

Indel Flanking Region Database
Indel Flanking Region Database is an online resource for indels (insertion/deletions) and the flanking regions of proteins in SCOP superfamilies. It aims at providing a comprehensive dataset for analyzing the qualities of amino acid indels, substitutions and the relationship between them.

Validated NMR structures of proteins and nucleic acids.

Pocket Similarity Search using Multiple-Sketches
POcket Similarity Search Using Multiple-Sketches (PoSSuM) includes all the discovered protein-small molecule binding site pairs with annotations of various types (e.g., UniProt, CATH, SCOP, SCOPe, EC number and Gene ontology). PoSSuM enables rapid exploration of similar binding sites among structures with different global folds as well as similar folds. Moreover, PoSSuM is useful for predicting the binding ligand for unbound structures.

SitEx database of eukaryotic protein functional sites
SitEx is a database containing information on eukaryotic protein functional sites. It stores the amino acid sequence positions in the functional site, in relation to the exon structure of encoding gene This can be used to detect the exons involved in shuffling in protein evolution, or to design protein-engineering experiments.

PASS2 contains alignments of structural motifs of protein superfamilies. PASS2 is an automatic version of the original superfamily alignment database, CAMPASS (CAMbridge database of Protein Alignments organised as Structural Superfamilies). PASS2 contains alignments of protein structures at the superfamily level and is in direct correspondence with SCOPe 2.04 release.

Drug-related information: medical indications, adverse drug effects, drug metabolism and Gene Ontology terms of the target proteins.

Virus Pathogen Database and Analysis Resource
The Virus Pathogen Database and Analysis Resource (ViPR) is an integrated repository of data and analysis tools for multiple virus families, supported by the National Institute of Allergy and Infectious Diseases (NIAID) Bioinformatics Resource Centers (BRC) program. ViPR captures various types of information, including sequence records, gene and protein annotations, 3D protein structures, immune epitope locations, clinical and surveillance metadata and novel data derived from comparative genomics analysis. The database is available without charge as a service to the virology research community to help facilitate the development of diagnostics, prophylactics and therapeutics for priority pathogens and other viruses.

Influenza Research Database
The Influenza Research Database (IRD) is a free, open, publicly-accessible resource funded by the U.S. National Institute of Allergy and Infectious Diseases through the Bioinformatics Resource Centers program. IRD provides a comprehensive, integrated database and analysis resource for influenza sequence, surveillance, and research data, including user-friendly interfaces for data retrieval, visualization, and comparative genomics analysis, together with personal login- protected ‘workbench’ spaces for saving data sets and analysis results. IRD integrates genomic, proteomic, immune epitope, and surveillance data from a variety of sources, including public databases, computational algorithms, external research groups, and the scientific literature.

Telomerase Database
The Telomerase Database is a Web-based tool for the study of structure, function, and evolution of the telomerase ribonucleoprotein. The objective of this database is to serve the research community by providing a comprehensive compilation of information known about telomerase enzyme and its substrate, telomeres.

Functional Coverage of the Proteome
FCP is a publicly accessible web tool dedicated to analysing the current state and trends on the population of available structures along the classification schemes of enzymes and nuclear receptors, offering both graphical and quantitative data on the degree of functional coverage in that portion of the proteome by existing structures, as well as on the bias observed in the distribution of those structures among proteins.

Evolutionary Trace
Relative evolutionary importance of amino acids within a protein sequence.

SURFACE is a database containing the results of a large-scale protein annotation and local structural comparison project. The homepage of the resource has not been updated since 2003 (and the maintainer's website since 2010). Until we have confirmation of the status of this project, we have classified it as uncertain.

Information system for G protein-coupled receptors
The GPCRDB is a molecular-class information system that collects, combines, validates and stores large amounts of heterogenous data on G protein-coupled receptors (GPCRs). The GPCRDB contains data on sequences, ligand binding constants and mutations. In addition, many different types of computationally derived data are stored such as multiple sequence alignments and homology models.

Chemical Component Dictionary
The Chemical Component Dictionary is an external reference file describing all residue and small molecule components found in Protein Data Bank entries. It contains detailed chemical descriptions for standard and modified amino acids/nucleotides, small molecule ligands, and solvent molecules. Each chemical definition includes descriptions of chemical properties such as stereochemical assignments, aromatic bond assignments, idealized coordinates, chemical descriptors (SMILES & InChI), and systematic chemical names.

Catalytic Site Atlas
The Catalytic Site Atlas (CSA) is a database documenting enzyme active sites and catalytic residues in enzymes of 3D structure. It uses a defined classification for catalytic residues which includes only those residues thought to be directly involved in some aspect of the reaction catalysed by an enzyme.

Protein Data Bank in Europe
The Protein Data Bank in Europe (PDBe) is the European resource for the collection, organisation and dissemination of data on biological macromolecular structures. It is a founding member of the worldwide Protein Data Bank which collects, organises and disseminates data on biological macromolecular structures.

PDBsum; at-a-glance overview of macromolecular structures
PDBsum provides an overview of every macromolecular structure deposited in the Protein Data Bank (PDB), giving schematic diagrams of the molecules in each structure and of the interactions between them.

Protein Data Bank: Proteins, Interfaces, Structures and Assemblies
The Protein Quaternary Structure file server (PDBePISA) is an internet resource that makes available coordinates for likely quaternary states for structures contained in the Brookhaven Protein Data Bank (PDB) that were determined by X-ray crystallography.

PROCOGNATE is a database of cognate ligands for the domains of enzyme structures in CATH, SCOP and Pfam. The database contains an assignment of PDB ligands to the domains of structures as classified by the CATH, SCOP and Pfam databases. Cognate ligands have been identified using data from the ENZYME and KEGG databases and compared to the PDB ligand using graph matching to assess chemical similarity. Cognate ligands from the known reactions in ENZYME and KEGG for a particular enzyme are then assigned to enzymes structures which have EC numbers.

Protein Classification Benchmark Collection
The Protein Classification Benchmark Collection was created in order to create standard datasets on which the performance of machine learning methods can be compared.

ArchDB is a compilation of structural classifications of loops extracted from known protein structures. The structural classification is based on the geometry and conformation of the loop. The geometry is defined by four internal variables and the type of regular flanking secondary structures, resulting in 10 different loop types. Loops in ArchDB have been classified using an improved version (Espadaler et al.) of the original ArchType program published in 1997 by Oliva et al.

PSIbase is a molecular interaction database based on PSIMAP (PDB, SCOP) that focuses on structural interaction of proteins and their domains. This resource has been marked as Uncertain because its project home can no longer be found. Please get in touch if you have any information about this resource.

Protein Model Database
The Protein Model DataBase (PMDB) is a database that stores three dimensional protein models obtained by structure prediction techniques.

SUPERFAMILY is a database of structural and functional annotation for all proteins and genomes.

Structural Classification Of Proteins
The SCOP database is a curated both manually and with the use of automated tools. This freely available resource aims to provide a comprehensive description of the structural and evolutionary relationships between all proteins whose structure is known.

Molecular Modeling Database
The Molecular Modeling Database (MMDB), as part of the Entrez system, facilitates access to structure data by connecting them with associated literature, protein and nucleic acid sequences, chemicals, biomolecular interactions, and more.

Transporter Classification Database
This freely accessible database details a comprehensive IUBMB approved classification system for membrane transport proteins known as the Transporter Classification (TC) system. The TC system is analogous to the Enzyme Commission (EC) system for classification of enzymes, except that it incorporates both functional and phylogenetic information for organisms of all types. As of April. 1, 2021, TCDB consists of 21,114 proteins classified in 16,558 non-redundant transport systems with 1,605 tabulated 3D structures, 19,196 reference citations describing 1,586 transporter families, of which 26% are members of 83 recognized superfamilies. Overall, this is an increase of over 50% since the last published update of the database in 2016. The most recent update of the database contents and features include (1) adoption of a chemical ontology for substrates of transporters, (2) inclusion of new superfamilies, (3) a domain-based characterization of transporter families (tcDoms) for the identification of new members as well as functional and evolutionary relationships between families, (4) development of novel software to facilitate curation and use of the database, (5) addition of new subclasses of transport systems including 11 novel types of channels and 3 types of group translocators, and (6) the inclusion of many man-made (artificial) transmembrane pores/channels and carriers.

Protein Data Bank Japan
The Protein Data Bank is the single worldwide archive of structural data of biological macromolecules.

Ligand Expo
Ligand Expo is a data resource for finding information about small molecules bound to proteins and nucleic acids. Tools are provided to search the PDB dictionary for chemical components, to identify structure entries containing particular small molecules, and to download the 3D structures of the small molecule components in the PDB entry.

RCSB Protein Data Bank
This resource is powered by the Protein Data Bank archive-information about the 3D shapes of proteins, nucleic acids, and complex assemblies that helps students and researchers understand all aspects of biomedicine and agriculture, from protein synthesis to health and disease. As a member of the wwPDB, the RCSB PDB curates and annotates PDB data. The RCSB PDB builds upon the data by creating tools and resources for research and education in molecular biology, structural biology, computational biology, and beyond.

TargetTrack, a target registration database, provides information on the experimental progress and status of targets selected for structure determination.

Nucleic Acids Database
The Nucleic Acids Database contains information about experimentally-determined nucleic acids and complex assemblies. NDB can be used to perform searches based on annotations relating to sequence, structure and function, and to download, analyze, and learn about nucleic acids.

Sanger Pfam Mirror
The Pfam database contains information about protein domains and families. For each entry a protein sequence alignment and a Hidden Markov Model is stored.

3D interacting domains
The database of 3D Interaction Domains (3did) is a collection of domain-domain interactions in proteins for which high-resolution three-dimensional structures are known. 3did exploits structural information to provide critical molecular details necessary for understanding how interactions occur.

SWISS-MODEL Repository of 3D protein structure models
The SWISS-MODEL Repository is a database of annotated 3D protein structure models generated by the SWISS-MODEL homology-modelling pipeline for protein sequences of selected model organisms.

The CATH database is a free, publicly available online resource that provides information on the evolutionary relationships of protein domains. It provides a hierarchical domain classification of protein structures in the Protein Data Bank. Protein structures are classified using a combination of automated and manual procedures. There are four major levels in this hierarchy; Class (secondary structure classification, e.g. mostly alpha), Architecture (classification based on overall shape), Topology (fold family) and Homologous superfamily (protein domains which are thought to share a common ancestor).

Biological Magnetic Resonance Databank
BMRB collects, annotates, archives, and disseminates (worldwide in the public domain) the important spectral and quantitative data derived from NMR spectroscopic investigations of biological macromolecules and metabolites. The goal is to empower scientists in their analysis of the structure, dynamics, and chemistry of biological systems and to support further development of the field of biomolecular NMR spectroscopy.

Electron Microscopy Data Bank
Cryo-electron microscopy reconstruction methods are uniquely able to reveal structures of many important macromolecules and macromolecular complexes. The Electron Microscopy Data Bank (EMDB) is a public repository for electron microscopy density maps of macromolecular complexes and subcellular structures. It covers a variety of techniques, including single-particle analysis, electron tomography, and electron (2D) crystallography. The EMDB was founded at EBI in 2002, under the leadership of Kim Henrick. Since 2007 it has been operated jointly by the PDBe, and the Research Collaboratory for Structural Bioinformatics (RCSB PDB) as a part of EMDataBank which is funded by a joint NIH grant to PDBe, the RCSB and the National Center for Macromolecular Imaging (NCMI).

KnotProt: A database of proteins with knots and slipknots
KnotProt collects information about proteins with knots or slipknots. The knotting complexity of proteins is presented in the form of a matrix diagram that shows users the knot type of the entire polypeptide chain and of each of its subchains. The database presents extensive information about the biological function of proteins with non-trivial knotting and enables users to analyze new structures.

MobiDB is a database of intrinsically disordered regions (IDRs) and related features from various sources and prediction tools. Different levels of reliability and different features are reported as different and independent annotations. The database features three levels of annotation: manually curated, indirect and predicted. MobiDB annotates the binding modes of disordered proteins, whether they undergo disorder-to-order transitions or remain disordered in the bound state. In addition, disordered regions undergoing liquid-liquid phase separation or post-translational modifications are defined.

RepeatsDB ( is a database of annotated tandem repeat protein structures. Tandem repeats pose a difficult problem for the analysis of protein structures, as the underlying sequence can be highly degenerate. Several repeat types haven been studied over the years, but their annotation was done in a case-by-case basis, thus making large-scale analysis difficult. We developed RepeatsDB to fill this gap. Using state-of-the-art repeat detection methods and manual curation, we systematically annotated the Protein Data Bank, predicting 10 745 repeat structures. In all, 2797 structures were classified according to a recently proposed classification schema, which was expanded to accommodate new findings. In addition, detailed annotations were performed in a subset of 321 proteins. These annotations feature information on start and end positions for the repeat regions and units. RepeatsDB is an ongoing effort to systematically classify and annotate structural protein repeats in a consistent way. It provides users with the possibility to access and download high-quality datasets either interactively or programmatically through web services.

Worldwide Protein Data Bank
The Worldwide PDB (wwPDB) organization manages the PDB archive and ensures that the PDB is freely and publicly available to the global community. The mission of the wwPDB is to maintain a single Protein Data Bank Archive of macromolecular structural data that is freely and publicly available to the global community. The wwPDB is composed of the RCSB PDB, PDBe, PDBj and BMRB.

Model Archive
The Model Archive provides a stable archive for computational macro-molecular models published in the scientific literature. The model archive provides a unique stable accession code (DOI) for each deposited model, which can be directly referenced in the corresponding manuscripts.

LinkProt: A database of proteins with topological links
LinkProt collects information about protein chains and complexes that form links. LinkProt detects deterministic links (with loops closed by cysteine), and determines likelihood of formation of links in networks of protein chains called MacroLinks. Links are presented graphically in an intuitive way, using tools that involves surfaces of minimal area spanned on closed loops. The database presents extensive information about biological functions of proteins with links and enables users to analyze new structures.

Small angle scattering biological data bank
Curated repository for small angle scattering data and models. SASBDB contains X-ray (SAXS) and neutron (SANS) scattering data from biological macromolecules in solution.

WALTZ-DB 2.0 is a database for characterizing short peptides for their amyloid fiber-forming capacities. The majority of the data comes from electron microscopy, FTIR and Thioflavin-T experiments done by the Switch lab. Apart from that class of data we also provide the amyloid annotation for several other short peptides found in current scientific research papers. Structural models of the potential amyloid cores are provided for every peptide entry.

T-psi-C is a database of tRNA sequences and 3D tRNA structures. The T-psi-C database can be continuously updated by any member of the scientific community.

Virus Particle Explorer
VIPERdb is a database for icosahedral virus capsid structures. The emphasis is on providing data from structural and computational analyses on these systems, as well as high quality renderings for visual exploration.

Kinase-Ligand Interaction Fingerprints and Structures database
Kinase-Ligand Interaction Fingerprints and Structures database (KLIFS) is a database that revolves around the protein structure of catalytic kinase domains and the way kinase inhibitors can interact with them. Based on the underlying systematic and consistent protocol all (currently human and mouse) kinase structures and the binding mode of kinase ligands can be directly compared to each other. Moreover, because of the classification of an all-encompassing binding site of 85 residues it is possible to compare the interaction patterns of kinase-inhibitors to each other to, for example, identify crucial interactions determining kinase-inhibitor selectivity.

PDB-REDO is a databank of optimised (re-refined, rebuilt and validated) Protein Data Bank entries. It covers nearly all structure models derived from X-ray and electron diffraction that have deposited experimental data. Entries can be accessed by their PDB identifier.

Scroll for more...

Implementing Policies

This record is not implemented by any policy.


Record Maintainer

  • This record is in need of a maintainer. If you login, you'll be able to claim this record.