How to cite this record: FAIRsharing.org: NCBITAXON; NCBI Taxonomy; DOI: https://doi.org/10.25504/FAIRsharing.fj07xj; Last edited: Feb. 22, 2018, 2:02 p.m.; Last accessed: Apr 19 2018 4:14 p.m.
|online documentation||http://purl.bioontology.org/ontology/NCB ...|
No XSD schemas defined
Conditions of Use
Models and Formats
This computational biology resource mainly focuses on annotation and detection of eukaryotic linear motifs (ELMs) by providing both a repository of annotated motif data and an exploratory tool for motif prediction. ELMs, or short linear motifs (SLiMs), are compact protein interaction sites composed of short stretches of adjacent amino acids.
Pfam Protein Families
The Pfam database is a large collection of protein families, each represented by multiple sequence alignments and hidden Markov models (HMMs). Proteins are generally composed of one or more functional regions, commonly termed domains. Different combinations of domains give rise to the diverse range of proteins found in nature. The identification of domains that occur within proteins can therefore provide insights into their function. Pfam also generates higher-level groupings of related entries, known as clans. A clan is a collection of Pfam entries which are related by similarity of sequence, structure or profile-HMM.
This resource is a hierarchical clustering of UniProt protein sequences into hierarchical trees. This resource allows for the study of sub-family and super-family of a protein, using UniRef50 clusters.
Comparative Toxicogenomics Database
The Comparative Toxicogenomics Database (CTD) advances understanding of the effects of environmental chemicals on human health. Biocurators manually curate chemical-gene, chemical-disease, and gene-disease relationships from the scientific literature. This core data is then internally integrated to generate inferred chemical-gene-disease networks. Additionally, the core data is integrated with external data sets (such as Gene Ontology and pathway annotations) to predict many novel associations between different data types. A unique and powerful feature of CTD is the inferred relationships generated by data integration that helps turn knowledge into discoveries by identifying novel connections between chemicals, genes, diseases, pathways, and GO annotations that might not otherwise be apparent using other biological resources.
ArchDB is a compilation of structural classifications of loops extracted from known protein structures. The structural classification is based on the geometry and conformation of the loop. The geometry is defined by four internal variables and the type of regular flanking secondary structures, resulting in 10 different loop types. Loops in ArchDB have been classified using an improved version (Espadaler et al.) of the original ArchType program published in 1997 by Oliva et al.
A CLAssification of Mobile genetic Elements
ACLAME is a database dedicated to the collection and classification of mobile genetic elements (MGEs) from various sources, comprising all known phage genomes, plasmids and transposons.
Giga Science Database
GigaDB primarily serves as a repository to host data and tools associated with articles in GigaScience; however, it also includes a subset of datasets that are not associated with GigaScience articles. GigaDB defines a dataset as a group of files (e.g., sequencing data, analyses, imaging files, software programs) that are related to and support an article or study.
UniCarbKB is an initiative that aims to promote the creation of an online information storage and search platform for glycomics and glycobiology research. The knowledgebase will offer a freely accessible and information-rich resource supported by querying interfaces, annotation technologies and the adoption of common standards to integrate structural, experimental and functional data.
probeBase is a manually maintained and curated database of rRNA-targeted oligonucleotide probes and primers. Contextual information and multiple options for evaluating in silico hybridization performance against the most recent rRNA sequence databases are provided for each oligonucleotide entry, which makes probeBase an important and frequently used resource for microbiology research and diagnostics. The major features of probeBase include a classification of probes and primers according to the NCBI taxonomy database, a powerful and customizable search function, which serves to query for target organisms, probe names, primers, target sites, and references. The probeBase match tool can be used to match near-full length rRNA sequences against probeBase and find all published probes targeting the query sequences. The new proxy match tool extends this analysis to partial rRNA sequences, which exploits full-length sequences in the rRNA sequence database SILVA to find published probes potentially targeting partial query sequences. A tool for submitting new or missing probe sequences or references helps to keep probeBase up-to-date.
The FAIRDOMHub is a publicly available resource build using the SEEK software, which enables collaborations within the scientific community. FAIRDOM will establish a support and service network for European Systems Biology. It will serve projects in standardizing, managing and disseminating data and models in a FAIR manner: Findable, Accessible, Interoperable and Reusable. FAIRDOM is an initiative to develop a community, and establish an internationally sustained Data and Model Management service to the European Systems Biology community. FAIRDOM is a joint action of ERA-Net EraSysAPP and European Research Infrastructure ISBE.
The ENCODE (Encyclopedia of DNA Elements) Consortium is an international collaboration of research groups funded by the National Human Genome Research Institute (NHGRI). The goal of ENCODE is to build a comprehensive parts list of functional elements in the human genome, including elements that act at the protein and RNA levels, and regulatory elements that control cells and circumstances in which a gene is active. ENCODE results from 2007 and later are available from this project. This covers data generated during the two production phases 2007-2012 and 2013-present.
Microenvironment Perturbagen LINCS Center image server
The MEP LINCS project contributes to the development of the NIH Library of Integrated Network-based Cellular Signatures (LINCS) program by developing a dataset and computational strategy to elucidate how microenvironment (ME) signals affect cell intrinsic intracellular transcriptional- and protein-defined molecular networks to generate experimentally observable cellular phenotypes measured by high-content imaging.
GrainGenes, a Database for Triticeae and Avena
The GrainGenes website hosts a wealth of information for researchers working on Triticeae species, oat and their wild relatives. The website hosts a database encompassing information such as genetic maps, genes, alleles, genetic markers, phenotypic data, quantitative trait loci studies, experimental protocols and publications. The database can be queried by text searches, browsing, Boolean queries, MySQL commands, or by using pre-made queries created by the curators. GrainGenes is not solely a database, but serves as an informative site for researchers and a means to communicate project aims, outcomes and a forum for discussion.
MorphoBank is a web application providing an online database and workspace for evolutionary research in systematics (the science of determining the evolutionary relationships among species). MorphoBank invites scientists producing peer-reviewed research to upload images and affiliate data with those images (labels, species names, etc.). MorphoBank also offers a platform for live collaboration on phylgoenetic matrices by teams in a private workspace where they can also affiliate images with phylogenetic matrices. MorphoBank stores digital versions of both text and image-based observations on phenotypes. Phylogenetic matrices (Nexus or TNT format), particularly phenotypical matrices, 2D (including JPEG, GIF, PNG, TIFF and Photoshop) and 3D (PLY, STL, ZIP, TIFF and DCM) image data and video (MPEG-4, QuickTime and WindowsMedia). MorphoBank also offer a Documents folder for additional files about their research such as pdfs, word documents, and text files (e.g., morphometric data, phylogenetic trees).
The Ensembl genome annotation system, developed jointly by the EBI and the Wellcome Trust Sanger Institute, has been used for the annotation, analysis and display of vertebrate genomes since 2000. Since 2009, the Ensembl site has been complemented by the creation of five new sites, for bacteria, protists, fungi, plants and invertebrate metazoa, enabling users to use a single collection of (interactive and programatic) interfaces for accessing and comparing genome-scale data from species of scientific interest from across the taxonomy.
Over 30,000 genome sequences from bacteria and archaea have been annotated and deposited in the public archives of the members of the International Nucleotide Sequence Database Collaboration. This site provides access to complete, annotated genomes from bacteria and archaea (present in the European Nucleotide Archive) through the Ensembl graphical user interface (genome browser).
From release 27 release onwards, all protist genomes whose sequence and annotation has been completed and submitted to the the International Nucleotide Sequence Database Collaboration (i.e. the ENA, GenBank and DDBJ databases) are now available in Ensembl Protists. The release now consists of a total of over 150 genomes, of which over 100 have been taken directly from the INSDC archives and the remainder taken from other sources. The new genomes have been functionally annotated with InterPro entries and GO terms using InterPro v53.
A new genome assembly of Triticum aestivum cv. Chinese Spring is now available in Ensembl Plants. The assembly (TGACv1) and it's accompanying annotation was produced by the Earlham Institute, formerly The Centre for Genome Analysis (TGAC), as part of the Triticeae Genomics for Sustainable Agriculture project.
From release 28 forward, all fungal genomes whose sequence and annotation has been completed and submitted to the the International Nucleotide Sequence Database Collaboration (i.e. the ENA, GenBank and DDBJ databases) is available in Ensembl Fungi. The release now consists of a total of 589 genomes, of which 536 have been taken from the archives and 53 taken directly from other sources.
This site provides access to complete, annotated genomes from metazoa through the Ensembl graphical user interface (genome browser).
Hardwood Genomics Project
The Hardwood Genomics Project is a databases for expressed genes, genetic markers, genetic linkage maps, and reference populations. It provides lasting genomic and biological resources for the discovery and conservation of genes in hardwood trees for growth, adaptation and responses to environmental stresses such as drought, heat, insect pests and disease. All original sequence data is being deposited in NCBI's Sequence Read Archive and the genetic linkage maps and associated marker data will be available at the Dendrome database.
Visual Database for Organelle Genome
VDOG, Visual Database for Organelle Genome is an innovative database of the genome information in the organelles. Most of the data in VDOG are originally extracted from GeneBank, re-organized and represented.
The Project Tycho® database aims are to advance the availability and use of public health data for science and policy. We do this by acquisition of new data, by building infrastructure for data standardization, integration, quality control, and data redistribution, by developing innovative analytics, and by advocacy. Project Tycho contains a complete digitization of the entire history of weekly National Notifiable Disease Surveillance System (NNDSS) reports for the United States (1888-2013) into a database in computable format (Level 3 data). We have standardized a major part of these data for online access (Level 2 data). A subset of the U.S. data was cleaned further and used for a study on the impact of vaccination programs in the United States that was recently published in the NEJM (Level 1 data).
A resource providing data on bioentities and their associated ontology terms for Plant Biology. The database provides access to ontology-based annotations of genes, phenotypes and germplasms from about 90 plant species. A number of internal and external ontologies are used to annotate the biological data available from this resource.
The Open Biological and Biomedical Ontology (OBO) Foundry is a collective of ontology developers that are committed to collaboration and adherence to shared principles. The mission of the OBO Foundry is to develop a family of interoperable ontologies that are both logically well-formed and scientifically accurate. To achieve this, OBO Foundry participants voluntarily adhere to and contribute to the development of an evolving set of principles including open use, collaborative development, non-overlapping and strictly-scoped content, and common syntax and relations, based on ontology models that work well, such as the Gene Ontology (GO). The OBO Foundry is overseen by an Operations Committee with Editorial, Technical and Outreach working groups.
Scroll for more...
This record is maintained by schoch2