standards > model/format > bsg-s001362


ready Gene Ontology (GO) Gene Association File Format 2.1

Abbreviation: GAF 2.1


General Information
Ontology-based annotation data can be submitted to the Gene Ontology Consortium and other projects (e.g. the Planteome project) in the Gene Association Format 2.1 (GAF 2.1). The annotation flat file format is comprised of 17 tab-delimited fields. This format is preferred over the GAF 2.0 format.



Awaiting DOI assignment.


This record is maintained by suzi  ORCID

Record added: April 8, 2019, 2:41 p.m.
Record updated: April 9, 2019, 11:13 a.m. by suzi.




Support

General

Additional Information


Tools

Schemas

No XSD schemas defined


Access / Retrieve Data

Conditions of Use

Applies to: Data use




Publications

The Gene Ontology Resource: 20 years and still GOing strong.

The Gene Ontology Consortium
Nucleic Acids Res 2018

View Paper (PubMed) View Publication

Related Standards

Reporting Guidelines

No guidelines defined

Terminology Artifacts

No semantic standards defined

Identifier Schemas

No identifier schema standards defined

Metrics

No metrics standards defined


Implementing Databases (18)
Aspergillus Genome Database
The Aspergillus Genome Database is a resource for genomic sequence data as well as gene and protein information for Aspergilli. This publicly available repository is a central point of access to genome, transcriptome and polymorphism data for the fungal research community.

FlyBase
Genetic, genomic and molecular information pertaining to the model organism Drosophila melanogaster and related sequences. This database also contains information relating to human disease models in Drosophila, the use of transgenic constructs containing sequence from other organisms in Drosophila, and information on where to buy Drosophila strains and constructs.

GeneDB
GeneDB is a genome database for prokaryotic and eukaryotic organisms and provides a portal through which data generated by the "Pathogen Genomics" group at the Wellcome Trust Sanger Institute and other collaborating sequencing centres can be accessed.

Saccharomyces Genome Database
The Saccharomyces Genome Database (SGD) collects and organizes information about the molecular biology and genetics of the yeast Saccharomyces cerevisiae. SGD contains a variety of biological information and tools with which to search and analyze it.

The Arabidopsis Information Resource
The Arabidopsis Information Resource (TAIR) maintains a database of genetic and molecular biology data for the model higher plant Arabidopsis thaliana.

PomBase
PomBase is a model organism database that provides organization of and access to scientific data for the fission yeast Schizosaccharomyces pombe. PomBase supports genomic sequence and features, genome-wide datasets and manual literature curation as well as providing structural and functional annotation and access to large-scale data sets.

WormBase
WormBase is an international consortium of biologists and computer scientists dedicated to providing the research community with accurate, current, accessible information concerning the genetics, genomics and biology of C. elegans and related nematodes.

Candida Genome Database
The Candida Genome Database (CGD) provides access to genomic sequence data and manually curated functional information about genes and proteins of the human pathogen Candida albicans. It collects gene names and aliases, and assigns gene ontology terms to describe the molecular function, biological process, and subcellular localization of gene products.

Gramene, a comparative mapping resource for grains
Gramene's purpose is to provide added value to data sets available within the public sector, which will facilitate researchers' ability to understand the grass genomes and take advantage of genomic sequence known in one species for identifying and understanding corresponding genes, pathways and phenotypes in other grass species.

Sol Genomics Network
The Sol Genomics Network (SGN) is a database and website dedicated to the genomic information of the Solanaceae family, which includes species such as tomato, potato, pepper, petunia and eggplant.

Gene Ontology Annotation
The GOA (Gene Ontology Annotation) project provides high-quality Gene Ontology (GO) annotations to proteins in the UniProt Knowledgebase (UniProtKB) and International Protein Index (IPI). This involves electronic annotation and the integration of high-quality manual GO annotation from all GO Consortium model organism groups and specialist groups.

Reactome - a curated knowledgebase of biological pathways
The cornerstone of Reactome is a freely available, open source relational database of signaling and metabolic molecules and their relations organized into biological pathways and processes. The core unit of the Reactome data model is the reaction. Entities (nucleic acids, proteins, complexes, vaccines, anti-cancer therapeutics and small molecules) participating in reactions form a network of biological interactions and are grouped into pathways. Examples of biological pathways in Reactome include classical intermediary metabolism, signaling, transcriptional regulation, apoptosis and disease. Inferred orthologous reactions are available for 17 non-human species including mouse, rat, chicken, puffer fish, worm, fly, yeast, rice, and Arabidopsis.

Rat Genome Database
The Rat Genome Database is the premier site for genetic, genomic, phenotype, and disease data generated from rat research. It provides easy access to corresponding human and mouse data for cross-species comparison and its comprehensive data and innovative software tools make it a valuable resource for researchers worldwide.

Mouse Genome Database - a Mouse Genome Informatics (MGI) Resource
MGI is the international database resource for the laboratory mouse, providing integrated genetic, genomic, and biological data to facilitate the study of human health and disease. Data includes gene characterization, nomenclature, mapping, gene homologies among mammals, sequence links, phenotypes, allelic variants and mutants, and strain data.

EcoCyc E. coli Database
EcoCyc is a model organism database built on the genome sequence of Escherichia coli K-12 MG1655. Expert manual curation of the functions of individual E. coli gene products in EcoCyc has been based on information found in the experimental literature for E. coli K-12-derived strains. Updates to EcoCyc content continue to improve the comprehensive picture of E. coli biology. The utility of EcoCyc is enhanced by new tools available on the EcoCyc web site, and the development of EcoCyc as a teaching tool is increasing the impact of the knowledge collected in EcoCyc.

The Zebrafish Information Network
The Zebrafish Information Network, ZFIN, serves as the primary community database resource for the laboratory use of zebrafish. We develop and support integrated zebrafish genetic, genomic, developmental and physiological information and link this information extensively to corresponding data in other model organism and human databases.

dictyBase
dictyBase is a single-access database for the complete genome sequence and expression data of four Dictyostelid species providing information on research, genome and annotations. There is also a repository of plasmids and strains held at the Dicty Stock Centre. Relevant literature is integrated into the database, and gene models and functional annotation are manually curated from experimental results and comparative multigenome analyses.

Planteome
A resource providing data on bioentities and their associated ontology terms for Plant Biology. The database provides access to ontology-based annotations of genes, phenotypes and germplasms from about 90 plant species. A number of internal and external ontologies are used to annotate the biological data available from this resource.

Scroll for more...


Implementing Policies

This record is not implemented by any policy.


Credit

Record Maintainer

Maintains