Why these fly and human genes are considered "centrosomal"? (evidences)
In this new release, a total of 773 human genes and 348 fly genes were considered to be centrosomal based on different evidences. For example, in the human case, a total of 108 genes (some of the encoded protein isoforms were identified by Andersen et. al. (Nature, 2003) using high-throughput proteomics analysis. Other genes were obtained from the MiCroKit database . Additional genes were included on the basis of their Gene Ontology annotations in public databases. Gene Ontology terms considered as diagnostic of centrosomal localization includes Cellular Component terms such as Centrosome or Spindle Pole, as well as Biological Processes terms that are clearly related to the centrosome (e.g. centrosome cycle, centrosome duplication, centrosome separation, centrosome localization, mitotic centrosome separation, and centrosome organization and biogenesis). Moreover, annotations from the HPRD database and Flybase were also considered and were able to include genes that were not supported by any of the other kinds of evidences. Finally, human orthologs of mouse genes annotated with Gene Ontology terms associated to the centrosome were also considered to be part of the centrosome.
Domains were assigned using the rps-blast program, (BLAST package at the NCBI). Two kinds of domain databases were used to predict the presence of domains in centrosomal proteins: Pfam, which is based on curated profiles of known domains, and Superfamily, which is built based on structural domains as defined in the SCOP database. In addition the presence of coiled-coils was predicted with the COILS program. Coiled-coil structures seem to be extremely important in the organization of the centrosome and are responsible of many of the underlying protein-protein interactions. 1676 out of the 5169 peptides encoded by the 773 human genes present in our database, contain at least one predicted coiled-coil structure, and usually many of them. In the case of the fly genes there are 391 of 930 peptides with at least one predicted coiled-coil structure.
The domain structure of a protein can be highly informative of its function. For this reason special attention was paid to the representation of the domain organization of proteins. Such graphical representation is displayed at every level of the database, including groups of alternative protein isoforms, groups of orthologs, families, or sets of interacting proteins. By default, protein domains are represented based on the Pfam domain assignments. However, the user can switch to the Superfamily database in order to explore an alternative perspective.
Orthology relationships and phylogenetic patterns
The evolutionary history of a gene can through light on its function and/or biological relevance. For each of the centrosomal genes we found the orthologs in a set of species ranging from yeast to higher eukaryotes. The Compara database from Ensembl was used for this purpose. Orthology information is provided for each fly and human centrosomal gene, listing in a table which are the orthologs in each of the species considered and which is the kind of orthology relationship (one2one, one2many, etc). A colored phylogenetic pattern is displayed to easily identify in which species a given centrosomal gene is either absent, present or duplicated. The domain structure of the orthologs is also depicted.
Consideration on isoforms
For some genes, there was information about which of its protein isoforms have been found in the centrosome, but for many others this information was not available. Hence, we decided to include in the database those protein isoforms encoded by a gene described as centrosomal, although it is possible that some of those isoforms have an alternative cellular localization (e.g. EPB41 gene and its 12 alternative isoforms). This situation is similar to the inheritance of function descriptions from genes to proteins. The molecular function is carried out by the protein, not the gene. Alternative protein isoforms may have substantially different functions, but this knowledge is usually poor or missing. Instead, functional information is usually attached to genes, and automatically inherited by the protein isoforms.
Web browsing compatibility and requirements