The OMA browser serves as an access point for the OMA database, which contains precomputed homology data for over extant and ancestral genomes for over 2800 species (see the latest list of species).
The OMA browser focuses on three main data types: genes, groups, and genomes. Gene-centric pages provide detailed information about a specific gene, including its sequence, cross-references, functional annotations, and evolutionary data. Group-centric pages classify genes into OMA Groups (Orthologous Groups; OGs) and Hierarchical Orthologous Groups (HOGs) to define families and subfamilies. and Hierarchical Orthologous Groups (HOGs) to define families and subfamilies. Genome-centric pages offer information about extant or ancestral species, associated genes, related genomes, and a synteny viewer.
Gene-centric pages in OMA give all the information specific to a single gene in OMA. The gene is found at the top, with its OMA ID and UniProt ID. Different sub-pages are available on the left-hand scrollable menu with specific information, including the orthologs, paralogs, gene information, isoforms, GO annotations, sequences, and local extant and ancestral synteny for this gene.
Consider a scenario where you ran a gene network analysis and found that the human gene with UniProt ID OR2L5_HUMAN is involved in an interesting pathway. Search for this gene on the OMA homepage.
1. Based on the “Gene information” tab, what is this gene?
2. Where is this gene located in the genome?
3. Based on the Gene Ontology annotations, what function is this protein probably involved in? How sure are these annotations?
4. Does this gene share any localized conserved synteny among any other species in Hominidae? If so, which ones?
5. Go to the orthologs table. How many orthologs are inferred by OMA overall?
6. Pairwise orthologs in OMA refer to pairs of genes from different species that are considered to be orthologous to each other. How many 1:1 pairwise orthologs are there?
7. How many orthologs inferred are supported by HOG inference?
8. Why is there difference between the number of pairwise orthologs and the number of HOG-supported orthologs?
9. How conserved is the domain architecture of these orthologs? What is this domain?
10. How many paralogs are there in Human for this gene? When did they duplicate?
The evolution of a gene family describes the history of all the genes that descended from a common ancestral gene.
A Hierarchical Orthologous Groups (HOG) is a set of genes that have descended from a common ancestral gene in a given ancestral species (i.e. at a specific taxonomic level). HOGs are hierarchical because groups defined at more recent clades are encompassed within larger groups that are defined at older clades, thus making them nested subfamilies.
The following exercises are focused on analyzing the evolutionary history of a gene family. For an introduction on how to use the iham graphical viewer (needed to answer the following questions), see our documentation and YouTube video.
Open the HOG page corresponding to the gene from before (OR2L5_HUMAN -> Click on the Groups button). The HOG displayed is the largest HOG in which this gene is present (known as a “Root HOG” in OMA).
1. At what taxonomic level is the last common ancestral gene located at? In what common ancestral genome did all these genes descend from? At what taxonomic level did this gene originate? When did the Root HOG originate?
2. How many ancestral genes comprise this HOG at the root level?
3. How many extant genes comprise this HOG at the root level?
4. Which extant genomes have the most copies of this gene?
5. How many genes in this family (i.e. root HOG) are human genes?
6. In which lineages did the duplications likely take place that resulted in the multiple human genes?
7. How many genes in this family have 5 exons? In what species?
Genomes on the OMA Browser can be either extant (modern-day species) or ancestral. OMA leverages HOGs to model ancestral genomes; these ancestral genomes each correspond to an internal node of the Tree of Life. Conceptually, HOGs can be thought of as ancestral genes, as they encompass orthologs and paralogs descending from a common ancestral gene at a specific taxonomic level. Thus, the HOGs are proxies for ancestral genes in a common ancestor and the collection of HOGs at a given level are proxies for ancestral genomes.
We will first explore an extant genome: Human. Search for this by typing “HUMAN” in the search bar and choosing species for the field, or search by the home page -> Explore -> Quick access to Genomes. Go to the extant human genome.
1. How many genes are in this species, not including alternative splice variants?
Next, let’s explore one of the ancestral genomes leading to human: Primates. Click on this genome to get to the Ancestral Genome page.
2. How many genes was this primate common ancestor inferred to have?
HOG inference may not always be 100% reliable. OMA provides a “Completeness Score” to measure the HOG quality. The Completeness Score is defined as the number of species in that taxonomic clade present in the HOG / the total number of species in the taxonomic clade.
3. How many genes were in the primate common ancestor if we filter to only HOGs with at least 30% of the species present in the HOG?
Click on “Ancestral Gene Order.” Shown are the ancestral chromosome reconstructions, called “ancestral contigs.”
4. How many genes are in the first ancestral contig?