Finding Nemo’s genes: reef fish genome mapped and shared

Nemo’s genome has been deciphered and made publicly available, helping researchers further investigate fish ecology and evolution.

The genome of the orange clownfish, immortalized in the film Finding Nemo, has been deciphered, giving researchers the most detailed information so far on reef fish genomics.

“The nemo genome is composed of 24 chromosomes. We were able to sequence about 97 percent of the underlying genome sequence and then place about 98 percent of that sequence into the 24 chromosomes of the species,” says computational biologist, Robert Lehman. “By any measure, that is a remarkable effort and represents a very complete genome assembly.”

KAUST researchers have made their data available to the scientific community (see the link below) ahead of the journal publication of their results.

The orange clownfish, Amphiprion percula, is a mainstay of marine biology research: it is used as a model species to answer questions related to social organization, sex change, habitat selection, predator-prey interactions and the effects of climate change and ocean acidification on fish. Availability of this genome assembly as a community resource will help researchers more deeply understand ecology and evolution of reef fish.

The team used state-of-the-art technology to sequence the orange clownfish genome. "We began by using single-molecule real-time (SMRT) sequencing, a technology that has only recently become affordable to most research groups," says co-author Damien Lightfoot, a molecular biologist on the team. Traditionally, genome sequencing is performed by reading many short stretches of a genome followed by deciphering and re-assembling these small pieces. The smaller the pieces, however, the more difficult it becomes to put them back together in the right order. SMRT sequencing differs from standard methods by producing relatively long reads of the DNA.

Using bioinformatics programs and the resources of KAUST’s Supercomputing Core Lab, Lehmann and the team rebuilt the DNA pieces into even longer ones—averaging 639 thousand nucleotides in length.

Because the orange clownfish genome comprises approximately 939 million nucleotides, the team used another method to determine the likelihood that different pieces of the genome belonged next to each other. “By doing this, we were able to put the pieces together to assemble nearly complete chromosomes,” says Lehman.

The team assessed the completeness of their genome assembly by comparing it to the high-quality genome assemblies of 26 other fish species. The completeness of the orange clownfish genome was only surpassed by that of the Nile tilapia. Interestingly, the three most complete fish genome assemblies currently available are those achieved by SMRT sequencing.

The team is currently sequencing the genomes of other reef fish species and planning to compare them to answer questions about how genome structure and evolution relate to differences in fish traits, such as their responses to climate change.