Newspaper article Pittsburgh Post-Gazette (Pittsburgh, PA)

Speedy as a Sailfish Program Estimates Gene Expression 20 to 30 Times Faster Than Other Methods

Newspaper article Pittsburgh Post-Gazette (Pittsburgh, PA)

Speedy as a Sailfish Program Estimates Gene Expression 20 to 30 Times Faster Than Other Methods

Article excerpt

Measuring gene expression - the degree to which a gene is active - is key to genetic research and development of new ways to treat disease.

The presence of specific messenger RNA indicates what genes are involved, while their total count can be used to estimate the level of gene expression. But the computer process traditionally used to identify messenger RNA (mRNA) in cell samples takes 10 to 15 hours to get results.

Now a team of researchers led by Carnegie Mellon University computational biologist Carl Kingsford has stepped forward with its Sailfish program, featuring an algorithm that estimates gene expression 20 to 30 times faster than current methods.

That's to say, Sailfish gets results, and sometimes more accurately, in 10 to 15 minutes.

It explains why the team including Stephen M. Mount of the University of Maryland and Rob Patro, a CMU postdoctoral researcher, named the program after the world's fastest fish, whose same velocities in a car would draw a speeding ticket on Interstate 79.

"Understanding when a gene is on and off is an important tool in basic biology," said Mr. Kingsford, an associate professor in the school of computer science, who wrote the Sailfish algorithm. "The goal is to increase science and understand biology better."

The journal Nature Biotechnology published a report in April describing Sailfish and how it advances the computational process. Now available online for free, Sailfish is drawing praise from scientists who are using the program to speed up their research, with the novel opportunity of double-checking their results.

"It's benefited my research because it's an efficient, elegant program that has streamlined gene-expression analysis," said John Stanton-Geddes, a University of Vermont professor with a doctoral degree in ecology, evolution and behavior. He said he uses Sailfish to identify genes that change expression in response to temperature in two eastern ant species.

While any individual organism's genetic makeup is static, a CMU news release explains, activity of individual genes varies greatly over time, "making gene expression an important factor in understanding how organisms work and what occurs during disease processes."

"Gene activity can't be measured directly but can be inferred by monitoring RNA, the molecules that carry information from the genes for producing proteins and other cellular activities," it says.

The math and science explaining Sailfish and its algorithm are complicated. Here's a much-simplified explanation:

In research, cell samples of interest are ground up and analyzed in a sequencing machine, which spells out the combination of the four molecules that make up the RNA, each identified by a letter - namely A, C, G and U. Messenger RNA can have 100 to 1,000 of these letters.

Current methods require a process known as mapping, which takes RNA segments of 100 letters, known as "reads," and tries to find an inexact match of those letters in the 100,000 sequenced RNA. Because of the large number of letters in the "reads" and the often- complicated notion of what constitutes a good match, the computer process takes many hours of computation to identify the most likely RNA represented, along with the count of how many are present in the sample. …

Search by... Author
Show... All Results Primary Sources Peer-reviewed


An unknown error has occurred. Please click the button below to reload the page. If the problem persists, please try again in a little while.