NHGRI logo

Gene Chips Accurately Diagnose Four Complex
Childhood Cancers

Artificial Intelligence Used With Gene Expression Microarrays for the First Time

May 2001

BETHESDA, Md. - Scientists at the National Human Genome Research Institute and Lund University in Sweden have developed a method of genetic fingerprinting that can tell the difference between several closely related types of childhood cancer. The method combines, for the first time, the cutting edge technology of gene chips with a form of artificial intelligence called an artificial neural network (ANN). The neural network automatically analyzes the large amounts of data produced by the gene chip to make a highly accurate diagnosis.

Using typical diagnostic technologies, the four types of childhood tumors used in this study can be difficult to tell apart because they look alike under the microscope; their similar appearance can lead to misdiagnosis and improper treatment. Gene chip technology, on the other hand, analyzes the pattern of activity of thousands of genes inside any cell type, including cancer cells. This approach, the researchers reported in the June issue of Nature Medicine, allows their computerized neural network to classify the different cancers with much greater accuracy.

"This research is a very exciting example of how genome technology is advancing the diagnosis of some of the most serious and challenging diseases," says Francis S. Collins, M.D., Ph.D., director of the National Human Genome Research Institute (NHGRI) in Bethesda, Maryland. "Studies like this one should help lead to the discovery of genes that are altered in these tumors, and this information may lead to the development of effective new treatments."

The study began by simultaneously analyzing more than 6,000 known genes present in all cells. Among those, the researchers identified 41 genes expressed in the tumors that had not been previously associated with these diseases. "I am convinced that we will find good targets for new drug treatments with this kind of approach," says Paul Meltzer, M.D., Ph.D., a senior investigator in the Cancer Genetics Branch of the NHGRI Division of Intramural Research and the paper's senior author. "We are clearly moving away from using chemotherapy to nonspecifically kill cells to developing targeted treatments."

Analyzing Many Genes

Nearly all cells in the human body carry exactly the same set of genes, and many of these genes, so-called housekeeping genes, are turned on in most cells. But skin cells are different than muscle cells because a unique set of skin genes are turned on in skin cells and a unique set of muscle genes are turned on in muscle cells. Genes give cells their unique characteristics, but the genes have to be turned on - or expressed - for the unique characteristics to appear. By identifying the pattern of gene expression for muscle cells, skin cells or any type of cells, including cancers, scientists can create a genetic fingerprint of that cell type. Genetic fingerprints can uniquely differentiate one cell type from other cell types.

To create these fingerprints, scientists use a device called a gene expression microarray, or, more commonly, a gene chip. The chip is simply a glass slide on which thousands of known gene samples have been printed in tiny spots. Cells to be tested are then manipulated in such a way that genes expressed in the cell will match up with the known gene samples, like two pieces of Velcro attaching to each other. The cellular genes are treated in such a way that they literally light up the gene dots on the chip. The luminescent pattern is then measured with a special type of microscope and the results fed into a computer for analysis.

The NHGRI team applied this technique to the childhood cancers to see whether it could simultaneously test for several different types of tumors at the same time. Researchers first began using microarrays to classify cancer in 1998, including work by first author Javed Khan, now a principal investigator in the Pediatric Branch of the National Cancer Institute, who showed that the technology could distinguish between muscle cells and other cancers. "We hypothesized that the genes turned on in a certain cancer were unique, but were not sure if these genes were so disrupted in cancer that it would be impossible to distinguish one type from another," Khan says. Recently, other NHGRI scientists reported using the technique to tell the difference between certain inherited types of breast cancer.

In this study, for the first time, a gene expression microarray was used to tell the difference between four unique types of cancer: neuroblastoma, rhabdomyosarcoma, non-Hodgkin lymphoma (Burkitt's lymphoma) and Ewing's sarcoma. As a group, these cancers are referred to as the small, round blue cell tumors of childhood because of the way they look under the microscope.

"This is the first time anyone has taken several different kinds of cancer, and used their gene expression patterns for diagnostic classification," Meltzer says. The data form a complex pattern of signal intensities that represent the fingerprint for each tumor type.

As the number of genes discovered by the Human Genome Project (HGP) has grown, the amount of information generated by these tests has become huge, and newer analytical methods were required. The researchers turned to the field of artificial intelligence, specifically a computer system called artificial neural networks. The FBI already used artificial neural networks to analyze fingerprints at a crime scene; the scientists decided to try using the same technology to analyze a genetic fingerprint from cancer.

Diagnostic Neural Network

"Artificial neural networks are built to mimic how neurons function in the brain," says Markus Ringnér, a theoretical physicist from Lund University in Lund Sweden, and now a post-doctoral researcher at NHGRI. "It's all software, but the basic ideas is that it can be trained to recognize patterns."

The scientists chose this system because it helps solve a difficult problem. "The challenge is to take the large amount of information on thousands of genes tested by the microarray," Meltzer says, "and use the information to make an accurate diagnosis."

To teach the neural network to recognize the genetic pattern produced by each type of cancer, Ringnér, together with Carsten Peterson, a senior colleague at Lund University, took the patterns from four correctly diagnosed childhood cancers and fed them into an ordinary personal computer that runs the neural network software. The computer performed a mathematical calculation that systemically identified the pattern produced by each cancer type. Each time the mass of data was run through the computer, the neural network became more effective in identifying that particular type of cancer. In other words, the neural network was learning to recognize the pattern.

After 63 training runs in which the computer was taught to recognize the genetic patterns produced by the four tumor types, Khan sent to Ringnér and Peterson in Lund, images from a set of 25 unknown tissue samples to run through the system. Ringnér didn't know which kinds of cancers the samples represented, so it was a blinded experiment. But Khan threw his Swedish colleagues a curveball: five of the unknown samples were not any of these childhood cancers, but rather were samples of normal muscle, and other tumor types: undifferentiated sarcoma, osteosarcoma and prostate cancer.

"I was worried about the ones I could not classify," Ringnér admits. "I had to make a decision about whether to shoehorn them into a category, or say I couldn't classify them. I admitted for these five, I could not tell, and that was the right answer." The neural network correctly identified what belonged in the childhood cancer category, and what did not belong, based on the genetic pattern.

Artificial neural networks have been working their way into clinical medicine to analyze test results where complex pattern recognition is required, such as x-ray images, PAP smears and electrocardiogram results, some of which was pioneered by the Lund group. "This is the first time," Meltzer says, "that a neural network has been applied to this kind of analysis."

Future Research

While it may take some time for this technique to reach clinical practice, it already has quickly pointed toward additional research. While the research team started with more than 6,000 genes, the artificial neural network analysis narrowed that number down to a mere 93 unique genes needed to differentiate the four tumor types. And of those, 41 were new genes that might provide important insights into the biology of these cancers and offer possible targets for new treatments.

"This technology will ultimately define the complete catalog of all the genes involved in cancer," Meltzer says. "The only remaining task will be deciding which of the genes, and their products, will make the best targets for new drug treatments."

Meanwhile, work has begun to use gene chip analyses to associate gene expression patterns with the clinical outcome. It's well known that the cancers of some patients progresses faster than the cancers of others, even when the patients share the same diagnosis and stage. It now appears that genetic patterns can be used to predict how aggressive an individual's cancer will be.

The power of this method, Khan says, "is not only that we can diagnose these cancers, but in the very near future, we will be able to predict which patients will respond to treatment and which will not, and will therefore need stronger treatment."

Moreover, Meltzer says, "With this technology, we will be able to treat patients according to the specific genes that are expressed in their tumor," a more customized therapy rather than the generally toxic drugs now used for anyone with a particular diagnosis.

Getting the Diagnosis Right

Even as the researchers apply gene chips and artificial neural networks to research questions, the newly developed system could prove valuable for anxious parents and family members of sick children. Because of the visual similarity of the childhood tumors used in this study, diagnostic mistakes are sometimes made. But an accurate diagnosis can be critical for the child's survival. While these tumors are physically similar, the treatments are quite different. When a patient gets the right therapy, up to 90 percent of the children with Burkitt's lymphoma recover; about half will survive Ewing's sarcoma and rhabdomyosarcoma, and up to 40 percent will recover from neuroblastoma. Without accurate diagnosis and proper treatment, few children survive.

"Defining the features of individual cancer cells - their signatures - is an area rich in scientific opportunity," says Richard Klausner, director of the National Cancer Institute. "As this study suggests, our growing knowledge of the 'molecular fingerprints' of cancer cells can and will translate into clinical benefits."

Not having an accurate diagnosis, or knowing whether a child is likely to survive, can be psychologically devastating for affected families.

"We hope," Khan says, "that we can take the guesswork out of diagnosing these children and be able to more accurately individualize treatment to increase their chance of recovery."

The Research Team

The research team includes: Javed Khan, Jun S. Wei, Markus Ringnér and Paul S. Meltzer from the Cancer Genetics Branch, Division of Intramural Research, National Human Genome Research Institute, which is a part of the National Institutes of Health, Bethesda, Maryland; Carsten Peterson from the Complex Systems Division, Department of Theoretical Physics, Lund University, Lund, Sweden; Marc Ladanyi and Cristina R. Antonescu from the Department of Pathology, Memorial Sloan-Kettering Cancer Center in New York, New York; Frank Westermann and Manfred Schwab from the Department of Cytogenetics, German Cancer Research Center, Heidelberg, Germany; and Frank Berthold, from the Department of Pediatrics, Klinik für Kinderheilkunde der Universität zu Köln, Köln, Germany. Dr. Khan is now with the Pediatric Oncology Branch, Advanced Technology Center, National Cancer Institute, Gaithersburg, Maryland, and Markus Ringnér is a post-doctoral researcher from Lund University working at NHGRI.

This research was principally supported by the Division of Intramural Research of the National Human Genome Research Institute, and also received support from the Charles & Dana Nearburg Foundation, the Swedish Research Council, the Knut and Alice Wallenberg Foundation through the SWEGENE consortium and the Swedish Foundation for Strategic Research.

The research is reported in the June, 2001, issue of Nature Medicine, Volume 7, Number 6.

Contact:
Geoff Spencer
NHGRI

Top of page

Last updated: September 01, 2006