Michael Barmada

Apple SAN Solution Meets Massive Storage Needs

Dr. Michael Barmada at the University of Pittsburgh uses an Xserve G5 cluster to crack the mysteries of inherited diseases.

DNA stores more data per ounce than any man-made storage media yet devised. That makes detangling it and decoding gene sequences for statistical analysis a mathematically intense undertaking.

“From a numerical point of view, if you’ve got 1,000 individuals in your study and you’re typing 100,000 to a million markers, you’ve got at least 200 million points of data,” says Dr. Michael Barmada, a statistical geneticist with the University of Pittsburgh. “It’s a lot of data to deal with. Apart from the simple logistical problem of how to create files that contain all of the data, how do you manipulate those files, how do you analyze them, how do you get a computer to actually do all of the calculations that are required?” And how do you store and organize million-plus data files from a few dozen researchers and students?

“[Xserve RAID] was half the price of the Dell/EMC solution and gave us four times the storage space. It was just an amazing benefit, cost-wise.”

The short answer: Use Macs. Barmada and his team have assembled the second-largest genetics cluster in the U.S. The Xserve G5 cluster crunches numbers with 282 processors day and night, storing results on an Xserve RAID linked together with Fibre Channel and managed with Xsan storage area networking software. The system is a key component of the ceaseless quest to understand genetic predispositions to common diseases like diabetes, high blood pressure and Alzheimer’s.

Sequencing Genes to Find Diseases

Barmada is part of a four-person statistical genetics team at the University of Pittsburgh’s Graduate School of Public Health. He compares the genomes of people who have diseases with the genomes of healthy people, looking for any differences that may influence a particular illness. He works closely with laboratory scientists that squeeze raw data out of gene sequences for processing. “We’re all interested in understanding genetic variation and human genetic disorders,” says Barmada.

Initially, Barmada reviewed genetic information from local populations, usually between 50 and 300 people. The sample sizes eventually grew. “The populations that the physicians are identifying are getting larger, or we’re doing collaborations with multiple groups,” he says. “Instead of getting 400 patients from one physician, we’ll get four or five or six thousand patients from a set of physicians around the country or 20,000 or 30,000 individuals from physicians around the world. For each of those individuals we’ll analyze 500,000 or a million markers in the not-too-distant future.”

The process may yield more than 20 billion points of data per study, but the work doesn’t stop there. Barmada charts external influences like diet, lifestyle and environmental contaminates, collecting even more data. In the end, a predisposition for a disease can usually be traced to more than one gene or factor. “The problem with these types of diseases — cardiovascular disease, blood pressure or schizophrenia — is that there are so many potential genetic risk factors that basically there’s a linkage on every chromosome in the genome,” says Barmada.

Without powerful computers and a vast storage infrastructure, Barmada’s research would be impossible. Using cluster processing and an Apple Xsan storage system, he hopes to understand the genetic and environmental combinations that cause the common diseases that plague society.

1 2 3