Michael Barmada
Apple SAN Solution Meets Massive Storage Needs
Dr. Michael Barmada at the University of Pittsburgh uses an Xserve G5 cluster to crack the mysteries of inherited diseases.
DNA stores more data per ounce than any man-made storage media yet devised. That makes detangling it and decoding gene sequences for statistical analysis a mathematically intense undertaking.
From a numerical point of view, if youve got 1,000 individuals in your study and youre typing 100,000 to a million markers, youve got at least 200 million points of data, says Dr. Michael Barmada, a statistical geneticist with the University of Pittsburgh. Its a lot of data to deal with. Apart from the simple logistical problem of how to create files that contain all of the data, how do you manipulate those files, how do you analyze them, how do you get a computer to actually do all of the calculations that are required? And how do you store and organize million-plus data files from a few dozen researchers and students?
[Xserve RAID] was half the price of the Dell/EMC solution and gave us four times the storage space. It was just an amazing benefit, cost-wise.
The short answer: Use Macs. Barmada and his team have assembled the second-largest genetics cluster in the U.S. The Xserve G5 cluster crunches numbers with 282 processors day and night, storing results on an Xserve RAID linked together with Fibre Channel and managed with Xsan storage area networking software. The system is a key component of the ceaseless quest to understand genetic predispositions to common diseases like diabetes, high blood pressure and Alzheimers.
Sequencing Genes to Find Diseases
Barmada is part of a four-person statistical genetics team at the University of Pittsburghs Graduate School of Public Health. He compares the genomes of people who have diseases with the genomes of healthy people, looking for any differences that may influence a particular illness. He works closely with laboratory scientists that squeeze raw data out of gene sequences for processing. Were all interested in understanding genetic variation and human genetic disorders, says Barmada.
Initially, Barmada reviewed genetic information from local populations, usually between 50 and 300 people. The sample sizes eventually grew. The populations that the physicians are identifying are getting larger, or were doing collaborations with multiple groups, he says. Instead of getting 400 patients from one physician, well get four or five or six thousand patients from a set of physicians around the country or 20,000 or 30,000 individuals from physicians around the world. For each of those individuals well analyze 500,000 or a million markers in the not-too-distant future.
The process may yield more than 20 billion points of data per study, but the work doesnt stop there. Barmada charts external influences like diet, lifestyle and environmental contaminates, collecting even more data. In the end, a predisposition for a disease can usually be traced to more than one gene or factor. The problem with these types of diseases cardiovascular disease, blood pressure or schizophrenia is that there are so many potential genetic risk factors that basically theres a linkage on every chromosome in the genome, says Barmada.
Without powerful computers and a vast storage infrastructure, Barmadas research would be impossible. Using cluster processing and an Apple Xsan storage system, he hopes to understand the genetic and environmental combinations that cause the common diseases that plague society.
1 2 3 Next Page >

