Data Sharing and Intellectual Property in a Genomic Epidemiology Network: Policies for Large-Scale Research Collaboration
Chokshi, Dave A., Parker, Michael, Kwiatkowski, Dominic P., Bulletin of the World Health Organization
One of the most important discoveries of genome sequencing projects is the extent of genomic diversity in humans (1) and in human pathogens. (2,3) We now have many of the tools required for genomic epidemiology--the systematic investigation of how natural genomic variation affects the clinical outcome of disease. Infectious diseases are a central focus for genomic epidemiology, because pathogens are a major force for evolutionary selection of the human genome, and pathogen genomes are continually evolving to counter adaptations in the human immune system, and to survive the drugs used against them. Genomic epidemiology has important practical applications for diseases of the developing world, particularly in tackling drug resistance and guiding vaccine development.
Although genomic epidemiology operates by the same fundamental principles as other forms of genetic research, the scale of research projects is much larger. Studies are currently under way that involve testing over half a million genetic variants (known as single nucleotide polymorphisms or SNPs) in thousands of individuals with different diseases, and in healthy people, to identify those regions of the genome that are associated with resistance or susceptibility to those diseases. It is because the whole genome is being investigated that a large number of SNPs need to be tested. The need to detect effects of modest magnitude necessitates a very large number of subjects for a given study. (4)
Thus large-scale epidemiological studies, often conducted in multiple populations, need to be combined with high-throughput genome technologies and advanced statistical computations. The consequent increase in scale often requires the formation of research consortia for investigations in genomic epidemiology.
This paper details issues arising during the formation of an international research consortium known as MalariaGEN (Malaria Genomic Epidemiology Network; www.MalariaGEN. net), whose aim is to use genomic epidemiology to identify molecular mechanisms of protective immunity against malaria--and thereby to guide malaria vaccine development. The consortium, which is funded through the Grand Challenges in Global Health initiative (5) brings together clinical researchers, epidemiologists, immunologists, genome researchers and statisticians from 20 countries in Africa, Asia, Europe and North America. The consortium is funded to analyse DNA and clinical data from tens of thousands of individuals, generating billions of genetic data points. This large undertaking raises many ethical issues which have been summarized elsewhere, particularly relating to consent, genetic database governance and the fact that the research involves communities in the developing world. (6) Here we focus specifically on the questions of data sharing and how intellectual property may be managed for the greatest return to society.
Innovation and access
The issues of data-sharing and intellectual property are closely connected. For example, if a research consortium decides to release all data immediately into the public domain, this precludes the possibility of patenting discoveries. If there are good reasons for immediately releasing data into the public domain, but also good reasons for parenting discoveries, then we need policy guidelines to determine which option should take precedence in a given situation.
We propose two fundamental principles upon which to base policy decisions about data sharing and intellectual property: (1) impediments to innovation in research processes should be minimized, and (2) the fruits of research--eventual products that result from scientific discoveries--should be made as widely accessible as possible, particularly to the people who need them the most.
In the context of genomic epidemiology, fostering innovation involves two broad goals. The first goal is to accelerate basic scientific research by making data accessible to the researchers best able to build upon promising findings. …