Human Genome Analysis and Genetic Diversity: Most people are familiar with the idea that the sequence of the human genome has been determined. This "reference" human genome sequence does not match everyone exactly; in fact, it does not correspond to the genome of any one person, but is just a framework sequence for the human genome, patched together from different sources. Real individual genomes will be different from this reference sequence in a variety of ways: there is variation in the exact sequence, due to single base changes, but also variation in the general structure of the genome, due to variation in the numbers of repeated regions, including even variation in whether a sequence is present or absent (or, if present, how many times). Although these differences are often neutral (irrelevant to a person's health and other characteristics) genome variation can also influence disease. Our research looks at how variation in the structure of the human genome arises, and investigates its possible influence on human disease.
Copy number variation in the human genome: Understanding the genetic basis of human health involves determining how the different genetic variants present in human populations contribute to health and disease in individuals. There are currently major efforts aimed at discovering links between substitutional ('SNP') variants and predisposition to disorders like type I diabetes, asthma and schizophrenia. However, many human genetic disorders can instead result from having the wrong number of copies of a gene, or parts of a gene, and variation in gene number has recently become widely appreciated as an important contributor to disease susceptibility. Variation in gene copy number is extensive in humans - and there are many examples of important genes that show variation in number; for example, people who are "Rhesus negative" completely lack any copies of the RHD gene, whereas "Rhesus positive" individuals have one or two copies.
Where there is variation in the number of copies of a gene or other DNA sequence, measuring the copy number is a surprisingly tough challenge even for modern DNA technology; it is far easier to sequence a region of DNA than to measure accurately whether it is present in four copies or five. An underlying strength of our research is in developing and refining methods for measuring copy number. Our innovations have included Multiplex Amplifiable Probe Hybridisation (MAPH, see Armour et al., 2000); this method was one of the first to assay copy number at multiple sites in a convenient assay, and was used to uncover the remarkable copy number variation in the human beta-defensin genes. Instead of the expected two copies each of these antimicrobial genes, people can commonly have anywhere between 2 and 7 copies of a cluster of beta defensins (including DEFB4 and DEFB103), and there are even variants with as many as 10 or 11 copies. The corresponding region of these 10- or 11-copy chromosomes is seen to be visibly expanded under the microscope, as a so-called "euchromatic variant" chromosome (Hollox et al., 2003). The nearby alpha-defensin genes are also variable, independently of the beta-defensin cluster, and we have used our high-resolution methods to investigate the details of this variation (see Aldred et al., 2005).
More recently, with the particular goal of designing a method for accurate measurement of copy number variation in very large numbers of samples, such as the thousands of samples now used in case-control association studies, we have developed Paralogue Ratio Tests (PRT, see Armour et al., 2007). These tests are based on PCR using a single pair of primers to amplify products from two loci: one product from the region to be tested, the other from an unlinked "reference" locus. We have been able to show that comparing the levels of these two products can provide an accurate determination of copy number, but also deliver high throughput, and we have applied this method to case-control studies to examine the influence of copy number variation on susceptibility to disorders such as psoriasis, inflammatory bowel disease and rheumatoid arthritis (Hollox et al. 2008).
Our current research investigates the difficult question of how copy number variation affects function. It is often assumed that gene expression is simply proportional to copy number, but several examples show that the truth is often more complicated. In particular, expression can also be simultaneously affected by gene sequence variants, trans-acting factors, and position within a repeat array. We are using our high-precision approaches to measuring gene copy number to underpin investigation of these questions for the human alpha-defensin, salivary amylase and CCL3L1 genes.
Armour, J.A.L., Sismani, C., Patsalis, P.C. and Cross, G. (2000) Measurement of locus copy number by hybridisation with amplifiable probes. Nucleic Acids Research, 28, 605-609.
Hollox, E.J., Armour, J.A.L. and Barber, J.C.K. (2003). Extensive normal copy number variation of a human ß-defensin antimicrobial gene cluster. Am.J.Hum.Genet. 73, 591-600.
Aldred, P.M.R., Hollox, E.J. and Armour, J.A.L. (2005). Copy number polymorphism and expression level variation of the human alpha-defensin genes DEFA1 and DEFA3. Hum. Mol. Genet., 14, 2045-2052.
Armour, J.A.L. (2006). Tandemly repeated DNA: Why should anyone care? Mutation Research 598, 6-14.
Armour, J.A.L., Palla, R., Zeeuwen, P.L.J.M., den Heijer, M., Schalkwijk, J. and Hollox, E.J. (2007). Accurate, high-throughput typing of copy number variation using paralogue ratios from dispersed repeats. Nucleic Acids Res., 35, e19.
Hollox E. J., Huffmeier U., Zeeuwen P. L. J. M., Palla R., Lascorz J., Rodijk-Olthuis D., van de Kerkhof P. C. M., Traupe H., de Jongh G., Heijer M. d., Reis A., Armour J. A. L., and Schalkwijk J. (2008). Psoriasis is associated with increased [beta]-defensin genomic copy number. Nature Genetics 40, 23-25.