Cookies on this website

We use cookies to ensure that we give you the best experience on our website. If you click 'Accept all cookies' we'll assume that you are happy to receive all cookies and you won't see this message again. If you click 'Reject all non-essential cookies' only necessary cookies providing core functionality such as security, network management, and accessibility will be enabled. Click 'Find out more' for information on how to change your cookie settings.

Analysis of rare variants is currently a major focus of genetic studies of human disease. Single-nucleotide polymorphism (SNP) genotypes can be assayed using microarray genotyping or by sequencing, but neither technology produces perfect genotype calls, especially at rare SNPs. Studies that collect both types of data are becoming increasingly common, so it may be possible to combine data types to increase accuracy. We present a method, called Chiamante, which calls genotypes on individuals with either array data, sequence data, or both. The model adapts to data quality and can estimate when either the array or the sequence data should be ignored when calling the genotypes at each SNP. As a special case, our method will call genotypes from only array data and outperforms existing methods in this scenario. We have applied our method to array and sequence data from Phase I of the 1000 Genomes Project and show that it provides improved performance, especially at rare SNPs. This method provides a foundation for future efforts to fuse genetic data from different sources, for example, when combining data from exome sequencing and exome microarrays. © 2012 Wiley Periodicals, Inc.

Original publication

DOI

10.1002/gepi.21657

Type

Journal article

Journal

Genetic Epidemiology

Publication Date

01/09/2012

Volume

36

Pages

527 - 537