Biometrics, 2012-03, Vol.68 (1), p.1-11
RNA-seq may replace gene expression microarrays in the near future. Using RNA-seq, the expression of a gene can be estimated using the total number of sequence reads mapped to that gene, known as the total read count (TReC). Traditional expression quantitative trait locus (eQTL) mapping methods, such as linear regression, can be applied to TReC measurements after they are properly normalized. In this article, we show that eQTL mapping, by directly modeling TReC using discrete distributions, has higher statistical power than the two-step approach: data normalization followed by linear regression. In addition, RNA-seq provides information on allele-specific expression (ASE) that is not available from microarrays. By combining the information from TReC and ASE, we can computationally distinguish eis-and trans-eQTL and further improve the power of czs-eQTL mapping. Both simulation and real data studies confirm the improved power of our new methods. We also discuss the design issues of RNA-seq experiments. Specifically, we show that by combining TReC and ASE measurements, it is possible to minimize cost and retain the statistical power of ds-eQTL mapping by reducing sample size while increasing the number of sequence reads per sample. In addition to RNA-seq data, our method can also be employed to study the genetic basis of other types of sequencing data, such as chromatin immunoprecipitation followed by DNA sequencing data. In this article, we focus on eQTL mapping of a single gene using the association-based method. However, our method establishes a statistical framework for future developments of eQTL mapping methods using RNA-seq data (e. g., linkage-based eQTL mapping), and the joint study of multiple genetic markers and/or multiple genes.
Total read count (TReC) ; eQTL ; RNA-seq ; Allele-specific expression (ASE) ; Haplotypes ; Quantitative trait loci ; Biometrics ; Sample size ; Linear regression ; Binomial distributions ; Genes ; Alleles ; P values ; BIOMETRIC METHODOLOGY ; Genotypes ; RNA‐seq ; Allele‐specific expression (ASE) ; Sequence Analysis, RNA - methods ; Data Interpretation, Statistical ; Algorithms ; Base Sequence ; Computer Simulation ; Humans ; Molecular Sequence Data ; Models, Genetic ; Models, Statistical ; Quantitative Trait Loci - genetics ; RNA - genetics ; Anopheles ; Chromatin ; Analysis ; Nucleotide sequencing ; Gene expression ; DNA sequencing ; Quantitative genetics ; Genetic markers ; Ribonucleic acid--RNA ; Genomics ; Index Medicus ; Allele-specific Expression (ASE) ; Total Read Count (TReC)
JSTOR Arts & Sciences II
Academic Search Ultimate
Alma/SFX Local Collection
SPORTDiscus with Full Text
Permalink to record