Leveraging biological replicates to improve analysis in ChIP-seq experiments.

TitleLeveraging biological replicates to improve analysis in ChIP-seq experiments.
Publication TypeJournal Article
Year of Publication2014
AuthorsYang, Y, Fear, J, Hu, J, Haecker, I, Zhou, L, Renne, R, Bloom, D, McIntyre, LM
JournalComput Struct Biotechnol J
Date Published2014

ChIP-seq experiments identify genome-wide profiles of DNA-binding molecules including transcription factors, enzymes and epigenetic marks. Biological replicates are critical for reliable site discovery and are required for the deposition of data in the ENCODE and modENCODE projects. While early reports suggested two replicates were sufficient, the widespread application of the technique has led to emerging consensus that the technique is noisy and that increasing replication may be worthwhile. Additional biological replicates also allow for quantitative assessment of differences between conditions. To date it has remained controversial about how to confirm peak identification and to determine signal strength across biological replicates, particularly when the number of replicates is greater than two. Using objective metrics, we evaluate the consistency of biological replicates in ChIP-seq experiments with more than two replicates. We compare several approaches for binding site determination, including two popular but disparate peak callers, CisGenome and MACS2. Here we propose read coverage as a quantitative measurement of signal strength for estimating sample concordance. Determining binding based on genomic features, such as promoters, is also examined. We find that increasing the number of biological replicates increases the reliability of peak identification. Critically, binding sites with strong biological evidence may be missed if researchers rely on only two biological replicates. When more than two replicates are performed, a simple majority rule (>50% of samples identify a peak) identifies peaks more reliably in all biological replicates than the absolute concordance of peak identification between any two replicates, further demonstrating the utility of increasing replicate numbers in ChIP-seq experiments.

Alternate JournalComput Struct Biotechnol J
PubMed ID24688750
PubMed Central IDPMC3962196
Grant ListR01 AI048633 / AI / NIAID NIH HHS / United States
R01 CA088763 / CA / NCI NIH HHS / United States
R01 GM102227 / GM / NIGMS NIH HHS / United States