ChIP-Seq
GenomeQuest ChIP-Seq solution provides state-of-the-art tools across the entire workflow, including for alignment, peak modeling, and interactive analysis. Integrating the popular Model based Analysis for Chip-Seq “MACS” peak modeling software, it allows for over 15 key parameters to be set and multiple runs to be stacked and analyzed collectively.
From a web browser, researchers simply upload their database and fill out two forms – one each for alignment and peak modeling. For the reference databases, users can select from a list of GenomeQuest aggregated and qualified databases with extended annotations.
The interactive sequence browser of the ChIP-Seq workflow includes a table of peak modeling results with columns for gene name and description, chromosome, peak start and stop position, peak length and high point, and all peak statistics. After their analysis, researchers can save all or part of this table as a new annotated database and share it as a reference for follow-on work.
The GenomeQuest ChIP-Seq parameters for alignment include the read database, clean-up, low-quality base trimming and removal, and repeat removal. Peak modeling parameters for MACS include the aligned database, control alignment database (optional), mappable genome size, sequence read size, and sheared genomic fragment size.
Full details of the open source MACS technology and bios of authors Yong Zhang and Tao Liu are available on its product website and in their paper, published in Genome Biology.
"Genome-wide measurements of protein-DNA interactions and transcriptomes are increasingly done by deep DNA sequencing methods (ChIP-Seq and RNA-Seq)… Whereas early adopters necessarily developed their own custom computer code to analyze the first ChIP-Seq and RNA-Seq datasets, a new generation of more sophisticated algorithms and software tools are emerging to assist in the analysis phase of these projects…. parameters are often not fully known in advance, which means that computational analysis for a given experiment is usually performed iteratively and repeatedly, with results dictating whether additional sequencing is needed and is cost-effective… this means that the choice of software for running ChIP-Seq analysis favors packages that are simple to use repeatedly with multiple datasets."
- Nature Methods, November, 2009
GenomeQuest Applications
All GenomeQuest science applications are delivered in an easy-to-use, integrated environment. They package all bioinformatics details – including read and reference data, scientific algorithms, all third party integration, all parameters, and the compute environment – into a simple form-based interface. Users query results using an interactive sequence browser. All data, parameters, and results can be shared with colleagues and teams using the GenomeQuest collaborative environment.
GenomeQuest applications are built to scale to thousands of samples, each with billions of reads. They support all popular sequencing machines (including Illumina, SOLiD, 454, and PacBio) and all common formats (including FASTQ, FASTA, SAM/BAM, VCF, SVA, and others).
