Resources![]()
Explore the GenomeQuest solution further including details on the GQ platform and some frequently asked questions from next-generation customers.
Download PDF»
HS3 is GenomeQuest's arsenal of powerful sequence search methods and is the entry point for any next-generation sequencing data file. The suite allows combinations of local and global alignment methods to be deployed in tandem to increase the yield of useful sequence data from next-generation sequencing runs. HS3 enables users to take high performance computing for granted and focus instead on the biology of the question at hand.
The centerpiece of HS3 is a high-speed, word-based algorithm able to identify highly similar sequences quickly. The algorithm has no read length limitation, allows user definable word-lengths and mismatch stringencies and is able to deal with gaps and sequence reads from any sequencing vendor platform. The scalability and ultra high throughput of HS3 makes it perfectly suited for high volume, “all-against-all” sequence comparisons such as:
We recently completed a project involving a metagenomics-scale comparison of the output from several next-gen sequence runs against a Refseq Genome collection. For one specific run we have the following details to provide a better sense of the speed of the HS3 algorithm.
The query set was comprised of 375,661 sequences that represented a total of about 103.4M base pairs. The subject set contained about 19M sequences of about 66B base pairs.
This sequence comparison run was conducted on a small cluster that contains 7 compute nodes and a head node. The compute nodes were all identical, each with dual-quad core cpus, running at 2.33GHz, for a total of 56 cores. Each compute node has 16Gb of RAM and 500Gb of local storage.
The head node is also a dual-quad core system, running at 2.66GHz, with 64GB of RAM and 4Tb of local storage.