The recently used deep sequencing techniques represent a new data processing challenge: mapping short fragment reads to open-access eukaryotic genomes at the scale of several hundred thousand. This problem is solvable by BLAST, BWA and similar sequence alignment tools. BLAST is one of the most frequently used tool in bioinformatics and BWA is a relative new fast light-weighted tool that aligns effectively short sequences. Local installations of these algorithms are typically not able to handle large problem size therefore the sequence alignment process runs slowly, while web based implementations cannot accept high number of queries. HP-SEE infrastructure allows accessing massively parallel supercomputing infrastructure. With gUSE/WS-PGRADE we have created successfully an online Bioinformatics eScience Gateway, which is capable to serve the short fragment sequence alignment demand of the regional bioinformatics communities within the SEE region. Using workflows we have ported algorithms (BLAST and BWA) to the massively parallel HP-SEE infrastructure. In this paper we describe the created Bioinformatics eScience Gateway, and show as case study how we have implemented the ported BLAST workflow using parameter study. With our online service, researchers can do high throughput sequence alignments against the eukaryotic genomes to search for regulatory mechanisms controlled by short fragments on HP-SEE’s supercomputing infrastructure.
Speakers:
Mr
Akos Balasko
(SZTAKI), Mr
Gergely Windisch
(Obuda University - Hungary), Dr
Miklos Kozlovszky
(OU)