ALGA: de novo assembly from NGS reads
About ALGA
ALGA (ALgorithm for Genome Assembly) is a genome-scale de novo sequence assembler based on the overlap graph approach. The method accepts at the input reads from the next generation DNA sequencing, paired or not. It can be used without setting any parameter by a user, parameters are adjusted internally by ALGA on the basis of input data. Only one optional parameter is left, the maximum allowed error rate in overlaps of reads, with its default (and suggested) value 0.
ALGA incorporates several new ideas resulting in more exact contigs produced in acceptable time. Among these ideas we have creation of a sparse but quite informative graph, reduction of the graph including a procedure referring to the problem of minimum spanning tree of a local subgraph, and graph traversal connected with simultaneous analysis of contigs stored so far. The algorithm is one of tools involved in processing data in currently realized national project Genomic Map of Poland.
Availability
- ALGA - the source code of the assembler and the user guide,
- ALGA supplements - additional material from comparison with other de novo assemblers