Long sequencing read technologies are currently hindered by high error rates in the output data associated with the technical design of the approach. Due to many advantages in long read technologies, high error rates tend to be disregarded. However, high error rates prohibit accurate analyses of such data. Algorithms used for long reads’ error correction, can be divided into two groups: hybrid and non-hybrid. Hybrid methods take advantage of high accuracy of short reads for correcting errors in long reads. Non-hybrid methods however, perform self-correction with long reads alone. They usually contain a step to generate consensus sequences using overlap information. Generally speaking, algorithms improve quality of long reads. Ultimately, just handling high error rates is not enough. Those algorithms also have to be computationally efficient and be able to handle big sets of data. It is hard to imagine DNA sequencing without long read technologies. They have helped us to solve sequencing problems that have been unsolvable up until recently. Further development of error correcting algorithms will only make them more reliable and efficient.
|