The goal of this thesis was to implement a sequential algorithm that would
search for subsequences in a genome. To accelerate the execution time of this
algorithm we designed a parallel version and implemented the parallel version
on a graphics card. The sequential algorithm had to search for predefined
subsequences in a genome that was represented as a sequence of characters. It
had to calculate the frequencies of sequence occurrences and the frequencies of
interactions on predefined positions and on randomly modified positions in the
genome, for each subsequence. Based on these frequencies it had to identify
sequences that were more frequent on certain locations in a given genome. Based
on data about protein-RNA interactions on certain locations in the genome, and
based on the found character sequences, the algorithm had to calculate and
statistically evaluate the frequencies of interactions. The sequential
algorithm was implemented in the C programming language, while the
parallelization was implemented on the OpenCL architecture.
|