The goal of this thesis was to implement a sequential algorithm that would search for subsequences in a genome. To accelerate the execution time of this algorithm we designed a parallel version and implemented the parallel version on a graphics card. The sequential algorithm had to search for predefined subsequences in a genome that was represented as a sequence of characters. It had to calculate the frequencies of sequence occurrences and the frequencies of interactions on predefined positions and on randomly modified positions in the genome, for each subsequence. Based on these frequencies it had to identify sequences that were more frequent on certain locations in a given genome. Based on data about protein-RNA interactions on certain locations in the genome, and based on the found character sequences, the algorithm had to calculate and statistically evaluate the frequencies of interactions. The sequential algorithm was implemented in the C programming language, while the parallelization was implemented on the OpenCL architecture.
|