In this master's thesis we studied, implemented and compared sequential and parallel sorting algorithms. We implemented seven algorithms: bitonic sort, multistep bitonic sort, adaptive bitonic sort, merge sort, quicksort, radix sort and sample sort. Sequential algorithms were implemented on a central processing unit using C++, whereas parallel algorithms were implemented on a graphics processing unit using CUDA architecture. We improved the above mentioned implementations and adopted them to be able to sort input sequences of arbitrary length. We compared algorithms on six different input distributions, which consist of 32-bit numbers, 32-bit key-value pairs, 64-bit numbers and 64-bit key-value pairs. The results show that radix sort is the fastest sequential sorting algorithm, whereas radix sort and merge sort are the fastest parallel algorithms (depending on the input distribution). With parallel implementations we achieved speedups of up to 157-times in comparison to sequential implementations.