In this thesis we introduce a design and implementation of OpenCL kernel in FPGA. Initially we presented limits of increasing performances of traditional CPU technology. The end of frequency scaling has caused a shift to multicore processing. However, multicore processing has diminishing returns in terms of increasing true application performance due to limits in I/O and memory bandwidth. Heterogeneous computing is a solution to increase performances and efficiencies. However, writing software application for such computing system is a quite challenging. OpenCL is a framework for heterogeneous systems, which was developed by Apple Inc., but is now maintained by the Khronos Group. It allows programs to run on multicore CPUs, GPUs, DSPs and FPGAs. Altera introduced the SDK for OpenCL which convert the OpenCL code to kernels that can be run on an FPGA device. In this thesis we present a user-centric overview of Altera SDK for OpenCL. In the comparison study we take matrix multiplication function and compare the ordinary CPU execution time and the computed kernel time on FPGA. Within the same comparison study we compare computed kernel time of optimized FPGA-B kernel and non-optimized FPGA-A kernel as well. We find out, (i) that same calculation using the OpenCL model provide a speed of 131 times over the ordinary CPU execution and (ii) that only kernel optimization provide a speed of 15 times.