PageRank is Google's algorithm for ranking web pages by relevance. Pages can then be hierarchically sorted in order to provide better search results.
The MSc thesis considers functioning, relevance, general properties of web search and its weaknesses before the appearance of Google. One of the most important questions is, if we can formally explain the mathematics behind PageRank algorithm and what mathematical knowledge is necessary. Finally, we present an example of its implementation in a form of a web application, to demonstrate how PageRank works on a form of simplified web.
The MSc theses presents the mathematics behind PageRank algorithm. To this end, we need linear algebra and graph theory. Beside formal mathematical description of the algorithm, we also provide examples to illustrate how it works. Web is modeled as a directed graph, to which we assign a certain matrix. The result of PageRank, performed on this matrix, is the eigenvector, corresponding to matrix's eigenvalue 1. The eigenvector is calculated with power iteration method. We consider problems, that can occur during the calculation: does the result of the power iteration method always make sense, can there be more than one solution for a given example and does the result depend on the starting parameters.
The major objective of this thesis is to provide a wide and concrete insight into web search, emphasising PageRank, considering historical, mathematical and computer science viewpoint. We wish to provide relevant examples to demonstrate how the algorithm works. With these examples we also try to demonstrate problems as well as solutions that can occur during calculation with the power iteration method. PageRank is presented comprehensively in a way suitable for those familiar with basic knowledge in web search, linear algebra and graph theory, yet still in need of an introduction to some advanced concepts.
|