Iskanje podobnih primerov v večrazsežnih prostorih : magistrsko delo

Kariž, Primož

Iskanje podobnih primerov v večrazsežnih prostorih : magistrsko delo
ID Kariž, Primož (Author), ID Robnik Šikonja, Marko (Mentor) More about this mentor... This link opens in a new window

PDF - Presentation file, Download (6,44 MB)
MD5: 4713E3EB1464C2D3AC4EA54F3D6B7764

Abstract

Iskanje najbližjih objektov se uporablja na različnih področjih in pomembno je, da jih lahko hitro poiščemo. Pri iskanju v visokodimenzionalnih prostorih ne znamo hitro poiskati eksaktnih sosedov, zato se zadovoljimo s približnimi. V magistrski nalogi opišemo najbolj uporabljane eksaktne in približne metode za iskanje najbližjih sosedov. Med eksaktnimi so to R, R*, KD, M, PM in ball-drevo, med približnimi pa RKD-drevo, LSH, hierarhično razvrščanje z voditelji in gozd robov. Nekatere smo implementirali sami, druge smo uporabili iz že obstoječih knjižnic. Predstavimo in analiziramo rezultate testiranj hitrosti iskanja najbližjih sosedov, točnosti in porabe pomnilnika. V programskem jeziku python smo razvili knjižnico, ki vsebuje opisane metode in omogoča njihovo preprosto in enotno uporabo preko programskega vmesnika. Knjižnica omogoča tudi avtomatsko izbiro najprimernejšega algoritma za dano podatkovno množico. Algoritem izberemo na podlagi dveh odločitvenih dreves, ki smo ju sestavili s pomočjo analize rezultatov testiranj.

Language:	Slovenian
Keywords:	algoritmi, podatkovne strukture, iskanje najbližjih sosedov, približni najbližji sosedi, visokodimenzionalni prostor, R-drevo, R*-drevo, M-drevo, PM-drevo, ball-drevo, KD-drevo, RKD-drevo, LSH, hierarhično razvrščanje z voditelji, računalništvo, računalništvo in informatika, magisteriji
Work type:	Master's thesis/paper
Typology:	2.09 - Master's Thesis
Organization:	FRI - Faculty of Computer and Information Science
Publisher:	[P. Kariž]
Year:	2015
Number of pages:	126 str.
PID:	20.500.12556/RUL-70240
UDC:	004.42(043.2)
COBISS.SI-ID:	1536276675
Publication date in RUL:	10.07.2015
Views:	1370
Downloads:	249
Metadata:
:	Copy citation
Share:

Licences

License:	CC BY-SA 2.5 SI, Creative Commons Attribution-ShareAlike 2.5 Slovenia
Link:	https://creativecommons.org/licenses/by-sa/2.5/si/deed.en
Description:	You are free to reproduce and redistribute the material in any medium or format. You are free to remix, transform, and build upon the material for any purpose, even commercially. You must give appropriate credit, provide a link to the license, and indicate if changes were made. You may do so in any reasonable manner, but not in any way that suggests the licensor endorses you or your use. If you remix, transform, or build upon the material, you must distribute your contributions under the same license as the original. You may not apply legal terms or technological measures that legally restrict others from doing anything the license permits.

Secondary language

Abstract:
Language:	English
Title:	Searching nearest neighbours in high dimensional spaces
Nearest neighbours search is used in different problems, therefore it is important that we are able to find nearest neighbours fast. When searching in high-dimensional spaces we have to be satisfied with approximate nearest neighbours, because fast methods do not exist. In this master thesis we describe some well-known exact and approximate methods for searching nearest neighbours. The described exact ones are R, R*, KD, M, PM and ball-tree, while the approximate are RKD-tree, LSH, hierarchical k-means and boundary-forest. Some of them we implemented, while others were taken from existing libraries. We present and analyze the search results in terms of speed, precision and memory requirements of methods. We developed a library in python programming language, which includes the described methods and provides a simple and consistent API. The library also allows automatic selection of the most suitable algorithm for a given dataset based on two decision trees, which were created through analysis of the results.
Keywords:	algorithms, data structures, nearest neighbours search, approximate nearest neighbours, high-dimensional space, R-tree, R*-tree, M-tree, PM-tree, ball-tree, KD-tree, RKD-tree, LSH, hierarchical k-means, computer science, computer and information science, master's degree

Similar works from RUL:
Similar works from other Slovenian collections:

Licences

Secondary language

Similar documents