Details

Metodološki pristopi in praktični vidiki podatkovnega rudarjenja v antropologiji : doktorska disertacija
ID Pretnar Žagar, Ajda (Author), ID Podjed, Dan (Mentor) More about this mentor... This link opens in a new window, ID Zupan, Blaž (Comentor)

.pdfPDF - Presentation file, Download (5,68 MB)
MD5: DD6A72A8EB641FF4584FE365C88FEB34

Abstract
Analiza velikih podatkov postaja v zadnjih letih vse bolj priljubljena. To se kaže tudi na področju antropologije, in sicer v pojavu novega področja, imenovanega računska antropologija. Čeprav se računalniki uporabljajo v antropologiji že vsaj od sredine 20. stoletja in četudi je računsko družboslovje danes uveljavljena veda, je antropoloških raziskav, ki uporabljajo računske pristope za analizo podatkov, relativno malo. Prva, ki sta opozorila na uporabnost računalnikov kot raziskovalnih pripomočkov v antropologiji, sta bila Lévi-Strauss, ki je želel učinkovito organizirati osnovne enote mitov, ter Edmund Leach, ki je iskal splošne vzorce družb. Tudi v Sloveniji je že leta 1970 Helena Ložar-Podlogar uporabljala sistem luknjanih kartic za zapisovanje gradiva o ženitovanjskih šegah, leta 1991 pa je Fikfak s sodelavci oblikoval digitalizirani informacijski sistem Göthove topografije. Kljub temu so študije ostajale večinoma kvalitativne, kvantitativni podatki pa so se obravnavali kot sekundarno gradivo. V drugem desetletju 21. stoletja smo priča tudi metodološkemu in epistemološkemu napredku pri vključevanju računskih analiz v antropološke raziskave. Povezovanje kvalitativnih in kvantitativnih pristopov izhaja iz tradicije mešanih metod, pri čemer pristopi zajemajo sodelovalne opise vizualizacij (t. i. etno-rudarjenje), kalibracijo analitskih parametrov in interpretacijo ugotovitev z etnografijo (t. i. zvezovanje), oblikovanje novega konceptualnega polja z obeh analitskih izhodišč(t. i. zlivanje) ter eksplicitno interdisciplinarno metodološko eksperimentiranje (t. i. hibridna metodologija). V disertaciji predlagam nov metodološki pristop, imenovan krožna mešana metoda, ki sočasno izhaja iz obeh izhodišč in vključuje večkratno reformulacijo hipotez in ugotovitev na podlagi analize podatkov ter etnografskih pripovedi. Na primeru analize senzorskih meritev kvalitete delovnih prostorov določim prednosti in slabosti posameznega pristopa, raziščem možnosti prepletanja etnografije s podatkovnim rudarjenjem ter oblikujem metodološke smernice za uporabo računskih metod v antropologiji. Krožne mešane metode sicer izhajajo iz tradicije interdisciplinarnih raziskav v antropologiji, natančneje iz splošnih mešanih metod. Če se pri splošnih mešanih metodah zaporedno ali vzporedno prepletata kvalitativni in kvantitativni pristop, je poudarek pri krožnih mešanih metodah ravno na krožnosti. Krožnost lahko opišemo kot kontinuirano prehajanje med obema načinoma, kjer spodbuda za menjavo tehnike izhaja iz problema v podatkih. Ko z eno metodo izčrpamo možnosti novih informacij, uporabimo drugo metodo, ki problem osvetli na povsem nov način. Raziskovanje s pomočjo krožnih mešanih metod prikažem na primeru stavbe Fakultete za računalništvo in informatiko (FRI) Univerze v Ljubljani. V kompleksu, kjer se nahaja stavba FRI, naprave redno beležijo več kot dvajset tisoč vhodno-izhodnih signalov. V raziskavi sem za analizo izbrala temperaturo in prisotnost v prostoru, porabo energije ter kvaliteto zraka. Skupno sem zbrala več kot tri milijone meritev, ki sem jih nato zaokrožila na 15-minutne intervale. V podatkih sem identificirala ponavljajoče se vzorce s pomočjo vizualizacij, frekvenc pojavitev posameznih kombinacij ter gručenja. Izkaže se, da so vzorci uporabe različni glede na tip sobe. Leta 2018 smo v določenih sobah namestili senzorje kvalitete zraka, katerih namen je bil izboljšanje parametrov delovnega okolja. Z namestitvijo senzorja, ki je kvaliteto zraka sporočal z barvo lučke, se je pogostost zračenja v prostorih dvignila. Intervencija se je izkazala za uspešno, pri čemer pa mora le-ta biti zasnovana ljudem prijazno. Odzivnost senzorja je bila namreč slaba, kar je izzvalo frustracije uporabnikov, ki so senzorju sčasoma preprosto nehali zaupati. Odzive posameznikov na senzor sem identificirala z etnografskim pristopom. V prvih nekaj tednih se je večina uporabnikov na opozorilo o slabi kvaliteti zraka odzvala resno in v hipu reagirala (npr. z odpiranjem okna). Ker senzor dolgo časa ni nagradil uporabnika s povratno informacijo o izboljšanju kvalitete zraka, je zaupanje do naprave padlo, naraščale pa so frustracije. V začetku se je večina odzivala šaljivo, nato jezno, nazadnje pa so senzor preprosto začeli ignorirati. Iz tega sem ugotovila, da morajo biti tehnološke rešitve zasnovane tako, da imajo preprosta navodila ter ustrezajo pričakovanjem uporabnikov, sicer jih ti hitreje prenehajo uporabljati. Izkazalo se je, da je tovrstne tehnološke izdelke, ki vplivajo na vedenje in navade, treba oblikovati v sodelovanju z uporabniki, torej po načelu sodelovalnega oblikovanja. Kot nadgradnjo analize podatkov sem zasnovala sistem za analizo transkriptov intervjujev. Struktura intervjuja je določena z vprašanji in odgovori, kar je treba upoštevati pri analizi. Zato sem preizkusila šest načinov segmentacije intervjujev, kjer segmentacija upošteva sklope vprašanje-odgovor, hkrati pa poskuša besedilo razdeliti na tematsko enotne dele. Segmentirane intervjuje sem nato gručila s hierarhičnim razvrščanjem v skupine, ki je uspešno združilo vprašanja po tematskih sklopih. Na ta način sem enostavneje primerjala odgovore sogovornikov. Rezultate analize intervjujev sem ponovno dopolnila z etnografijo. Glavno sporočilo uporabnikov t. i. pametne stavbe je, da so sicer tehnološke rešitve zadovoljiv pripomoček, je pa uporabnikom potrebno omogočiti, da sami upravljajo z okoljem, ko to želijo. Hkrati je take stavbe že izhodiščno smiselno zasnovati glede na specifične potrebe uporabnikov, saj so kasnejše prilagoditve dražje ali celo neizvedljive. In če uporabniki menijo, da rešitev zanje ni ustrezna, hkrati pa je pomembna za njihovo delo ali dobro počutje, jo bodo s kreativnimi rešitvami prilagodili. Z analizo senzorskih podatkov v disertaciji pokažem, kako lahko kvantitativni pristopi, specifično strojno učenje, podatkovno rudarjenje in rudarjenje besedil, uspešno dopolnijo kvalitativne pristope, denimo opazovanje z udeležbo in intervjuje. Preplet metodologij pa je možno uporabiti tudi v drugih raziskovalnih kontekstih. Modele strojnega učenja lahko uporabimo za avtomatsko označevanje arhivskih slik, kot pokažem na primeru kozolcev. Mešane metode pa uspešno delujejo tudi v gospodarstvu, kar je razvidno iz primera razvoja uporabniškega vmesnika za medijski portal, kjer smo v sodelovanju z uporabniki ter s kombinacijo strojnega učenja in fokusnih skupin oblikovali izdelek, ki upošteva potrebe uporabnikov in želje naročnikov. Kot pokažem v disertaciji, kvantitativni pristopi v antropologiji ne zavračajo klasičnih etnografskih pristopov, temveč jih obogatijo in opolnomočijo. Računske tehnike so zelo primerne za obdelavo velikih količin podatkov, preučevanje longitudinalnih vzorcev in sočasnih pojavov ter preliminarne raziskave terena oziroma oblikovanje raziskovalnih vprašanj. Z etnografijo razkrite vzorce razložimo, osmislimo, postavimo v družbeni kontekst in jih dopolnimo s podrobnimi in bogatimi opisi. Glavna prednost krožnih mešanih metod pa je, da z njimi ustvarimo povratno raziskovalno zanko, ki omogoči novo perspektivo za posamezen vir podatkov in s tem zagotovi celostnejši vpogled v problem, ki ga raziskujemo.

Language:Slovenian
Keywords:računska antropologija, krožne mešane metode, znanost o podatkih, podatkovna etika, nadzor, pametne stavbe, metodologija, tehnologija
Work type:Dissertation
Typology:2.08 - Doctoral Dissertation
Organization:FF - Faculty of Arts
Place of publishing:Ljubljana
Publisher:A. Pretnar
Year:2021
Number of pages:157 str.
PID:20.500.12556/RUL-167599 This link opens in a new window
UDC:39:004.8(043.3)
COBISS.SI-ID:64481027 This link opens in a new window
Publication date in RUL:03.03.2025
Views:446
Downloads:126
Metadata:XML DC-XML DC-RDF
:
Copy citation
Share:Bookmark and Share

Secondary language

Language:English
Title:Methodological approaches and practical aspects of data mining in anthropology
Abstract:
An increasing popularity of big data analyses in anthropology in recent years is most evident in the emergence of a new field called computational anthropology. Even though computers have been used in anthropology at least since mid 20th century and even though computational social sciences are a well-established field, there are few anthropological researches that use computational approaches for data analysis. The first to point to the usefulness of computers as research tools for anthropology were Lévi-Strauss, who wanted to efficiently organize basic units of myths, and Edmund Leach, who was looking for general structures of societies. In Slovenia in 1970, Helena Ložar-Podlogar used punch-card system to record material on marriage rituals, and in 1991 Jurij Fikfak designed a digitized information system of Göth topography with his co-workers. Nevertheless, most studies remained predominantly qualitative, with quantitative data used only as secondary material. In the last decade we witnessed methodological and epistemological progress in inclusion of computational analyses in anthropological research. Integration of qualitative and quantitative approaches stems from the tradition of mixed methods, where specific approaches include cooperative descriptions of visualizations (i.e. »ethno-mining«), calibration of analytical parameters and interpretation of findings with ethnography (i.e. »stitching«), designing a new conceptual field from both methods (i.e. »blending«), and explicit interdisciplinary methodological experimentation (i.e. »hybrid methodology«). In this dissertation, I propose a novel methodology called circular mixed methods, which draws from both approaches and promotes continuous reformulation of hypotheses and findings based on data analysis and ethnography. I show the advantages and shortcomings of circular mixed methods on the analysis of workspace sensor data, explore the possibilities of intertwining ethnography with data mining and propose a methodological framework for applying computational techniques in anthropology. Circular mixed methods stem from the tradition of interdisciplinary researches in anthropology, specifically from mixed methods. If general mixed methods combine qualitative and quantitative aspects sequentially or in parallel, the focus in circular mixed methods in on circularity. Circularity signifies continuous transversing between the two approaches, where the impetus for a change in the technique comes from the problem in the data. Once the extraction of new information from the data is exhausted with one technique, another is used, which gives a novel view of the data. The circular mixed methods are demonstrated on the case of the Faculty of Computer and Information Science (FCIS) building. The complex, where the FCIS building is located, regularly records over 20,000 input-output signals, among which I have chosen room temperature and occupancy, energy use and air quality for the analysis. There were over three million data points in total, which I have aggregated to 15-minute intervals. Repetitive patterns in the data where identified with visualizations, frequentist techniques, and clustering. In 2018, we installed air quality sensor in certain rooms, with the aim of improving work space parameters. By installing the sensor, which reflected air quality with the color of the light, the frequency of opening windows increased. The intervention proved useful, but only if it is designed in a user-friendly way. Sensor wasn't very responsive, which frustrated the users, who stopped trusting the sensor as the time went by. I used ethnography to identify people's reactions to the air quality sensor. In the first few weeks most users took the warning of poor air quality seriously and instantly responded (i.e. opened the window). When the sensors failed to reward the user with favourable feedback for while, the user ceased to trust the device and got increasingly frustrated. The initial response was joking about the sensor, then anger, then finally the user simply ignored the sensor. Technological solutions have to be designed in such a way to be easy to use and fit the users' expectations, otherwise they will stop using them sooner. The easiest way to develop such a product is by cooperating with the end users and following the principles of co-creation. To enhance data analysis I have developed a system for interview transcript segmentation. The structure of the interview is determined by questions and answers, which we have to consider when analysing text. Therefore I tested six algorithms for interview segmentation, where question-answer pairs are taken into account when determining the thematic structure of the text. It turns out none of these approaches work. The best one is a simple algorithm, which joins question-answer pairs if the second question contains less than five words or it doesn't end with a question mark. The segmented interviews were then clustered with hierarchical clustering, which successfully joined the segments with a similar topic. This approach made it easier to compare the answers of different research participants. The results of interview analysis were once again supplemented with ethnography. The main take way is that technological solutions are a welcome tool, but they have to enable some level of control for the user. Concurrently, »smart« buildings should be designed by end users' requirements from the get go, as subsequent modifications are costly or are even impossible to do. And when the users feel like the solution isn't appropriate, yet is vital for their work and well-being, they will hack and modify it to their needs. With sensor data analysis I show how quantitative approaches, specifically machine learning, data mining and text mining, successfully complement qualitative approaches, that is participant observation and interviews. Such mixing of methodologies can be just as fruitful in other research contexts. Predictive models can be used for automatic labelling of archive images, as I show on the case of hayracks. Mixed methods also work well in the industry; I present the development of a user interface for a media portal, where the interface was designed in cooperation with the end users. By combining machine learning and focus groups the team was able to design the final product, which fit the needs of the users and the requirements of the client. Quantitative approaches in anthropology don't reject standard ethnographic approaches, they enrich and enhance them. Computational techniques are specifically appropriate for analysis of large data sets, longitudinal and simultaneous phenomena, and for preliminary field research and generating research questions. Ethnography, on the contrary, explains and makes sense of the discovered patterns, places the results in social and cultural context, and generally supplements the findings with rich descriptions. The main advantage of circular mixed methods is that they create a recurrent research loop, which ensures an additional perspective for each data source and enables a holistic insight into the studied phenomenon.

Keywords:computational anthropology, circular mixed methods, data science, data ethics, surveillance, smart building, methodology, technology

Similar documents

Similar works from RUL:
Similar works from other Slovenian collections:

Back