Large enterprises are heavily relying on IT technology and infrastructure, which strives to quickly respond and remediate occurring problems, faults, and identify the underlying root causes. To automate this process, the enterprises rely on root cause analysis approach. One of the components is a component discovery module, which also provides information about the dependencies between IT components. In this thesis, we focus on building an IT component dependency graph from granular configuration data automatically. We analyze the configuration data in order to first infer the dependencies between hosts, and secondly, to find the dependencies between IT components. Furthermore, we assign each dependency a likelihood that it exists with a supervised machine learning algorithm. We show that our approach is much faster and accurate compared to the naive approach, which compares configuration parameters to each other. Moreover, we provide an extensive evaluation on the real dataset, where the evaluation takes into account the transitive property of dependencies, and specific properties of the root cause analysis. The evaluation results show that our proposed algorithm reaches 90% recall and 100% precision for discovering dependencies between generic IT components.
|