The study of genealogical trees of bengal cats using machine learning methods to identify hereditary diseases

The aim of the work is to investigate, using methods of intelligent analysis, the probability of heritable disease transmission, using a database of Bengal cats, which have a high predisposition to hypertrophic cardiomyopathy. The Neo4j graph database management system was chosen to accomplish the task. The program realizing the collection of information from the web version of the cat database and further processing of the obtained data was implemented in Python. Various approaches have been used to determine an individual's HCM disease status from their pedigree. The analysis used such methods as: random forest method, logistic regression and multilayer perceptron. Experiments have shown that the most effective approach to solving this problem is the prediction of relationships, and the most effective model among the considered candidate models is the random forest method. In practice, methods for solving machine learning problems using graph structure data were considered.

Authors: N. A. Fomchenkova, Ya. A. Bekeneva

Direction: Informatics, Computer Technologies And Control

Keywords: pedigree, Bengal cat, HCM, graph, Data Mining, database


View full article