Events at the production facility that can be recorded by monitoring devices could be not related to each other and carried out by different types of entities. In this case, the attribute sets may not coincide completely, except for the attributes that are common to all the fixed events. In this regard, large data sets describing a variety of different processes often consist of records that differ significantly in structure and composition of attributes. When aggregate analysis of such data by the methods of intellectual analysis, there may be a problem with the choice of a method suitable for such a set of data. In addition, the authors suggest that different analysis methods can be used for different data groups. In this paper, we propose a method for splitting the data sets into groups, depending on the composition of the attributes

Authors: Ya. А. Bekeneva, S. I. Lebedev, I. I. Kholod, Е. S. Novikova

Direction: Informatics and Computer Technologies

Keywords: Data grouping, data attributes, data from heterogeneous sources, decision tree, data mining

View full article