Survey of Data Mining for Mechatronic Systems
Research output: Thesis › Diploma Thesis
Standard
2014. 86 p.
Research output: Thesis › Diploma Thesis
Harvard
APA
Vancouver
Author
Bibtex - Download
}
RIS (suitable for import to EndNote) - Download
TY - THES
T1 - Survey of Data Mining for Mechatronic Systems
AU - Xu, Tian
N1 - embargoed until null
PY - 2014
Y1 - 2014
N2 - Data mining is a process of using various algorithms to transform an original data set, which may be affected by noise and missing values, into a form that can be analysed easier by human in order to extract information from it. This thesis gives an overview of the process and a brief introduction to commonly used algorithms. Among them symbolisation methods have some advantage for data mining. They allow convenient visualisation for human or automated search with symbolic queries, for example for repetitive pattern identification and discord detection. Especially the Symbolic Aggregate Approximation method allows efficient reduction of dimensionality and indexing with a positive semi-definite distance measure. After giving an overview, the thesis focuses on mining a real data set that was recorded on a production machine.Twenty sensors delivered values over more than a year resulting in a huge amount of approximately one billion measurements. For two exemplary sensors, the application of several algorithms is demonstrated, such as preprocessing, k-means clustering, symbolisation, or dimensionality reduction. At the end of the data processing it is easily possible to find relations between events in the data streams with the help of token tables and to enable symbolic search for repetitive patterns.
AB - Data mining is a process of using various algorithms to transform an original data set, which may be affected by noise and missing values, into a form that can be analysed easier by human in order to extract information from it. This thesis gives an overview of the process and a brief introduction to commonly used algorithms. Among them symbolisation methods have some advantage for data mining. They allow convenient visualisation for human or automated search with symbolic queries, for example for repetitive pattern identification and discord detection. Especially the Symbolic Aggregate Approximation method allows efficient reduction of dimensionality and indexing with a positive semi-definite distance measure. After giving an overview, the thesis focuses on mining a real data set that was recorded on a production machine.Twenty sensors delivered values over more than a year resulting in a huge amount of approximately one billion measurements. For two exemplary sensors, the application of several algorithms is demonstrated, such as preprocessing, k-means clustering, symbolisation, or dimensionality reduction. At the end of the data processing it is easily possible to find relations between events in the data streams with the help of token tables and to enable symbolic search for repetitive patterns.
KW - Data mining
KW - time series
KW - classification
KW - clustering
KW - sax
KW - symbolic queries
KW - lexical analysis
KW - k-means
KW - Data-Mining
KW - Zeitreihen
KW - Klassifikation
KW - Clustering
KW - Sax
KW - Symbolic Query
KW - Lexikalische Analyse
KW - k-means
M3 - Diploma Thesis
ER -