- © Tom Wu (Github)
A manufacturing process should be in control to assess the process capability. Nowadays, statistical process control (SPC) charts have been incorporated by organizations around the world as one of the primary tools to monitor and improve the control of a process. The statistical process control (SPC) chart was invented by Walter A. Shewhart working for Bell Labs in the 1920s. The SPC chart is used to study the changes in the process over time by plotting data points, control limits(管制上下界), and a center line(中心線). Note that a process is in statistical control when only common cause variation exist and when the statistical properties do not vary over time.
Image Source : Wikimedia Commons
In this work, we will monitor the trend of the data over time for ensuring trends in data are consistent. Becasuse there is a trend in the data over time, we cannot use the SPC chart to objectively assess process capability. Here, we use Dissimilarity Analytics to monitor the trend of the data. A dissimilarity analytics is applicable to time-series data and is a very robust technique to compare two or more time-series data.
There are three main machine learning models of our algorithm : Dynamic Time Warping, Multidimensional Scaling and K-means. The overall structure of algorithm is shown in the following figure.
[1] Dynamic Time Warping, DTW [ Studying Note ]
[2] Multidimensional Scaling, MDS [ Studying Note ][ Sample Code ]
- Sample Dataset
The sample dataset in this work is from a real panel process by Electronic Data Capture(EDC) system. There are 95,419 data of the test item between 1st February 2021 and 19th January 2022, with 11 times parts replacement.
- Normal Pattern
The overall pattern of the sample dataset is shown in the following figure. First, the normal pattern must be defined. We define the normal pattern by the data from 2021/2/12 01:54 to 2021/2/19 15:17.
-
Test Data
Now, we will detect the pattern of each test data (Test 1~7 in the above figure) by comparing each test data and defined normal data. -
Model Performance (offline)
From the above figure, we intuitively get the feeling that the trend of the Test 6 and Test 7 are obviously flat. This algorithm confirms our feelings! This model can effectively distinguish differences in trends and the model performance is stable. We can view the detection results of each test data in the following figure and gif.
[1] Similarity Measures and Dimensionality Reduction Techniques for Time Series Data Mining, Carmelo Cassisi, Placido Montalto, Marco Aliotta, Andrea Cannata and Alfredo Pulvirenti, September 2012. [ Download Link ]
[2] Identification of Out-of-Trend Stability Results, PhRMA CMC Statistics and Stability Expert Teams, April 2003. [ Download Link ]
[3] Methods for Identifying Out-of-Trend Data in Analysis of Stability Measurements–Part I: Regression Control Chart, Máté Mihalovits and Sándor Kemény, November 2, 2017. [ Download Link ] [ Reading Note ]
[4] Methods for Identifying Out-of-Trend Data in Analysis of Stability Measurements—Part II: By-Time-Point and Multivariate Control Chart, Máté Mihalovits and Sándor Kemény, December 2017. [ Download Link ]
[5] Learning Confidence for Out-of-Distribution Detection in Neural Networks, Terrance DeVries and Graham W. Taylor, February 2018. [ Download Link ]
Articles
[1] Time Series Similarity Using Dynamic Time Warping -Explained, Abhishek Mishra, December 2020.
[2] Timeseries Classification: KNN & DTW, Mark Regan, October 2018.
Programming
[1] dtw-python: Dynamic Time Warping in Python
[2] Python Data Science Handbook
Please cite this repository Trend Detection if you use it.