In order to perform a Time series Classification we use Decision Tree, and then we look at the performance of the classification.
We use the Synthetic Control Chart Time Series. This dataset contains 600 examples of control charts synthetically generated by the process in Alcock and Manolopoulos (1999).
data <- read.table("C:/07 - R Website/dataset/TS/synthetic_control.txt", header = FALSE) # Data Preparation pattern100 <- c(rep('Normal', 100), rep('Cyclic', 100), rep('Increasing trend', 100), rep('Decreasing trend', 100), rep('Upward shift', 100), rep('Downward shift', 100)) # Create data frame newdata <- data.frame(data, pattern100) # Classification with Decision Tree library(party) tree <- ctree(pattern100~., newdata) # Classification Performance tab <- table(Predicted = predict(tree, newdata), Actual = newdata$pattern100) # confusion matrix tab
Actual Predicted Cyclic Decreasing trend Downward shift Increasing trend Cyclic 97 0 3 0 Decreasing trend 0 99 8 0 Downward shift 0 1 89 0 Increasing trend 2 0 0 96 Normal 1 0 0 0 Upward shift 0 0 0 4 Actual Predicted Normal Upward shift Cyclic 0 0 Decreasing trend 0 0 Downward shift 0 0 Increasing trend 0 6 Normal 100 4 Upward shift 0 90
sum(diag(tab))/sum(tab) # accuracy
From the resul of the tree model (not shown here) we have 25 terminal nodes, and 49 branches. From the Confusin Matrix above, we can see that we have si different patterns, and in the main diagonal we have the corrected prediction, and the off main diagonalvalues say to use the numebr of missclassified observations. The Accuracy is 95.16%. The worst misclassification is for Decreasing trend with 8 misclassified observations. Moreover, the maximum confusion is for Downward shift. On the contrary, Normal trend is 100% correctly classified.