Time Series Classification

In order to perform a Time series Classification we use Decision Tree, and then we look at the performance of the classification.

We use the Synthetic Control Chart Time Series. This dataset contains 600 examples of control charts synthetically generated by the process in Alcock and Manolopoulos (1999).

data <- read.table("C:/07 - R Website/dataset/TS/synthetic_control.txt", header = FALSE)

# Data Preparation
pattern100 <- c(rep('Normal', 100),
             rep('Cyclic', 100),
             rep('Increasing trend', 100),
             rep('Decreasing trend', 100),
             rep('Upward shift', 100),
             rep('Downward shift', 100))

# Create data frame
newdata <- data.frame(data, pattern100)

# Classification with Decision Tree
library(party)
tree <- ctree(pattern100~., newdata)

# Classification Performance
tab <- table(Predicted = predict(tree, newdata), Actual = newdata$pattern100) # confusion matrix
tab

                  Actual
Predicted          Cyclic Decreasing trend Downward shift Increasing trend
  Cyclic               97                0              3                0
  Decreasing trend      0               99              8                0
  Downward shift        0                1             89                0
  Increasing trend      2                0              0               96
  Normal                1                0              0                0
  Upward shift          0                0              0                4
                  Actual
Predicted          Normal Upward shift
  Cyclic                0            0
  Decreasing trend      0            0
  Downward shift        0            0
  Increasing trend      0            6
  Normal              100            4
  Upward shift          0           90

sum(diag(tab))/sum(tab) # accuracy

[1] 0.9516667

From the resul of the tree model (not shown here) we have 25 terminal nodes, and 49 branches. From the Confusin Matrix above, we can see that we have si different patterns, and in the main diagonal we have the corrected prediction, and the off main diagonalvalues say to use the numebr of missclassified observations. The Accuracy is 95.16%.
The worst misclassification is for Decreasing trend with 8 misclassified observations. Moreover, the maximum confusion is for Downward shift. On the contrary, Normal trend is 100% correctly classified.