Siren Platform User Guide

Models

Siren ML consists of two models, Anomaly Detection and Future Prediction.

Anomaly detection

Anomaly detection models use unsupervised learning to automatically detect anomalies in a single-metric numerical time series. To train an anomaly detection model, you first need to select training data. This data is used during the model training to learn typical behaviors and seasonal patterns in your data.

Once trained, the anomaly detection model can be activated to perform either live or historical detections.

Live detection

During live detection, the anomaly detection model runs in real time to alert you of any unusual events in your data so that you can take timely and appropriate action.

Historical detection

Historical detection allows you to run the anomaly detection model on your existing data, which is useful for gaining insight into the past behavior of the data and highlighting unusual events that you may have missed.

Future prediction

Future prediction models provide the ability to predict future trends in a single-metric numerical time-series. This type of model is particularly useful for supporting decision-making when faced with planning and resource management tasks, as the predictive model can learn complex trends not always obvious to the decision-maker.

Similar to the anomaly detection model, training data is required for the future prediction model to learn the behavior and patterns within your data prior to activation. When training is complete, the future prediction models can be activated to do live predictions, where the machine learning model runs in real time to predict behavior at a user-specified time into the future.

These real-time future predictions can be viewed in the Machine Learning Explorer visualization. In this visualization the future predictions are indicated with a red line and are accompanied by a blue shaded area which shows the confidence of the model’s prediction. The narrower this shaded region is, the more confident the model is of its prediction.

Model training

Training of a new machine learning model consists of two phases, hyperparameter optimization and full model training.

Hyperparameter optimization finds the best model architecture and training parameters so that the most accurate model is developed. This is an important step as different datasets require different model architectures (such as the number of hidden layers in a neural network) and training parameters to attain the best results. The best model architecture and training parameters determined during the hyperparameter optimization are then used for the full model training.

This full model training iterates through your data multiple times to learn its behaviors and patterns. When training a model, the data is split into a training set, a validation set, and a testing set.

The training set updates the model parameters and is effectively what the machine learning model uses to learn.

The validation set is used to make sure the model is not overfitting the training data. When a model overfits the training data, it effectively memorizes the data instead of learning general patterns that are useful when handling new data. By occasionally assessing the performance of the machine learning model on a validation set during training, you can ensure that the model training is progressing as expected.

When training is completed the test set is used to calculate the accuracy of the model on data it has not seen before; this is indicative of how well the model is expected to perform during live detection/prediction.

During training, the machine learning algorithm is tasked with minimizing the output value of a function known as the cost (or loss) function, which differs depending on the intended use of the machine learning model. The output value of the cost function is referred to as the training loss when the function is assessed on the training set, and the validation loss when assessed on the validation set. The closer the loss function is to zero, the more accurate the model.

model-info-training-progress.png

Hyperparameter histogram

During hyperparameter optimization, different model architectures and training parameters are tested to find the best configuration for the data being modelled. Each test run is referred to as a hyperparameter trial. Each trial consists of a short training and an evaluation of the loss function, which is presented on the histogram. Up to ten hyperparameter trials may be run for a model.

The hyperparameter histogram shows the evolution of the hyperparameter score over each of the trials. The configuration of the model architecture and training parameters which result in the lowest scoring trial are used as the configuration for the full model training.

Training loss curves

During full model training, live data of the training progress is plotted. This shows the progression of both the training and validation losses. Ideally, both the training loss and the validation loss will decrease at a similar rate and the lower these values are the better the model has learned to analyze and predict your data.

The parameters of the trained machine learning model (which are used for detections/predictions) are taken from the point when the validation loss is at its lowest. This indicates when the model has reached its best performance in terms of discovering useful trends and will have the best generalization to new data.

The graph can also be used to assess if the model is overfitting, this can be observed when the training loss continues to reduce and the validation loss stops decreasing or even begins to increase. In such an event the training automatically stops to prevent unnecessary training time. Similarly, if the validation loss plateaus for a substantial number of points the training also automatically stops as the model is unlikely to learn further useful trends.

Data Storage

When a model is created, its configuration and generated neural network are stored on the filesystem by the Siren ML engine. The progress of hyperparameter optimization and training is stored in the sirenml-monitor index in Elasticsearch. The machine learning data output from models is stored in dedicated indices of the form ml-model-<modelName>-<date>. For example, for a model named MyPredictor, one of the output indices is ml-model-mypredictor-2019-07-17. Thus, to manually access data for this model, use the index pattern ml-model-mypredictor-*.