Anomaly detection, or outlier detection, is the identification of data points, observations, or events that do not conform to expected patterns of a given group. Anomalies or outliers occur very infrequently but can signify a large and significant threat, such as cyber intrusion, financial fraud, compliance violation, and machinery malfunction, to businesses. Anomaly detection has traditionally relied on subject matter experts to curate and set business rules to trigger red flags in data. This traditional approach is inherently flawed because:

|
  • As the methodology is highly dependent on subject matter experts' knowledge and experience, the results can be subjective or biased.
  • It entails a long and expensive development cycle to cultivate and refine the set of business rules.
  • Oftentimes, business rules are not precise enough, difficult to manage overtime and may contradict one another.

Machine learning-based anomaly detection algorithms are a leap forward from the rule-based solution. However, conventional machine learning algorithms oftentimes fail to achieve a satisfactory performance level to end users due to several reasons.

|
  • First, as anomalies are rare and expensive to be investigated, many organizations have few or no historical labels on anomalies. The lack of historical data and sparsity in labels impede the use of supervised machine learning algorithms.
  • Different anomaly detection algorithms rely on different assumptions and are domain specific. There is no one-size-fits-all anomaly detection algorithm.
  • Customizing anomaly detection algorithms is non-trivial as it relies heavily on feature engineering. This is an increasing challenge with rapidly rising data velocity and variety in today's business world.

In recent years, deep learning-based anomaly detection solutions have gained great momentum and showed superior performance in various domains. In contrast to the conventional machine learning algorithms, a deep learning algorithm thrives in a high-dimensional data rich environment. It has the ability to explore feature interactions and execute feature engineering by itself. It is also highly flexible in adapting to different types of data, structured or unstructured, by using the appropriate neural network design. Popular deep learning architectures that can be used in an anomaly detection framework include:

Autoencoder: Autoencoders learn compact representations of complex datasets by encoding them through an unsupervised training process, in which high-dimensional multivariate datasets are represented in lower dimensions. For example, when an autoencoder is trained on a dataset consisting entirely of normal transactions, the autoencoder will learn very well how to encode a normal transaction to a compact representation and then decode the compact representation back to a transaction. The reconstruction error of this encode-decode process will be very low. However, when the autoencoder is faced with abnormal transactions, we expect higher reconstruction error and those transactions will be flagged as anomalies.

Generative Adversarial Network (GAN): GAN is a framework for the estimation of generative models via an adversarial process in which two models, a generator and a discriminator, are trained simultaneously. The idea being that two models (generator and discriminator) compete with each other during training such that the former tries to capture normal data distribution and generate data, while the latter distinguishes between real and generated data. A well-trained discriminator can then be used to detect anomalies.