Loading…
Wednesday, June 12 • 11:00am - 12:00pm
Anomaly Detection on Golden Signals

Sign up or log in to save this to your schedule, view media, leave feedback and see who's attending!

Anomaly detection on golden signals, including latency, traffic, errors, and saturation, can detect system failures and provide important clues for failure diagnosis. In this talk, we will introduce our algorithm toolbox for anomaly detection on the golden signals.

The toolbox leverages historic data from the signals to build appropriate probability models. The alerts are hence generated based on the probability calculated from the observation and the probability model. The probability directly relates to the false positive rate of classification and is able to represent the SRE engineers' feeling. Furthermore, the probability values are comparable across different signals. So, it becomes a good feature for failure diagnosis. From our production system, the alerting precision ranges from 70% to 90%, and the recall is around 90%.

Speakers
avatar for Yu Chen

Yu Chen

Baidu
Yu Chen is a Data Architect at the IOP group of Baidu’s SRE department. His work focuses on developing algorithms for alerting and diagnosis, in order to improve the stability of production systems. Previously, he worked at Microsoft Research Asia. His research interests are distributed... Read More →


Wednesday June 12, 2019 11:00am - 12:00pm GMT+08
Track 2: Room 331–332