Transform Broadcasting To The Digital Advertising Age
July 23, 2021
The DevOps team could not manually monitor multiple simultaneous sports matches at once. Resources that were located remotely did not have access to make on-the-fly configuration changes in response to service degradation.
At the heart of the problem was that client devices and OTT components were not constantly communicating the complete set of health and performance metrics. In addition, there was no centralized dashboard, monitoring, and smart alerting in place which caused confusion, redundancy in effort, and increased the time to fix.
STAND 8 OTT engineers delivered a robust real-time monitoring and remote control system which our client used to broadcast major world sports events. Using MQTT messaging, a consolidated dashboard, and robust instrumentation monitoring, the client devices were configured to send real-time health metrics including CPU usage, memory usage, bandwidth availability, geolocation, and critical KPI stats (video start time, rebuffering ratio, etc.). Each health parameter was given an acceptable KPI threshold. If any metric went above the threshold, the DevOps team would be notified immediately. It could start triaging and addressing the issue by using the centralized monitoring and control system.
The centralized dashboard and smart KPI alerting provided timely insights to our client’s OTT operations group and enabled them to triage and root cause issues affecting performance quickly.
Our client no longer had to manually monitor the metrics of hundreds of matches, but rather could rely on a consolidated real-time dashboard view and exception-based alerting to focus on troubleshooting and root cause analysis. The net result being that the dev/ops team could quickly and easily identify problem areas, fix those issues remotely, and maintain the highest quality of live streaming performance during the entire sports event. This ensured the broadcast companies maintained and exceeded their service level agreements (SLAs).
Longer-term, the DevOps teams have come to rely on the accuracy and the integrity of the data collected across all client devices. Employing an automatic self-recovery model using AI/ML technology along with exception-based alerting, our client was able to effectively determine when a potential outage was simply an outlier event which can be ignored or was an impending critical issue that needed immediate attention.
Smart OTT monitoring and remote control technologies open up the opportunity for Engineering and Operations support teams to cut costs, decrease triage time, and leverage automatic service recovery using AI/ML technology allowing OTT providers and broadcast companies to focus on technology advancements and innovation.