Skip to content

How to model stochastic behavior of failures in telco or IT systems using machine learning

License

Notifications You must be signed in to change notification settings

SadeghKrmi/codedive2022

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 

Repository files navigation

codedive2022

How to model stochastic behavior of failures in telco or IT systems using machine learning

Abstract:

In large-scale networks in IT or telco, in order to slow down the degradation process of the live system and reduce its impact on the quality of end-user experience, preventive maintenance (PM) with minimal repair at failures is required. Network nodes have stochastic behavior for failures with relation to alarm and health-check status shown before failure happens. The more major or critical alarm generated, the probability of failure increases. To predict failure and to reduce financial and non-financial loss, it is necessary to have a proper approach and proper model to address prioritization of failure for preventive maintenance.

Prometheus diagram

image

Neural network

image

shap explainer

image

About

How to model stochastic behavior of failures in telco or IT systems using machine learning

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages