Вы находитесь на странице: 1из 1

Efficient Incident Identification from Multi-dimensional Issue Reports via Meta-

heuristic Search
In large-scale cloud systems, unplanned service interruptions and outages may cause severe
degradation of service availability. Such incidents can occur in a bursty manner, which will in turn
deteriorate user satisfaction. Identifying incidents rapidly and accurately is critical to the cloud
system operation and maintenance. In industrial practice, incidents are typically detected by
analyzing the issue reports generated over time by monitoring cloud services. Identifying incidents
in a large number of issue reports is quite challenging. An issue report is typically multi-
dimensional: it has many categorical attributes. It is difficult to identify a specific attribute
combination that indicates an incident. Existing methods generally rely on pruning-based search,
which is time-consuming given high-dimensional data. They are not practical to incident detection
in large-scale cloud systems. In this paper, we propose MID(Multi-dimensional Incident
Detection), a novel framework for identifying incidents from large-amount, multi-dimensional
issue reports effectively and efficiently. Key to the MID design is encoding the problem into a
combinatorial optimization problem, and a specific-tailored meta-heuristic search method is
proposed, which can rapidly generate attribute combinations with high probability indicating
incidents. We evaluate MID with extensive experiments using both synthetic data and real-world
data collected from a real-world large-scale cloud system of Company M. The experimental
results show that MID significantly outperforms the current state-of-the-art methods in terms of
effectiveness and efficiency. MID has been successfully applied to company M’s cloud systems
and helped greatly reduce manual maintenance efforts.

Вам также может понравиться