Вы находитесь на странице: 1из 16

LogRhythm Diagnostics Module

User Guide
June 22, 2016

USG-LogRhythm_Diagnostics-revA
LogRhythm Diagnostics Module User Guide

LogRhythm, Inc. All rights reserved


This document contains proprietary and confidential information of LogRhythm, Inc., which is protected by
copyright and possible non-disclosure agreements. The Software described in this Guide is furnished under the
End User License Agreement or the applicable Terms and Conditions (Agreement) which governs the use of the
Software. This Software may be used or copied only in accordance with the Agreement. No part of this Guide may
be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopying and
recording for any purpose other than what is permitted in the Agreement.

Disclaimer
The information contained in this document is subject to change without notice. LogRhythm, Inc. makes no
warranty of any kind with respect to this information. LogRhythm, Inc. specifically disclaims the implied warranty
of merchantability and fitness for a particular purpose. LogRhythm, Inc. shall not be liable for any direct, indirect,
incidental, consequential, or other damages alleged in connection with the furnishing or use of this information.

Trademark
LogRhythm is a registered trademark of LogRhythm, Inc. All other company or product names mentioned may be
trademarks, registered trademarks, or service marks of their respective holders.

LogRhythm Inc.
4780 Pearl East Circle
Boulder, CO 80301

(303) 413-8745
www.logrhythm.com

LogRhythm Customer Support


support@logrhythm.com
LogRhythm Diagnostics Module User Guide

Contents
Introduction ....................................................................................................................................................................................................... 1
Module Contents ............................................................................................................................................................................................. 1
Alarm Rules .................................................................................................................................................................................................... 1
Reports ............................................................................................................................................................................................................ 3
Investigations ................................................................................................................................................................................................ 3
Tails ................................................................................................................................................................................................................... 3
Troubleshooting Guidance .......................................................................................................................................................................... 4
Appendix: Summary of Changes ............................................................................................................................................................. 12
Renamed Alarm Rules ............................................................................................................................................................................. 12
New Alarm Rules ....................................................................................................................................................................................... 13
LogRhythm Diagnostics Module User Guide

Introduction
The LogRhythm Diagnostics module is provided as part of the LogRhythm Knowledge Base and includes content
intended to monitor the health of the LogRhythm deployment and generate alarms when key health-impacting
events occur. The module contains tails, reports, and investigations to monitor all diagnostic events, as well as
alarms triggered by specific conditions.

NOTE: The LogRhythm Diagnostics Module replaces content currently available in the QsEMP module and
will be automatically synchronized on all deployments. Existing rules have been modified for accuracy,
updated with new components where necessary, and have undergone settings changes to reduce alarm
volume. With the exception of LogRhythm Component Critical Condition, which is suppressed by default
for one hour, the default suppression for all alarms has been updated to two hours.

Module Contents
Alarm Rules
Alarm Rule Name Alarm Description Alarm Rule ID
LogRhythm Mediator Database Alarms on the occurrence of the Mediator Database reaching 96
Capacity Error 90% capacity. The Mediator Server inserting log data into the
affected Mediator database will cease accepting new log
messages from connected agents and will force agents to
disconnect.
LogRhythm Mediator Database Alarms on the occurrence of the Mediator Database reaching 97
Capacity Warning 80% capacity. At 90% capacity the Mediator Server inserting
data into the affected Mediator database will cease accepting
new log messages.
LogRhythm Agent Heartbeat Missed Alarms on the occurrence of a LogRhythm Agent Heartbeat 98
Missed event which could indicate a LogRhythm Agent going
down.
LogRhythm Component Critical Alarms on the occurrence of any critical LogRhythm component 99
Condition event which indicates the failure of a LogRhythm component.
LogRhythm Component Successive Alarms on successive occurrences of critical and error 100
Errors LogRhythm component events which likely indicate the failure of
a LogRhythm component.
LogRhythm Component Excessive Alarms on excessive occurrences of critical, error or warning 101
Warnings LogRhythm component events which could indicate pending
failures of the LogRhythm solution.
LogRhythm Mediator Heartbeat Alarms on the occurrence of a LogRhythm Mediator Heartbeat 102
Missed Missed event which could indicate that a log manager has gone
down.
LogRhythm MPE Rule Disabled Alarms on the occurrence of a LogRhythm MPE Rule Disabled 103
event.
LogRhythm Silent Log Source Error Alarms on a LogRhythm Silent Log Source Error event which 104
could indicate a log source that has gone silent.

Page 1
LogRhythm Diagnostics Module User Guide

Alarm Rule Name Alarm Description Alarm Rule ID


LogRhythm Database Maintenance Alarms on a LogRhythm Database Maintenance job failing. 210
Failure
LogRhythm Failed To Submit Batch Alarms on Mediator error "Failed to submit batch job to the 212
Job To DB database"
LogRhythm Excessive Unprocessed A high number of logs have been spooled to disk. 230
Logs Spooled to Disk
LogRhythm Excessive Processed Logs The Log Insert Manager has spooled a high number of logs to 231
Spooled to Disk disk.
LogRhythm Excessive Events Spooled The Event Insert Manager has spooled a high number of logs to 232
to Disk disk.
Perfmon Counter Reached Threshold Alarm for performance counter alerting on disk exhaustion 233
Limit
LogRhythm GLPR Error Alarms on the occurrence of a LogRhythm GLPR processing or 408
preparation error.
LogRhythm AI Engine Heartbeat A heartbeat message from the LogRhythm AI Engine service was 676
Missed not received in the allotted time.
LogRhythm AI Comm Manager A heartbeat message from the LogRhythm AI Engine 677
Heartbeat Missed Communication Manager service was not received in the
allotted time.
LogRhythm CMDB Database Alarms on the occurrence of the Case Management Database 947
Warning reaching 90% capacity.
LogRhythm CMDB Stats Warning Alarms on the occurrence when the LogRhythm Job Manager is 948
unable to retrieve the Case Management stats.
LogRhythm CMDB Database Error Alarms when the Case Management Database has utilized more 949
than 90% of its capacity.
LogRhythm Agent Cannot Update Alarms on the LogRhythm Diagnostic Event ID 7012 - The 1002
System Monitor Agent Cannot Update Itself.
LogRhythm Agent Needs Reboot Alarms on the LogRhythm Diagnostic Event ID 7014 - The 1003
System Monitor Agent Has Been Updated But Requires A
Reboot.
LogRhythm Network Monitor Alarms on the occurrence of a Network Monitor Heartbeat 1084
Heartbeat Missed Missed event which could indicate that a network monitor has
gone down.
LogRhythm Data Indexer Stopped One or more data indexer services has stopped. 1093
LogRhythm Data Indexer An attempt to change the LogRhythm Data Indexer 1094
Configuration Fail configuration has failed.
LogRhythm Data Indexer Suspend The LogRhythm Data Indexer reliable messaging has gone into 1095
suspend.
LogRhythm Data Indexer EMDB Sync The synchronization service failed to replicate critical EMDB 1096
Fail tables.
LogRhythm Data Indexer Disk Limit The LogRhythm Data Indexer has exceeded its drive space 1097
Exceeded threshold.

Page 2
LogRhythm Diagnostics Module User Guide

Alarm Rule Name Alarm Description Alarm Rule ID


LogRhythm Data Indexer Max Index The LogRhythm Data Indexer has exceeded its TTL. 1098
Exceeded
LogRhythm Data Indexer List Not A query attempted to access a list which is not available on the 1099
Found DX cluster.
LogRhythm Data Indexer Repo Not A query attempted to access a log repository which is not 1100
Found available on the DX cluster.
LogRhythm Data Indexer Cluster The Data Indexer elastic search health has changed. 1101
Health
LogRhythm Knowledge Base Update Alarms if the LogRhythm Knowledge Base fails to automatically 1102
Error download or sync.
LogRhythm Mediator Recycling The LogRhythm Mediator has recycled due to hung MPE 1139
Hung MPE Threads threads.
LogRhythm Scheduled Report Failure The LogRhythm Job Manager has encountered an error when 1140
attempting to prepare, run, or export a scheduled report
package.
LogRhythm AD Sync Failure The LogRhythm Platform Manager has failed when attempting 1141
to sync Active Directory.

Reports
Report Name Report Description Report ID
LogRhythm Diagnostic Events Provides a detailed account of critical and error conditions 431
experienced by LogRhythm components.

Investigations
Investigation Name Investigation Description Investigation ID
LogRhythm Diagnostic Events This investigation is used to bring back all diagnostics events 12
from any LogRhythm Component (Agent, AI Engine, ARM,
Mediator, etc.).

Tails
Tail Name Tail Description Tail ID
LogRhythm Diagnostic Events This tail returns all diagnostic events from any LogRhythm 1
component (System Monitor Agent, AI Engine, ARM, Mediator,
etc.).

Page 3
LogRhythm Diagnostics Module User Guide

Troubleshooting Guidance
This section provides information about steps you can take to further analyze specific alarms or how to gather
additional information to provide to LogRhythm Customer Support.

ID: Alarm Potential Remediation Steps


96: LogRhythm 1. Use LogRhythm to analyze and collect all information regarding the alarm, related events/logs,
Mediator Database and surrounding logs from affected sources.
Capacity Error 2. Check Data Management settings for proper tuning, ensuring only required logs are being
brought online (Classification-Based Data Management or run investigation to identify major
producers of white noise).
3. Investigate top talkers, log counts, and summary totals for spikes in the environment that could
point to a misconfiguration or a potential threat Log Volume Reports.
4. Ensure only required logs are being kept online.
5. If oversubscription, and a justified volume of logs are being brought online, the LMDB can be
manually grown (drive space permitting) or the appliance/deployment may need to be re-
scoped for additional resources.
6. If the steps above do not provide a solution or if you require assistance, please contact
LogRhythm Support.
Note: Directs immediate attention to the LMDB and indicates the system has now gone into
suspend mode. This is caused by oversubscription of the LMDB. This alarm does not apply to the
Data Processor.

97: LogRhythm 1. Use LogRhythm to analyze and collect all information regarding the alarm, related events/logs,
Mediator Database and surrounding logs from affected sources.
Capacity Warning 2. Check Data Management settings for proper tuning, ensuring only required logs are being
brought online (Classification-Based Data Management or run investigation to identify major
producers of white noise).
3. Investigate top talkers, log counts, and summary totals for spikes in the environment that could
point to a misconfiguration or a potential threat Log Volume Reports.
4. Ensure only required logs are being kept online.
5. If oversubscription, and a justified volume of logs are being brought online, the LMDB can be
manually grown (drive space permitting) or the appliance/deployment may need to be re-
scoped for additional resources.
6. If the steps above do not provide a solution or if you require assistance, please contact
LogRhythm Support.
Note: Provides early warning to administrators that action may be necessary to maintain the LMDB
and prevent the system from going into suspend mode. This is caused by oversubscription of the
LMDB. This alarm does not apply to the Data Processor.

Page 4
LogRhythm Diagnostics Module User Guide

ID: Alarm Potential Remediation Steps


98: LogRhythm 1. Use LogRhythm to analyze and collect all information regarding the alarm, related events/logs,
Agent Heartbeat and surrounding logs from affected sources.
Missed 2. Check System Monitor service health (try restarting).
3. Check network connectivity between Agent and Mediator.
4. Check scsm.log for errors.
5. If the steps above do not provide a solution or if you require assistance, please contact
LogRhythm Support.

99: LogRhythm 1. Use LogRhythm to analyze and collect all information regarding the alarm, related events/logs,
Component Critical and surrounding logs from affected sources.
Condition 2. Review the status of the component listed in the alarm, and restart the component if necessary.
3. If an issue is identified, remediation steps will vary according to the affected component and the
error observed.
4. If the steps above do not provide a solution or if you require assistance, please contact
LogRhythm Support.
Note: Identifies and detects critical problems at an early stage on any LogRhythm component. It will
most likely require analysis to verify the scope, validity, and priority if a source issue identified.

100: LogRhythm 1. Use LogRhythm to analyze and collect all information regarding the alarm, related events/logs,
Component and surrounding evidence from affected sources.
Successive Errors 2. Review the status of the component listed in the alarm. Restart component if necessary.
3. If an issue is identified, remediation steps will vary according to the affected component and the
error observed.
4. If the steps above do not provide a solution or if you require assistance, please contact
LogRhythm Support
Note: Identifies and detects potential problems at an early stage on any LogRhythm component. It
will most likely require analysis to verify the scope, validity, and priority if a source issue identified.

101: LogRhythm 1. Investigate event and related logs from problem host around time-frame of logs with common
Component events matching that of the alarm criteria.
Excessive Warnings 2. Review the status of the component listed in the alarm, and restart the component if necessary.
3. If an issue is identified, remediation steps will vary according to the affected component and the
error observed.
4. If the steps above do not provide a solution or if you require assistance, please contact
LogRhythm Support.
Note: Identifies and detects potential problems at an early stage on any LogRhythm component. It
will most likely require analysis to verify the scope, validity, and priority if a source issue identified.

Page 5
LogRhythm Diagnostics Module User Guide

ID: Alarm Potential Remediation Steps


102: LogRhythm 1. Use LogRhythm to analyze and collect all information regarding the alarm, related events/logs,
Mediator Heartbeat and surrounding logs from affected sources.
Missed 2. Review Mediator service health. If stopped, start the service and use scmedsvr.log to observe
any errors during startup if service fails to run.
3. Check network connectivity between LM and EM.
4. Check scmedsvr.log
5. If receiving false positives due to expected environmental factors, the heartbeat time-out
interval may be tuned to match in the LM's properties.
6. If the steps above do not provide a solution or if you require assistance, please contact
LogRhythm Support.

103: LogRhythm 1. Use LogRhythm to analyze and collect all information regarding the alarm, related events/logs,
MPE Rule Disabled and surrounding logs from affected sources.
2. Check scmpe.log and lps_detail.log for policy processing issues, statistics, stalled threads, further
issues with the same rule, and/or subsequent issues with MPE rules in the same policy (possibly
indicating a change or update to a log source or log source type, logging operations).
3. Investigate associated MPE Rules it is recommended that you contact LogRhythm Support
for this step.
4. If the steps above do not provide a solution or if you require assistance, please contact
LogRhythm Support.
Note: The MPE service now detects when individual rules continue to generate processing warnings
and may disable rules repeatedly raising warnings if hung processes jeopardize the health of the
system.
This capability increases the reliability of each Log Manager by allowing the MPE to more accurately
identify and gracefully handle parsing rules that risk the health of the overall system.

104: LogRhythm 1. Use LogRhythm to analyze and collect all information regarding the alarm, related events/logs,
Silent Log Source and surrounding logs from affected sources.
Error 2. Examine log source in the Client Console to check if the log source record has updated the "last
log message" field since the receipt of the alarm.
3. Verify correct configuration of the Silent Log Message Source Settings in the log source
properties.
4. Investigate log source host to verify health or issues that may be interrupting communications
to the collecting agent.
5. Ensure no configuration changes security or administrative have been made to the log
source, and verify no changes in the communications path that may prevent logging to
LogRhythm.
6. If this is expected behavior from the log source, silent log source settings may be tuned in the
advanced properties of each log source.
7. If the steps above do not provide a solution or if you require assistance, please contact
LogRhythm Support.
Note: While Silent log source alarms can be extremely valuable, and can be set individually per log
source, environmental factors, like a log source that is not very chatty, can cause flooding of this
alarm. As this is tuned per log source it can have a high administrative cost. When tuned, however,
the value of these alarms can be extremely high and produce valuable insight into each log sources
normal behavior.

Page 6
LogRhythm Diagnostics Module User Guide

ID: Alarm Potential Remediation Steps


210: LogRhythm 1. Use LogRhythm to analyze and collect all information regarding the alarm, related events/logs,
Database and surrounding logs from affected sources.
Maintenance 2. Additional details about the status of specific job instances can be found by viewing the Job
Failure History in the SQL Server Agent properties.
3. Check health of SQL Services.
4. If the steps above do not provide a solution or if you require assistance, please contact
LogRhythm Support.
Note: Each SQL Database Maintenance job comprises individual steps that perform a specific
maintenance function. LogRhythm has several maintenance jobs designed to perform routine
functions that age data from the databases and rebuild indexes to maintain efficient search
functions. If the maintenance jobs do not run, it will have an impact on your system and could
create suspense conditions and fill the databases to capacity.
The database maintenance jobs are implemented as SQL Server Agent jobs. There are three jobs
that are in place on any LogRhythm 6.x database server:
LogRhythm Weekday Maintenance: Runs Mon-Fri at a default time of 12:15 AM
LogRhythm Saturday Maintenance: Runs Saturday at a default time of 12:15 AM
LogRhythm Sunday Maintenance: Runs Sunday at a default time of 12:15 AM
One additional job that appears on Platform Managers only, LogRhythm Backup, is called by the
LogRhythm Sunday Maintenance Job.

212: LogRhythm 1. Use LogRhythm to analyze and collect all information regarding the alarm, related events/logs,
Failed To Submit and surrounding logs from affected sources.
Batch Job To DB 2. Check SQL and Mediator service health.
3. Check logmsgprocessor.log, evtmsgprocessor.log, scmedsvr.log, and SQL server/error and
history logs around relative time-frame for cause of failure and/or related SQL maintenance
errors.
4. Investigate volume reports and log spikes that may have caused a temporary failure and
monitor for subsequent failures.
5. If the steps above do not provide a solution or if you require assistance, please contact
LogRhythm Support.
Note: Processed logs, coupled with their respective instructions for inserting, are batched together
into the respective destination database.

Page 7
LogRhythm Diagnostics Module User Guide

ID: Alarm Potential Remediation Steps


230: LogRhythm 1. Investigate top talkers, log counts, and summary totals for spikes in the environment that could
Excessive point to a misconfiguration, oversubscription, or a potential threat Log Volume Reports.
Unprocessed Logs 2. Monitor for continuance by checking Deployment Monitor and related Performance Counters
Spooled to Disk to gather usage statistics counter descriptions are available in the LR Help File.
3. Check scmedsvr.log and verify Mediator service health and connectivity.
4. Find and investigate any occurrences of the "LogRhythm MPE Rule Disabled" alarm relative to
the time-frame use alarm viewer, alarm reports, personal dashboard, notifications, and the
Web Console.
5. Check scmpe.log and lps_detail.log for processing issues, statistics, stalled threads, and disabled
MPE rules.
6. A restart of the mediator service can help to clear processing threads that may be stalled while
processing a message. However if the parsing rule is not parsing a message correctly, rather
than an unexpected one time exception, the issue will most likely return depending on the
frequency at which the source log is observed.
7. If the steps above do not provide a solution or if you require assistance, please contact
LogRhythm Support.
Note: Indicates that processing is unable to keep up with the rate at which logs are being received.
This could be due to improper processing of logs, oversubscription, high resource utilization, or
overall service health. This rule can be tuned to exclude lower-volume diagnostic events (e.g.,
Unprocessed Log Spooled Count Exceeds 1 Million) on high-volume deployments.
Component settings can be tuned to help a system catch up temporarily. However, each of these
settings should always be set back to their original best practice values, unless there is a specific
need to keep a new setting or unless otherwise directed. Manipulation of these settings can cause a
waterfall effect. For example, if you increase the resources dedicated to insertion, you would leave
less resources for processing incoming log messages.

231: LogRhythm 1. Investigate top talkers, log counts, and summary totals for spikes in environment that could
Excessive point to a misconfiguration, oversubscription, or a potential threat Log Volume Reports.
Processed Logs 2. Monitor for continuance by checking Deployment Monitor and related Performance Counters
Spooled to Disk to gather usage statistics counter descriptions are available in the LR Help File
3. Check logmsgprocessor.log, evtmsgprocessor.log, and scmedsvr.log around relative time-frame
for clues.
4. Check SQL maintenance jobs for successes and failures.
5. Check resource utilization.
6. Check Data Management and RBP settings for improper tuning identify through
investigations and ensure insert rates adhere to best practices.
7. If the steps above do not provide a solution or if you require assistance, please contact
LogRhythm Support.
Note: Spooling logs to disk is an expected condition that happens under periods of peak load.
Excessive spooling, however, can result in disk starvation and the Mediator going into a suspend
state. This rule can be tuned to exclude lower-volume diagnostic events (e.g., Unprocessed Log
Spooled Count Exceeds 1 Million) on high-volume deployments.

Page 8
LogRhythm Diagnostics Module User Guide

ID: Alarm Potential Remediation Steps


232: LogRhythm 1. Investigate top talkers, log counts, and summary totals for spikes in environment that could
Excessive Events point to a misconfiguration, oversubscription, or a potential threat Log Volume Reports.
Spooled to Disk 2. Monitor for continuance by checking Deployment Monitor and related Performance Counters
to gather usage statistics counter descriptions are available in the LR Help File
3. Check logmsgprocessor.log, evtmsgprocessor.log, and scmedsvr.log around relative time-frame
for clues.
4. Check SQL maintenance jobs for successes and failures.
5. Check resource utilization.
6. Check Data Management and RBP settings for improper tuning identify through
investigations and ensure insert rates adhere to best practices.
7. Check for oversubscription Log Volume Report by Day.
8. If the steps above do not provide a solution or if you require assistance, please contact
LogRhythm Support.
Note: Spooling logs to disk is an expected condition that happens under periods of peak load.
Excessive spooling, however, can result in disk starvation and the Mediator going into a suspend
state. This rule can be tuned to exclude lower-volume diagnostic events (e.g., InsertMgr Event
Spooled Count Exceeds 1 Million) on high-volume deployments.

233: Perfmon Remediation varies based on rule configuration.


Counter Reached Note: This rule is not typically enabled by default, but specific Performance Counters are added to
Threshold Limit the alarm criteria to assist with troubleshooting or to detect specific trouble conditions.

408: LogRhythm 1. Use LogRhythm to analyze and collect all information regarding the alarm, related events/logs,
GLPR Error and surrounding logs from affected sources.
2. Review GLPR configurations.
3. Check GLPR Performance Counters.
4. Check scmpe.log for errors and disabled GLPRs.
5. If the steps above do not provide a solution or if you require assistance, please contact
LogRhythm Support.
Note: The following error types may trigger this alarm:
Collection Update Error: Error updating the entire rule-base of GLPRs after a configuration
change
Preparation Error: Error observed while preparing a rule for processing
Processing Errors: Observed error while processing against a specific GLPR

676: LogRhythm AI 1. Use LogRhythm to analyze and collect all information regarding the alarm, related events/logs,
Engine Heartbeat and surrounding logs from affected sources.
Missed 2. Review the AI Engine Communication Manager service on AIE appliance.
3. Check network connectivity between AIE and Platform Manager.
4. Check LRAIEEngine.log.
5. If the steps above do not provide a solution or if you require assistance, please contact
LogRhythm Support.

Page 9
LogRhythm Diagnostics Module User Guide

ID: Alarm Potential Remediation Steps


677: LogRhythm AI 1. Use LogRhythm to analyze and collect all information regarding the alarm, related events/logs,
Comm Manager and surrounding logs from affected sources.
Heartbeat Missed 2. Review the AI Engine Communication Manager service on the AIE appliance.
3. Check network connectivity between AIE and the Platform Manager.
4. Check lraiedp.log (Mediator) and lraiecommgr.log (AIE).
5. If the steps above do not provide a solution or if you require assistance, please contact
LogRhythm Support.
Note: The threshold at which a LogRhythm System Monitor will trigger this alarm can be configured
in the System Monitor properties.

947: LogRhythm Collect SQL error logs, note database and disk sizes, and then raise a ticket with LogRhythm
CMDB Database Support.
Warning
948: LogRhythm Collect SQL error logs, note database and disk sizes, and then raise a ticket with LogRhythm
CMDB Stats Support.
Warning
949: LogRhythm Collect SQL error logs, note database and disk sizes, and then raise a ticket with LogRhythm
CMDB Database Support.
Error
1002: LogRhythm 1. Determine the affected Agent by investigating logs and events.
Agent Cannot 2. Ensure the Agent is communicating and last Agent Heartbeat is up to date.
Update
3. Ensure the Agent is collecting logs as expected.
4. Collect scmedsvr.log and scsm.log, and then raise a ticket with LogRhythm Support.

1003: LogRhythm Restart the LogRhythm System Monitor Agent


Agent Needs Note: This alarm indicates that the noted System Monitor Agent service requires a restart. You
Reboot should restart the Agent during off-peak hours and within an allowed change control window.

1084: LogRhythm 1. Verify that you can log in to Network Monitor and access the UI.
Network Monitor 2. Check the diagnostics page for clues.
Heartbeat Missed
3. If the steps above do not provide a solution or if you require assistance, please contact
LogRhythm Support.

1093: LogRhythm Attempt to start the Indexer services using the start script:
Data Indexer Windows: C:\Program Files\LogRhythm\Data Indexer\tools\start-all-services.bat
Stopped
Linux: /usr/local/logrhythm/tools/start-all-services-linux.sh

1094: LogRhythm Please contact LogRhythm Support.


Data Indexer
Configuration Fail
1095: LogRhythm Please contact LogRhythm Support.
Data Indexer
Suspend
1096: LogRhythm Please contact LogRhythm Support.
Data Indexer EMDB
Sync Fail

Page 10
LogRhythm Diagnostics Module User Guide

ID: Alarm Potential Remediation Steps


1097: LogRhythm Please contact LogRhythm Support.
Data Indexer Disk
Limit Exceeded
1098: LogRhythm Please contact LogRhythm Support.
Data Indexer Max
Index Exceeded
1099: LogRhythm Please contact LogRhythm Support.
Data Indexer List
Not Found
1100: LogRhythm Please contact LogRhythm Support.
Data Indexer Repo
Not Found
1101: LogRhythm Grab the results of "_cat/indices?v" command from a Web browser and provide the results to
Data Indexer LogRhythm Support.
Cluster Health
1102: LogRhythm 1. Check the synchronization history for issues in the Client Console, click Tools, click
Knowledge Base Knowledge, click Knowledge Base Manager, and then click View Synchronization History.
Update Error 2. In the Knowledge Base Manager, attempt a manual download by clicking Check For
Knowledge Base Updates.
3. If the steps above do not provide a solution or if you require assistance, please contact
LogRhythm Support.

1139: LogRhythm 1. Use LogRhythm to analyze and collect all information regarding the alarm, related events/logs,
Mediator Recycling and surrounding logs from affected sources.
Hung MPE 2. Check scmpe.log and lps_detail.log for policy processing issues, statistics, stalled threads, further
Threads issues with the same rule, and/or subsequent issues with MPE rules in the same policy (possibly
indicating a change or update to a log source or log source type, logging operations).
3. Investigate associated MPE Rules it is recommended that you contact LogRhythm Support
for this step.
4. If the steps above do not provide a solution or if you require assistance, please contact
LogRhythm Support.
Note: The MPE service now detects when individual rules continue to generate processing warnings
and may disable rules repeatedly raising warnings if hung processes jeopardize the health of the
system.
This capability increases the reliability of each Log Manager by allowing the MPE to more accurately
identify and gracefully handle parsing rules that risk the health of the overall system.

1140: LogRhythm Collect the lrjobmgr.log file, and then raise a ticket with LogRhythm Support.
Scheduled Report
Failure
1141: LogRhythm 1. Investigate events and logs that produced the alarm.
AD Sync Failure 2. Ensure that the credentials used for AD Sync are still valid by performing an AD sync validation
test via the Active Directory Domain Manager on the Platform Manager tab of Deployment
Manager.
3. Determine if any recent network changes may have affected the communication between
LogRhythm and the AD server.
4. Collect the lrjobmgr.log file, and then raise a ticket with LogRhythm Support.

Page 11
LogRhythm Diagnostics Module User Guide

Appendix: Summary of Changes


This section summarizes the changes to existing alarm rules and new alarm rules that have been added in the
LogRhythm Diagnostics Module.

Renamed Alarm Rules


Old Rule Name New Rule Name ID
QsEMP : Mediator Database Capacity Error LogRhythm Mediator Database Capacity Error 96
QsEMP : Mediator Database Capacity Warning LogRhythm Mediator Database Capacity Warning 97
QsEMP : LogRhythm Agent Heartbeat Missed LogRhythm Agent Heartbeat Missed 98
QsEMP : LogRhythm Component Critical Condition LogRhythm Component Critical Condition 99
QsEMP : LogRhythm Component Successive Errors LogRhythm Component Successive Errors 100
QsEMP : LogRhythm Component Excessive Warnings LogRhythm Component Excessive Warnings 101
QsEMP : LogRhythm Mediator Heartbeat Missed LogRhythm Mediator Heartbeat Missed 102
QsEMP : LogRhythm MPE Rule Disabled LogRhythm MPE Rule Disabled 103
QsEMP : LogRhythm Silent Log Source Error LogRhythm Silent Log Source Error 104
QsEMP : LogRhythm Database Maintenance Failure LogRhythm Database Maintenance Failure 210
QsEMP : LogRhythm Failed To Submit Batch Job To DB LogRhythm Failed To Submit Batch Job To DB 212
QsEMP : Excessive Unprocessed Logs Spooled to Disk LogRhythm Excessive Unprocessed Logs Spooled to Disk 230
QsEMP : Excessive Processed Logs Spooled to Disk LogRhythm Excessive Processed Logs Spooled to Disk 231
QsEMP : Excessive Events Spooled to Disk LogRhythm Excessive Events Spooled to Disk 232
QsEMP : Perfmon Counter Reached Threshold Limit Perfmon Counter Reached Threshold Limit 233
QsEMP : LogRhythm GLPR Error LogRhythm GLPR Error 408
QsEMP : LogRhythm AI Engine Heartbeat Missed LogRhythm AI Engine Heartbeat Missed 676
QsEMP : LogRhythm AI Comm Manager Heartbeat Missed LogRhythm AI Comm Manager Heartbeat Missed 677
QsEMP : LogRhythm CMDB Database Warning LogRhythm CMDB Database Warning 947
QsEMP : LogRhythm CMDB Stats Warning LogRhythm CMDB Stats Warning 948
QsEMP : LogRhythm CMDB Database Error LogRhythm CMDB Database Error 949
QsEMP : LogRhythm Agent Cannot Update LogRhythm Agent Cannot Update 1002
QsEMP : LogRhythm Agent Needs Reboot LogRhythm Agent Needs Reboot 1003
QsEMP : LogRhythm Network Monitor Heartbeat Missed LogRhythm Network Monitor Heartbeat Missed 1084
QsEMP : LogRhythm Data Indexer Stopped LogRhythm Data Indexer Stopped 1093
QsEMP : LogRhythm Data Indexer Configuration Fail LogRhythm Data Indexer Configuration Fail 1094
QsEMP : LogRhythm Data Indexer Suspend LogRhythm Data Indexer Suspend 1095
QsEMP : LogRhythm Data Indexer EMDB Sync Fail LogRhythm Data Indexer EMDB Sync Fail 1096
QsEMP : LogRhythm Data Indexer Disk Limit Exceeded LogRhythm Data Indexer Disk Limit Exceeded 1097

Page 12
LogRhythm Diagnostics Module User Guide

Old Rule Name New Rule Name ID


QsEMP : LogRhythm Data Indexer Max Index Exceeded LogRhythm Data Indexer Max Index Exceeded 1098
QsEMP : LogRhythm Data Indexer List Not Found LogRhythm Data Indexer List Not Found 1099
QsEMP : LogRhythm Data Indexer Repo Not Found LogRhythm Data Indexer Repo Not Found 1100
QsEMP : LogRhythm Data Indexer Cluster Health LogRhythm Data Indexer Cluster Health 1101
QsEMP : LogRhythm Knowledge Base Update Error LogRhythm Knowledge Base Update Error 1102

New Alarm Rules


The following new alarm rules are included in the LogRhythm Diagnostics Module:

Alarm Rule ID
LogRhythm Mediator Recycling - Hung MPE Threads 1139
LogRhythm Scheduled Report Failure 1140
LogRhythm AD Sync Failure 1141

Page 13

Вам также может понравиться