Вы находитесь на странице: 1из 59

HSS9860 Maintenance

and Troubleshooting
(LTE)
www.huawei.com

Copyright © 2012 Huawei Technologies Co., Ltd. All rights reserved.


References
 HSS9860 Product Documentation

Copyright © 2012 Huawei Technologies Co., Ltd. All rights reserved. Page1
Objectives
 Upon completion of this course, you will be able to:

 Describe symptom of HSS9860 common LTE service faults

 Grasp common maintenance operation of HSS9860 service


faults

 Locate and troubleshooting common LTE service faults on


HSS9860

Copyright © 2012 Huawei Technologies Co., Ltd. All rights reserved. Page2
Contents
1. HSS9860 Maintenance and Troubleshooting Flow

2. HSS9860 Fault Information Collection Methods

3. HSS9860 Common LTE Fault Troubleshooting

4. HSS9860 Fault Troubleshooting Cases

Copyright © 2012 Huawei Technologies Co., Ltd. All rights reserved. Page3
Troubleshooting Flow
 The troubleshooting flow is:
 Information Collection
 Fault Classification
 Fault Location
 Fault Removal

General Flow Extranet


of Troubleshooting

Information Fault Fault Fault



Collection •
Classificatio Location Remova
n l

Copyright © 2012 Huawei Technologies Co., Ltd. All rights reserved. Page4
Information Fault Fault Fault
Collection Classification Location Removal

Troubleshooting Flow(Cont.)
 Information to be collected  Information collection means
 Specific fault symptoms (such as
 Fault report from the subscribers or
subscriber perception and system
prompts) customer center
 Time, place, and frequency of the fault  Fault report from the maintenance
 Scope and impact of the fault personnel in the neighboring office
 Equipment running status before the  Alarm report from the alarm system
fault occurs
 Abnormalities found in daily
 Operations performed on the
equipment before the fault occurs and maintenance or inspection
the results of the operations
 Measures that are taken after the fault
occurs and the results after the
measures are taken
 Equipment alarms when the fault
occurs and the relevant or associated
alarms

Copyright © 2012 Huawei Technologies Co., Ltd. All rights reserved. Page5
Information Fault Fault Fault
Collection Classification Location Removal

Troubleshooting Flow(Cont.)
 Determining the fault scope  Determining the fault type
 To correctly determine the  Service fault - It is the fault
fault scope is to determine showing that the service is
the troubleshooting direction. directly affected, for
It is the most important example, failure to access
element for quickly removing the network.
the fault.  Non-service fault - It is the
 In terms of fault symptoms fault showing that the
and impacts, faults can be service is indirectly affected,
classified into two types: for example, the disk array
service faults and non- fault and OMU cluster fault.
service faults. Each type of
faults can be further
classified.
Copyright © 2012 Huawei Technologies Co., Ltd. All rights reserved. Page6
Information Fault Fault Fault
Collection Classification Location Removal

Troubleshooting Flow(Cont.)
Original Information Analysis Check of Logs & Other NEs Alarm Analysis
 After the fault occurs, collect  Check the operations before the fault
 Try to remove the fault based
the related information (such occurs. Ask the on-site person and
query the operation logs. Determine on the suggestions in the alarm
as fault time, symptom, and whether the fault is caused by incorrect information.
place) through various means. operations.  When multiple alarms are
Determine the scope and type  Know about the running state (such as reported on the local alarm
of the fault. version upgrade and failure) of the console, you can firstly check
 Analyze the information such neighboring NEs before the fault occurs. the high-level alarms based on
Check whether there is some special the alarm level. The event
as complaining subscriber event (such as important holiday) that
numbers, IP addresses. alarms can be handled finally.
affects the network.

Performance Measurement
Information Analysis Interface Tracing Analysis Data Configuration Check
 Check measurement success rate.  Perform message tracing on the  Service configuration: including
Compare the rate with that in the services of the complaining EPS service registration
same time segment of the recent subscriber.
several days. Check whether there is  Signaling configuration: including
 Analyze the message streams, Diameter link configuration
obvious fluctuation and analyze the
trace the abnormal interruption  DS configuration: including the
failure cause (through comparison of
all modules). points in the message streams, verification switch,
 Based on the failure cause, check and compare the abnormal active/standby work mode, and
whether the local data configuration message stream with the IP addresses
is modified recently, and whether the message streams in normal  PGW configuration: including
IP addresses of the neighboring NEs cases. Check the data redundancy in service layer,
are modified recently. configuration of the subscriber parallel/serial mode, and buffer
 The performance measurement is and determine the fault cause. length.
suitable for locating the service faults
and signaling faults.

Copyright © 2012 Huawei Technologies Co., Ltd. All rights reserved. Page7
Information Fault Fault Fault
Collection Classification Location Removal

Troubleshooting Flow(Cont.)

Modify the service


global
configuration
Isolate the
Modify the service board
template
configuration
Replace the Remove the
Reset the
board fault
Modify the board
signaling
configuration Switch over
the module
Modify the function
configuration

Copyright © 2012 Huawei Technologies Co., Ltd. All rights reserved. Page8
Contents
2. HSS9860 Fault Information Collection Methods
2.1 Collecting the Alarm Information

2.2 Collecting the Operating System Log Information

2.3 Collecting the Board Hardware Information

2.4 Collecting the Performance Measurement Information

2.5 Collecting the Message Tracing Information

2.6 Using the NIC Tool

Copyright © 2012 Huawei Technologies Co., Ltd. All rights reserved. Page9
Querying Alarm Logs
 Querying alarm logs helps to identify the cause of a fault and for rectifying the fault.

Choose Alarm > Query Alarm


Log from the menu bar

Set the specific parameters


for filtering on the displayed
dialog box.

Click OK. The Query Alarm


Log window is displayed.

Copyright © 2012 Huawei Technologies Co., Ltd. All rights reserved. Page10
Querying Alarm Logs(Cont.)
 You can also query alarm logs by running LST ALMLOG in the
MML Command - CGP window.

Copyright © 2012 Huawei Technologies Co., Ltd. All rights reserved. Page11
Saving Alarms
 Save the alarm information to the specified files for query.
Right-click in the alarm display
pane, A shortcut menu is
displayed.

Choose Save as. The Save


dialog box is displayed

Select the contents to be


saved in Columns, the range
of alarms to be saved in Save
Rows, and a directory and file
name in File name,
respectively.

Click Save.

Copyright © 2012 Huawei Technologies Co., Ltd. All rights reserved. Page12
Contents
2. HSS9860 Fault Information Collection Methods
2.1 Collecting the Alarm Information

2.2 Collecting the Operating System Log Information

2.3 Collecting the Board Hardware Information

2.4 Collecting the Performance Measurement Information

2.5 Collecting the Message Tracing Information

2.6 Using the NIC Tool

Copyright © 2012 Huawei Technologies Co., Ltd. All rights reserved. Page13
Collecting the Operating System Log
Information
 The operating system logs are stored in the files named in the format of
messages*, boot.*, and mail.*. The log files are located in the /var/log directory on
each board.

 You can download the operating system log files from the OMU server through the
FTPS function of the OMU client.

Copyright © 2012 Huawei Technologies Co., Ltd. All rights reserved. Page14
Collecting the Operating System Log
Information(Cont.)
 You can log in to the board through KVM over IP to view the logs.

Copyright © 2012 Huawei Technologies Co., Ltd. All rights reserved. Page15
Contents
2. HSS9860 Fault Information Collection Methods
2.1 Collecting the Alarm Information

2.2 Collecting the Operating System Log Information

2.3 Collecting the Board Hardware Information

2.4 Collecting the Performance Measurement Information

2.5 Collecting the Message Tracing Information

2.6 Using the NIC Tool

Copyright © 2012 Huawei Technologies Co., Ltd. All rights reserved. Page16
Collecting the Board Hardware
Information
 To collect the hardware information of the boards, open the MML
Command - CGP window and run DSP BRD commands.

Copyright © 2012 Huawei Technologies Co., Ltd. All rights reserved. Page17
Contents
2. HSS9860 Fault Information Collection Methods
2.1 Collecting the Alarm Information

2.2 Collecting the Operating System Log Information

2.3 Collecting the Board Hardware Information

2.4 Collecting the Performance Measurement Information

2.5 Collecting the Message Tracing Information

2.6 Using the NIC Tool

Copyright © 2012 Huawei Technologies Co., Ltd. All rights reserved. Page18
Collecting the Performance
Measurement Information
 Performance measurement collects the running information
of the system in real time. The performance measurement
information reflects the running status of the system. It can
be used for fault identification when the system experiences
a fault.

 You can export the performance measurement analysis


result using either of the following methods:
 Exporting Measurement Results on a Real-Time Basis

 Exporting Measurement Results on a Scheduled Basis

Copyright © 2012 Huawei Technologies Co., Ltd. All rights reserved. Page19
Exporting Measurement Results on a
Real-Time Basis
 You can set export conditions to export measurement results in real time and then
save the measurement results to a local terminal.

Copyright © 2012 Huawei Technologies Co., Ltd. All rights reserved. Page20
Exporting Measurement Results on a
Scheduled Basis
 You can set exporting conditions to export measurement results according to the
schedule and then save the measurement results to a local terminal.

Copyright © 2012 Huawei Technologies Co., Ltd. All rights reserved. Page21
Contents
2. HSS9860 Fault Information Collection Methods
2.1 Collecting the Alarm Information

2.2 Collecting the Operating System Log Information

2.3 Collecting the Board Hardware Information

2.4 Collecting the Performance Measurement Information

2.5 Collecting the Message Tracing Information

2.6 Using the NIC Tool

Copyright © 2012 Huawei Technologies Co., Ltd. All rights reserved. Page22
Collecting the Message Tracing
Information
 Message tracing provides dynamic and
real-time monitoring on the call
connection process, resource usage, and
service flow over ports and signaling
links. The traced messages can be
saved for future view.

 In addition, you can use traced


messages to locate a call connection
failure quickly and help you to
troubleshoot the fault. In addition, the
traced messages help you to learn about
the signaling exchange between NEs.

Copyright © 2012 Huawei Technologies Co., Ltd. All rights reserved. Page23
Contents
2. HSS9860 Fault Information Collection Methods
2.1 Collecting the Alarm Information

2.2 Collecting the Operating System Log Information

2.3 Collecting the Board Hardware Information

2.4 Collecting the Performance Measurement Information

2.5 Collecting the Message Tracing Information

2.6 Using the NIC Tool

Copyright © 2012 Huawei Technologies Co., Ltd. All rights reserved. Page24
Using the NIC Tool
 You can use the Network Information
Collection (NIC) tool to collect required
information.
 To conduct the health check by using
network management system, In this
case, the HSS9860 can be deployed as
an SAE-HSS, a GU-HLR, or an
HSS9860. Therefore, install the
adaptation package for related NE
before the health check.
 To conduct the health check by using
the VTS tool, In this case, the HSS9860
can be deployed only as the HSS9860.
Therefore, install the HSS9860
adaptation package before the health
check.
Copyright © 2012 Huawei Technologies Co., Ltd. All rights reserved. Page25
Contents
3. HSS9860 Common LTE Fault Troubleshooting and
Troubleshooting Cases
3.1 HSS9860 Common LTE Fault Troubleshooting

3.2 HSS9860 Troubleshooting Cases

Copyright © 2012 Huawei Technologies Co., Ltd. All rights reserved. Page26
Attach Procedure
 Function of attach procedure:
 The UE register to the EPS network.

 The always-on IP connectivity for UE/users of the EPS is


enabled by establishing a default EPS bearer during Network
Attachment.

 The MM context and EPS bearer context will be created in the


MME and UE. The EPS bearer context will be created in S-GW
and P-GW.

Copyright © 2012 Huawei Technologies Co., Ltd. All rights reserved. Page27
Attach Procedure(Cont.)

eNodeB New MME Old MME S-GW P-GW


UE HSS
1. Attach request

2. Identification req/rsp
3. Identity req/rsp
massages involved
4. Security function with HSS in the
EPS Attach
5. Update location request
procedure
6. Cancel location /Ack
7. Update location Ack

8. Create session request


9. Create session request
10. Create session response
11. Create session response

Copyright © 2012 Huawei Technologies Co., Ltd. All rights reserved. Page28
Attach Procedure(Cont.)

eNodeB New MME Old MME HSS S-GW P-GW

12. Initial Context Setup Request / Attach Accept

13. RRC Connection Reconfiguration / Complete

14. Initial Context Setup Response


15. Direct Transfer

16. Attach complete


17. Modify Bearer Request / Response

Uplink and downlink data

Copyright © 2012 Huawei Technologies Co., Ltd. All rights reserved. Page29
TAU Procedure

S-GW2
S-GW1

MME1 MME2 MME3

TA list 1 TA list 2 TA list 3 TA list 4

Periodic TAU Inter MME TAU with


SGW change

Intra MME TAU Inter MME TAU without


SGW change

Copyright © 2012 Huawei Technologies Co., Ltd. All rights reserved. Page30
Inter TAU with SGW change

eNodeB New MME Old MME HSS New S-GW


Old S-GW P-GW
1. TAU request

2. Context request massages involved


3. Context response with HSS in the
EPS TAU
4. Security function
procedure
5. Context Ack

6. Create Session request


7. Modify bearer request

8. Modify bearer response

9. Create Session response

Copyright © 2012 Huawei Technologies Co., Ltd. All rights reserved. Page31
Inter TAU with SGW change(Cont.)

eNodeB New MME Old MME HSS New S-GW


Old S-GW P-GW

10. Update location request


massages involved
11. Cancel location
with HSS in the
12. Cancel location Ack EPS TAU
procedure
13. Update location Ack

14. Delete bearer request


16. TAU accept
15. Delete bearer response
17. TAU complete

Copyright © 2012 Huawei Technologies Co., Ltd. All rights reserved. Page32
Inter TAU without SGW change

UE eNodeB New MME Old MME HSS S-GW P-GW


1. TAU request
2. Context request
massages involved
3. Context response with HSS in the
4. Security function EPS TAU
procedure
5. Context Ack

6. Modify bearer request


7. Modify bearer request

8. Modify bearer response


9. Modify bearer response

Copyright © 2012 Huawei Technologies Co., Ltd. All rights reserved. Page33
Inter TAU without SGW change(Cont.)

UE eNodeB New MME Old MME HSS S-GW P-GW

10. Update location request

11. Cancel location


massages involved
with HSS in the
12. Cancel location Ack EPS TAU
procedure
13. Update location Ack

14. TAU accept

15. TAU complete

Copyright © 2012 Huawei Technologies Co., Ltd. All rights reserved. Page34
EPS Attach/TAU Fault
(unknownEpsSubscriber)
 Symptom:
In the EPS attach/TAU procedure, HSS sends the message ULA which
contains the failure cause unknownEpsSubscriber to MME.

 Fault Analysis:
The subscriber is not EPS subscriber.
The subscriber has LOCK service.
Possible The subscriber doesn’t register EPSAPN.
Causes
The subscriber registers the ODB BAPOS service, and the
value of the parameter ODBPOS_REJ_ULR is REJECT in
MAPSERV

Copyright © 2012 Huawei Technologies Co., Ltd. All rights reserved. Page35
EPS Attach/TAU Fault
(unknownEpsSubscriber)
Symptom Procedure Fault Analysis Fault Diagnosis
The subscriber is not EPS
subscriber.

In the EPS location The subscriber has LOCK


update procedure, service.
HSS sends the The subscriber doesn’t
EPS
message ULA which register EPSAPN. LST OPTGPRS
location
contains the failure LST APNTPL
update The subscriber registers
cause
unknownEpsSubscrib the ODB BAPOS service,
er. and the value of the
parameter
ODBPOS_REJ_ULR is
REJECT in MAPSERV

Copyright © 2012 Huawei Technologies Co., Ltd. All rights reserved. Page36
EPS Attach/TAU Fault
(roamingNotAllowed)
 Symptom:
In the EPS attach/TAU procedure, HSS sends the message ULA which
contains the failure cause roamingNotAllowed to MME.

 Fault Analysis:

PLMN roaming restriction


Possible
Diameter serving node roaming restriction
Causes
EPS dynamic license not allowed

Copyright © 2012 Huawei Technologies Co., Ltd. All rights reserved. Page37
EPS Location Update Fault
(roamingNotAllowed)

Run LST PLMNRSZI to check the PLMN


PLMN roaming template which the subscriber
restriction ? subscripted. Analyzing the fault
according to the configuration

Diameter serving Run LST DIAMRRS to check


node roaming DIAMRRTPL. Analyzing the fault
restriction? according to the configuration

EPS dynamic Query whether there is a license alarm


generated. Run DSP LICRATE to query
license not
the actual number of EPS dynamic
allowed? subscriber on the OMU BE.

Copyright © 2012 Huawei Technologies Co., Ltd. All rights reserved. Page38
Contents
3. HSS9860 Common LTE Fault Troubleshooting and
Troubleshooting Cases
3.1 HSS9860 Common LTE Fault Troubleshooting

3.2 HSS9860 Troubleshooting Cases

Copyright © 2012 Huawei Technologies Co., Ltd. All rights reserved. Page39
Attach failure Case 1:
 Problem Description:
 The Diameter links between HSS and MME are all normal.

 In the HSS, there is no tracing message in the subscriber


tracing, the attach procedure is failed.

 In the Diameter link tracing, we can find HSS return error code
3002 to MME.

Copyright © 2012 Huawei Technologies Co., Ltd. All rights reserved. Page40
Attach failure Case 1:
 Possible reasons:
 The subscription data in HSS is incorrect

 The link between HSS and MME is abnormal

Copyright © 2012 Huawei Technologies Co., Ltd. All rights reserved. Page41
Attach failure Case 1:
 Handling Process:
 Check the Diameter link status, normal.

 Check the subscription data of the subscriber, normal.

 Check the Diameter link message, we can find the error code in
the AIA message to MME: DIAMETER_UNABLE_TO_DELIVER
(3002).

Copyright © 2012 Huawei Technologies Co., Ltd. All rights reserved. Page42
Attach failure Case 1:
 Handling Process:
 According to RFC3588:
 DIAMETER_UNABLE_TO_DELIVER 3002

This error is given when Diameter can not deliver the message to
the destination, either because no host within the realm supporting
the required application was available to process the request, or
because Destination-Host AVP was given without the associated
Destination-Realm AVP.

 That means the HSS cannot process the message from MME
because the Destination-Host in the message is inconsistent
with the HSS side.
Copyright © 2012 Huawei Technologies Co., Ltd. All rights reserved. Page43
Attach failure Case 1:
 Handling Process:
 Check the Destination-Host in the MME request message:

 Check the Host name in HSS:


– %%LST DMLE:;%%
– RETCODE = 0 Operation succeeded
– The result is as follows
– ------------------------
– Entity name = HSS
– Local device type = HSS9860
– Host name = hss01.gz.gd.node.epc.mnc001.mcc460.3gppnetwork.org
– Realm name = epc.mnc001.mcc460.3gppnetwork.org
– Diameter local entity ID = 0

Copyright © 2012 Huawei Technologies Co., Ltd. All rights reserved. Page44
Attach failure Case 1:
 Solution:
 Modify the Host name in the HSS and the attach procedure
is successful.

Copyright © 2012 Huawei Technologies Co., Ltd. All rights reserved. Page45
Attach failure Case 2:
 Problem Description:
 In the new LTE network, we define some test subscribers.

 When we set the Maximum bandwidth to 15,000,000, the


subscriber can attach the network normally.

 When we modify the Maximum bandwidth to 2,000,000,000, the


attach procedure is failed.

Copyright © 2012 Huawei Technologies Co., Ltd. All rights reserved. Page46
Attach failure Case 2:
 Handling Process:
 The problem happened after we modify the Maximum
bandwidth in the new EPSQOSTPL, so the Maximum bandwidth
may be the cause.

 Check the tracing message in HSS, the Authentication and


Attach procedures are all successful.

 But in the message between USN and UGW, after the UGW
received SM_MM_CTRL_CREATE_DEFAULT_BEARER_REQ,
UGW rejected to create the bearer with the error code: no-
resource-available .

Copyright © 2012 Huawei Technologies Co., Ltd. All rights reserved. Page47
Attach failure Case 2:
 Handling Process:
 There are two possible reasons if UGW rejected to create the
bearer with the error code: no-resource-available:
 No available IP address;

 No available bandwidth.

 Since the IP address is sufficient in UGW, the reason must be


the Bandwidth.

Copyright © 2012 Huawei Technologies Co., Ltd. All rights reserved. Page48
Attach failure Case 2:
 Solution:
 Run the MOD EPSQOSTPL and modify the Maximum
bandwidth to 200,000,000.

 Then the subscriber can attach normally.

Copyright © 2012 Huawei Technologies Co., Ltd. All rights reserved. Page49
Attach failure Case 3:
 Problem Description:
 During the test of the new
HSS in one LTE network,
the subscribers attach
failed.

 In the tracing message of


HSS, we can find:

error- diagnostic: no-gprs-


data-subscribed (1)。

Copyright © 2012 Huawei Technologies Co., Ltd. All rights reserved. Page50
Attach failure Case 3:
 Handling Process:
 Run LST DYNSUBDATA to check the dynamic information of
the subscriber, no subscriber data found;

 Run LST APNTPL to check the APN template, it’s correct;

 Run LST OPTGPRS to check the optimized GPRS/EPS data


and found APN_TYPE is the default value: PS_APN. But for the
4G network, the APN_TYPE should be EPS_APN.

Copyright © 2012 Huawei Technologies Co., Ltd. All rights reserved. Page51
Attach failure Case 3:
 Solution:
 Run MOD OPTGPRS to change the optimized GPRS/EPS
data, and set APN type to EPS_APN, then the subscriber
attach is normal.
 MOD OPTGPRS: IMSI="460018888888888",
PROV=ADDPDPCNTX, APN_TYPE=EPS_APN, APNTPLID=1,
DEFAULTCFGFLAG=TRUE, EPS_QOSTPLID=1,
PDPTYPE=IPV4, ADDIND=DYNAMIC, VPLMN=TRUE,
CHARGE=NORMAL;

Copyright © 2012 Huawei Technologies Co., Ltd. All rights reserved. Page52
Case 4:No IDR or DSR when Modify
PLMNRSZI
 Problem Description:
 During the test of the new HSS in one LTE network, when we
run MOD PLMNRSZI, HSS does not send DSR or IDR to the
MME.

Copyright © 2012 Huawei Technologies Co., Ltd. All rights reserved. Page53
Case 4:No IDR or DSR when Modify
PLMNRSZI
 Handling Process:
 Check the link status between HSS and MME, it’s normal.

 Check the status of the subscriber:


 The subscription data is correct, with PLMN Roaming Service ;

 The subscriber attached in MME normally.

Copyright © 2012 Huawei Technologies Co., Ltd. All rights reserved. Page54
Case 4:No IDR or DSR when Modify
PLMNRSZI
 Handling Process:
 Run MOD PLMNRSZI:IMSI=“XX”,PROV=FALSE; to delete
the PLMN Roaming service, HSS does not send DSR to
MME, and Run LST PLMNRSZI to check, the service is
deleted in HSS.

 Check the PLMN template: ADD PLMNTPL: HLRSN=1,


TPLID=1, DEFAULTRULE=PERMISSION, MCC=“460”,
MNC=“11”, ISLOCAL=TRUE, RULE=PERMISSION; we
found there is no ZC Code setting.

 If there is no ZC code in the PLMN template, the HSS will not


send IDR/DSR to MME.
Copyright © 2012 Huawei Technologies Co., Ltd. All rights reserved. Page55
Case 4:No IDR or DSR when Modify
PLMNRSZI
 Solution:
 Run ADD PLMNTPL in HSS and set ZC code=0001;

 Run ADD ZC: ZC=0001 in MME to set the mapping between


TAC and ZC;

 Run MOD PLMNSRI in HSS again, now HSS can send DSR
or IDR to the MME normally.

Copyright © 2012 Huawei Technologies Co., Ltd. All rights reserved. Page56
Summary
 This course introduces:

 The common maintenance and troubleshooting flow of


HSS9860

 The related fault information collection methods

 The troubleshooting flow of HSS9860 LTE service by


some cases.

Copyright © 2012 Huawei Technologies Co., Ltd. All rights reserved. Page57
Thank you
www.huawei.com