Вы находитесь на странице: 1из 131

RAN10 KPI Troubleshooting Guide

INTERNAL

BOM Code

Product Name

RAN10 KPI Troubleshooting


Guide

Intended
audience

Product Version

V100R010

Document Version

V1.0

Department

UMTS Maintenance and


Development Department

RAN10 KPI Troubleshooting Guide


Prepared by

KPI Team of the UMTS


Maintenance Department

Date

Reviewed by

Date

Reviewed by

Date

Approved by

Date

2009-3-6

Huawei Technologies Co., Ltd.


All rights reserved
2009-03-06

2016-12-19

Huawei Confidential

Page 1 of 131

RAN10 KPI Troubleshooting Guide

INTERNAL

Revision History
Version

Date

Description

Author

V1.0

2009-1-20

The draft was complete.

Shen Yueping, Qian Jin, and Wu Yanwen

V1.0

2009-3-6

The document was revised.

Shen Yueping, Qian Jin, and Wu Yanwen

2016-12-19

Huawei Confidential

Page 2 of 131

RAN10 KPI Troubleshooting Guide

Figures

Contents
1 Analysis Methodology of KPI-Related Problems.............................10
1.1 Problem Discussion.........................................................................................................................................10
1.2 Narrowing the Scope.......................................................................................................................................10
1.3 Locking the Scenario.......................................................................................................................................11
1.4 Drive Test On Site............................................................................................................................................11
1.5 Reproducing the Mirroring Environment........................................................................................................11
1.6 Problem Analysis and Summary......................................................................................................................12

2 RRC Access Success Rate (Service/Non-Service)............................13


2.1 KPI Definition..................................................................................................................................................13
2.2 Influence Factors..............................................................................................................................................13
2.3 Analysis Process..............................................................................................................................................14
2.4 List of Problem Information............................................................................................................................40

3 RAB Access Success Rate (AMR/PS/VP/HSPA)................................41


3.1 KPI Definition..................................................................................................................................................41
3.2 Influence Factors..............................................................................................................................................41
3.3 Analysis Process..............................................................................................................................................42
3.4 List of Problem Information............................................................................................................................49

4 Handover Success Rate (SHO/HHO)..............................................50


4.1 Problems Related to Soft Handover Success Rate...........................................................................................50
4.1.1 KPI Definition........................................................................................................................................50
4.1.2 Influence Factors....................................................................................................................................51
4.1.3 Analysis Process.....................................................................................................................................53
4.1.4 Cases of Soft Handover Failure..............................................................................................................56
4.2 Problems Related to Hard Handover Success Rate.........................................................................................58
4.2.1 KPI Definition........................................................................................................................................58
4.2.2 Influence Factors....................................................................................................................................59
4.2.3 Analysis Process.....................................................................................................................................60
4.2.4 Cases of Inter-Frequency Hard Handover Failure..................................................................................63
4.3 List of Problem Information............................................................................................................................66

5 Problems Related to Call Drop (AMR/PS/VP/HSPA).........................67


5.1 KPI Definition..................................................................................................................................................67
5.2 Influence Factors..............................................................................................................................................67
5.3 Analysis Process..............................................................................................................................................70

2016-12-19

Huawei Confidential

Page 3 of 131

RAN10 KPI Troubleshooting Guide

Figures

5.4 Cases of Call Drop...........................................................................................................................................74


5.5 List of Problem Information............................................................................................................................77

6 Inter-RAT Interoperability............................................................78
6.1 Inter-RAT Handover from WCDMA to GSM (CS Domain)...........................................................................78
6.1.1 KPI Definition........................................................................................................................................78
6.1.2 Influence Factors....................................................................................................................................79
6.1.3 Analysis Process.....................................................................................................................................80
6.2 Inter-RAT Handover from GSM to WCDMA (CS Domain)...........................................................................85
6.2.1 KPI Definition........................................................................................................................................85
6.2.2 Influence Factors....................................................................................................................................86
6.2.3 Analysis Process.....................................................................................................................................87
6.3 Inter-RAT Handover from WCDMA to GPRS (PS Domain)..........................................................................89
6.3.1 KPI Definition........................................................................................................................................89
6.3.2 Influence Factors....................................................................................................................................90
6.3.3 Analysis Process.....................................................................................................................................90
6.4 Inter-RAT Handover from GPRS to WCDMA (PS Domain)..........................................................................92
6.4.1 KPI Definition........................................................................................................................................92
6.4.2 Analysis Process.....................................................................................................................................92
6.5 List of Problem Information............................................................................................................................93

7 Information Collection.................................................................94
7.1 Performance data of RNC................................................................................................................................94
7.1.1 Purpose...................................................................................................................................................94
7.1.2 Information to Be Collected...................................................................................................................94
7.1.3 Method....................................................................................................................................................94
7.2 RNC CHR/PCHR............................................................................................................................................95
7.2.1 Purpose...................................................................................................................................................95
7.2.2 Information to Be Collected...................................................................................................................95
7.2.3 Method....................................................................................................................................................96
7.3 RNC IOS Tracing.............................................................................................................................................96
7.3.1 Purpose...................................................................................................................................................96
7.3.2 Information to Be Collected...................................................................................................................96
7.3.3 Method....................................................................................................................................................97
7.4 RNC IFTS/CDT (User Plane) Tracing.............................................................................................................99
7.4.1 Purpose...................................................................................................................................................99
7.4.2 Information to Be Collected...................................................................................................................99
7.4.3 Method....................................................................................................................................................99
7.5 Standard Signaling Tracing on the RNC........................................................................................................104
7.5.1 Purpose.................................................................................................................................................104
7.5.2 Information to Be Collected.................................................................................................................104
7.5.3 Method..................................................................................................................................................104
7.6 UE QXDM LOG............................................................................................................................................108

2016-12-19

Huawei Confidential

Page 4 of 131

RAN10 KPI Troubleshooting Guide

Figures

7.6.1 Purpose.................................................................................................................................................108
7.6.2 Information to Be Collected.................................................................................................................108
7.6.3 Method..................................................................................................................................................108
7.7 Real-Time Performance Monitoring of RNC................................................................................................110
7.7.1 Purpose.................................................................................................................................................110
7.7.2 Information to Be Collected.................................................................................................................110
7.7.3 Method..................................................................................................................................................110
7.8 RNC Script Configuration..............................................................................................................................111
7.8.1 Purpose..................................................................................................................................................111
7.8.2 Information to Be Collected.................................................................................................................111
7.8.3 Method..................................................................................................................................................111
7.9 Operation Log of RNC...................................................................................................................................112
7.9.1 Purpose.................................................................................................................................................112
7.9.2 Information to Be Collected.................................................................................................................112
7.9.3 Method..................................................................................................................................................112
7.10 Alarm Information on RNC.........................................................................................................................113
7.10.1 Purpose...............................................................................................................................................113
7.10.2 Information to Be Collected...............................................................................................................113
7.10.3 Method................................................................................................................................................113
7.11 Node B Configuration Script.......................................................................................................................114
7.11.1 Purpose................................................................................................................................................114
7.11.2 Information to Be Collected................................................................................................................114
7.11.3 Method................................................................................................................................................114
7.12 Node B CHR................................................................................................................................................116
7.12.1 Purpose...............................................................................................................................................116
7.12.2 Information to Be Collected...............................................................................................................116
7.12.3 Method................................................................................................................................................116
7.13 Node B Alarm..............................................................................................................................................117
7.13.1 Purpose...............................................................................................................................................117
7.13.2 Information to Be Collected...............................................................................................................117
7.13.3 Method................................................................................................................................................117
7.14 Node B CDT................................................................................................................................................119
7.14.1 Purpose...............................................................................................................................................119
7.14.2 Information to Be Collected...............................................................................................................119
7.14.3 Method................................................................................................................................................119
7.15 Checking Whether Any Neighboring Cells are not Configured..................................................................122
7.15.1 Enabling Call Trace for Missing Neighboring Cell Detection Tracing..............................................122
7.15.2 Stopping the MNCDT........................................................................................................................126
7.15.3 Reporting the Missing Neighboring Cell Message.............................................................................126
7.16 Soft Failure of DSP......................................................................................................................................129
7.17 Terminal Troubleshooting............................................................................................................................131

2016-12-19

Huawei Confidential

Page 5 of 131

RAN10 KPI Troubleshooting Guide

Figures

Figures
Figure 1 Impact of the PS service upon the RTWP when some neighboring cells are not configured................19
Figure 2 Impact of the CS service upon the RTWP when the neighboring cell is not configured.......................20
Figure 3 Satellite map of the BTS.........................................................................................................................22
Figure 4 Traced RTWP waveform........................................................................................................................22
Figure 5 Change in the number of subscribers of the cell....................................................................................23
Figure 6 Signal quality of neighboring cells.........................................................................................................24
Figure 7 Impact of the burst of a large number of RRC connection requests upon the RTWP............................25
Figure 8 Cell audit message..................................................................................................................................27
Figure 9 Cell signal quality...................................................................................................................................57
Figure 10 Measurement report..............................................................................................................................57
Figure 11 CIO offset parameter.............................................................................................................................58
Figure 12 Relation between RSCP fading and Ec/N0 fading...............................................................................60
Figure 13 Comparison of handover parameters....................................................................................................65
Figure 14 Flow on CS inter-RAT handover out of 3G..........................................................................................79
Figure 15 Relocation Required message...............................................................................................................82
Figure 16 Relocation Command message.............................................................................................................83
Figure 17 Handover Request ACK message.........................................................................................................84
Figure 18 Flow on CS handover-in.......................................................................................................................86
Figure 19 Signaling of CS inter-RAT handover-in...............................................................................................88
Figure 20 Relocation_Request message...............................................................................................................89
Figure 21 Flow on PS inter-RAT handover out of................................................................................................90
Figure 22 Flow on LAU/RAU after the UE accesses the 2G cell.........................................................................92
Figure 23 Querying the workarea of the BAM.....................................................................................................95
Figure 24 Exporting the CHR log (by running the COL LOG command)..........................................................96
Figure 25 Types of objects to be traced................................................................................................................97
Figure 26 IOS Tracing dialog box.......................................................................................................................98

2016-12-19

Huawei Confidential

Page 6 of 131

RAN10 KPI Troubleshooting Guide

Figures

Figure 27 MoreInfo dialog box............................................................................................................................99


Figure 28 Type of trace object............................................................................................................................100
Figure 29 Configuration page of CDT parameters.............................................................................................101
Figure 30 Configuration page of IFTS parameters.............................................................................................102
Figure 31 Configuration page of user-plane tracing...........................................................................................103
Figure 32 Configuration page of performance monitoring.................................................................................104
Figure 33 Uu interface tracing............................................................................................................................105
Figure 34 Iub interface tracing............................................................................................................................106
Figure 35 Iur interface tracing............................................................................................................................107
Figure 36 Querying the DSP code of the CN......................................................................................................108
Figure 37 Configuring the QPST port.................................................................................................................109
Figure 38 Connecting the equipment ports.........................................................................................................109
Figure 39 Enabling log tracing............................................................................................................................110
Figure 40 Real-time performance monitoring.....................................................................................................111
Figure 41 NC script configuration.......................................................................................................................111
Figure 42 Exporting the operation log by running the EXP LOG command....................................................112
Figure 43 Alarm box of the LMT........................................................................................................................113
Figure 44 Exporting the alarms...........................................................................................................................114
Figure 45 Exporting the NodeB configuration file through the M2000.............................................................115
Figure 46 Data Config File Transfer...................................................................................................................115
Figure 47 FTP upload..........................................................................................................................................116
Figure 48 Setting the CHR level of the NodeB...................................................................................................117
Figure 49 NodeB CHR reporting switch.............................................................................................................117
Figure 50 Querying the alarm information.........................................................................................................118
Figure 51 Saving the alarm information.............................................................................................................118
Figure 52 Alarm box of the NodeB LMT...........................................................................................................119
Figure 53 Modifying the properties of the monitor items of the NodeB CDT...................................................120
Figure 54 Enabling CDT tracing of the NodeB cells..........................................................................................120
Figure 55 Basic setting........................................................................................................................................121
Figure 56 Setting other monitor items................................................................................................................121
Figure 57 Enabling call trace to check whether any neighboring cells are not configured................................123
Figure 58 Configuration interface of intra-frequency MNCDT.........................................................................123
Figure 59 MNCDT window................................................................................................................................124

2016-12-19

Huawei Confidential

Page 7 of 131

RAN10 KPI Troubleshooting Guide

Figures

Figure 60 Intra-frequency measurement control after the intra-frequency MNCDT is enabled........................124


Figure 61 Configuration interface of inter-frequency MNCDT.........................................................................125
Figure 62 Configuration interface of inter-RAT MNCDT..................................................................................126
Figure 63 Message tracing for the missing intra-frequency neighboring cells...................................................127
Figure 64 Reported message about the missing intra-frequency neighboring cells...........................................127
Figure 65 Message tracing for the missing inter-frequency neighboring cells...................................................128
Figure 66 Reported message about the missing inter-frequency neighboring cells............................................128
Figure 67 Message about the missing inter-RAT neighboring cell.....................................................................129
Figure 68 Analyzing the soft failure of the DSP through the CHR log..............................................................130
Figure 69 Resetting the DSP...............................................................................................................................130
Figure 70 Analyzing the special UEID through the CHR log............................................................................131

2016-12-19

Huawei Confidential

Page 8 of 131

RAN10 KPI Troubleshooting Guide

Tables

Tables
Table 1 Indicators related to RRC setup failure....................................................................................................14
Table 2 Cell traffic countAnalysis of power congestion...................................................................................15
Table 3 Number of top congested cells.................................................................................................................21
Table 4 Indicator of cell CE congestion................................................................................................................28
Table 5 Number of CEs consumed by the DCH service.......................................................................................29
Table 6 Number of CEs consumed by the HSUPA service...................................................................................30
Table 7 Analysis of cell code congestion indicators.............................................................................................33
Table 8 Analysis of transmission congestion indicators........................................................................................34
Table 9 Indicators of CS RAB setup failure..........................................................................................................42
Table 10 Indicators of PS RAB setup failure........................................................................................................43
Table 11 Indicators of PS RB setup failure...........................................................................................................46
Table 12 Flow on RB setup failure because of invalid configuration...................................................................47
Table 13 Models of the known UEs that have invalid configuration....................................................................47
Table 14 Indicators related to soft handover failure..............................................................................................54
Table 15 Indicators related to inter-frequency hard handover failure...................................................................61
Table 16 Inter-frequency handover failure............................................................................................................63
Table 17 CS call drop rate.....................................................................................................................................63
Table 18 Requirements for the EcIo and Ec threshold..........................................................................................68
Table 19 Requirements of IP-based networking for the transmission quality......................................................69
Table 20 Indicators related to CS call drop...........................................................................................................71
Table 21 Indicators related to PS call drop............................................................................................................72
Table 22 Indicators related to CS inter-RAT handover-out failure.......................................................................80
Table 23 Indicators related to CS inter-RAT handover-in failure.........................................................................87
Table 24 Indicators related to PS inter-RAT handover-out failure........................................................................91

2016-12-19

Huawei Confidential

Page 9 of 131

RAN10 KPI Troubleshooting Guide

INTERNAL

The document describes the troubleshooting methods for the KPI-related problems in the
commercial WCDMA networks, thus providing reference for the network maintenance
personnel.

Analysis Methodology of KPIRelated Problems

1.1 Problem Discussion

If the customer, network planning personnel, or customer service personnel report


some KPI-related problems, you need to collect the related information and understand
the problems and needs of the field personnel.

It is important to know the background of the problems, especially the KPI-related


problems that occur in commercial networks. You need to collect more information
about the problems by phone and by Email. Firstly, you need to ascertain the urgency
and importance of the problems, thus helping you lay down appropriate measures.

Determine whether the problems are known problems according to the collected
information (for example, problem description and version).

For the KPI-related problems, you need to obtain the version information first. Some
problems are known problems or related to known bugs, which have been analyzed
and solved.

Therefore, the troubleshooting personnel can first obtain the bug information or release
notes of the version to eliminate the impacts of known problems upon the KPIs.

According to the time of KPI changes, determine whether the problems are caused by
network operations or parameter modifications, and analyze the impacts of network
adjustment upon the KPIs emphatically.

1.2 Narrowing the Scope


After understanding the problems clearly, you can analyze the general KPI data and the KPI
data of the top N cells, compare the normal KPI data with the abnormal KPI data, and thus
find out the main causes that affect the KPIs (performance counter).
For example, if the call drop rate increases, you can analyze the call drop rate of the Top N
cells and compare the normal KPIs of the Top N cells with the abnormal KPIs of the Top N
cells. You may find that a cell is abnormal and its call drop rate increases. By analyzing the
2016-12-19

Huawei Confidential

Page 10 of 131

RAN10 KPI Troubleshooting Guide

INTERNAL

abnormal data of the cell and comparing the normal KPIs with the abnormal KPIs, you can
determine whether the symptom is the primary cause of the problem. If yes, the KPI-related
problems of the network can be focused on the abnormal cells.

1.3 Locking the Scenario

Know the main scenarios that affect the KPIs by analyzing the PCHR log.

Compared with the performance data, the PCHR log records more detailed information
and can record the 15 pieces of key signaling before the call is released. By analyzing
the PCHR log, you can collect and analyze the exceptional information about the KPIrelated problems and know the common features of signaling flow.

Based on the PCHR log, you can obtain the subscriber information and terminal
information, and thus judge whether the problems are caused by the poor performance
of the terminal of a specific subscriber or a specific UE type.

If having the preliminary analysis principles and results, you can request the field
personnel to enable the IOS tracing of the TOP cells, thus knowing the scenarios and
details of the problems.

IOS tracing is an effective troubleshooting means. As a kind of cell-level tracing, it can


trace the subscriber information about a cell and provide massive amounts of
information. However, it can only trace several cells and a limited number of
subscribers. To achieve an optical effect, you need to have a preliminary understanding
of the problem before enabling IOS tracing.

Deeply analyze the causes for the KPI-related problems through the CHR, as well as
the IOS and PCHR.

Normally, you can find some abnormalities through IOS tracing and obtain more
detailed internal print information about the RNC by analyzing the CHR in the
corresponding time range. In addition, you can associate the CHR with the PCHR bills
and thus analyze the internal abnormal process, for example, whether the problem is
Soft Failure of DSP.
The analyzed problem scenarios are determined as the main scenarios that affect the KPIs.

1.4 Drive Test On Site


Normally, you can determine the causes that affect the KPIs through the preceding steps. If
failing to determine such causes, you need to request the field personnel to arrange drive
test. The expense of drive test is high. Before conducting the drive test, therefore, you need
to have a considerable analysis of the problems and determine the main cells or scenarios
where the problems occur.
Through drive test, you can know the behaviors and signaling of the UE, which are vital to
analyzing some UE-related problems.

1.5 Reproducing the Mirroring Environment


Both field drive test and mirroring in the HQ are the means to reproduce the problems. If
the problems can be reproduced in the HQ, the expense is lower and the problems can be
analyzed more clearly. However, it is difficult to simulate the field scenarios (transmission
2016-12-19

Huawei Confidential

Page 11 of 131

RAN10 KPI Troubleshooting Guide

INTERNAL

status and signal status) in the HQ. To verify the problems caused by the modification
parameters, therefore, it is necessary to reproduce the problems in the HQ. If the problems
occur in the existing network, field drive test can be required usually.
Note that you must have a preliminary analysis of the problems before reproducing the
problems (unless the problems are extremely urgent). Otherwise, the reproduction is blind.

1.6 Problem Analysis and Summary


Firstly, you need to determine whether the analyzed problem is the main influence factor of
the KPIs. This point is important, because lots of factors affect the KPIs of the network. You
need to clearly analyze the major factors that cause the KPI changes or affect the KPIs.
Secondly, it is also important to summarize the problems and share the related experience
timely.

2016-12-19

Huawei Confidential

Page 12 of 131

RAN10 KPI Troubleshooting Guide

INTERNAL

RRC Access Success Rate


(Service/Non-Service)

2.1 KPI Definition


RRC setup success rate = (Number of Successful RRC Setups)/ (Number of RRC
Connection Attempts)
VS.RRC.SuccConnEstab.Rate = < RRC.SuccConnEstab.sum > / <
VS.RRC.AttConnEstab.Cell >

2.2 Influence Factors


The process of RRC connection setup includes the following steps:
1.

The UE sends the RRC Connection Request message through the RACH.

2.

The RNC sends the RRC Connection Setup message through the FACH.

3.

If the RRC is established on the DCH, the UE sends the RRC Connection Setup CMP
message through the uplink dedicated channel after the downlink dedicated channel is
set up and synchronized.

4.

If the RRC is established on the CCH, the UE directly sends the RRC Connection
Setup CMP message through the RACH.
The RRC connection setup fails in the following scenarios:

2016-12-19

The UE sends the RRC Connection Request message, but the RNC does not receive
the message.

The RNC receives the RRC Connection Request message sent by the UE and delivers
the RRC Connection Setup message, but the UE does not receive the RRC Connection
Setup message.

The RNC receives the RRC Connection Request message sent by the UE, and delivers
the RRC Connection Reject message.

The UE receives the RRC Connection Setup message, but does not send the RRC
Setup Complete message.

The UE sends the RRC Setup Complete message, but the RNC does not receive the
message.

Huawei Confidential

Page 13 of 131

RAN10 KPI Troubleshooting Guide

INTERNAL

Usually, the problems related to RRC setup success rate are found through the
performance counter of the RNC or users complaints (or drive test). In the scenario
where the UE sends the RRC Connection Request message but the RNC does not
receive the message, the problem can be found only through users complaints or drive
test. In other scenarios, the problem can be found through the performance counter.
Usually, RRC setup failure is caused by the following factors:

Uplink RACH

Downlink coverage

Cell reselection parameter

Downlink synchronization

Uplink synchronization

Resource congestion

The equipment is abnormal.


Resource congestion includes power resource congestion, CE resource congestion,
code resource congestion, and transmission resource congestion. For the problem
caused by resource congestion, you need to first check the actual utilization of
resources and analyze the correctness of congestion threshold and configurations.
For the problem caused by other factors, the air interface of RRC setup does not make
any response. Generally, UU Noreply is the main problem that causes RRC connection
setup failure.

2.3 Analysis Process


1.

Discussing the Problem, Ascertaining the Problem Background and Product


Version, and Excluding the Impacts of Known Bugs
Determine the time at which the RRC setup success rate decreases severely, analyze
whether the problem is caused by network adjustment, and focus on the impacts of
network adjustment. Obtain the known bug information about the corresponding
version (you can inquire of the related contact person of the product or inquire about
the information about similar problems of other sites), and determine whether the
problem is a known problem.

2.

Analyzing the Main Scenarios in Which RRC Setup Fails


Analyze the change in the causes of RRC access failure through the performance
counters on the RNC, and analyze which factor causes the decline of RRC setup
success rate. Figure 1 lists the causes of RRC access failure defined by the
performance counter:

Figure 1 Indicators related to RRC setup failure


Measurement Item

Sub Items

RRC.FailConnEstab.Cong

VS.RRC.Rej.Power.Cong
VS.RRC.Rej.UL.CE.Cong
VS.RRC.Rej.DL.CE.Cong
VS.RRC.Rej.Code.Cong
VS.RRC.Rej.ULIUBBandCong
VS.RRC.Rej.DLIUBBandCong

2016-12-19

Huawei Confidential

Page 14 of 131

RAN10 KPI Troubleshooting Guide

INTERNAL

Measurement Item

Sub Items

VS.RRC.FailConnEstab

VS.RRC.Rej.RL.Fail
RRC.FailConnEstab.Cong
VS.RRC.Rej.AAL2.Fail

RRC.FailConnEstab.NoReply

3.

Analyzing the Main Causes of RRC Access Failure Deeply

VS.RRC.Rej.Power.Cong
The RNC RRM makes power admission algorithm decision. If finding the decision on
uplink or downlink admission denial, the RNC RRM initiates RRC setup rejection.
In the RAN10 or earlier versions, the power admission policy for RRC is as follows:
If the RRC Connection Request is caused by emergency call, detach, or registration,
directly allow the RRC connection;
If the RRC Connection Request is caused by other factors,
Allow the RRC connection according to the OLC threshold if the OLC is enabled;
directly allow the RRC connection if the OLC is disabled.
Therefore, VS.RRC.Rej.Power.Cong occurs when the OLC is enabled in the network
and network load is high enough to cause congestion. If RRC setup success rate
decreases because the indicator value becomes large suddenly, find the Top N cells that
cause power congestion and then query the changes in the maximum RTWP
(VS.MaxRTWP) and maximum TCP (VS.MaxTCP) of the Top N cells. If the RTWP
increases severely, it indicates that the problem is caused by uplink power congestion.
If the TCP increases severely, it indicates that the problem is caused by downlink
power congestion.

RTWP
The RTWP increases for the following reasons:

High traffic

External interference

Some neighboring cells are not configured.

Cells re-establish.

The equipment is abnormal.


For uplink power congestion (the RTWP increases), judge the causes through the
analysis of performance data, and then propose appropriate solution suggestions.
Figure 1 lists the related performance counters.

Figure 1 Cell traffic countAnalysis of power congestion

RB Number

CS Erlang

PS Erlang

2016-12-19

DL

UL

VS.AMR.Ctrl.DL12.2

VS.AMR.Ctrl.UL12.2

VS.RB.DLConvCS.64

VS.RB.ULConvCS.64

VS.RB.DLInterPS.8

VS.RB.ULInterPS.8

VS.RB.DLInterPS.16

VS.RB.ULInterPS.16

VS.RB.DLInterPS.32

VS.RB.ULInterPS.32

Huawei Confidential

Page 15 of 131

RAN10 KPI Troubleshooting Guide

Throughput

2016-12-19

INTERNAL

DL

UL

VS.RB.DLInterPS.64

VS.RB.ULInterPS.64

VS.RB.DLInterPS.128

VS.RB.ULInterPS.128

VS.RB.DLInterPS.144

VS.RB.ULInterPS.144

VS.RB.DLInterPS.256

VS.RB.ULInterPS.256

VS.RB.DLInterPS.384

VS.RB.ULInterPS.384

VS.RB.DLBkgPS.8

VS.RB.ULBkgPS.8

VS.RB.DLBkgPS.16

VS.RB.ULBkgPS.16

VS.RB.DLBkgPS.32

VS.RB.ULBkgPS.32

VS.RB.DLBkgPS.64

VS.RB.ULBkgPS.64

VS.RB.DLBkgPS.128

VS.RB.ULBkgPS.128

VS.RB.DLBkgPS.144

VS.RB.ULBkgPS.144

VS.RB.DLBkgPS.256

VS.RB.ULBkgPS.256

VS.RB.DLBkgPS.384

VS.RB.ULBkgPS.384

HSPA User

VS.HSDPA.UE.Mean.Cell

VS.HSUPA.UE.Mean.Cell

R99 PS
Throughput

VS.PS.Int.Kbps.DL8

VS.PS.Int.Kbps.UL8

VS.PS.Int.Kbps.DL16

VS.PS.Int.Kbps.UL16

VS.PS.Int.Kbps.DL32

VS.PS.Int.Kbps.UL32

VS.PS.Int.Kbps.DL64

VS.PS.Int.Kbps.UL64

VS.PS.Int.Kbps.DL128

VS.PS.Int.Kbps.UL128

VS.PS.Int.Kbps.DL144

VS.PS.Int.Kbps.UL144

VS.PS.Int.Kbps.DL256

VS.PS.Int.Kbps.UL256

VS.PS.Int.Kbps.DL384

VS.PS.Int.Kbps.UL384

VS.PS.Bkg.Kbps.DL8

VS.PS.Bkg.Kbps.UL8

VS.PS.Bkg.Kbps.DL16

VS.PS.Bkg.Kbps.UL16

VS.PS.Bkg.Kbps.DL32

VS.PS.Bkg.Kbps.UL32

VS.PS.Bkg.Kbps.DL64

VS.PS.Bkg.Kbps.UL64

VS.PS.Bkg.Kbps.DL128

VS.PS.Bkg.Kbps.UL128

VS.PS.Bkg.Kbps.DL144

VS.PS.Bkg.Kbps.UL144

VS.PS.Bkg.Kbps.DL256

VS.PS.Bkg.Kbps.UL256

VS.PS.Bkg.Kbps.DL384

VS.PS.Bkg.Kbps.UL384

Huawei Confidential

Page 16 of 131

RAN10 KPI Troubleshooting Guide

INTERNAL

HSPA
Traffic

Power

Call
Attempt
Times

DL

UL

VS.HSDPA.MeanChThrough
put

VS.HSUPA.MeanChThroug
hput

VS.HSDPA.MeanChThrough
put.TotalBytes

VS.HSUPA.MeanChThroug
hput.TotalBytes

VS.MeanTCP

VS.MeanRTWP

VS.MaxTCP

VS.MaxRTWP

VS.MinTCP

VS.MinRTWP

VS.RRC.AttConnEstab.Cell

VS.RRC.AttConnEstab.Cell

VS.RAB.AttEstab.AMR

VS.RAB.AttEstab.AMR

VS.RAB.AttEstCS.Conv.64

VS.RAB.AttEstCS.Conv.64

VS.RAB.AttEstabPS.Cell

VS.RAB.AttEstabPS.Cell

VS.HSDPA.RAB.AttEstab

VS.HSDPA.RAB.AttEstab

VS.HSUPA.RAB.AttEstab

VS.HSUPA.RAB.AttEstab

The following section describes the judgment methods and solution suggestions in
different scenarios where the RTWP increases:

High traffic causes the rise in the RTWP


<Features>:

The RTWP increases abnormally when traffic is busy.

The admission of uplink power is rejected when traffic is in peak hours.

The RTWP becomes normal gradually while traffic decreases.

The corresponding traffic is high, that is, about 80 equivalent Erlang (it may not
serve as the necessary condition).

<Method of analysis>:
Through the performance data, analyze whether the RWTP increases while traffic
increases.
<Suggestions>:

It is recommended that TRXs should be added in hotspot areas.

If TRXs cannot be added within a short period, you can run the following
command to enable the uplink LDR function: Run the following command: ADD
CELLALGOSWITCH: NBMLdcAlgoSwitch=UL_UU_LDR-1;

The LDR function can relieve the congestion caused by high traffic rather than eliminate the
congestion. The LDR function sacrifices the QoS of some users for the access success rate of the new
users.

External interference causes the rise in the RTWP


<Features>:
After the preceding cause is excluded, external interference causes the rise in the
RTWP in two scenarios:

2016-12-19

The RTWP of a cell is abnormal at regular intervals.

Huawei Confidential

Page 17 of 131

RAN10 KPI Troubleshooting Guide

INTERNAL

For the cells in an area, the coverage directions are the same basically and the
problem occurs at the same time and at the same frequency.

1)

When traffic is low, the RTWP of multiple cells in the same area increases at
different degrees in the same time segment and the symptom lasts for more than
20 minutes.

2)

If tracing the waveform about the abnormal RTWP (by enabling the RTWP
tracing task of the cells), you can find that the RTWP varies gently and has no
remarkable fluctuation after the RTWP increases.

<Method of analysis>:

Analyze the performance data: In an area or cell where RRC power congestion
occurs, check whether the problem occurs in the same time segment. Use the
minimum granularity to query the performance data. During the congestion period,
check whether the traffic of the cell increases sharply and thus the RTWP increases
abnormally (exclude the factor that traffic increases).

Analyze the PCHR log: Filter out all the bills of RRC admission denial because of
the RTWP congestion, and determine the occurrence of congestion (as detailed as
to the minute and second).

Analyze the geographical distribution: Query the geographical distribution


information about the cells. If the coverage directions of the cells are the same, it is
probable that external interference causes the problem.

<Suggestions>:
If the problem is caused by external interference, capture the evidence by scanning the
antenna interface and explain the cause to the customer.

The RTWP increases abnormally because some neighboring cells are not configured
There are two such scenarios:

Huawei neighboring cell is not configured.

The cells of other vendors are not configured with Huawei neighboring cell.

<Features>:

In such scenarios, the RTWP increases abnormally because of the mobility of


subscribers. The problem occurs at random and in the time segment during which
the subscribers move frequently.

If the RTWP abnormally increases more frequently, RRC congestion and service
congestion occur more frequently.
Figure 2 shows the impact of different types of services upon the RTWP of the cells
(data source: The O2 in Germany).
PS service:
The Nokia cell is not configured with Huawei neighboring cell. In the Nokia cell,
the PS 384/384 service is initiated and it is uploaded.

2016-12-19

Huawei Confidential

Page 18 of 131

RAN10 KPI Troubleshooting Guide

INTERNAL

Figure 2 Impact of the PS service upon the RTWP when some neighboring cells are not
configured

AMR service:
Cell1: 311
Cell2: 312
Cell1 and Cell2 are the intra-frequency cells. Cell 311 is not configured with the
neighbor relation with Cell 312.

2016-12-19

Huawei Confidential

Page 19 of 131

RAN10 KPI Troubleshooting Guide

INTERNAL

Figure 3 Impact of the CS service upon the RTWP when the neighboring cell is not configured

<Method of analysis>:

Analyze the performance data of the congested cells, and find out the distribution
of occurrence time of power congestion.

Trace and analyze the RTWP and number of subscribers of the cells in real time.

Analyze whether the congested cells are not configured with neighboring cells
through the NASTAR (or through the analysis result of the neighboring cell
configuration of the intelligent network optimization in the PCHR log).

Check whether any neighboring cells are not configured according to the preceding
analysis result and engineering map information.

<Suggestions>:
It is recommended that the corresponding neighbor relation should be configured.
<Reference case>: O2 BPCR case
1.

2016-12-19

During the period of 2009-01-12 to 2009-01-18, the following cells are severely
congested: 23690, 43962, 24104, 23696, 3686, 23678, and 23701 of Cluster
UMTS_S0048_4.

Huawei Confidential

Page 20 of 131

RAN10 KPI Troubleshooting Guide

INTERNAL

Figure 1 Number of top congested cells


CellId
23690
24104
43692
24104
23690
3686
23690
23678
43692
3686
23701
23690
23678
23690
43692
3678
23696
23945
44104
23690
23691
43685
3701
3673
3686
3696
23675
23685
23690
23696
23696
23696
43685
44104
3675
23676
23685
43678
43685
44104
23690
43676
23676

CellName
509310690S2
509311104S-2
509310692S-3
509311104S-2
509310690S2
509310686S-1
509310690S2
509310678S2
509310692S-3
509310686S-1
509310701S2
509310690S2
509310678S2
509310690S2
509310692S-3
509310678S1
509310696S2
509310945S2
509311104S-3
509310690S2
509310691S-2
509310685S3
509310701S1
509310673S1
509310686S-1
509310696S1
509310675S2
509310685S2
509310690S2
509310696S2
509310696S2
509310696S2
509310685S3
509311104S-3
509310675S1
509310676S2
509310685S2
509310678S3
509310685S3
509311104S-3
509310690S2
509310676S3
509310676S2

Time(As
2009-1-12
2009-1-19
2009-1-15
2009-1-19
2009-1-18
2009-1-16
2009-1-15
2009-1-18
2009-1-17
2009-1-16
2009-1-14
2009-1-15
2009-1-13
2009-1-15
2009-1-14
2009-1-13
2009-1-18
2009-1-12
2009-1-19
2009-1-14
2009-1-19
2009-1-12
2009-1-16
2009-1-19
2009-1-19
2009-1-17
2009-1-18
2009-1-16
2009-1-17
2009-1-17
2009-1-17
2009-1-17
2009-1-13
2009-1-19
2009-1-16
2009-1-12
2009-1-13
2009-1-13
2009-1-12
2009-1-19
2009-1-16
2009-1-18
2009-1-12

hour)
20:00:00
17:00:00
12:00:00
18:00:00
19:00:00
14:00:00
13:00:00
20:00:00
21:00:00
12:00:00
18:00:00
20:00:00
18:00:00
11:00:00
10:00:00
18:00:00
18:00:00
19:00:00
16:00:00
18:00:00
18:00:00
6:00:00
18:00:00
14:00:00
15:00:00
19:00:00
13:00:00
12:00:00
19:00:00
12:00:00
18:00:00
19:00:00
15:00:00
20:00:00
17:00:00
9:00:00
15:00:00
18:00:00
7:00:00
18:00:00
13:00:00
13:00:00
8:00:00

VS.RRC.Rej.Power.CongVS.LCC.OverCongNumULVS.LCC.OverCongTimUL
335
3
680
318
3
520
238
1
320
168
7
415
101
1
250
89
1
260
63
2
145
39
1
75
38
3
95
37
4
155
26
1
40
22
3
135
19
1
40
18
1
60
13
1
5
12
1
65
12
1
220
11
1
25
8
1
15
7
1
85
7
2
25
6
1
15
5
1
35
4
1
30
4
1
20
4
2
75
4
1
20
4
1
40
4
1
30
4
1
65
4
1
55
4
6
155
4
1
25
4
1
15
3
1
35
3
3
75
3
2
190
3
1
15
3
1
15
3
1
10
2
1
30
2
1
35
1
1
15

The satellite map of the BTS shows that Nokia neighboring cell is nearby 23690,
43692, and 24104. Therefore, it is suspected that Nokia neighboring cell has a great
impact upon the RTWP of Huawei BTS.

2016-12-19

Huawei Confidential

Page 21 of 131

RAN10 KPI Troubleshooting Guide

INTERNAL

Figure 2 Satellite map of the BTS

2.

On the RNC where the Cluster is located, select the top 20 severely congested NodeBs
for RTWP tracing and find the RTWP waveform generated at the congestion time.
During the period of 16:00 to 17:00, you can trace the waveform of Cell 44587 about
abnormal RTWP, as shown in Figure 1.

Figure 1 Traced RTWP waveform

Figure 2 shows the change in the number of subscribers of the corresponding cell.

2016-12-19

Huawei Confidential

Page 22 of 131

RAN10 KPI Troubleshooting Guide

INTERNAL

Figure 2 Change in the number of subscribers of the cell

3.

2016-12-19

Query the neighboring cell configuration in the configuration script. Cell 44587 is
configured with Nokia neighboring cells: Site 175, Site 730, and Cell 43176 of RNC
525. In addition, the signal quality information about Nokia neighboring cells is
exported to the PCHR log. As a result, the RTWP of Huawei cell is raised by 10 dB
when the subscriber is in a Nokia cell.

Huawei Confidential

Page 23 of 131

RAN10 KPI Troubleshooting Guide

INTERNAL

Figure 1 Signal quality of neighboring cells

The RTWP increases abnormally because cells are reestablished


<Features>:
For the equipment or transmission reason, cells are reestablished. When the cells are
enabled again, a large number of RRC connection requests are generated because of
cell reselection over different subsystems.
The RTWP increases abnormally within a short period because of the burst of a large
number of RRC connection requests. As a result, when the uplink power admission
algorithm is enabled, some RRC connection requests are rejected because of power
congestion. If the cell has a large number of subscribers, the rise in the RTWP value
lasts for a longer period.
As verified in the lab, the RTWP increases to an abnormal level if there are a large
number of RRC connection requests. For details, see Figure 2.

2016-12-19

Huawei Confidential

Page 24 of 131

RAN10 KPI Troubleshooting Guide

INTERNAL

Figure 2 Impact of the burst of a large number of RRC connection requests upon the RTWP

<Method of analysis>:

Within two or three minutes after cells are reestablished, the RTWP fluctuates
continuously.

When both the admission algorithm and OLC algorithm are enabled, a large
number of RRC connection requests with the cause of cell reselection over
different subsystems are rejected and other types of service requests are also
rejected.

Cell reestablishment alarm: Query the system alarms and check whether the
following alarms are generated in the time segment when the RTWP increases
abnormally:

Uplink CPRI Interface Abnormal or SAAL Link Unavailable / SCTP Link Down or
Cell unavailable

Analyze the PCHR data. Power congestion mainly occurs in the time segment of 2
to 5 minutes, and more than 60% of the power-congestion subscribers undergo cell
reselection over different subsystems.
<Suggestions>:

Disable the uplink admission algorithm or OLC algorithm. If both the uplink
admission algorithm and OLC algorithm are enabled, the access success rate
decreases severely. If the algorithms are disabled, you can avoid the RRC
connection failure caused by power congestion.

Analyze the cause of cell reestablishment, and try to lower the occurrence
frequency of cell reestablishment in the network.

The RTWP increases abnormally because the equipment is abnormal


<Features>:

2016-12-19

When traffic is not high, the RTWP of one or two sites increases stably by more
than 10 dB. The symptom lasts for more than 60 minutes.

After the RTWP increases, the RTWP varies gently and has no remarkable
fluctuation.

The minimum RTWP (VS.MinRTWP) always remains at a high level.

Huawei Confidential

Page 25 of 131

RAN10 KPI Troubleshooting Guide

INTERNAL

<Method of analysis>:

Analyze the performance data. If cells are congested, measure the MinRTWP value
of the cells.

Query the system alarms and check whether there exist any board-related alarms.
Process the alarms first.

Trace the RTWP and number of subscribers of the cells in real time.

If the real-time tracing result shows that the RTWP of the cells is abnormal
continuously and the number of subscribers is small, you can ascertain the cause on
site.
If all the preceding causes are excluded, you can suspect that the problem is caused
by the equipment. You can collect the related information and submit the
information to the Maintenance Department.

<Suggestions>:
Collect the related information according to the following checklist, and ask the R&D
personnel to further analyze the problem.
1)

TCP

The TCP increases for the following reasons:

High traffic

Other causes
For downlink power congestion (the TCP increases), analyze the performance data,
judge whether the problem is related to the rise in traffic, and then propose
appropriate solution suggestions.

The commercial networks do not encounter the scenario where RRC admission failure is caused by
the overhigh TCP. Currently, the troubleshooting experience in the aspect is not enough. The related
contents will be added subsequently.

VS.RRC.Rej.UL.CE.Cong/ VS.RRC.Rej.DL.CE.Cong
The RNC RRM makes admission algorithm decision. The RNC RRM can find the
admission denial because of the insufficiency of uplink or downlink CE resources, or
the number of RRC connection rejections because the NodeB returns CE Congestion
when the RNC delivers the RL_SETUP message. For the RL_Fail because the NodeB
returns CE Congestion, the RAN10 or earlier versions have the following defects:
The CE capability of the NodeB is constrained by both the license configuration and
hardware specifications. At present, the NodeB reports
IUB_INTERFACE_CELL_SYNC_NOT_SUPP and
IUB_INTERFACE_CELL_SYNC_ADJ_NOT_SUPP to the RNC if the CE Licenses
are not enough. The RNC adds the two cause values to the VS.RRC.Rej.UL.CE.Cong
counter and VS.RRC.Rej.DL.CE.Cong counter respectively. However, the NodeB
reports RADIO_RESOURCES_NOT_AVAILABLE to the RNC if the actual hardware
capability (CE resource) of the NodeB is not enough. Then, the RNC adds the cause
value to the VS.RRC.Rej.RL.Fail counter rather than to the corresponding CE
congestion counter.
You can observe the CE capability reported by the NodeB through the Iub NBAP
signaling.

2016-12-19

Huawei Confidential

Page 26 of 131

RAN10 KPI Troubleshooting Guide

INTERNAL

Figure 3 Cell audit message

The CE license configuration of the NodeB should be lower than the hardware capability of the
NodeB. Why is the hardware capability not enough when the license capability is not congested? The
reason is as follows: In the RAN10 or earlier versions, the NodeB reports the CE capability to the
RNC according to the standard of configured licenses 110% regardless of the hardware capability.
Therefore, there exists the scenario where the configured licenses exceed the hardware capability,
which is not reasonable. The subsequent versions will make the following improvements:
The following improvements are made on the NodeB:
If the CE Licenses are not enough, the NodeB reports the following cause through the Iub interface:
CELL_SYN_NOT_SUPP: The uplink CE licenses are not enough.
CELL_SYN_ADJ_NOT_SUPP: The downlink CE licenses are not enough.
The two cause values keep the design of the RAN10 version.
If the hardware CE capability is not enough, the NodeB reports the following cause through the Iub
interface:
UL_RADIO_RESOURCES_NOT_AVAILABLE: The uplink hardware CE capability is not enough,
the uplink logical resources (for example, FPID and CcTrchID) are not enough, and uplink
subscribers are allocated.
DL_RADIO_RESOURCES_NOT_AVAILABLE: The downlink hardware CE capability is not
enough, and the downlink logical resources (for example, FPID and CcTrchID) are not enough.
The following improvements are made on the RNC:
Both CELL_SYN_NOT_SUPP and UL_RADIO_RESOURCES_NOT_AVAILABLE reported by the
NodeB are considered as uplink CE insufficiency.
Both CELL_SYN_ADJ_NOT_SUPP and DL_RADIO_RESOURCES_NOT_AVAILABLE reported
by the NodeB are considered as downlink CE insufficiency.
In addition, the access failure because of the preceding four causes is excluded from the
VS.RRC.Rej.RL.Fail counter.
It is improbable that the NodeB reports RADIO_RESOURCES_NOT_AVAILABLE for the
insufficiency of other resources (for example, hardware resource). Therefore, you can basically
determine that the problem is caused by the insufficiency of CEs.

Because of the preceding defects, the number of CE congestions is not accurate. When
analyzing VS.RRC.Rej.UL.CE.Cong and VS.RRC.Rej.DL.CE.Cong, you also need to
consider VS.RRC.Rej.RL.Fail.
The common causes of CE congestion are as follows:
2016-12-19

Huawei Confidential

Page 27 of 131

RAN10 KPI Troubleshooting Guide

INTERNAL

High traffic

The residual CEs maintained by the NodeB are not consistent with those
maintained by the RNC.
In case of CE congestion, analyze the top N congested cells through the
performance data, judge the causes of CE congestion, and propose appropriate
solution suggestions.

Figure 4 Indicator of cell CE congestion

Traffic

DL

UL

CS
Erlang

VS.AMR.Ctrl.DL12.2

VS.AMR.Ctrl.UL12.2

VS.RB.DLConvCS.64

VS.RB.ULConvCS.64

PS
Erlang

VS.RB.DLInterPS.8

VS.RB.ULInterPS.8

VS.RB.DLInterPS.16

VS.RB.ULInterPS.16

VS.RB.DLInterPS.32

VS.RB.ULInterPS.32

VS.RB.DLInterPS.64

VS.RB.ULInterPS.64

VS.RB.DLInterPS.128

VS.RB.ULInterPS.128

VS.RB.DLInterPS.144

VS.RB.ULInterPS.144

VS.RB.DLInterPS.256

VS.RB.ULInterPS.256

VS.RB.DLInterPS.384

VS.RB.ULInterPS.384

VS.RB.DLBkgPS.8

VS.RB.ULBkgPS.8

VS.RB.DLBkgPS.16

VS.RB.ULBkgPS.16

VS.RB.DLBkgPS.32

VS.RB.ULBkgPS.32

VS.RB.DLBkgPS.64

VS.RB.ULBkgPS.64

VS.RB.DLBkgPS.128

VS.RB.ULBkgPS.128

VS.RB.DLBkgPS.144

VS.RB.ULBkgPS.144

VS.RB.DLBkgPS.256

VS.RB.ULBkgPS.256

VS.RB.DLBkgPS.384

VS.RB.ULBkgPS.384

VS.HSDPA.UE.Mean.Cell

VS.HSUPA.UE.Mean.Cell

VS.HSDPA.MeanChThroughput

VS.HSUPA.MeanChThrough
put

VS.HSDPA.MeanChThroughput
.TotalBytes

VS.HSUPA.MeanChThrough
put.TotalBytes

VS.RRC.AttConnEstab.Cell

VS.RRC.AttConnEstab.Cell

VS.RAB.AttEstab.AMR

VS.RAB.AttEstab.AMR

VS.RAB.AttEstCS.Conv.64

VS.RAB.AttEstCS.Conv.64

HSPA
Traffic

Call Attempt
Times

2016-12-19

Huawei Confidential

Page 28 of 131

RAN10 KPI Troubleshooting Guide

INTERNAL

Congestion

CE Used
Number

CE Used
Number

NodeB
Count

DL

UL

VS.RAB.AttEstabPS.Cell

VS.RAB.AttEstabPS.Cell

VS.HSDPA.RAB.AttEstab

VS.HSUPA.RAB.AttEstab

VS.LCC.LDR.Num.DLCE

VS.LCC.LDR.Num.ULCE

VS.LCC.LDR.Time.DLCE

VS.LCC.LDR.Time.ULCE

VS.RAB.FailEstPs.DLCE.Cong

VS.RAB.FailEstPs.ULCE.Co
ng

VS.RAB.FailEstCs.DLCE.Cong

VS.RAB.FailEstCs.ULCE.C
ong

VS.RRC.Rej.DL.CE.Cong

VS.RRC.Rej.UL.CE.Cong

VS.RRC.Rej.RL.Fail

VS.RRC.Rej.RL.Fail

VS.LC.DLCreditUsed.CELL

VS.LC.ULCreditUsed.CELL

VS.LC.DLCreditUsed.CELL.M
ax

VS.LC.ULCreditUsed.CELL.
Max

VS.LC.DLCreditUsed.CELL.Mi
n

VS.LC.ULCreditUsed.CELL.
Min

VS.DLCE.Mean.Shared

VS.ULCE.Mean.Shared

VS.DLCE.Max.Shared

VS.ULCE.Max.Shared

Figure 5 lists the number of CEs consumed by different services:


Figure 5 Number of CEs consumed by the DCH service

2016-12-19

Directio
n

Spreadi
ng
Factor

Number of
CEs
Consumed

Corresponding
Credits
Consumed

Typical
Traffic Class

DL

256

3.4 kbit/s SRB

UL

256

DL

128

UL

64

DL

128

UL

64

DL

32

UL

16

DL

64

UL

32

1.5

Huawei Confidential

13.6 kbit/s SRB

12.2 kbit/s AMR

64 kbit/s VP

32 kbps PS

Page 29 of 131

RAN10 KPI Troubleshooting Guide

INTERNAL

Directio
n

Spreadi
ng
Factor

Number of
CEs
Consumed

Corresponding
Credits
Consumed

Typical
Traffic Class

DL

32

64 kbit/s PS

UL

16

DL

16

UL

10

DL

UL

10

20

128 kbit/s PS

384 kbit/s PS

Figure 6 Number of CEs consumed by the HSUPA service


Direction

Spreading
Factor

HSUPA
Phase 1

HSUPA Phase 2

Typical
Traffic
Class

UL

64

1+1+1

UL

32

1+1+1.5

1.5

64 kbit/s

UL

16

1+1+3

128 kbit/s

UL

1+1+5

256 kbit/s

UL

1+1+10

10

384 kbit/s

UL

2 x SF4

1+1+20

20

1.45 Mbit/s

UL

2 x SF2

Not supported

32

2.04 Mbit/s

UL

2 x SF2 + 2 x SF4

Not supported

48

5.76 Mbit/s

The following section describes the judgment methods and solution suggestions in
different scenarios of CE congestion:

High traffic congestion causes CE congestion


<Features>:

The CE Used Number is large, and approaches to the license capability.

CE admission denial occurs when traffic is in peak hours.

CE congestion disappears gradually while traffic decreases.

<Method of analysis>:

2016-12-19

Analyze the performance data of the congested cells, find the NodeB to which the
congested cells belong, and obtain the KPI counter of all the cells of the NodeB.

Query the total number of CEs consumed by all cells under the NodeB (through the
performance data of the RNC) and the CE count measured by the NodeB (the
number of consumed CEs measured by the NodeB), and check whether they reach
the upper limit of CE capability (uplink: license110%-UlHoCeResvSf; downlink:
Huawei Confidential

Page 30 of 131

RAN10 KPI Troubleshooting Guide

INTERNAL

license110%-DlHoCeCodeResvSf). UlHoCeResvSf and DlHoCeCodeResvSf are


configured by the RNC MML.

Calculate the number of consumed CEs equivalently through the number of RBs of
each cell under the NodeB, and check whether they reach the upper limit of CE
capability.

Check whether CE congestion disappears gradually while traffic (CE Used


Number, RBs, and HSPA subscribers) decreases gradually.
If all the preceding conditions are met, you can basically determine that high traffic
causes CE congestion.

<Suggestions>:
You can take the following measures:

If the CE-based LDR function is not enabled in the existing network, you can
consider enabling the CE LDR algorithm to relieve the impacts of CE congestion.

If CE-based LDR function is enabled in the existing network, you can check
whether VS.LCC.LDR.Num.DLCE, VS.LCC.LDR.Num.ULCE,
VS.LCC.LDR.Time.DLCE, and VS.LCC.LDR.Time.ULCE are validated through
the following performance data.
If the preceding indicators do not measure the count and duration in LDR state, it
indicates that the equipment does not enter the LDR state. The possible causes are
as follows:

A)

The NodeB reports the CE capability according to the standard of configured


licenses 110%. If the configured licenses 110% LDR threshold
(UlLdrCreditSfResThd/DlLdrCreditSfResThd) exceeds the hardware capability
of the NodeB, the equipment can never enter the LDR state. The RNC triggers
the LDR function by judging whether the difference between the CE capability
reported by the NodeB (configured licenses 110%) and the number of
currently consumed CEs reaches the LDR threshold.

B)

The functions of the product are defective.

A)

If you enable HSUPA DCCC, you must configure HSUPA admission to be


based on MBR access.

B)

If you enable dynamic CEs of the NodeB, you must disable HSUPA DCCC and
configure HSUPA admission to be based on GBR access.

When the HSUPA function is enabled in the existing network, you can enable the
dynamic CE function of the NodeB or HSUPA DCCC function if uplink CE
congestion is severe. Note the following points:

Expand the capacity, purchase CEs, or add TRXs.

CE congestion is caused because the residual CEs maintained by the NodeB are not
consistent with those maintained by the RNC.
<Features>:

The CE Used Number is not high enough to reach the license capability.

CE admission denial occurs even if traffic is not high.

<Method of analysis>:

2016-12-19

Analyze the performance data of the congested cells, find the NodeB to which the
congested cells belong, and obtain the performance data of all the cells of the
NodeB.

Huawei Confidential

Page 31 of 131

RAN10 KPI Troubleshooting Guide

INTERNAL

Query the total number of CEs consumed by all cells under the NodeB and the CE
Count measured by the NodeB, and check whether they are below the upper limit
of CEs.

Calculate the number of consumed CEs equivalently through the number of RBs of
each cell under the NodeB (query the number of RBs of each cell through the
performance data, and then calculate the number of consumed CEs according to the
CE consumption rules), and check whether it is below the upper limit of CEs.

If all the preceding conditions are met, you can basically determine that the
problem is caused because the residual CEs maintained by the NodeB are not
consistent with those maintained by the RNC (the former is less than the latter).
The possible cause is NodeB CE leakage.

<Suggestions>:
In case of NodeB CE leakage, you need to contact the Maintenance Department for
further analysis.

VS.RRC.Rej.RL.Fail:
During RRC connection setup, the NodeB judges setup failure. The possible cause is
that the internal resources (hardware CE capability and logical resource) of the NodeB
are not enough.

The hardware CE capability is not enough


<Features>:

The CE Used Number is large, and approaches to the upper limit of CEs.

In peak hours, RL Reject occurs more frequently.

RL Reject becomes normal gradually while traffic decreases.

<Method of analysis>:

Analyze the performance data of the RL Reject cells, find the NodeB to which the
RL Reject cells belong, and obtain the performance data of all the cells of the
NodeB.

Query the total number of CEs consumed by all cells under the NodeB and the CE
Count measured by the NodeB, and check whether they approach to the upper limit
of the hardware CE capability of the NodeB.
If all the preceding conditions are met, you can basically determine that the
problem is caused by the constraint of hardware specifications of the NodeB. The
NodeB reports the CE capability according to the standard of the configured
licenses 110% regardless of the hardware specifications. If License110%
UlHoCeResvSf or license110% DlHoCeCodeResvSf exceeds the hardware
capability of the NodeB, the problem occurs.

<Suggestions>:

In the subsequent R11 version, the hardware specifications are taken into account
when the NodeB reports the CE capability. Then, the problem does not occur.

To avoid the problem, you can decrease the number of configured licenses. As a
result, the impacts of congestion can be relieved through the LDR function.

Other internal resources of the NodeB are not enough


The probability of occurrence is low. Feed back the occurrence (if available) to the
R&D department for analysis.

VS.RRC.Rej.Code.Cong
RRC setup rejection is mainly caused by the insufficiency of code resources. In a hightraffic scenario (for example, indoor micro-cell coverage), code resources may be not

2016-12-19

Huawei Confidential

Page 32 of 131

RAN10 KPI Troubleshooting Guide

INTERNAL

enough. You need to expand its capacity. Query the following count values and
determine whether the problem is caused by high traffic.
Figure 7 Analysis of cell code congestion indicators
DL

Traffic

VS.AMR.
Ctrl.DL12
.2

CS Erlang

VS.RB.DLConvCS.64
VS.RB.DLInterPS.8

PS Erlang

VS.RB.DLInterPS.16
VS.RB.DLInterPS.32
VS.RB.DLInterPS.64
VS.RB.DLInterPS.128
VS.RB.DLInterPS.144
VS.RB.DLInterPS.256
VS.RB.DLInterPS.384
VS.RB.DLBkgPS.8
VS.RB.DLBkgPS.16
VS.RB.DLBkgPS.32
VS.RB.DLBkgPS.64
VS.RB.DLBkgPS.128
VS.RB.DLBkgPS.144
VS.RB.DLBkgPS.256
VS.RB.DLBkgPS.384
Congestion

VS.RAB.FailEstPs.Code.Cong
VS.RAB.FailEstPs.Code.Cong
VS.RRC.Rej.Code.Cong

<Suggestions>:

Check the code setting of the HSDPA. The following configuration is


recommended:
ADD CELLHSDPA: AllocCodeMode=Manual, HsPdschCodeNum=1; /// The
RNC is statically configured with one HSPDSCH code.
SET MACHSPARA: DYNCODESW=OPEN; /// Enable the dynamic code
switch of the NodeB

2016-12-19

Expand the capacity

VS.RRC.Rej.ULIUBBandCong/ VS.RRC.Rej.DLIUBBandCong

Huawei Confidential

Page 33 of 131

RAN10 KPI Troubleshooting Guide

INTERNAL

RRC setup failure is mainly caused by the transmission congestion on the IUB
interface. You can check the traffic and transmission configuration of the cells, and
thus judge whether the problem is caused by the insufficiency of transmission
resources.
Figure 8 Analysis of transmission congestion indicators

Congesti
on

Iub
bandwidt
h utility
ratio

ATM

IP

DL

UL

VS.RRC.Rej.DLIUBBandCong

VS.RRC.Rej.ULIUBBa
ndCong

VS.RAB.FailEstab.CS.DLIUBBand.Cong

VS.RAB.FailEstab.CS.
ULIUBBand.Cong

VS.RAB.FailEstab.PS.DLIUBBand.Cong

VS.RAB.FailEstab.PS.
ULIUBBand.Cong

VS.AAL2PATH.PVCLAYER.TXBYTES

VS.AAL2PATH.PVCL
AYER.RXBYTES

VS.QAAL2.AllocedFwd.AAL2BitRate

VS.QAAL2.AllocedBw
d.AAL2BitRate

VS.QAAL2.AllocedMaxFwd.AAL2BitRat
e.Value

VS.QAAL2.AllocedMa
xBwd.AAL2BitRate.Val
ue

VS.IPPATH.IPLAYER.TXBYTES

VS.IPPATH.IPLAYER.
RXBYTES

OS.ANI.IP.AllocedFwd

OS.ANI.IP.AllocedBwd

The following section describes several important concepts about Iub admission:

Iub bandwidth admission is based on the allocated bandwidth regardless of the


actual traffic.

In the versions later than the RAN10, the bandwidth is allocated for the PS service
according to GBR Active Factor.

The RAN10 provides the corresponding count indicators for both actual traffic and
allocated bandwidth of the Iub interface, but they need to be converted. The
admission is based on the PVC traffic consumed by the user. All traffic needs to be
converted to the PVC layer.
The following section describes several important count indicators:

The following section describes the calculation of actual traffic on the Iub interface
by taking the downlink as an example:
ATM (kbps): SUM (VS.AAL2PATH.PVCLAYER.TXBYTES) 8 / 3600 / 1000
Meaning: Add up the traffic of all AAL2PATHs of the Iub, and have the sum
divided by the time, thus obtaining the actual traffic (kbps)
The traffic measurement is performed in the PVC layer, so it does not need to be
converted.
IP (kbps): SUM(VS.IPPATH.IPLAYER.TXBYTES) 8 / 3600 / 1000
Meaning: Add up the traffic of all IPPATHs of the Iub, and have the sum divided by
the time, thus obtaining the actual traffic (kbps)

2016-12-19

Huawei Confidential

Page 34 of 131

RAN10 KPI Troubleshooting Guide

INTERNAL

The traffic measurement is performed in the IP layer, so it does not need to be


converted.

By taking the downlink as an example, the following section describes bandwidth


allocation of the Iub interface:
ATM (kbps): VS.QAAL2.AllocedFwd.AAL2BitRate 53 / 48 /1000
Meaning: Convert the allocated bandwidth of the Qaal2 adjacent point
corresponding to the NodeB. The conversion from the AAL2 layer to the PVC
layer is 53/48.
IP (kbps): OS.ANI.IP.AllocedFwd /1000
Meaning: OS.ANI.IP.AllocedFwd is the traffic of the IP layer, so it does not need to
be converted.
Generally, the allocated bandwidth should be approximate to the actual traffic.
Then, the configuration of the activation factor is appropriate. If there is a great
difference between them, you can optimize the configuration of the activation
factor appropriately.
If IUB congestion causes RRC access failure, the reason is usually that traffic
increases or the activation factor is not configured reasonably. Therefore, you need
to increase the Iub bandwidth or optimize the configuration of the activation factor.
The following section gives the judgment method and solution suggestions:

<Features>:

The allocated bandwidth of the Iub interface is high and is approximate to the
configured bandwidth.

<Method of analysis>:

Measure the actual traffic and allocated bandwidth (average value per hour) of the
Iub interface through the performance data. If the allocated bandwidth is high and
is approximate to the configured transmission bandwidth, the transmission
bandwidth may be congested.

<Suggestions>:

If the actual traffic is approximate to the allocated bandwidth, it indicates that high
traffic causes transmission congestion. The first consideration is to expand the
capacity and increase the bandwidth of the Iub interface.

If the actual traffic is low but the allocated bandwidth is high, it indicates that the
problem is caused by the inappropriate setting of the activation factor. You can
reduce the activation factor appropriately. Raise the transmission utilization.

Other possible optimization means are to modify the service GBR and modify the
FP mode into the Silent mode. However, the two means are not recommended.

VS.RRC.Rej.AAL2.Fail:
The AAL2 Path setup fails on the Iub interface because the transmission is abnormal.
Such setup failure does not frequently occur in the existing network. If such cause
leads to KPI deterioration, feed back the problem to the R&D department.

RRC.FailConnEstab.NoReply
There are the following Noreply scenarios:

2016-12-19

Uu Noreply is caused by cell reselection over different subsystems.

The RNC receives the RRC Connection Request message sent by the UE and
delivers the RRC Connection Setup message, but the UE does not receive the RRC
Connection Setup message (excluding the part of cell reselection over different
subsystems).

Huawei Confidential

Page 35 of 131

RAN10 KPI Troubleshooting Guide

INTERNAL

The UE receives the RRC Connection Setup message, but does not send the RRC
Setup Complete message.

The UE sends the RRC Setup Complete message, but the RNC does not receive the
message.
It is difficult to judge whether the UE receives the RRC Connection Setup message
only through the CHR log or performance data. To attain a definite result, you must
conduct drive test. Of course, you can attain the preliminary analysis result through
the CHR log or performance data before conducting the drive test.
The following section describes the judgment methods and solution suggestions in
different scenarios:

Uu Noreply is caused by cell reselection over different subsystems


<Features>:

In the PCHR log, you can find that there exists the access success log nearby the
point of access failure time of the same subscriber.

The analysis data of Germanys O2 and Spains VDF shows that the part accounts
for about 40% of total RRC access failure count.

<Method of analysis>:
As instructed in the following operation guide, you can directly obtain the count and
proportion of cell reselections over different subsystems. The operation remains yet to
be attached here. The operation guide has been prepared well, but its size is large.
Alternatively, analyze the problem as follows:
For a RRC access failure recorded in the PCHR log, you can determine that the
problem is caused by cell reselection over different subsystems under the following
circumstances:

The last access of the corresponding subscriber is normal

The RL Release time is later than the time of the current access failure.

Alternatively,

The next access of the corresponding subscriber is normal,

The difference between the time of the next normal access and the time of the
current access failure is less than (N300+1) T300,

The cell of access failure and the cell of access success are not in the same subsystem.
<Suggestions>:

In the RNC RAN11 050, the RRC access failure caused by cell reselection over
different subsystems is not considered as UU Noreply.

Provide a clarification report for the customer, thus explaining the impacts of cell
reselection over different subsystems and excluding such impacts.

The RNC receives the RRC Connection Request message sent by the UE and delivers
the RRC Connection Setup message, but the RNC does not receive the RRC Setup
Complete message.
<Method of analysis>:

First exclude the RRC access failure caused by cell reselection over different
subsystems through the PCHR log.

If UU Noreply is not caused by cell reselection over different subsystems,


discriminate the following scenarios and then analyze the problem deeply:
The RNC receives the RRC Connection Request message sent by the UE and
delivers the RRC Connection Setup message, but the UE does not receive the RRC

2016-12-19

Huawei Confidential

Page 36 of 131

RAN10 KPI Troubleshooting Guide

INTERNAL

Connection Setup message (excluding the part of cell reselection over different
subsystems).
The UE receives the RRC Connection Setup message, but does not send the RRC
Setup Complete message.

The UE sends the RRC Setup Complete message, but the RNC does not receive the
message.
For details about the judgment methods and solution suggestions, see the following
section. However, the most direct method is to conduct drive test and make
signaling analysis. Therefore, the <features> in the following section define several
common judgment criteria, which are not absolute.

The RNC receives the RRC Connection Request message sent by the UE and delivers
the RRC Connection Setup message, but the RNC does not receive the message
(excluding the part of cell reselection over different subsystems).
<Features>:
Through the IOS, you can find that the RRC Connection Setup message is sent
repeatedly on the UU interface (based on the N300).
The possible causes are as follows:

The FACH coverage is poor.

The cell selection and reselection parameters are not set reasonably.

The equipment is abnormal or packets are lost during the transmission.

<Method of analysis>:

Analyze the EC/N0 information reported by the UE in the RRC Connection


Request message (you can obtain the EC/N0 information through the PCHR log). If
the EC/N0 value is lower than 12 dB (the default value), it indicates that the
problem is caused by poor coverage.

If the monitoring set in the RRC Connection Request message contains better cells,
it indicates that the problem may be caused by cell reselection.

If the EC/N0 reported by the UE in the RRC Connection Request message is higher
than 7 dB, it indicates that the equipment is abnormal or packets are lost during
the transmission (which seldom occurs).

<Suggestions>:

2016-12-19

If the problem is caused by poor coverage, you can take appropriate measures to
enhance the coverage, for example, add sites to fill the blind spots and adjust the
engineering parameters. If you cannot enhance the coverage, you can raise the
RACH power appropriately. During the adjustment, you need to consider the
PCPICH EC/Io coverage of the existing network. For example, if the pilot Ec/Io in
the coverage area is higher than -12 dB after network optimization, you can ensure
the access success rate of the UE at the 3G idle state as long as the matching
proportion of the power of public channels is configured to ensure that the Ec/Io is
higher than -12 dB. If the UE is reselected to the GSM when the pilot Ec/Io is
lower than -12 dB, you can ensure the RRC setup success rate of the UE in a weaksignal coverage area after cell reselection over different subsystems as long as the
matching proportion of the power of public channels is configured to ensure that
the Ec/Io is higher than -14 dB.

If the cell selection and reselection parameters are not set reasonably, you can
modify such parameters to raise the speed of cell selection and reselection.

If the EC/N0 value is ideal but the RRC Connection Setup message is not received,
feed back the symptom to the R&D department.

Huawei Confidential

Page 37 of 131

RAN10 KPI Troubleshooting Guide

INTERNAL

The RRC CONNECTION SETUP message is carried by the FACH. The UE sends the RRC
CONNECTION REQUEST message through the RACH after the preamble of the PRACH is received
at the UTRAN side and the power of the preamble is used as the benchmark. The transmit power of
the preamble can increase continuously until the UE receives a response (restricted by the maximum
count of preamble retransmissions). In some poor-coverage areas, the imbalance may occur between
the RACH coverage and FACH coverage. As a result, the RRC setup request sent by the UE can be
received at the UTRAN side, but the UE cannot receive the RRC Connection Setup sent by the RNC.

The UE receives the RRC Connection Setup message, but does not send the RRC
Setup Complete message.
<Features>:

Through the IOS, you can find that the RRC Connection Setup message is sent
infrequently on the UU interface and that the sending count does not reach the
count as specified by the N300.

If RRC access is based on the DCH, you do not find the RL Restore message on the
Iub interface.

If RRC access is based on the DCH, you can find that the transmit power of the UE
is low.
If both feature 1 and feature 2 (or feature 3) are available, it is probable that the UE
receives the RRC Connection Setup message but does not send the RRC Setup
Complete message.
If RRC access is based on the CCH and feature 1 appears, it is probable that the UE
receives the RRC Connection Setup message but does not send the RRC Setup
Complete message or that the UE sends the RRC Setup Complete message but the
RNC does not receive the message.
The possible causes are as follows:

Downlink synchronization fails.

The UE is abnormal.
If the RRC Setup Complete message is sent through the DCH, the UE does not
send the RRC Setup Complete signaling on the uplink unless the downlink is
synchronized in accordance with the description of Synchronization procedure A in
procedure A (The UE shall not transmit on uplink until higher layers consider the
downlink physical channel established).
Section 25.214 gives the following description: The UE establishes downlink chip
and frame synchronization of DPCCH, using the P-CCPCH timing and timing
offset information notified from UTRAN. Frame synchronization can be confirmed
using the frame synchronization word. Therefore, if the UE cannot synchronize the
physical downlink channel, the cause may be related to the power of public
channels or the power of initial downlink DPCCH. The power of public channels is
determined when the cells are configured. Except the power of the PCPICH, the
power of other channels is relative to that of the PCPICH. The power of the
downlink DPCCH is informed to the NodeB by the RNC when the RL SETUP
REQ message is sent. The power is estimated by using the open-loop power
algorithm. The formula is as follows:

PTxInitial

CPICH _ Tx _ power
R
( Eb / N o ) DL [
PtxTotal ]
W
( E c / N o ) CPICH

refers to the downlink orthogonalization factor, and (Ec/No)cpich refers to the


coverage status at the UE location.
In addition to by guess, you can also judge whether the downlink is synchronized
by the TPC received by the UE and transmit power of the UE. Section 25.214 gives

2016-12-19

Huawei Confidential

Page 38 of 131

RAN10 KPI Troubleshooting Guide

INTERNAL

the following description: UTRAN shall start the transmission of the downlink
DPCCH and may start the transmission of DPDCH if any data is to be transmitted.
The initial downlink DPCCH transmit power is set by higher layers. Downlink
TPC commands are generated as described in 5.1.2.2.1.2. Therefore, the downlink
DPCCH power is transmitted after the RL is established. In accordance with the
preceding description, the Pattern of the downlink TPC command word is to insert
one 1 after the n 01s. The n is informed to the NodeB by the RNC when cells
are established. The parameter name is DlTpcPattern01Count. If the UE can
resolve the downlink DPCCH, e2n+1 slots can raise the power by 1 dB until the
NodeB judges that the uplink channel is synchronized. If the downlink is
synchronized, the transmit power of the UE should increase from the minimum
value to a high value within 1 second. If the UE does not show the symbol that the
transmit power increases, you can basically determine that the physical downlink
channel of the UE is not synchronized.

If the UE can normally receive the uplink TPC and raise the power according to TPC,
you can determine that the UE is abnormal.
<Method of analysis>:

Check whether the transmit power of the UE increases till the maximum value. If
the transmit power does not increase, it indicates that the downlink is not
synchronized.

If the downlink power of the UE increases but the RRC Setup Complete message is
not on the uplink, it indicates that the UE is abnormal.

<Suggestions>:

If the downlink is not synchronized, you can raise the power of the PCPICH or
raise the initial transmit power of the downlink DPCH. However, the RNC does not
provide a parameter for controlling the initial transmit power of the downlink
DPCH separately, but can only control the minimum transmit power of the DPCH.
By configuring the minimum transmit power parameter of the DPCH, you can
control its initial transmit power.

It is improbable that the UE is abnormal. If the UE is really abnormal, you can


provide a clarification report or inquire the IOT about the related test results.

The UE sends the RRC Setup Complete message, but the RNC does not receive the
message.
<Features>:

Through the IOS, you can find that the RRC Connection Setup message is sent
infrequently on the UU interface and that the sending count does not reach the
count as specified by the N300.

If RRC access is based on the DCH, you can find the RL Restore message on the
Iub interface.

If RRC access is based on the CCH, the RACH has lots of bit errors.
If both feature 1 and feature 2 (or feature 3) are available, it is probable that the UE
sends the RRC Setup Complete message but the RNC does not receive the
message.

The possible causes are as follows:

The RACH has bit errors.

Uplink synchronization fails.

Packets are lost during the transmission.

<Method of analysis>:

2016-12-19

Huawei Confidential

Page 39 of 131

RAN10 KPI Troubleshooting Guide

INTERNAL

If RRC access is based on the CCH, it is possible that the RACH has bit errors. You
can check the VS.ULBler.PSNrt.Rach8 and VS.MeanRTWP values. If
VS.ULBler.PSNrt.Rach8 or VS.MeanRTWP is high, it is possible that the RTWP
interference on the uplink causes the bit errors on the RACH and thus the RNC
cannot receive the RRC Setup Complete message correctly.

If RRC access is based on the DCH, it is possible that the uplink is not
synchronized. You can check whether the RL Restore Indication is available on the
Iub interface. If not, it is possible that the initial transmit power of the dedicated
uplink channel is relatively low.

If the RL Restore message is available but the RNC cannot receive the RRC Setup
Complete message correctly, it is possible that packets are lost during the
transmission or the equipment is abnormal. You need to feed back the symptom to
the R&D department.

<Suggestions>:

If the RACH has bit errors and the RTWP is extremely high, eliminate the uplink
interference according to the RTWP Check List.

If the problem is caused by the failure of uplink synchronization, the transmit


power of the UE increases by controlling the initial uplink power, which occurs
improbably. The occurrence of such problem can raise the Constant Value of the
dedicated channel, thus raising the initial transmit power of the uplink DPCCH of
the UE. In addition, the problem is related to the setting of the initial target value of
the uplink SIR, which has a great impact on the initial uplink synchronization at the
time of initial link establishment. If the parameter is set to an extremely large value,
overhigh uplink interference may be caused to the link initially established for the
UE. If the parameter is set to an extremely small value, the time of uplink
synchronization is prolonged and even initial synchronization fails. The parameter
is an RNC-level parameter and has a great impact on network performance.
Therefore, you need to modify the parameter with caution.

The RRC CONNECTION SETUP COMPLETE message is sent through the DPCH, and the UE
calculates the initial power of the uplink DPCCH according to the received
IE"DPCCH_Power_offset" and measured CPICH_RSCP value.
DPCCH_Initial_power = DPCCH_Power_offset - CPICH_RSCP
DPCCH_Power_offset is equal to Primary CPICH DL TX Power + UL Interference + Constant Value.
The Constant Value parameter can be configured on the background. If the Constant Value parameter
is set to an extremely low value, it is possible that the transmit power of the UE is not enough when
the UE sends the RRC CONNECTION SETUP COMPLETE message. However, the problem usually
does not occur under the current default parameter setting (in the V13C03B151 version, the default
value is -20).

If the RL Restore message is received but the RRC Setup Complete message is not
available, it is possible that packets are lost during the transmission or the
equipment is abnormal. You need to feed back the symptom to the R&D
department.

2.4 List of Problem Information


Checklist for KPI
Troubleshooting-2.4 .xls

2016-12-19

Huawei Confidential

Page 40 of 131

RAN10 KPI Troubleshooting Guide

INTERNAL

RAB Access Success Rate


(AMR/PS/VP/HSPA)

3.1 KPI Definition


RAB setup success rate = (RAB setup success count)/(RAB attempt count)
VS.RAB.SuccEstabCS.AMR.Cell.Rate = <VS.RAB.SuccEstab.AMR> / <VS.RAB.AttEstab.AMR>
VS.RAB.SuccEstabPS.Cell.Rate = ( <VS.RAB.SuccEstabPS.Conv> + <VS.RAB.SuccEstabPS.Str> +
<VS.RAB.SuccEstabPS.Inter> + <VS.RAB.SuccEstabPS.Bkg> )/( <VS.RAB.AttEstabPS.Conv> +
<VS.RAB.AttEstabPS.Str> + <VS.RAB.AttEstabPS.Inter> + <VS.RAB.AttEstabPS.Bkg> )

3.2 Influence Factors


The process of RAB connection setup includes the following steps:
1.

The CN sends the RAB ASSIGNMENT REQUEST message to the RNC through the
IU interface.

2.

After receiving the RAB ASSIGNMENT REQUEST message, the RNC determines
that it needs to establish a new RAB. The RNC first performs resource admission.

3.

If resource admission fails, the RNC returns the RAB ASSIGNMENT RESPONSE
message to the CN.

4.

If resource admission is successful, the RNC sends the RADIO BEARER SETUP
message to the UE. If the radio bearer setup fails, the UE returns the RADIO
BEARER SETUP FAILURE message to the RNC. If receiving the RADIO BEARER
SETUP FAILURE message or no response, the RNC returns the RAB ASSIGNMENT
RESPONSE message to the CN.
RAB setup fails under the following scenarios:

2016-12-19

The RNC receives the RAB ASSIGNMENT REQUEST message, and the admission
of code, CE or power resource fails.

The RNC receives the RAB ASSIGNMENT REQUEST message. The admission of
system resources (for example, the memory) fails.

Huawei Confidential

Page 41 of 131

RAN10 KPI Troubleshooting Guide

INTERNAL

After receiving the RAB ASSIGNMENT REQUEST message, the RNC sends the
RADIO BEARER SETUP message to the UE, but does not receive the RADIO
BEARER SETUP COMPLETE message sent by the UE.

After receiving the RAB ASSIGNMENT REQUEST message, the RNC sends the
RADIO BEARER SETUP message to the UE and receives the RADIO BEARER
SETUP FAILURE message sent by the UE.
Usually, RAB setup failure is caused by the following factors:

Resource congestion

Downlink coverage

Downlink synchronization

Uplink synchronization

The equipment is abnormal.

RAB parameters unsupported


Resource congestion includes power resource congestion, CE resource congestion,
code resource congestion, and transmission resource congestion. For the problem
caused resource congestion, you need to first check the actual utilization of resources,
and analyze the correctness of congestion threshold and configurations.
The problems related to downlink coverage and downlink synchronization mainly
occur when RAB setup fails under the DRD scenarios.

3.3 Analysis Process


1.

Discussing the Problem, Ascertaining the Problem Background and Product


Version, and Excluding the Impacts of Known Bugs
Ask the field personnel to feed back the related information, obtain the known bug
information about the corresponding version (you can inquire of the related contact
person of the product or inquire about the information about similar problems of other
sites), and determine whether the problem is a known problem.
Determine the time at which the RAB setup success rate is changed, analyze whether
the problem is caused by network adjustment, and focus on the impacts of network
adjustment.

2.

Narrowing the Analysis Scope, Analyzing Whether the Problem Occurs in Only
One or Two Cells, and Analyzing Whether the Top N Cells are Representative
Analyze the change of the causes of RAB access failure through the performance
counters on the RNC, and analyze which factor causes the decline of RAB setup
success rate.

Figure 1 Indicators of CS RAB setup failure


Measurement
Item
Level 1

Sub
Items

Sub
Items

Level 2

Level 3

Sub Items
Level 4

VS.RAB.FailEstCs.Power.Cong
VS.RAB.FailEstCs.Code.Cong
VS.RAB.FailEstab.CS.DLIUBBand.Cong

2016-12-19

Huawei Confidential

Page 42 of 131

VS.RAB.FailEstabCS.RNL

Level 1

Sub
Items

Sub
Items

Level 2

Level 3
VS.RAB.FailEstabCS.Cong

Measurement
Item

INTERNAL

VS.RAB.FailEstCS.Unsp

RAN10 KPI Troubleshooting Guide

Sub Items
Level 4

VS.RAB.FailEstab.CS.ULIUBBand.Cong
VS.RAB.FailEstCs.ULCE.Cong
VS.RAB.FailEstCs.DLCE.Cong

VS.RAB.FailEstabCS.Unsp.Other
VS.RAB.FailEstCS.RIPFail
VS.RAB.FailEstCS.Relo
VS.RAB.FailEstabCS.RNL.Other
VS.RAB.FailEstabCS.TNL
VS.RAB.FailEstabCS.other.CELL

Figure 2 Indicators of PS RAB setup failure


Measurement
Item

Sub Items Level

Sub Items Level 3

VS.RAB.FailEstPS.Unsp

VS.RAB.FailEstPS.RNL

Level 1
VS.RAB.FailEstPs.Power.Cong
VS.RAB.FailEstPs.Code.Cong
VS.RAB.FailEstab.PS.DLIUBBand.Cong
VS.RAB.FailEstab.PS.ULIUBBand.Cong
VS.RAB.FailEstPs.ULCE.Cong
VS.RAB.FailEstPs.DLCE.Cong
VS.RAB.FailEstabPS.Unsp.Other
VS.RAB.FailEstPS.RIPFail
VS.RAB.FailEstPS.Par
VS.RAB.FailEstPS.Relo
VS.RAB.FlEstPS.RNL.Other
VS.RAB.FailEstPS.TNL
VS.RAB.FailEstPS.NResAvail

2016-12-19

Huawei Confidential

Page 43 of 131

RAN10 KPI Troubleshooting Guide

INTERNAL

VS.RAB.FailEstabPS.Other.Cell

3.

Analyzing the Causes of RAB Failure Deeply

VS.RAB.FailEstCs.Power.Cong /VS.RAB.FailEstPs.Power.Cong
The RNC RRM performs power admission algorithm decision. If finding the decision
on uplink or downlink admission denial, the RNC RRM initiates RAB setup rejection.
Power congestion occurs when the power admission switch is enabled (by running the
ADD CELLALGOSWITCH:; command) and network load is high. If RAB setup
success rate decreases because the indicator value becomes large suddenly, find the
Top N cells that cause power congestion and then query the changes of the maximum
RTWP (VS.MaxRTWP) and maximum TCP (VS.MaxTCP) of the TOP N cells. If the
RTWP increases severely, it indicates that the problem is caused by uplink power
congestion. If the TCP increases severely, it indicates that the problem is caused by
downlink power congestion.

For details about the causes of the rise in the RTWP and TCP, judgment methods, and
solution suggestions, see the section Analyzing the Main Causes of RRC Access
Failure Deeply VS.RAB.FailEstCs.ULCE.Cong/VS.RAB.FailEstCs.DLCE.Cong /
VS.RAB.FailEstPs.ULCE.Cong/ VS.RAB.FailEstPs.DLCE.Cong
The RNC RRM makes access algorithm decision. The RNC RRM can find the
admission denial because of the insufficiency of uplink or downlink CE resources, or
the count of RAB rejections because the NodeB returns CE Congestion when the RNC
delivers the RL_SETUP message.
The common causes of CE congestion are as follows:

High traffic

The residual CEs maintained by the NodeB are not consistent with those
maintained by the RNC.
For details about the analysis methods and solution suggestions, see the section
VS.RRC.Rej.UL.CE.Cong/ VS.RRC.Rej.DL.CE.Cong.
For details about the analysis methods and solution suggestions, see the

VS.RAB.FailEstCs.Code.Cong /VS.RAB.FailEstPs.Code.Cong
RAB setup rejection is mainly caused by the insufficiency of code resources. In a
high-traffic scenario (for example, indoor micro-cell coverage), code resources may be
not enough. You need to expand its capacity. Query Figure 7 to determine whether the
problem is caused by high traffic.
<Suggestions>:

Check the code setting of the HSDPA. The following configuration is


recommended:
ADD CELLHSDPA: AllocCodeMode=Manual, HsPdschCodeNum=1; /// The
RNC is statically configured with one HSPDSCH code.
SET MACHSPARA: DYNCODESW=OPEN; /// Enable the dynamic code
switch of the NodeB

Expand the capacity

VS.RAB.FailEstab.CS.DLIUBBand.Cong/VS.RAB.FailEstab.CS.ULIUBBand.Co
ng/VS.RAB.FailEstab.PS.DLIUBBand.Cong/VS.RAB.FailEstab.PS.ULIUBBand.
Cong
RRC setup failure is mainly caused by the transmission congestion on the IUB
interface. You can check the traffic and transmission configuration of the cells, and

2016-12-19

Huawei Confidential

Page 44 of 131

RAN10 KPI Troubleshooting Guide

INTERNAL

thus judge whether the problem is caused by the insufficiency of transmission


resources. For details about the related counts, see Figure 8.
For details about the analysis methods and solution suggestions, see the section
VS.RRC.Rej.ULIUBBandCong/ VS.RRC.Rej.DLIUBBandCong.

VS.RAB.FailEstabCS.Unsp.Other/ VS.RAB.FailEstabPS.Unsp.Other
The RAB setup failure here includes the following failure:

The QoS parameters require the RNC not to support RAB setup.

RRM admission fails.


The RAB setup failure here does not include the failure because of CE congestion,
code congestion, Iub congestion, or power congestion.
The common cause is as follows: The insufficiency of NodeB resources leads to
the RL Recfg failure. Through the IOS tracing of the top N cells, you can judge
whether the RL Recfg Fail on the Iub interface is caused by
RADIO_RESOURCES_NOT_AVAILABLE.
The insufficiency of resources includes the insufficiency of CE hardware resources
and other resources.

The CE Hardware Resources are not Enough


<Features>:

The CE Used Number is large, and approaches to the upper limit of CEs.

In peak hours, Unsp.Other occurs more frequently.

Unsp.Other becomes normal gradually while traffic decreases.

Check the signaling on the Iub interface, and determine whether RL Recfg Fail is
caused by RADIO_RESOURCES_NOT_AVAILABLE.

<Method of analysis>:

Analyze the performance data of the Unsp.Other cells, find the NodeB to which the
Unsp.Other cells belong, and obtain the performance data of all the cells of the
NodeB.

Query the total number of CEs consumed by all cells under the NodeB and the CE
Count measured by the NodeB, and check whether they approach to the upper limit
of the hardware CE capability of the NodeB.
If all the preceding conditions are met, you can basically determine that the
problem is caused by the constraint of hardware specifications of the NodeB. The
NodeB reports the CE capability according to the standard of the configured
licenses 110% regardless of the hardware specifications. If License110%
UlHoCeResvSf or license110% DlHoCeCodeResvSf exceeds the hardware
capability of the NodeB, the problem occurs.

<Suggestions>:

In the subsequent R11 version, the hardware specifications are taken into account
when the NodeB reports the CE capability. Then, the problem does not occur.

To avoid the problem, you can decrease the number of configured licenses. As a
result, the impacts of congestion can be relieved through the LDR function.

Unsp.Other Failure Caused by other Factors


Collect the IOS information about the performance data, CHR, and top N cells, and
return the information to the R&D department for analysis.

2016-12-19

VS.RAB.FailEstCS.RIPFail/ VS.RAB.FailEstPS.RIPFail

Huawei Confidential

Page 45 of 131

RAN10 KPI Troubleshooting Guide

INTERNAL

When the RNC sends the RAB ASSIGNMENT RESPONSE message about RAB
assignment failure to the CN, the indicator is measured in the best cell of the UE if the
failure cause value is Failure in the Radio Interface Procedure.
When analyzing such failure, you need to consider the cause of RB setup failure, and
analyze the cause of RIPFail more deeply. Figure 1 lists the related counts.
Figure 1 Indicators of PS RB setup failure
Measurement Item

Description

VS.FailRBSetup.CfgUnsup

Configuration unsupported

VS.FailRBSetup.PhyChFail

Physical channel failure

VS.FailRBSetup.CellUpd

Cell update occurred

VS.FailRBSetup.IncCfg

Invalid configuration

VS.FailRBSetup.NoReply

No reply

The following section describes the judgment methods and solution suggestions in
different RIPFail scenarios:

VS.FailRBSetup.CfgUnsup
In the RB setup phase, the UE returns the RB setup failure message. The cause value is
Configuration unsupported. Usually, the failure is mainly caused because the UE
capability does not support RB setup. For example, the UE receives the RB setup
request for the VP service (the VP is calling or called) when the UE is using the 128Kbps downlink data service. Most terminals do not support the concurrent VP service
and high-speed PS service on the downlink. Therefore, the UE directly returns the RB
setup failure message, and the cause value is unsupported configuration.
If such failures increase or the failure is the main factor that affects the RAB access
success rate, you can analyze the distribution of the UEs that undergo the failure
according to the PCHR and determine whether the failure focuses on specific
subscribers. If yes, it indicates that the performance of the UEs is defective.

VS.FailRBSetup.PhyChFail
In the RB setup phase, the UE returns the RB setup failure message and the cause
value is Physical channel failure. After the UE receives the RB SETUP message, the
downlink DPDCH cannot be synchronized. For details about the synchronizationrelated problems, see the section RRC.FailConnEstab.NoReply. The following section
describes the optimization measures:
If the downlink is not synchronized, you can raise the power of the PCPICH or raise
the initial transmit power of the downlink DPCH. However, the RNC does not provide
a parameter for controlling the initial transmit power of the downlink DPCH
separately, but can only control the minimum transmit power of the DPCH. By
configuring the minimum transmit power parameter of the DPCH, you can control its
initial transmit power.

VS.FailRBSetup.CellUpd
At present, the customer has not encountered the RAB access failure because of the
cause. In case of such failure, return the problem to the R&D department for analysis.

VS.FailRBSetup.IncCfg
In the RB setup phase, the UE returns the RB setup failure message. The cause value is
Invalid Configuration. If such failures increase or the failure is the main factor that

2016-12-19

Huawei Confidential

Page 46 of 131

RAN10 KPI Troubleshooting Guide

INTERNAL

affects the RAB access success rate, you can analyze the distribution of the UEs that
undergo the failure according to the PCHR. The current analysis shows that some
terminals report the RB setup failure because of invalid Configuration incorrectly in
the following specific flows:
After receiving the RB SETUP message and before returning the RB SETUP
COMPLETE message, the UE returns the RRC_RB_SETUP_FAIL (invalid
Configuration) message to the RNC if receiving the RRC_DL_DIR_TRANSF
(Disconnect) message. For details, see the following figure.
Figure 2 Flow on RB setup failure because of invalid configuration

On the IU interface, the scenario is Normal Release and the failure cannot be
considered as RAB access failure, as shown in the following figure.

Figure 3 Models of the known UEs that have invalid configuration

2016-12-19

IMEI(IMSI)

UE TYPE

Produce by

35170801.429053.0(262073937151768)

K800C/K800i

SonyEricsson

35159602.421420.0(262074970737150)

W880i

SonyEricsson

Huawei Confidential

Page 47 of 131

RAN10 KPI Troubleshooting Guide

INTERNAL

35342701.649470.0(262074905086910)

K800C/K800i

SonyEricsson

VS.FailRBSetup.NoReply
In the RB setup phase, the RNC delivers the RB SETUP message, but does not receive
any response. Therefore, the RNC considers that RB setup fails. The main causes are
as follows:

The downlink SRB1 is abnormal, so the UE does not receive the RB SETUP
message. The RNC RLC is reset or RbSetupRspTmr times out.

The UE receives the RB SETUP message and returns the RB SETUP COMPLETE
message. However, the NodeB cannot demodulate the RB SETUP COMPLETE
message because the uplink SRB2 is abnormal.

If the UE receives the RB SETUP message but the downlink cannot be synchronized, the UE returns
the RB SETUP FAILURE message with the cause value of Physical channel failure. In this case,
the setup failure is considered as VS.FailRBSetup.PhyChFail rather than VS.FailRBSetup.NoReply.

The following section describes the judgment methods and solution suggestions in
different scenarios:
You need to first trace the IFTS(L2 DATA Report Timer=100s) of top N cells. You
can trace abnormal signaling.
Through signaling analysis, check whether the uplink SRB2 of the UE sends new
data packets after the RB SETUP message is delivered. If yes, you can think that
the UE receives the RB SETUP message and returns the RB SETUP COMPLETE
message.
If the uplink or downlink BER is high, you can analyze whether the problem is
related to the DCH activation time, that is, whether the activation time of the UE is
not consistent with that of the NodeB (note: The problem occurs in only Sony
Ericssons UEs). The primary cause is that there exists interference on the uplink or
the downlink coverage is poor. Especially in the double-TRX DRD scenario, the
EC/N0 difference between TRXs is great because of the imbalance of coverage
between carriers. If all the cases of RB setup timeout occur in the DRD scenario,
you can raise the access success rate by optimizing the DRD parameters and
controlling the DRD occurrence frequency.
If both TRXs support R99/HSPA, you can optimize the DRD parameters by using
the DRD algorithm based on load balance.

Enable the DRD algorithm switch for the HSDPA service


(LdbDRDSwitchHSDPA)

Raise the DRD offset for the HSDPA service (LdbDRDOffsetHSDPA)

Lower the DRD power remainder threshold for the HSDPA service
(LdbDRDLoadRemainThdHSDPA)

Modify the parameter TIMERPOLL of the RLC layer (to the optimized value: 120)
to increase the SRB retransmission opportunities

Raise the DRD EcNo threshold.

VS.RAB.FailEstCS.Relo/ VS.RAB.FailEstCS.Relo
While initiating the migration, the RNC receives the RAT SETUP Request message.
The RNC does not process the request. The problem is caused by flow embedment and
occurs improbably. It is related to the time sequence of subscriber behaviors. The
problem is usually controlled in the core network.

2016-12-19

VS.RAB.FailEstabCS.TNL/VS.RAB.FailEstPS.TNL

Huawei Confidential

Page 48 of 131

RAN10 KPI Troubleshooting Guide

INTERNAL

RAB setup fails because of the failure of transmission establishment. The problem
occurs improbably. If the problem occurs, collect the performance data, CHR log, and
IOS information about the top N cells and return the information to the R&D
department for analysis.

VS.RAB.FailEstPS.Par
The RNC considers that the parameters delivered by the core network are invalid. The
problem occurs improbably. If the problem occurs, trace the IOS data of the top N
cells and return the data to the R&D department for analyzing the detailed RAB setup
information.

Other
If you find the increase in the failures because of VS.RAB.FlEstPS.RNL.Other,
VS.RAB.FailEstabPS.Other.Cell, VS.RAB.FailEstabCS.RNL.Other, and
VS.RAB.FailEstabCS.other.CELL, collect the performance data, CHR log, and IOS
data of top N cells and return the information to the R&D department for analysis.

3.4 List of Problem Information


Checklist for KPI
Troubleshooting-3.4 .xls

2016-12-19

Huawei Confidential

Page 49 of 131

RAN10 KPI Troubleshooting Guide

INTERNAL

Handover Success Rate


(SHO/HHO)

In an actual commercial network, the handover-related problems are closely related to call
drop. In most cases, handover failure leads to call drop. Therefore, the chapter describes the
call drop caused by handover in the sections about handover success rate. Chapter 6 only
describes the call drop that is not caused by handover failure.
By the handover scenario, handover is categorized into soft handover (softer handover),
intra-frequency hard handover, inter-frequency hard handover, and inter-RAT handover. By
the handover service, handover is categorized into CS (AMR and VP), PS R00, HSDPA,
and HSUPA.
The chapter categorizes handover by the handover scenario, and describes the success rate
of soft handover and inter-frequency hard handover. Intra-frequency soft handover seldom
occurs. Intra-frequency hard handover only occurs when soft handover is not supported, for
example:

Handover between RNC intra-frequency cells when no Iur interface is available

The Iur interface is available but the Iur interface resources are not enough

Handover because of the control of the rate threshold of the PS service of the cells
The inter-RAT interoperations involve the interoperations between the UMTS, GSM,
and CN. Therefore, Chapter 7 gives a description separately.

4.1 Problems Related to Soft Handover


Success Rate
4.1.1 KPI Definition
The following section defines the soft handover success rate, thus laying a basis for the
analysis of soft handover success rate.
1.

Soft Handover Success Rate of CS Service and PS R99 Service


VS.SHO.Success.Cell.Rate = ( <SHO.SuccRLAddUESide> +
<SHO.SuccRLDelUESide> )/( <SHO.AttRLAddUESide> + <SHO.AttRLDelUESide> )

2.
2016-12-19

Change Success Rate of HSDPA Serving Cell


Huawei Confidential

Page 50 of 131

RAN10 KPI Troubleshooting Guide

INTERNAL

VS.HSDPA.ServCellChg.Succ.Rate = <VS.HSDPA.SHO.CellChg.SuccOut> /
<VS.HSDPA.SHO.CellChg.AttOut>
3.

Change Success Rate of HSUPA Serving Cell


VS.HSUPA.SHO.ServCellChg.Succ.Ratio = <VS.HSUPA.SHO.ServCellChg.Succ> /
<VS.HSUPA.SHO.ServCellChg.Att>

4.1.2 Influence Factors


The following factors affect the soft handover success rate:
1.

Some Neighboring Cells are not Configured


During the initial optimization, call drop is mainly caused because some neighboring
cells are not configured. For the intra-frequency neighboring cells, you can check
whether intra-frequency neighboring cells are not configured by using the following
methods:
Method 1: Observe the EcIo information about the active set recorded by the UE and
the Best Server EcIo information recorded by the Scanner before call drop. If the EcIo
recorded by the UE is poor but the Best Server EcIo recorded by the Scanner is ideal,
check whether the Best Server scrambling code recorded by the Scanner appears in the
latest neighboring cell list of intra-frequency measurement control before call drop. If
the neighboring cell list of intra-frequency measurement control has no such
scrambling code, you can determine that some neighboring cells are not configured.
Method 2: If the UE accesses a cell immediately after call drop and the scrambling
code of the accessed cell is not consistent with the scrambling code at the time of call
drop, you can also suspect that some neighboring cells are not configured. You can
further analyze the problem by measurement control (find the latest intra-frequency
measurement control message by starting from the message at the call drop position)
and check the neighboring cell list of the measurement control message.
Method 3: Some UEs report the Detected Set information. If the Detected Set
information contains the corresponding scrambling code information before call drop,
you can also determine that some neighboring cells are not configured.
Call drop can be caused if some neighboring cells are not configured. The redundancy
of neighboring cells also has impacts upon network performance. For example, the
consumption of intra-frequency measurement of the UE is increased and in serious
cases, cells cannot be added to neighboring cells. Therefore, you also need to show
concern for the redundancy of neighboring cells when analyzing the handover-related
problems.

2.

Pilot Pollution
Usually, pilot pollution is defined as follows: There exist overmany strong pilots at a
point, but there is no primary pilot that is strong enough. Therefore, you need to
confirm the following contents when the pilot pollution criteria are laid down.

Definition of Strong pilot

Definition of Overmany

Definition of there is no primary pilot that is strong enough

Definition of Strong pilot


You can determine whether a pilot is a strong pilot according to its absolute strength.
You can measure the pilot strength through its RSCP. If the RSCP of the pilot exceeds
a certain threshold, you can determine that the pilot is a strong pilot, for example,

CPICH _ RSCP ThRSCP _ Absolute

2016-12-19

Huawei Confidential

Page 51 of 131

RAN10 KPI Troubleshooting Guide

INTERNAL

Definition of Overmany
You can judge whether there are overmany pilots at a point through the number of
pilots. If the number of pilots at a point exceeds a certain threshold, you can determine
that there are overmany pilots at the point, for example,

CPICH _ Number ThN


Definition of there is no primary pilot that is strong enough
You can determine whether there is a primary pilot strong enough according to the
relative strength of multiple pilots at the point. If the difference between the signal
(Th N 1)
strength of the strongest pilot and the signal strength of the
th strongest
pilot at a point is below a certain threshold, you can determine that there is no primary
pilot strong enough at the point, that is,

(CPICH _ RSCP1st CPICH _ RSCP(ThN 1)th ) ThRSCP _ Re lative


If the following condition is met, you can determine that there exists pilot pollution at
the point:
There are more than
.

ThN pilots that meet the condition: CPICH _ RSCP ThRSCP _ Absolute

(CPICH _ RSCP1st CPICH _ RSCP(ThN 1)th ) ThRSCP _ Re lative


Th
95dBm Th N 3
ThRSCP _ Re lative 5dB
Assume that RSCP _ Absolute
,
, and
, the
criteria for pilot pollution are as follows:
There are more than 3 pilots that meet the condition: CPICH _ RSCP 95dBm .

(CPICH _ RSCP1st CPICH _ RSCP4th ) 5dB .


If both conditions are met, you can determine that there exists pilot pollution.
3.

The Parameters of the Soft Handover Algorithm are not Set Correctly
You can adjust the handover algorithm to solve two types of problems: Handover too
late and ping-pong handover.
Judging from the signaling flow, handover too late has the following symptom: For the
CS service, the UE does not receive the active set update message (for intra-frequency
hard handover, the UE does not receive the physical channel reconfiguration message).
The cause is as follows: The EcIo of the source cell signals decreases sharply after the
UE reports the measurement report, and the UE switches off the transmitter because of
downlink out-of-step when the RNC sends the active set update message; judging at
the UE side, the active set update message is not received. For the PS service, it is
possible that the active set update message is not received or TRB reset occurs before
the handover.
Judging from signals, handover too late has the following symptoms:

Corner effect: The EcIo of the source cell decreases sharply, and the EcIo of the
target cell increases sharply (increase to a high value suddenly).

Pinpoint effect: The EcIo of the source cell decreases sharply for some time and
then increases, and the EcIo of the target cell increases sharply within a short
period.
Judging from the signaling flow, the UE reports the 1a or 1c measurement report of
the neighboring cell before call drop and the RNC receives the measurement report

2016-12-19

Huawei Confidential

Page 52 of 131

RAN10 KPI Troubleshooting Guide

INTERNAL

and delivers the active set update message, but the UE does not receive the active
set update message.
Ping-pong handover has the following two symptoms:

The dominant cell is changed quickly: Two or more cells become the dominant cell
alternatively. The dominant cell has desirable RSCP and EcIo, and each cell acts as
the dominant cell for a short period.

There is no dominant cell: There are multiple cells, the RSCP is normal, the RSCP
difference is not great between the cells, and the EcIo of each cell is poor.
Judging from the signaling flow, you can see the following symptom: After a cell is
deleted, the 1A event of the cell is immediately reported and the active set update
message sent by the RNC cannot be received, thus causing the failure.

4.

The Equipment (Including the UE) is Abnormal


Check whether there are abnormal alarms on the alarm subsystem, analyze the traced
messages, and determine at which step soft handover occurs in the flow by querying
the failure message resolution. You can ask the local customer service and engineering
personnel to determine whether the equipment is abnormal. Note that the exception
handling of the UE or instability of transmission quality is also a common factor that
lowers the handover success rate.
In the latest version, it is also possible that handover failure is caused by the version
quality. If the problem is not a known problem, you must ask the R&D personnel to
participate in the analysis.

4.1.3 Analysis Process


1.

Discussing the Problem, Ascertaining the Problem Background and Product


Version, and Ruling Out the Possibility of Known Bugs
You need to first ask the field personnel to feed back the related information and
symptoms of the problem, and then obtain the information about the known bugs of
the corresponding version by inquiring the contact persons of the RNC and NodeB or
referring to the Release Notes. In this way, you can determine whether the problem is
caused by a known version problem, for example, whether soft handover failure is
caused by abnormal power control of the version.
Determine the time at which the handover success rate is changed, analyze whether the
problem is caused by network adjustment (for example, add or relocate sites), and
focus on the impacts of network adjustment. Judging from the experiences, the soft
handover success rate of a mature commercial network is barely deteriorated at
sudden. If the KPI is deteriorated severely in many areas, the cause is usually as
follows: The network is newly built or relocated, so some neighboring cells are not
configured. During the relocation, the interoperations are performed between the Iur
interface of the local RNC and the Iur interface of the peer RNC. Therefore, the latest
actions performed on the network are critical information.

2.

Narrowing Down the Analysis Scope, Analyzing Whether the Low Handover
Success Rate Is Caused by Certain Cells, and Analyzing the Performance data
About Soft Handover Failure
If the preceding causes are ruled out, you need to analyze the performance data. The
performance data is one of the most information sources for network optimization and
also the main evaluation criterion for network performance.
The handover-related performance data can be obtained from the RNC and cells. The
RNC-oriented performance data can reflect the handover performance of the whole
network, and the cell-oriented performance data can help you locate the faulty cells.

2016-12-19

Huawei Confidential

Page 53 of 131

RAN10 KPI Troubleshooting Guide

INTERNAL

The flow of soft handover includes the soft handover preparation process and soft
handover air-interface process. The preparation process indicates the process from
handover decision to completion of RL setup. The air-interface process refers to the
update process of the active set.
Check whether the soft handover success rate of the entire network and cells in busy
hours complies with the standard. If not, analyze the change in the handover success
rate of the cells and in the handover failure count and thus judge whether the problem
is caused by the performance worsening of certain cells.
Analyze the change tendency of the top N cells in the handover success rate and
handover failure count, compare them with the change tendency of the entire network
in the handover success rate and handover failure count, and thus judge whether the
top N cells can show the information about the handover success rate in the entire
network. If the top N cells show such information in the entire network, you can
address the top N cells.
Then, you can determine the main causes for the worsening (or not incompliance with
the standard) of the handover success rate, that is, find the main count cause value of
handover failure.
Figure 1 lists the main count cause values for soft handover failure.
Figure 1 Indicators related to soft handover failure
Indicator

Description

SHO.FailRLAddUESide.Cf
gUnsup

Number oft handover RL failures of the cells (the cause value


is Configuration Unsupported.)
The UE thinks that the active set update contents of adding or
deleting links by the RNC are not supported. Basically, the
scenario does not occur in a commercial network.

SHO.FailRLAddUESide.Isr

Number of soft handover RL failures of the cells (the cause


value is incompatible simultaneous reconfiguration)
The UE feeds back that the soft or softer handover process of
adding or deleting links by the RNC is not compatible with
other concurrent processes. The RNC ensures serial
processing during the flow processing. The problem is caused
mainly because the processing of some UEs is defective.

SHO.FailRLAddUESide.In
vCfg

Number of soft handover RL failures of the cells (the cause


value is invalid configuration)
The UE thinks that the active set update contents of adding or
deleting links by the RNC are invalid. Basically, the scenario
does not occur in a commercial network.

2016-12-19

SHO.FailRLAddUESide.N
oReply

The RNC does not receive the response to the active set
update command of adding or deleting links. It is the main
cause of soft or softer handover failure in the network, and
mainly occurs in the area where the coverage quality is poor
or the handover area is small. You need to first consider RF
optimization. It is also possible that the equipment is
abnormal.

Other

Soft handover failure caused by other factors

Huawei Confidential

Page 54 of 131

RAN10 KPI Troubleshooting Guide

INTERNAL

For the change failure of the serving cell of HSDPA/HSUPA, no dedicated cause count
is available. There is only one failure scenario: After the RNC sends the PHYSICAL
CHANNEL RECONFIGRATION message to the UE, the RNC does not receive the
PHYSICAL CHANNEL RECFG COMPLETE message returned by the UE. You need
to analyze such failure through the CHR log and signaling tracing.
3.

Analyzing the Main Scenarios of Handover Failure


After you can basically lock the top N cells of handover failure and the performance
counter of handover failure, you need to analyze the CHR log and IOS data. The
analysis procedure is as follows:

Through the CHR analysis (by using the OMSTAR tool), judge whether soft
handover failure focuses on a certain UE (for details about the analysis methods,
see Chapter 7). If you can determine that the problem is caused by the UEs of
several certain IMSIs, you can query the IMEI sequence numbers of the IMSIs.
The IMEI sequence number is a 15-digit number and is the hardware identification
mark of the UE. The first 8 digits indicate the vendor and model of the UE. If you
can determine that the problem usually occurs in the UE of a certain model, you
can consider the compatibility of terminals emphatically.

Enable the IOS tracing of the top N cells to obtain the signaling of air-interface
failure. Through the signaling, you can infer whether the problem is an uplink
problem or downlink problem and analyze the failure scenarios from the IOS
signaling (for example, whether the problem is related to encryption or a specific
flow).

When finding the main failure scenarios, you can consider enabling the IFTS
tracing (with the L2 user plane) to further analyze the failure cause.

IFTS tracing enables you to trace the detailed information at the CDT user plane
level. For the handover failure in the soft handover preparation phase
(RL_ADD/RL_SETUP), for example, the resource request failure, you can obtain
the detailed print information about such failure. In the Uu Noreply scenario where
the ASU signaling message is sent but the ASU CMP message is not received, you
can also obtain the valid RLC-layer information from the user-plane message. As a
result, you can determine which of the following factors causes handover failure:

The UE downlink does not receive the ASU message.

The UE uplink returns the ASU CMP message, but the RNC does not receive the
ASU CMP message.
You can also trace the downlink BLER, RSCP, and Ec/No at the same time to know
the quality of the uplink/downlink signals at the time when the problem occurs,
thus helping you analyze the preceding RF problems (including the pilot pollution,
corner effect, and pinpoint effect).

IFTS tracing is similar to IOS tracing, which involves selecting eligible subscribers in
the specified cell at random for tracing. IFTS tracing enables you to trace the detailed
information at the CDT user-plane level, thus facilitating deep analysis. However, only
one subscriber can be selected in each cell at a time. If the KPIs (for example, the
handover success rate and call drop rate) are changed slightly, the effect of IFTS
tracing is not desirable and you can hardly trace the valid data. Usually, you can trace
the IOS information to know the main flow of the problem, and then trace the IFTS
data.
4.

Conducting Drive Tests on Site, and Analyzing the Causes Deeply


Normally, you can determine the causes of the general air-interface RF problems, soft
handover failure in the preparation phase, and FP synchronization failure by analyzing
the preceding CHR log, performance data, IOS data, and IFTS data. If the signal

2016-12-19

Huawei Confidential

Page 55 of 131

RAN10 KPI Troubleshooting Guide

INTERNAL

quality is good during the handover and the Uu Noreply problem occurs, you need to
obtain more data. Sometimes, you need to conduct drive tests on site, reproduce the
handover failure, and obtain the QXDM log (including the L2 and L1 information) of
the UE for analysis.
The drive test must be well targeted. That is, you need to determine the main scenarios
or top N cells that affect the handover success rate. For example, you can determine
the handover that can be reproduced more easily:

Handover from cell A to cell B

Ping-pong handover between cell A and cell B


You can also determine to perform FTP download or ping small packets.
As a result, you can ensure high possibility of reproducing the handover failure
through the drive test and obtain the main failure signaling.
While conducting the drive test, enable NodeB CDT tracing and RNC CDT tracing
and analyze their signaling. Generally, Uu Noreply has the following symptoms:

The RNC does not send the ASU signaling to the NodeB effectively. For example,
packets are lost during the Iub transmission.

After receiving the ASU signaling, the NodeB does not send the ASU signaling
through the air interface successfully.

After the ASU signaling is sent from the air interface, the UE does not receive the
ASU signaling. This case is rare when the signal quality is good.

The UE does not send the ASU CMP message. This is a UE bug, and barely occurs
in existing commercial networks.

The UE sends the ASU CMP message, but the NodeB does not demodulate or
decode the message successfully.

The NodeB sends data, but the RNC L3 does not receive the data. For example,
packets are lost during the Iub transmission.
Obviously, only the 3rd and 4th symptoms may be caused by UE anomaly. If packets
are lost on the Iub interface, you can check the transmission quality through the
IPPM or VCLPM. If the transmission quality is poor, you need to solve the
transmission problem and check the effect. Other symptoms are mostly caused by
internal defects of the product, and require support from the R&D department.

4.1.4 Cases of Soft Handover Failure


1.

Problem Description
In June 2006, subscribers of the PCCW office in Hong Kong complained that call drop
easily occurs when they exited a tunnel. The performance data showed that the call
drop rate of the entire network did not increase greatly and the complaint was a singlepoint complaint. The technical personnel conducted a drive test on site, and captured
the data at the RNC side and UE side. The analysis showed that the signals outside the
tunnel were strong but the UE did not report the 1A event, thus causing the call drop.

2.

Problem Analysis
The analysis shows that the signal quality is not good in cell 486 of the active set, the
signal quality is good constantly in cell 472 of the monitoring set, and the conditions
for reporting the 1A event are met. For details, see Figure 1.

2016-12-19

Huawei Confidential

Page 56 of 131

RAN10 KPI Troubleshooting Guide

INTERNAL

Figure 1 Cell signal quality

The UE never reports the 1A event. Finally, call drop occurs because the signal quality
of cell 486 is extremely poor. Why does the UE not report the 1A event? The possible
causes are as follows:

It is originally suspected that some neighboring cells are not configured, but you
can see cell 472 in the monitoring set. Therefore, the problem is surely not caused
because some neighboring cells are not configured.

Query the configured soft handover threshold, but no anomaly is found.

Is the UE abnormal? During the test, the UE can report the 1A event and other
cells, for example, the measurement report is shown in Figure 2.

Figure 2 Measurement report

In addition, Huawei 636 UE also encounters the similar problem. Therefore, it


indicates that the problem does not occur in a single UE. Why does the UE not report
the A1 event and cell 472 when the UE is in serving cell 486? View the measurement
control information again, you can find a difference in neighboring cell 472 and other
neighboring cells: When cell 486 is configured with neighboring cell 472, CIO is set to
10; for other neighboring cells, CIO is set to 0.

2016-12-19

Huawei Confidential

Page 57 of 131

RAN10 KPI Troubleshooting Guide

INTERNAL

Figure 3 CIO offset parameter

Is it the CIO configuration that causes the problem? The protocol regulates that: CIO
indicates that the cell individual offset shall be used for event evaluation. That is, add
the CIO value and the measured CPICH value of the cell and use the sum for the event
evaluation process of the UE.
The current measurement decision uses the EEC/N0. Its value ranges from 0 to 24 dB.
At this time, CIO is set to 10, that is, 5 dB. When the signal quality of cell 472 is good,
it is possible that the decision is incorrect because the UE calculation is overflowed.
You can inquire the terminal development personnel about the impacts of the CIO
configuration upon the reporting of the measurement report by the UE. They obtain
Qualcomms answer: For the 636 UE, the CIO configuration causes the bug when the
UE reports the measurement report. In the 526 UE, the problem has been solved.
3.

Conclusion
If CIO is set to an extremely large value, the UE does not report the 1A event, which is
the UEs bug. During the event evaluation of the UE, the UE calculation may be
overflowed at the time of measurement decision if CIO is configured, thus causing the
decision error.
Qualcomm admits that Huawei 636 UE has the bug, and also points out that the
problem has been solved in Huawei 526 UE.

4.2 Problems Related to Hard Handover


Success Rate
4.2.1 KPI Definition
1.

Hard Handover Success Rate of CS Service and PS R99 Service


VS.HHO.InterFreq.Out.Cell.Rate = <VS.HHO.InterFreq.SuccOut> /
<VS.HHO.InterFreq.AttOut>
VS.HHO.InterFreq.In.Cell.Rate = <VS.HHO.InterFreqIn.Succ> /
<VS.HHO.InterFreqIn.Att>

2.

Change Success Rate of HSDPA Serving Cell (Inter-Frequency Handover)


VS.HSDPA.ServCellChg.Succ.Rate = <VS.HSDPA.SHO.CellChg.SuccOut> /
<VS.HSDPA.SHO.CellChg.AttOut>

3.

Change Success Rate of HSUPA Serving Cell (Inter-Frequency Handover)


VS.HSUPA.SHO.ServCellChg.Succ.Ratio = <VS.HSUPA.SHO.ServCellChg.Succ> /
<VS.HSUPA.SHO.ServCellChg.Att>

2016-12-19

Huawei Confidential

Page 58 of 131

RAN10 KPI Troubleshooting Guide

INTERNAL

4.2.2 Influence Factors


The following factors affect the soft handover success rate:
1.

Some Neighboring Cells Are Not Configured


Like soft handover failure, it is one of the common causes of inter-frequency hard
handover failure that some neighboring cells are not configured. For details about the
troubleshooting method, see Chapter 7.

2.

The Inter-Frequency Handover Threshold or Compression Mode Threshold


Parameters Is Set Improperly, So Handover does Not Occur in a Timely Manner
Inter-frequency measurement may use the compression mode (some UEs has double
receivers, so the inter-frequency signals can be measured without enabling the
compression mode, for example, some Motorola terminals). When the UE enters into
the CELL_DCH state or the best cell is updated, you need to configure the
measurement of the 2D and 2F events if the inter-frequency handover algorithm is
enabled and the best cell has the inter-frequency neighboring cell list. The absolute
thresholds of 2D and 2F are the enabling/disabling threshold of inter-frequency
measurement. The CPICH Ec/No or RSCP measurement quantity and threshold are
used according to the location properties of the best cell in the active set. If the
measured quality is below the enabling threshold, the 2D event is reported and
periodical inter-frequency measurement is enabled through a decision. If the quality of
the active set increases to be higher than the disabling threshold, the 2F event is
reported and inter-frequency measurement is disabled.
The compression mode usually affects link quality and system capacity. Therefore, it is
recommended that inter-frequency measurement should not be enabled unless
necessitated. If the enabling threshold of the compression mode is extremely low, it is
difficult to enable the compression mode. As a result, call drop occurs in the existing
network because it is too late to trigger hard handover.

3.

The Inter-Frequency Measurement Quantity Is Not Selected Correctly, So InterFrequency Measurement Cannot Be Initiated in a Timely Manner
Sometimes, a commercial network encounters the following inter-frequency handover
failure: When the UE moves toward an inter-frequency cell, the compression mode is
always not enabled to initiate inter-frequency measurement until the UE accesses the
inter-frequency cell again after call drop occurs. ..
Query the cell configuration, and you can find that the cell is configured to the TRX
center cell. That is, the 2D event, 2F event and inter-frequency measurement use
Ec/N0 as the measurement quantity.
The measured value of the pilot Ec/N0 depends on two factors: RSCP strength of the
pilot signals and downlink interference. For the WCDMA system, the downlink
interference mainly includes the downlink signal interference of the intra-cell cells (the
current cell and neighboring cells) and background noise. The strength of downlink
interference of the intra-cell cells is affected by the path loss and slow fading. It is
similar to the fading that is undergone by the wanted signals (for example, the CPICH
RSCP) to be received by the UE. At the coverage edge of a TRX, when the UE moves
from the TRX cell in use to another TRX cell, the CPICH RSCP and interference
almost fade at the same speed (the background noise is not affected by the path loss, so
the CPICH RSCP fades a little faster). Therefore, the CPICH Ec/I0 received by the UE
is changed extremely slowly. Both emulation test and actual test show that the CPICH
Ec/I0 can still come up to about 12 dB when the CPICH RSCP received by the UE is
about 110 dBm.

2016-12-19

Huawei Confidential

Page 59 of 131

RAN10 KPI Troubleshooting Guide

INTERNAL

Figure 1 Relation between RSCP fading and Ec/N0 fading

If Ec/I0 is used as the measurement quantity of the 2D event, the 2D event is probably
not triggered when call drop occurs in the UE. As a result, inter-frequency
measurement is not started.
In this case, you need to configure the cell to a TRX edge cell and use the RSCP as the
measurement quantity of the 2D or 2F event to initiate inter-frequency measurement
timely.
In the RAN10 or later versions, the RNC uses the Both mode of Ec/N0 and RSCP as the interfrequency measurement quantity by default, thus solving the problem fundamentally. In the versions
earlier than the RAN10 (for example, the V29 and V18), the inter-frequency measurement quantity
must be set correctly.

4.2.3 Analysis Process


Basically, inter-frequency handover failure can be analyzed by using the same way as soft
handover failure, especially when the handover success rate of the commercial networks
decreases. The method of analysis is usually as follows:
1.

Discussing the Problem, Ascertaining the Problem Background and Product


Version, and Ruling Out the Possibility of Known Bugs
Like the analysis of soft handover failure, you need to rule out the known bugs, know
the recent actions performed on the network (for example, relocation and upgrade),
and compare the script before the occurrence of the problem with the script after the
occurrence of the problem.

2.

Narrowing Down the Analysis Scope, Analyzing Whether the Low Handover
Success Rate Is Caused by Certain Cells, and Analyzing the Performance data
About Inter-Frequency Handover Failure
When analyzing the performance data, you need to check whether the inter-frequency
handover failures of the top N cells account for the majority of the total interfrequency handover failures in the entire network, and determine which type of count
is related to the inter-frequency handover failures in the network.

2016-12-19

Huawei Confidential

Page 60 of 131

RAN10 KPI Troubleshooting Guide

INTERNAL

Figure 1 lists the performance counter causes related to inter-frequency hard handover
failure.
Figure 1 Indicators related to inter-frequency hard handover failure
Numb
er

Indicator

Description

VS.HHO.InterFreqOut.CfgUnsupp

Configuration unsupported

VS.HHO.InterFreqOut.PyhChFail

Physical channel failure

VS.HHO.InterFreqOut.FailUSR

Incompatible simultaneous
reconfiguration

VS.HHO.InterFreqOut.CellUpdt

Cell update occurred

VS.HHO.InterFreqOut.CfgInvalid

Invalid configuration

VS.HHO.InterFreqOut.NoReply

No response on the air interface

VS.HHO.InterFreqOut.DLCodeRej

Failure of inter-frequency hard


handover from the current cell
because of the failure of downlink
code resource allocation

VS.HHO.InterFreqOut.ULAdmsnDeny

Failure of inter-frequency hard


handover out of the current cell
because of the uplink admission
denial

VS.HHO.InterFreqOut.DLAdmsnDeny

Failure of inter-frequency hard


handover out of the current cell
because of the downlink admission
denial

10

Other

Failure of inter-frequency hard


handover because of other factors

The 1st to 5th indicators all have the following feature: During inter-frequency hard
handover, after the RNC receives the PHYSICAL CHANNEL RECONFIGURATION
FAILURE message returned by the UE, the cause values carried in the message are
measured in the best cells of the UE respectively before inter-frequency hard handover
occurs. The problems seldom occur in the commercial networks. Some parameters
configured on the UE are not compatible with those configured in the network.
The 6th indicator has the following feature: During inter-frequency hard handover, the
RNC starts the timer to wait for the response from the UE after the RNC sends the
PHYSICAL CHANNEL RECONFIGURATION message to the UE; if the RNC does
not receive the response from the UE when the timer times out, the indicator is
measured in the best cell of the UE before inter-frequency hard handover occurs. Such
problems often occur in the commercial networks. A substantial part of such problems
occur under the scenario where the signal quality of the handover area is poor.
Therefore, you need to first check the signal quality of the current cell and target cell
at the time of handover and optimize the RF. It is also possible that the equipment is
abnormal.
2016-12-19

Huawei Confidential

Page 61 of 131

RAN10 KPI Troubleshooting Guide

INTERNAL

The 7th to 9th indicators all have the following feature: When the RNC receives the
measurement report sent by the UE, the RNC initiates the inter-frequency hard
handover request if the decision conditions of inter-frequency hard handover are met.
After entering the inter-frequency hard handover flow, the RNC needs to make the
decision on target cell admission. If hard handover fails because of the failure of cell
admission, the indicators are measured in the best cells of the UE respectively before
inter-frequency hard handover occurs according to the cause of cell admission failure.
For such problems, you need to first check whether the resources of the target cell are
really congested. If a resource is really congested, you need to expand the capacity as
soon as possible. If no obvious resource congestion occurs but the admission of hard
handover fails, you can suspect that the network has bugs. Then, you can trace the
CDT information to see what causes the failure of admission.
3.

Analyzing the Main Scenarios of Handover Failure


Like the analysis of soft handover failure, you need to also analyze the CHR log and
IOS or IFTS data after basically locking the top N cells of handover failure and the
performance counter of handover failure.

If the RNC receives the PHYSICAL CHANNEL RECONFIGURATION


FAILURE message returned by the UE, you can basically determine in which
scenario some parameters configured on the UE are not compatible with those
configured on the network according to the failure cause fed back by the UE. For
such problems, you can basically suspect the compatibility of the UE. You can use
the CHR data to analyze the corresponding IMSIs (for details, see Chapter 7) and
inquire the customer about the corresponding IMEIs of the IMSIs, thus judging
whether the problems are caused by the terminals of a specific model.

If no response is received on the air interface, you need to first judge whether the
signal quality of the current cell and target cell is normal at the occurrence time of
the problem. If both the Ec/No of the current cell and the Ec/No target cell are
lower than -13 dB, it is possible that the UE does not receive the PHYSICAL
CHANNEL RECONFIGURATION message delivered by the network. Therefore,
you can preliminarily determine that the problem is caused by poor coverage. You
need to optimize the network coverage. If the signal quality is good at the time of
handover but the air interface does not receive the response, you can query the
IFTS user-plane information to further determine which of the following occurs:

The UE does not receive the PHYSICAL CHANNEL RECONFIGURATION


message delivered by the network.

The RNC does not receive the PHYSICAL CHANNEL RECONFIGURATION


CMP message returned by the UE and delivered by the network.
Like the symptom that no response is received on the air interface during soft
handover failure, the problems may be caused by the quality defect of the network
version. To analyze the problem deeply, you may need to obtain more UE and
NodeB data.

4.

For the admission denial of the target cell during hard handover, you need to trace
the detailed IFTS/CDT data to find the cause of admission denial.

Conducting Drive Tests on Site, and Analyzing the Causes Deeply


Except the scenario under which no response is received on the air interface when the
signal quality is good, you can basically determine the cause through the preceding
analysis under other scenarios. If no response is received on the air interface when the
signal quality is good, it is possible that the network equipment is abnormal or the UE
processing is abnormal. Therefore, the field personnel need to conduct drive test and
test the top N cells to obtain the RNC CDT, NodeB CDT, and UE QXDM log data for
detailed analysis. The purpose is to determine at which step errors occur:

2016-12-19

RNC
Huawei Confidential

Page 62 of 131

RAN10 KPI Troubleshooting Guide

Iub transmission

NodeB

UE

INTERNAL

The R&D personnel need to analyze such problems emphatically.

4.2.4 Cases of Inter-Frequency Hard Handover


Failure
1.

Problem Description
In February 2009, the field personnel of the Orange site of Moldova fed back that two
KPI-related problems occurred after the NodeB was upgraded to the V110 053 (note:
The local time was the evening of February 6th, and the NodeB was upgraded in the
morning of February 7th).

The inter-frequency handover success rate decreasing by 1%


Analyze the performance data, and you can find that the main cause count of call
drop is VS.HHO.InterFreqOut.NoReply. In addition, no obvious top N cells are
available. Therefore, you can determine that the problem is a global problem but
not caused by certain cells.

Figure 1 Inter-frequency handover failure


Cell Group Time(As day)VS.HHO.InterFreq.Fail.Cell.Rate
053 Cluster
2009-2-4
0.64%
053 Cluster
2009-2-5
0.91%
053 Cluster
2009-2-6
0.91%
053 Cluster
2009-2-7
1.96%
053 Cluster
2009-2-8
1.63%
053 Cluster
2009-2-9
1.87%

The CS call drop rate increasing by 0.5% from 0.6% to an average of 1.1%

Figure 2 CS call drop rate

2.

Problem Analysis

2016-12-19

Analyze the CS call drop rate through the performance data and CHR data. You can
basically determine that the problem is caused by sharp increase of the
VS.RAB.Loss.PS.RF.UuNoReply value. Ask the field personnel to trace the IOS
data. The IOS analysis shows that call drop occurs for 17 times, the ASU timeout

Huawei Confidential

Page 63 of 131

RAN10 KPI Troubleshooting Guide

INTERNAL

occurs for 3 times, the failure occurs for 8 times after the compression mode is
enabled, and this proportion is high. You need to analyze the failure scenario
emphatically after the compression mode is enabled. You can find that the failure
scenario is basically the same as the scenario in the following figure. After the
compression mode is enabled, the signal quality of the current cell is poor (in the
following figure, the 1A measurement report is received after the compression
mode is enabled, and the Ec/No of both the current cell and neighboring cell is
about 15 dB). Subsequently, the signal quality of the inter-frequency neighboring
cells is not measured. Because of the poor signal quality, synchronization fails and
thus call drop occurs. Now, you can basically associate the CS call drop with the
worsening of the inter-frequency handover success rate. That is, the CS call drop
count increases mainly because of the failure of inter-frequency hard handover.

Now, you need to explain the following questions: Why has the signal quality been
so poor after the compression mode is enabled? Why does call drop occur upon the
out-of-step of the air interface even if the inter-frequency cell signals are not
measured? Is the signal quality fluctuated dramatically? Is it too late to enable the
compression mode?
The real-time measurement data of the downlink RSCP and Ec/No is not traced
when the IOS data is traced, so it is difficult to answer the question whether the
signal quality is fluctuated dramatically. You can only ask the field personnel to
trace such data next time.
To determine whether the compression mode is enabled at an appropriate time,
query the measurement control message of the compression mode for interfrequency handover. In the RAN10, the Both measurement quantity is used and
there are two measurement messages. For the first measurement message, the
measurement quantity is Ec/No, and the configured 2D and 2F thresholds and delay
are the default values. For the second measurement control message with the
measurement quantity of RSCP, the configured 2D threshold and 2F threshold are
100 dBm and 97 dBm respectively, both of which are 5 dB lower than the default

2016-12-19

Huawei Confidential

Page 64 of 131

RAN10 KPI Troubleshooting Guide

INTERNAL

values (2D: 95 dBm and 2F: 92 dBm). This is surely the reason why it is too late
to initiate inter-frequency hard handover and the call drop count increases.

Till this step, you can basically determine the cause of the problem. The field
personnel only upgrade the NodeB, and the parameter of the RNC is not changed
because of the upgrade. Why does the field personnel feed back that the indicator is
deteriorated after the NodeB is upgraded?
Query the previous script and operation log. Then, you can find the record of
modifying the inter-frequency handover threshold on the current day of upgrade.
Finally, the real cause is clear. The field personnel feed back that the modification
is performed by the customer and is not known to the customer service personnel.
[375083], [ admin], [

1], [ 172.16.106.48], [ 18670], [ Y2009M02D07H11N36S39],

[ Y2009M02D07H11N36S40], [

1], [

0], [

1], [SET INTERFREQHOCOV:

InterFreqCSThd2DRSCP=-100, InterFreqCSThd2FRSCP=97, InterFreqR99PsThd2DRSCP=100,


InterFreqHThd2DRSCP=100, InterFreqR99PsThd2FRSCP=97, InterFreqHThd2FRSCP=100,
TargetFreqCsThdRscp=97, TargetFreqR99PsThdRscp=97, TargetFreqHThdRscp=97;]

Figure 1 Comparison of handover parameters

3.

Conclusion
The real cause of the problem is as follows: The customer modified the inter-frequency
hard handover threshold without Huaweis prior consent, so the compression mode is
enabled too late and thus it is too late to initiate inter-frequency hard handover. To
restore the inter-frequency hard handover success rate and CS call drop rate to the
original level before the NodeB is upgraded, it is recommended that the customer
modify the parameter to the default value.

2016-12-19

Huawei Confidential

Page 65 of 131

RAN10 KPI Troubleshooting Guide

INTERNAL

4.3 List of Problem Information


Checklist for KPI
Troubleshooting-4.3 .xls

2016-12-19

Huawei Confidential

Page 66 of 131

RAN10 KPI Troubleshooting Guide

INTERNAL

Problems Related to Call Drop


(AMR/PS/VP/HSPA)
The call drop rate is a key indicator for assessing the network performance, and is also
among the first concerns of the customer. If the call drop rate increases, the problem is
usually urgent.
In a broad sense, the call drop rate includes the call drop rate of the CN and the call drop
rate of the UTRAN. You need to focus on the call drop rate at the UTRAN side. Chapter 5
focuses on the KPIs related to the call drops at the UTRAN side, but not the call drops
caused by handover failure.

5.1 KPI Definition


Formulas on the call drop rate (cell-level indicator):
VS.PS.Call.Drop.Cell.Rate = (<VS.RAB.Loss.PS.RF> + <VS.RAB.Loss.PS.Abnorm>) /
( <VS.RAB.Loss.PS.RF> + <VS.RAB.Loss.PS.Abnorm> + <VS.RAB.Loss.PS.Norm> )
VS.CS.AMR.Call.Drop.Cell.Rate = <VS.RAB.Loss.CS.AMR> / (<VS.RAB.Loss.CS.AMR> +
<VS.RAB.Loss.CS.Norm.AMR>)
VS.CS.VP.Call.Drop.Cell.Rate = <VS.RAB.Loss.CS.Conv64K> /
(<VS.RAB.Loss.CS.Conv64K> + <VS.Norm.Rel.CS.Conv.RB.64>)

5.2 Influence Factors


A great diversity of factors may cause call drop in a radio network. The chapter focuses on
the handover-unrelated call drop.
1.

Poor Coverage
For the Voice, call drop may be caused by poor coverage when the EcIo of the CPICH
is higher than 14 dB and the RSCP is higher than 100 dB. Usually, poor coverage
indicates that the RSCP is poor. Figure 1 lists the requirements for the planned
Outdoor EcIo and Ec (The data sources from the network planning result of an
operator, and is only for your reference).

2016-12-19

Huawei Confidential

Page 67 of 131

RAN10 KPI Troubleshooting Guide

INTERNAL

Figure 1 Requirements for the EcIo and Ec threshold


Servic
e

Bit rate of
service

DL EbNo

EcIo
thresholds

Ec
thresholds

CS 12.2

12.2

8.7

13.3

103.1

CS 64

64

5.9

11.9

97.8

PS 64

64

5.1

12.7

98.1

PS 128

128

4.5

13.3

95.3

PS 384

384

4.6

10.4

90.6

To determine whether uplink coverage or downlink coverage is poor, you need to


query the dedicated channel power of the uplink or downlink before call drop. The
method is as follows:
You can basically determine that call drop is caused by poor uplink coverage under the
following circumstances:

The uplink transmit power increases to the maximum value before call drop.

The uplink BLER is poor or the single-subscriber tracing data recorded by the RNC
shows that the NodeB reports the failure.
You can basically determine that call drop is caused by poor downlink coverage
under the following circumstances:

The downlink transmit power increases to the maximum value before call drop.
The downlink BLER is poor. If the uplink balances the downlink and there is no
interference on the uplink or downlink, the uplink transmit power and downlink
transmit power are limited at the same time. In this case, you do not need to strictly
determine which is limited first. In case of severe imbalance between the uplink
and the downlink, you can preliminarily determine that interference exists to the
limited direction.
To determine whether the problem is caused by poor coverage, the simplest method
is to directly observe the traced measurement data. If both the RSCP and EcIo of
the best cell are low, you can determine that the problem is caused by poor
coverage.
Poor coverage is caused for the following reasons:

Sites are not enough

Sectors are connected incorrectly

Sites are switched off because of the faults of power amplifiers.


In some indoor space, the overhigh penetration loss can also cause poor coverage.
Sometimes, sectors are connected incorrectly or sites are switched off because of
faults, which also occurs in the existing network. For example, the coverage of
other cells is poor at the point of call drop. You need to discriminate the reasons
from each other.

2.

Call Drop Caused by Interference


Both uplink interference and downlink interference cause call drop. Normally, you can
basically determine that the problem is caused by downlink interference if call drop
occurs when the CPICH RSCP of the active set is higher than 85 dBm and the

2016-12-19

Huawei Confidential

Page 68 of 131

RAN10 KPI Troubleshooting Guide

INTERNAL

general EcIo of the active set is lower than 13 dB. If handover is not initiated timely,
it is also possible that the RSCP signal quality of the serving cell is good but the EcIo
of the serving cell is poor; however, both the RSCP and EcIo of the cells in the
monitoring set are good. If the uplink RTWP exceeds the normal value (107 to 105)
by 10 dB and the interference period exceeds 2 to 3 seconds, call drop may occur. This
problem must be solved emphatically.
Usually, the downlink interference refers to pilot pollution, that is, more than three
cells satisfy the handover conditions in the coverage area. The fluctuation of signals
usually causes the replacement of the active set or the change of the best cell. When
the general quality of the active set is not good (the EcIo of the CPICH usually
fluctuates around 10 dB), handover may fail easily, thus causing SRB reset or TRB
reset.
The uplink interference raises the uplink transmit power of the UE in the connection
mode. As a result, the overhigh BLER causes SRB reset or TRB reset or call drop
occurs because of out-of-step. Additionally, at the time of handover, the newly
established link cannot be synchronized because of the uplink interference. The uplink
interference comes from outside the system or inside the system. In most scenarios, the
uplink interference comes from outside the system.
Usually, the uplink balances the downlink if there is no interference, that is, both the
uplink transmit power and downlink transmit power are approximate to their
maximum values before call drop. In case of the downlink interference, the uplink
transmit power is low or the BLER is converged, but the downlink transmit power
reaches its maximum value and the downlink BLER is not converged. In case of the
uplink interference, the same symptom appears. You can use the method to analyze the
actual problem.
3.

Abnormal Transmission
The call drop rate may increase because of the following factors:

The transmission equipment is abnormal

Packets are lost


There exists the delay ripple. Judging from the all-IP tendency of the networks, the
IP-based networking of some sites cannot provide high QoS. As a result, a burst of
packet loss occurs, and thus call drop occurs.
Usually, the QoS of the IP-based commercial network needs to meet the
requirements in Figure 1. Otherwise, the call drop rate may increase or the HSPA
service rate is low because of the poor transmission quality. To measure the
transmission quality, a simple way is to enable IPPM measurement on the RNC
LMT. For details, refer to the RAN10 Transmission Troubleshooting Guide.

2016-12-19

Huawei Confidential

Page 69 of 131

RAN10 KPI Troubleshooting Guide

INTERNAL

Figure 1 Requirements of IP-based networking for the transmission quality

4.

Equipment (Including the UE) Anomaly


After the preceding causes are excluded, you need to suspect that the equipment is
abnormal. For example, the UE is abnormal, or the compatibility is not ensured.
Therefore, you need to query the log and alarms of the equipment to further analyze
the cause of call drop.

5.3 Analysis Process


1.

Discussing the Problem, Ascertaining the Problem Background and Product


Version, and Ruling Out the Possibility of Known Bugs
You need to first ask the field personnel to feed back the related information and
symptoms of the problem, and then obtain the information about the known bugs of
the corresponding version (by inquiring the contact persons of the RNC and NodeB or
referring to the Release Notes). In this way, you can determine whether the problem is
caused by a known version problem, for example, whether abnormal call drop is
caused by memory leakage in a version.
Determine the time at which the call drop rate is changed suddenly, analyze whether
the problem is caused by network adjustment (for example, add or relocate sites, and
upgrade the version), and focus on the impacts of network adjustment. The previous
experience shows the following points:

For a newly built commercial network, call drop is mostly caused because some
neighboring cells are not configured or the coverage quality is poor.

For a relocated network, call drop is mostly caused because some neighboring cells
are not configured or the interoperability of the Iur interface of the relocated
network is poor.

For an upgraded network, call drop is mostly caused because the new version
(including the new hardware and new functions) has some defects.

For a stable commercial network, it is improbable that the call drop rate suddenly
increases. If the problem really occurs, it is possible that the soft failure occurs in
the equipment DSP or the transmission is abnormal on a large scale.
Therefore, the latest actions performed on the network are the critical information.

2016-12-19

Huawei Confidential

Page 70 of 131

RAN10 KPI Troubleshooting Guide

2.

INTERNAL

Narrowing the Scope, Analyzing Whether the High Call Drop Rate is Caused by
One or Two Cells, and Analyzing the Main Count Distribution of Call Drop
If the preceding factors are excluded, you need to first analyze the performance data.
Firstly, analyze the change in the call drop rate and call drop count of the cells, and
thus judge whether the problem is caused by the performance descent of one or two
cells.
Secondly, analyze the change tendency of the top N cells in the call drop rate and call
drop count, compare them with the change tendency of the entire network in the call
drop rate and call drop count, and thus judge whether the top N cells are
representative. If the top N cells are representative of the entire network, you can
analyze the top N cells emphatically.
Then, you can determine the main reasons why the call drop rate increases (or is not
up to standard). The following section describes the main reasons through traffic
measurement by taking the CS and PS service as an example:
Figure 1 lists the count reasons of CS call drop:

Figure 1 Indicators related to CS call drop


Indicator (Level 1)

Sub-indicator (Level 2)

VS.RAB.Loss.CS.RF

VS.RAB.Loss.CS.RF.RLCRst
VS.RAB.Loss.CS.RF.ULSync
VS.RAB.Loss.CS.RF.UuNoReply
VS.RAB.Loss.CS.RF.Oth

VS.RAB.Loss.CS.Abnorm

VS.RAB.RelReqCS.OM
VS.RAB.RelReqCS.UTRANgen
VS.RAB.RelReqCS.RABPreempt
VS.RAB.Loss.CS.Aal2Loss
VS.RAB.Loss.CS.Congstion.CELL
VS.Call.Drop.CS.Other

For the CS service, the common causes of call drop are as follows:

2016-12-19

VS.RAB.Loss.CS.RF: Abnormal release because of the out-of-step of the link. The


coverage quality is poor (for example, the signal quality of the current cell is poor,
some neighboring cells are not configured, or the handover area is small), so the
UP switches off the transmitter abnormally or the uplink demodulation is out of
step. To solve the problem, you need to improve the coverage quality. If the
network is newly built or relocated, the cause is frequently encountered.

VS.RAB.Loss.CS.RF.RLCRst: Link release because of the downlink SRB reset.


The coverage quality is poor (for example, some neighboring cells are not
configured, or the handover area is small). To solve the problem, you need to
improve the coverage quality. In an initial network, the cause is frequently
encountered.

VS.RAB.Loss.CS.RF.UuNoReply: The number of RABs released by the RNC


because of the Failure in the Radio Interface Procedure. The failure is usually
caused by the imbalance between the uplink coverage and downlink coverage and
Huawei Confidential

Page 71 of 131

RAN10 KPI Troubleshooting Guide

INTERNAL

fast signal change. You need to trace the IOS data or query the CHR log to further
analyze the cause of Uu Noreply.

VS.RAB.Loss.CS.Aal2Loss: The RNC initiates abnormal release after finding that


the AAL2 Path on the IU CS interface is abnormal. The scenario is seldom
encountered in practice. The corresponding alarm information is generated under
the scenario. It is possible that the transmission equipment is faulty or the RNC
version has some defects.

VS.Call.Drop.CS.Other: Call drop because of other causes. Lots of causes of call


drop (for example, the flow interaction times out, or cell update fails) are not
separately countered and are counted into OTHER. In practice, the count of call
drops caused by flow interaction timeout and cell update failure accounts for a high
proportion. Therefore, lots of causes of call drop are OTHER. You need to further
analyze the causes through the CHR log.

VS.RAB.RelReqCS.OM: The CS link is released caused by the operation and


maintenance work (for example, the cell is blocked). The call drop because of the
cause is normal.

VS.RAB.RelReqCS.UTRANgen: Number of RABs of the CS domain to be


released in the cell for the UTRAN Generated Reason. In practice, the scenario is
seldom encountered.

VS.RAB.RelReqCS.RABPreempt: The CS link is released because of the highpriority preemption. Such call drop occurs when the load and resources are not
enough. You need to determine whether to expand the capacity according to the call
drop count.
Figure 2 lists the count reasons of PS call drop:

Figure 2 Indicators related to PS call drop


Indicator (Level1)

Sub-indicator (Level2)

VS.RAB.Loss.PS.Abnorm

VS.RAB.RelReqPS.OM

Sub-indicator
(Level3)

VS.RAB.RelReqPS.RABPreempt
VS.RAB.Loss.PS.GTPULoss
VS.RAB.Loss.PS.Congstion.CELL
VS.Call.Drop.PS.Other
VS.RAB.Loss.PS.RF

VS.RAB.Loss.PS.RF.RLCRst

VS.RAB.Loss.PS.SRBReset
VS.RAB.Loss.PS.TRBReset

VS.RAB.Loss.PS.RF.ULSync
VS.RAB.Loss.PS.RF.UuNoReply
VS.RAB.Loss.PS.RF.Oth
In terms of the count values, the PS service is similar to the CS service. The
difference between them is as follows:

2016-12-19

Their CN interfaces are not consistent

Huawei Confidential

Page 72 of 131

RAN10 KPI Troubleshooting Guide

INTERNAL

The PS service involves the TRB reset. The following section analyzes the causes of
call drop:

VS.RAB.Loss.PS.RF: Abnormal release because of the out-of-step of the link. The


coverage quality is poor (for example, some neighboring cells are not configured,
or the handover area is small), so the UP switches off the transmitter abnormally or
the uplink demodulation is out of step. To solve the problem, you need to improve
the coverage quality. In an initial network, the call drop because of this cause
occurs frequently.

VS.RAB.Loss.PS.SRBReset: Link release because of the downlink SRB reset. The


coverage quality is poor (for example, some neighboring cells are not configured,
or the handover area is small). To solve the problem, you need to improve the
coverage quality. In an initial network, the call drop because of this cause occurs
frequently.

VS.RAB.Loss.PS.TRBReset: Link release because of the downlink TRB reset. The


coverage quality is poor (for example, some neighboring cells are not configured,
or the handover area is small). To solve the problem, you need to improve the
coverage quality. In an initial network, the cause is frequently encountered.

VS.RAB.Loss.PS.GTPULoss: The RNC initiates abnormal release after finding


that the GTPU on the IU PS interface is abnormal. In practice, the scenario is
seldom encountered. It is usually caused by equipment faults or defects.

PS_RAB_DROP_OTHER: Call drop because of other causes. Lots of causes of


call drop (for example, the flow interaction times out, or cell update fails) are not
separately countered and are counted into OTHER. In practice, the count of call
drops caused by flow interaction timeout and cell update failure accounts for a high
proportion. Therefore, lots of causes of call drop are OTHER. You need to further
analyze the causes through the CHR log.
Generally, the main causes of call drop are RLC Reset, UU Noreply, and Other.
Such causes usually result from poor coverage or product defects. To further
analyze the causes, you need to query the CHR log and trace the IOS or IFTS data.

3.

Locking the Problem Scenarios, and Determining the Main Scenarios Where the
Call Drop Rate Goes Up or is not Up to Standard
If you cannot determine the causes of call drop only through the performance data, you
also need to query the IOS data and CHR log for further analysis.
Firstly, determine the main causes of call drop through the performance data. For
further analysis, you need to also analyze the IOS data and CHR log as follows:

2016-12-19

Filter out the logs about the main causes of call drop among the PCHR logs, and
analyze the signal quality of the call drop active sets of all cells (or top N cells),
and thus judge whether call drop is caused by poor quality.

Through the PCHR logs, analyze the subscribers who undergo call drop because of
the main causes in all cells or the top N cells before and after the call drop rate is
changed, and judge whether one or two subscribers or the performance of the UE
of a specific brand affects the call drop rate. If yes, you can enable CDT tracing or
IOT test to further analyze the compatibility.

Enable the IOS tracing of top N cells, obtain the signaling about the main causes,
and determine the main scenarios of call drop from the signaling (check whether
the call drop is related to a specific flow, for example, softer handover and DRD).
In addition, analyze the fundamental cause of call drop by querying the CHR logs
generated in the corresponding time segment.

If you cannot determine the fundamental cause only through the IOS data, you can
consider enabling IFTS tracing (with the L2 user plane; you can deeply analyze the
call drop caused by SRB or TRB reset) after finding the main problem scenarios.

Huawei Confidential

Page 73 of 131

RAN10 KPI Troubleshooting Guide

INTERNAL

If you determine that the main cause lies in the transmission network layer, you
need to check the transmission quality emphatically. By checking the alarm
information, you can check whether there are any transmission-related alarms. If
the problem is caused by the transmission quality, directly transfer the problem to
the NodeB team (transmission team).
4.

Conducting Drive Test on Site, and Analyzing the Causes Deeply


If you still cannot determine the causes through the CHR log, performance data, and
IFTS data, you need to conduct drive test on site, reproduce the call drop signaling
under the main scenarios, and obtain the logs of the UE. Especially if no response is
received on the air interface, you need to query the logs of the UE.
The drive test should be well targeted, that is, determine the main scenarios or top N
cells of call drop, thus ensuring the high possibility of reproduction through drive test
and obtaining the main failure signaling.
You need to enable NodeB CDT tracing and RNC CDT tracing during the drive test,
and analyze their signaling information.

5.4 Cases of Call Drop


In November 2008, the StarHub site of Singapore was undergoing the Beta phase of the
RAN10. The field personnel fed back that the PS call drop rate went up sharply after the
EBBC cards of most sites in the existing network were activated on November 18th. As
shown below, the RNC402 can serve as a typical case. Before the EBBC cards are
activated, the PS call drop rate remains at about 0.3%. After the EBBC cards are activated,
the PS call drop rate increases to 1.2% sharply.

By querying the network actions performed before and after the problem occurs, you can
preliminarily determine that the problem is related to the activation of the EBBC cards. In
addition, the Beta version is new, so no similar known problems can be found. The analysis
of the performance data shows that no obvious top N cells are available. Therefore, the
problem is an entire-network problem. You can find that call drop is mostly caused by SRB
reset and TRB reset, which account for 94%.

2016-12-19

Huawei Confidential

Page 74 of 131

RAN10 KPI Troubleshooting Guide

INTERNAL

The IOS data and IFTS data traced on site show that abnormal call drop mainly occurs
under the following two scenarios:

2016-12-19

Scenario 1: After the new measurement control message is delivered upon completion
of soft handover, the L2 of the RNC does not receive the L2 ACK message sent by the
UE, thus causing SRB Reset. The SRB resets under the scenario account for 70% of
total SRB resets.

Scenario 1: After the active set update (ASU) message is delivered upon completion of
soft handover, the L2 of the RNC does not receive the L2 ACK message sent by the
UE, thus causing SRB Reset. The SRB resets under the scenario account for 30%.

Huawei Confidential

Page 75 of 131

RAN10 KPI Troubleshooting Guide

INTERNAL

Why is the L2 ACK message not received after the RNC delivers the Meas Ctrl or
ASU message?
For the SRB reset, you cannot determine the fundamental cause only through the data
at the RNC side. Even if you can obtain the user-plane data through IFTS tracing, you
can find only the following symptom: After the downlink delivers the Meas Ctrl or
ASU message, the L2 ACK message is not received; therefore, the last PDU with the
Poll is retransmitted repeatedly and subsequently, the RESET PDU is retransmitted
repeatedly. However, you cannot determine which of the following occurs:

The downlink data cannot reach the UE.

The UE uplink returns an acknowledgement, but the acknowledgement cannot reach


the RNC.
Therefore, the field personnel conduct drive test, and capture the CDT data and Probe
data (trace the L1 and L2 data). By converting the Probe data into the QCAT data, you
can see that the authorized SG of the uplink HSUPA of the UE is extremely small. As a
result, the physical layer is not fully authorized to send the L2 ACK message although
the RLC layer of the UE returns the L2 ACK message for the data of the RNC. As
shown in the following figure, the authorized SG of the HSUPA at the UE side is
lowered constantly from 8 to 7 till 4 because the SG DOWN message is received
repeatedly.

2016-12-19

Huawei Confidential

Page 76 of 131

RAN10 KPI Troubleshooting Guide

INTERNAL

As you know, the authorized SG should be at least 8 if the HSUPA uplink sends a 336bit (336 bit + 18-bit TB header = 354 bit) PDU. If the authorized SG is lower than 8,
the data cannot be sent through the uplink. Finally, the data is retransmitted through
the downlink till the maximum count, that is, the reset is initiated.
TB
Ind
ex

TB
Size

MAC-e
Data
Rate(k
bps)

Afte
r
Codi
ng

RLC
PDU

RLC
Data

SF

Num

Rate
(kbps)

18

1.8

138

256

186

18.6

642

204

20.4

696

354

35.4

1146

372

37.2

1200

Eqv
Ch

BtEd/Bt
C

Ref
ETPR

SG
LUP
R

0.2199707

0.0484

32

0.7071068

0.5

32

0.7405316

0.5484

32

32

0.9755065

0.9516

32

32

Num

You also need to explain the following question: To prevent the unlimited descent of
the SG from the SRB or TRB reset caused by the failure to sent packets through the
uplink, the SG is set to 8 and the authorized SG will not be lowered because of the
insufficiency of resources. However, why is the authorized SG is lowered continuously
after the authorized SG is set to 8?
This problem is a defect of product implementation. Therefore, you need to ask the
R&D personnel to participate in analysis. Finally, the NodeB development personnel
find the bug of product implementation. When the dynamic CEs are activated and the
SRB over HSPA switch is enabled in some weak-signal areas, the following problem
occurs:
If there exists CE congestion and the uplink EDPDCH of the decoding DSP has the
NACK information, the downlink DSP delivers the RG Down command by mistake
and the SG of the UE is lowered to 4.
The defective NodeB has been incorporated into the RAN10 B053.

5.5 List of Problem Information


Checklist for KPI
Troubleshooting-5.5.xls

2016-12-19

Huawei Confidential

Page 77 of 131

RAN10 KPI Troubleshooting Guide

INTERNAL

Inter-RAT Interoperability

Inter-RAT interoperability involves a great diversity of NEs, and the failure is mainly
caused by the incorrectness or inconsistency of parameter settings between NEs. When
analyzing such problems, you need to fully communicate with the customer, personnel of
the core network, and GSM personnel, thus obtaining the related information and avoiding
the vain work.

6.1 Inter-RAT Handover from WCDMA to GSM


(CS Domain)
6.1.1 KPI Definition
Definition of the RNC-level indicators:
VS.SRELOC.SuccPrep.IRHOCS.Rate = VS.SRELOC.SuccPrep.IRHOCS /
VS.SRELOC.AttPrep.IRHOCS
VS.IRATHO.SuccCSOut.RNC.Rate= VS.IRATHO.SuccCSOut.RNC /
VS.IRATHO.AttCSOut.RNC
Definition of the cell-level indicators:
VS.IRATHO.SuccRelocPrepOutCS.Cell.Rate= <IRATHO.SuccRelocPrepOutCS> /
<IRATHO.AttRelocPrepOutCS>
VS.IRATHO.SuccOutCS.Cell.Rate =<IRATHO.SuccOutCS> / <IRATHO.AttOutCS>

2016-12-19

Huawei Confidential

Page 78 of 131

RAN10 KPI Troubleshooting Guide

INTERNAL

6.1.2 Influence Factors


Figure 1 Flow on CS inter-RAT handover out of 3G

The handover process includes the following two processes:

Relocation preparation process


The SRNC sends the RELOCATION REQUEST message to the CN. The message
contains such information as the relocation type, relocation reason, source PLMN,
source LAC, source SAC, destination PLMN, and destination LAC.
The CN interacts with the GSM by forwarding the GSM MSC, and prepares the
related resources.
After the GSM-related resources are prepared, the CN sends the RELOCATION
COMMAND message to the SRNC. The message contains the layer 3 information
element, and the element carries the information about the related resources allocated
by the GSM.
If the allocation of all resources or some resources fails, the CN sends the
RELOCATION PREPARATION FAILURE message to the SRNC.

Handover implementation process


The RNC delivers the HANDOVER FROM UTRAN COMMAND message to the
UE. The message carries the RAB ID, activation time, GSM frequency, and the GSM
message in the form of a bit string.
After the UE accesses the GSM, the CN sends the IU RELEASE COMMAND
message continuously, instructing the RNC to release the resources of the UE in the
WCDMA system.
Relocation preparation failure is mainly caused for the following reasons:

2016-12-19

Huawei Confidential

Page 79 of 131

RAN10 KPI Troubleshooting Guide

INTERNAL

The 2G equipment is abnormal or resources are not enough.

The CN parameters are not configured reasonably.

The configurations of GSM neighboring cells are not consistent with actual
parameters.
Handover implementation failure is mainly caused for the following reasons:

The parameters of 2G neighboring cells are not configured correctly.

The 2G encryption algorithm is not consistent with the 3G encryption algorithm.

There exists side-channel interference in 2G cells.

The handover threshold is not set reasonably.

6.1.3 Analysis Process


1.

Discussing the Problem and Ascertaining the Problem Background and Product
Version
When the problem occurs, determine the key time at which the success rate is changed,
and know the recent adjustment of the 2G access network, 3G access network, and
CN. Analyze the impacts of the key actions performed at the corresponding time upon
the KPIs.

2.

Determining the Main Scenarios


Firstly, measure the relocation preparation success rate and handover implementation
success rate of the RNC level and cell level respectively according to the performance
data of the RNC. Determine which flow causes the descent of the inter-RAT handoverout success rate, and check whether the success rate of the entire network or the
success rate of some cells decreases. If the problem only occurs in one or two cells, it
indicates that the problem is related to the configuration of the GSM neighboring cells.
Secondly, analyze which cause leads to the descent of the inter-RAT handover-out
success rate. Figure 1 lists the failure causes defined by the performance counter.

Figure 1 Indicators related to CS inter-RAT handover-out failure


Indicator (Level1)

Sub-indicator (Level2)

VS.SRELOC.FailPrep.IRATCSOut

VS.SRELOC.Fail.IRATCSOutNRpl

(Relocation preparation failure)

VS.SRELOC.Fail.IRATCSOutCanc
VS.SRELOC.Fail.IRATCSOutTexp
VS.SRELOC.Fail.IRATCSOutTfai
VS.SRELOC.Fail.IRATCSOutTOve
VS.IRATHO.PrepFailCSOut.UkwnRNC
VS.IRATHO.PrepFailCSOut.NoRsrc
VS.IRATHO.PrepFaiCSInTgtOveL
VS.IRATHO.PrepFailCSOutReqinfnotavai

VS.IRATHO.FailCSOut.RNC

VS.IRATHO.FailCSOut.CfgUnRNC

(Handover implementation failure)

VS.IRATHO.FailCSOut.PhyFaRNC

3.

2016-12-19

Analyzing the Causes Case by Case

Huawei Confidential

Page 80 of 131

RAN10 KPI Troubleshooting Guide

INTERNAL

VS.SRELOC.Fail.IRATCSOutNRpl/ VS.SRELOC.Fail.IRATCSOutTexp
After the SRNC sends the RELOCATION REQUIRED, the SRNC starts the timer to
wait for the RELOCATION COMMAND message. If the RELOCATION
COMMAND message is not received when the timer times out, the SRNC sends the
RELOCATION CANCEL message and measures the indicator.
<Method of analysis>

Check whether the RNC links and MSC links are normal.

Check the CN configuration, especially the transmission parameters of the 2G


MSC/VLR, for example, the data of the MTP layer, data of the SCCP layer, and
inter-MSC trunk data.

Query the CN configuration, and check whether inter-RAT handover is allowed.

Trace and analyze the MSC/BSS signaling. Ask the CN personnel and 2G
personnel to attend the analysis.

VS.SRELOC.Fail.IRATCSOutCanc
After requesting the handover preparations, the RNC receives the release command
sent by the CN. It is usually caused as follows:

The inter-RAT handover request is initiated during the signaling (for example,
location update). Location update is complete before the flow is complete, so the
CN initiates the release.

The subscriber who sets up the call hangs up during the handover preparation, so
the CN initiates the release.
Although handover is not complete, the two circumstances are normal flow
embedment.

VS.SRELOC.Fail.IRATCSOutTfai
The relocation fails in the target CN/RNC or in the system. Usually, the cause is as
follows:

The CN configurations are not correct.

The BSS does not support the relocation.

<Method of analysis>:

2016-12-19

Check the CN negotiation data.

Check whether the BSS allows inter-RAT handover-in.

Check whether the configurations of GSM neighboring cells are consistent with the
actual parameters. The BTS may fail to find the target cell. If the problem occurs
only to one or two cells, you can trace the IOS data, determine whether the
relocation failure occurs only to one or two target cells, and check the parameters
of the GSM neighboring cells of the target cells.

Huawei Confidential

Page 81 of 131

RAN10 KPI Troubleshooting Guide

INTERNAL

Figure 1 Relocation Required message

The message carries the address information about the BSC of the access network
that expects to provide services for the subscriber. Usually, the address is the CGI
(global cell id=PLMN + LAC + CELL ID) of the target cell. The message also
carries the information about the current cell, that is, PLMN + (LAC and SAC).

VS.IRATHO.PrepFailCSOut.UkwnRNC
The target RNC is unknown. The cause is the main cause of relocation failure. Usually,
the reason is that the MSC cannot find the route leading to the 2G cells.
<Method of analysis>:

Check the CN configuration. It is possible that the LAI of the 2G target cell is not
configured on the MSC.

Check the consistency of the parameters of GSM neighboring cells configured on


the RNC.

VS.IRATHO.PrepFailCSOut.NoRsrc
No resources are available. Usually, the BSC has no resources available for the access
of the UE or the 2G MSC has no information about the target cells.
<Method of analysis>:

Check the resource utilization of the 2G BSS. It is possible that no channel is


available because the channel is occupied by another subscriber.

Check the status of the target cell. The target cell may be faulty.

Check the mapping between the target cell and 2G MSC on the 3G MSC.

VS.IRATHO.FailCSOut.CfgUnRNC
The handover is not supported by the configuration. Usually, the UE does not receive
the HANDOVER FROM UTRAN COMMAND message delivered by the RNC

2016-12-19

Huawei Confidential

Page 82 of 131

RAN10 KPI Troubleshooting Guide

INTERNAL

because of the incorrect RNC format, incompatibility of the UE, or incorrect


configuration of the encryption parameters.
<Method of analysis>:

The encryption parameters are not set correctly

Trace the IOS data of the top N cells, and query the encryption algorithm.

Figure 2 Relocation Command message

Check whether the parameter of the encryption algorithm on the BSC is consistent
with that carried by the relocation command.
In the 3G system, the encryption process is required. In the 2G system, the encryption process is
optional. Therefore, the 2G system can send an encryption-related parameter optionally when the UE
is handed over from the UMTS to the GSM.

If the 2G system does not send an encryption-related parameter, the MSOFTX3000 uses the default
handover configuration to reestablish a Cipher Mode Setting parameter and sends the parameter
to the RNC through the signalling message of Relocation command. When the 2G system
sends an encryption parameter carrying the chosen encryption algorithm, the
MSOFTX3000 uses the chosen encryption algorithm to establish the Cipher Mode Setting
parameter and sends the parameter to the RNC through the signalling of Relocation
command.

2016-12-19

If they are not consistent, further trace the CN signaling and query the encryption
parameter received by the MSC.

Huawei Confidential

Page 83 of 131

RAN10 KPI Troubleshooting Guide

INTERNAL

Figure 3 Handover Request ACK message

<Suggestions>:
Modify the handover parameter configuration of the 2G LAC on the MSOFTX3000,
so that the encryption parameter carried by the CN to the RNC is consistent with the
encryption parameter used by the 2G system.

UE compatibility

Trace the IOS data of the top N cells, obtain the failure flow, and analyze whether
there exists a typical scenario, for example, some flow interactions cause the UE to
return the message of Unsupported Configuration.

Obtain the IMSIs of the terminals through the CHR or PCHR log. If the problem
mainly occurs in one or two terminals, it indicates that the problem is caused by the
UE. Then, inquire the customer about the corresponding IMEI of the IMSI, and
query the type of the failed terminal.

If conditions permit, verify the problem in the HQ. Alternatively, ask the field
personnel to conduct drive test.

Incorrect RNC signaling format

Trace the IOS data of the top N cells, and capture the failed cells.

Compare the HO_FROM_UTRAN_CMD_GSM generated at the failure time with


the signaling generated at the time of normal handover. A usual problem is as
follows: The handover command does not carry the encryption indication. If this
problem occurs, you need to modify the handover parameter configuration of the
2G LAC on the MSC.

The ETSI GSM PHASE I protocol has a defect: The handover command does not carry the
encryption information. The ETSI GSM PHASE II protocol has rectified the defect. However, the
GSM devices of lots of vendors have not rectified the defect in accordance with the ETSI GSM
PHASE II protocol. If the CN does not reestablish the encryption for the RNC, a format error occurs.

<Suggestions>:
If the 2G BSC does not send the Chosen Encryption Algorithm parameter, configure
the handover parameter of the 2G LAC on the MSOFTX3000.

2016-12-19

Huawei Confidential

Page 84 of 131

RAN10 KPI Troubleshooting Guide

INTERNAL

For other problems, directly collect the related information and feed back the
information to the R&D department for analysis.

VS.IRATHO.FailCSOut.PhyFaRNC
Inter-RAT handover implementation failure is mainly caused as follows:
1) After receiving the Handover From Utran command, the UE attempts to access the
system on the BTS.
2) The UE repeatedly sends the Handover Access message to the BTS through the
FACCH, starts the T3124 timer (the default is 320 ms), and stops sending the
message if receiving the PHY INFO message.
3) If the timer times out, the BTS returns the old Utran channel and replies the
physical channel failure.
<Method of analysis>:

Check the parameter configuration of the GSM neighboring cells. For example, if
the BCCHARFCN is not configured correctly, the cell in the measurement report
that reaches the handover threshold is not the actual cell accessed by the UE. As a
result, the signal quality of the actual handover cell does not satisfy the handover
requirements and thus the handover fails.

Check whether the unreasonable setting of the handover threshold causes the easy
handover but poor signal quality of the 2G cell.

Check whether the handover failure is caused because the encryption algorithms
are not consistent.

If you still cannot solve the problem, ask the 2G personnel to attend the analysis.

6.2 Inter-RAT Handover from GSM to WCDMA


(CS Domain)
6.2.1 KPI Definition
Definition of the RNC-level indicators:
VS.IRATHO.PrepSuccCSIn.RNC.Rate= VS.IRATHO.PrepSuccCSIn.RNC /
VS.IRATHO.PrepAttCSIn.RNC
VS.IRATHO.SuccExecCSIn.RNC.Rate= VS.IRATHO.SuccExecCSIn.RNC /
VS.IRATHO.AttExecCSIn.RNC
Definition of the cell-level indicators:
VS.IRATHO.SuccRelocPrepInCS.Cell.Rate= < VS.IRATHO.PrepSuccCSIn > / <
IRATHO.AttIncCS >
VS.IRATHO.SuccInCS.Cell.Rate =< IRATHO.SuccIncCS > / < IRATHO.AttIncCS >

2016-12-19

Huawei Confidential

Page 85 of 131

RAN10 KPI Troubleshooting Guide

INTERNAL

6.2.2 Influence Factors


Figure 1 Flow on CS handover-in
UE

Node B

RNC
Target

CN

MSC

BSSMAP

MAP/E

RANAP

RANAP

3. Relocation
Request
4. Relocation
Request Ack.

2. Prepare
Handover

BSC

1. Handover
Required

BTS

BSSMAP

MAP/E

RANAP

RANAP

MAP/E

5. Prepare Handover
Response

MAP/E

BSSMAP

6. Handover
Command

BSSMAP

7. Handover Command
RR

RR

RANAP

RRC

9. DCCH : Handover Complete

8. Relocation
Detect

RANAP

RRC
10. Relocation Complete
RANAP

RANAP

MAP/E

11. Send End Signal


Request
MAP/E

BSSMAP

BSSMAP

12. Clear
Command
13. Clear
Complete

BSSMAP

BSSMAP

14. Send End Signal


Response
MAP/E
MAP/E

After receiving the RADIO LINK RESTORE INDICATION message, the RNC sends the
RELOCATION DETECT message to the MSC Server, notifying that the UE is handed over
from the GSM to the WCDMA.
The UE sends the HANDOVER TO UTRAN COMPLETE message to indicate that the
handover is complete. If the UE cannot complete the handover, the UE reports the handover
failure to the GSM.
After receiving the HANDOVER TO UTRAN COMPLETE message, the RNC sends the
RELOCATION COMPLETE message to the MSC Server, indicating that the handover is
complete. Additionally, the RNC controls the UE for mobility management, query of the
UE capability, and safe mode.
The relocation preparation failure is mainly caused for the following reasons:

The 3G cell resources are not enough.

Parameters are not configured correctly.

The handover failure is mainly caused for the following reasons:

2016-12-19

Huawei Confidential

Page 86 of 131

RAN10 KPI Troubleshooting Guide

INTERNAL

The radio air interface is abnormal.

Parameters are not configured correctly.

6.2.3 Analysis Process


1.

Discussing the Problem and Ascertaining the Problem Background and Product
Version
When the problem occurs, determine the key time at which the success rate is changed,
and know the recent adjustment of the 2G access network, 3G access network, and
CN. Analyze the impacts of the key actions performed at the corresponding time upon
the KPIs.

2.

Determining the Main Scenarios


Firstly, measure the inter-RAT CS handover-in success rate of the RNC level and cell
level respectively according to the performance data of the RNC, and thus determine
whether the success rate of the entire network or the success rate of some cells
decreases.
Secondly, analyze which cause leads to the descent of the inter-RAT handover-out
success rate. Figure 1 lists the failure causes defined by the performance counter.

Figure 1 Indicators related to CS inter-RAT handover-in failure


Indicator (Level1)

Sub-indicator (Level2)

VS.SRELOC.FailPrep.IRATCSIn

VS.IRATHO.PrepFaiCSInCongRNC
VS.IRATHO.PrepFaiCSInTfailRN
VS.IRATHO.PrepFaiCSInTunsRNC

VS.IRATHO.Incoming.Fail.RNC

3.

Analyzing the Causes Case by Case

VS.IRATHO.PrepFaiCSInCongRNC

VS.IRATHO.FailExecCSIn.NRply

IRATHO.FailIncCS.ResUnavail
The relocation failure message is received, and the cause value is Resource
Unavailable, that is, the admission fails. The common resources include the power,
codes, CEs, and IUB transmission.
<Method of analysis>:

Analyze the success rate of the cells from the performance data, and obtain the list
of top N cells.

By the utilization of various resources, analyze the resource limitation of the top N
cells. For details about the analysis method, see the section of the analysis of
resource congestion upon RRC setup failure.

IRATHO.FailIncCS.TRNCSysFailReloc/ VS.IRATHO.PrepFaiCSInTfailRN
The relocation fails in the target system or RNC.
<Method of analysis>:

2016-12-19

Check the parameter configuration of the CN.

Check the configuration of the 3G neighboring cells on the BSC.

Huawei Confidential

Page 87 of 131

RAN10 KPI Troubleshooting Guide

INTERNAL

VS.IRATHO.FailExecCSIn.NRply / VS.IRATHO.FailExecCSIn.NRply
The handover fails because the UE has no response.
<Method of analysis>:

Analyze the success rate of the cells from the performance data, and obtain the list
of top N cells.

Check whether the neighboring cell parameters of the top N cells are configured for
the 2G cells, and ensure that the target cells of the handover are correct.

If the handover failure rate is high, directly trace the IFTS data of the top N cells.
Otherwise, it is recommended that you conduct drive test and trace the CDT and
Probe data.

Analyze whether the uplink is synchronized through the CDT or IFTS data, that is,
whether the RNC receives the RL_RESTORE_IND message. If the uplink
synchronization indication is not received, you need to further determine whether
the transmit power of the UE increases and reaches the maximum value.

Figure 1 Signaling of CS inter-RAT handover-in

If the transmit power of the UE increases to the maximum value, it indicates that the
downlink is synchronized. Therefore, the uplink synchronization fails. Usually, it is
possible that the transmit power of the dedicated uplink channel is relatively low.

If the transmit power of the UE does not increase, it indicates that the downlink
synchronization fails. It is possible that the minimum power of the downlink DPCCH
is configured to an extremely small value.

If the synchronization indication is received, it indicates that the RNC does not
receive the HO_UTRAN_CMP message. It is possible that the encryption
algorithms are not consistent or packets are lost during the transmission.
If the encryption algorithms are not consistent, you can observe whether the
encryption parameter that the CN carries to the RNC is consistent with the
encryption algorithm carried in the Handover to UTRAN Command message that
the BSC delivers to the UE.

2016-12-19

Huawei Confidential

Page 88 of 131

RAN10 KPI Troubleshooting Guide

INTERNAL

Figure 2 Relocation_Request message

To determine whether packets are lost during the transmission, capture the
necessary information about the site and feed back the information to the R&D
department for analysis.

6.3 Inter-RAT Handover from WCDMA to GPRS


(PS Domain)
6.3.1 KPI Definition
Definition of the RNC-level indicators:
VS.IRATHO.SuccPSOutUTRAN.RNC.Rate = VS.IRATHO.SuccPSOutUTRAN.RNC /
VS.IRATHO.AttPSOutUTRAN.RNC
Definition of the cell-level indicators:
VS.IRATHO.SuccOutPSUNTRAN.Cell.Rate = <IRATHO.SuccOutPSUTRAN> /
<IRATHO.AttOutPSUTRAN>

2016-12-19

Huawei Confidential

Page 89 of 131

RAN10 KPI Troubleshooting Guide

INTERNAL

6.3.2 Influence Factors


Figure 1 Flow on PS inter-RAT handover out of

The handover failure is mainly caused for the following reasons:

The neighboring cell parameters are not configured correctly.

The CN configuration is not correct or the CN configuration does not support the
handover.

There exists interference in the 2G cell.

6.3.3 Analysis Process


1.

Discussing the Problem and Ascertaining the Problem Background and Product
Version
When the problem occurs, determine the key time at which the success rate is changed,
and know the recent adjustment of the 2G access network, 3G access network, and
CN. Analyze the impacts of the key actions performed at the corresponding time upon
the KPIs.

2.

Determining the Main Scenarios


Firstly, measure the inter-RAT CS handover-in success rate of the RNC level and cell
level respectively according to the performance data of the RNC, and thus determine
whether the success rate of the entire network or the success rate of some cells
decreases.

2016-12-19

Huawei Confidential

Page 90 of 131

RAN10 KPI Troubleshooting Guide

INTERNAL

Secondly, analyze which cause leads to the descent of the inter-RAT handover-out
success rate. Figure 1 lists the failure causes defined by the performance counter.
Figure 1 Indicators related to PS inter-RAT handover-out failure
Indicator (Level1)

Sub-indicator (Level2)

VS.IRATHO.PSOut.FailPS

VS.IRATHO.PSOut.CfgUnsup
VS.IRATHO.PSOut.PhyCHFail
VS.IRATHO.PSOut.Unpec
VS.IRATHO.PSOut.NoReply

3.

Analyzing the Causes Case by Case

VS.IRATHO.PSOut.PhyCHFail / IRATHO.FailOutPSUTRAN.PhyChFail
After receiving the CELL CHANGE ORDER FROM UTRAN message, the UE starts
the T309 timer. The T309 timer is stopped if the UE sets up a connection in a new cell.
Once the T309 timer times out, the original 3G cell is returned and the CCO failure
message is sent.
<Method of analysis>:

Check the configuration of the GSM neighboring cell parameters. If the parameters
are not configured correctly, the access is initiated in an incorrect target cell.

Check whether the status and KPIs of the target cell are normal.

Check the resource utilization of the target cell, and determine whether the access
failure is caused by the insufficiency of resources.

Check whether there exists strong interference in the GSM radio environment. The
downlink interference affects the reading of the downlink SI information. The
uplink interference causes the uplink signaling, for example, the Channel Request
message cannot be sent successfully.

VS.IRATHO.PSOut.NoReply / VS.IRATHO.CCO.FailOutPSUTRAN.Nrply
After sending the CCO message, the RNC starts the Trelocoverall timer. The timer is
stopped after the UE returns the CCO failure message or receives the IU RELEASE
CMD (the cause value is Normal release) message sent by the SGSN. Once the timer
times out, the RNC actively sends the IU RELEASE REQUEST message to the
SGSN. If receiving the SRNS Context Req message during the period, the RNC
restarts the timer.
<Method of analysis>:

2016-12-19

Check whether the 2G SGSN allows the handover-in.

Determine the top N cells according to the performance data, conduct drive test,
trace the UE LOG, and check whether the RAU is complete after the UE accesses
the 2G cell.

Huawei Confidential

Page 91 of 131

RAN10 KPI Troubleshooting Guide

INTERNAL

Figure 1 Flow on LAU/RAU after the UE accesses the 2G cell

<Suggestions>:
If it takes an extremely long period to complete the RAU, the possible cause is the
radio environment of the target cell. Therefore, if the UE can complete the RAU in the
2G cell, you can extend the Trelocoverall timer to raise the inter-RAT handover
success rate.

6.4 Inter-RAT Handover from GPRS to WCDMA


(PS Domain)
6.4.1 KPI Definition
VS.IRATHO.SuccPSInUE.RNC.Rate= VS.IRATHO.SuccPSInUE.RNC /
VS.IRATHO.AttPSInUE.RNC

6.4.2 Analysis Process


The inter-RAT handover-in success rate in the PS domain is consistent with the RRC setup
success rate with the cause value of cell reselection over different subsystems. For details
about the analysis of the PS inter-RAT handover-in problems, see the section of the RRC
setup success rate.

2016-12-19

Huawei Confidential

Page 92 of 131

RAN10 KPI Troubleshooting Guide

INTERNAL

6.5 List of Problem Information


Checklist for KPI
Troubleshooting-5.10 .xls

2016-12-19

Huawei Confidential

Page 93 of 131

RAN10 KPI Troubleshooting Guide

INTERNAL

Information Collection

7.1 Performance data of RNC


7.1.1 Purpose

To determine the KPIs that are deteriorated

To know the magnitude, trend, and scope of KPI changes

To provide the guidance for subsequent IOS tracing, drive test, and feedback of the
CHR data (in which subrack)

7.1.2 Information to Be Collected

Performance data generated within one week before the KPIs are changed

All performance data generated after the problem occurs

7.1.3 Method
1.

Collecting the Information Through the RNC by Using FTP


The BAM of the RNC automatically collects and saves the performance data.
Therefore, the related field personnel can log in to the RNC BAM to obtain the desired
performance data by using FTP.
On the RNC BAM, the performance data is saved in the following path:
V2 platform: \BAM\VersionA (VersionB)\FTP\MeasResult
V1 platform: \BAM\\FTP\MeasResult
You can query the workarea directory of the BAM by running the LST BAMAREA command.

2016-12-19

Huawei Confidential

Page 94 of 131

RAN10 KPI Troubleshooting Guide

INTERNAL

Figure 1 Querying the workarea of the BAM

2.

Collecting the information from the M2000


The M2000 periodically takes and saves the performance data from the BAM of the
RNC. Therefore, the performance data of the existing network can be obtained from
the M2000 by using FTP.
On the RNC BAM, the performance data is saved in the following path:
You can obtain the performance data by using the following method:
Log in to the M2000 by using FTP, ftp://(M2000 IP).
Enter the FTP username and password of the M2000.
The performance data is saved in the following folder: ftp:// (M2000 IP)/ftproot/pm/

7.2 RNC CHR/PCHR


7.2.1 Purpose

To view the classification of main KPIs (for example, handover and call drop rate)
through full record

To check whether soft failure occurs in the DSP, whether the problem occurs in the
terminals of the same model, and whether there exists identical print information about
internal errors

7.2.2 Information to Be Collected

2016-12-19

Data generated before and after the problem occurs

Huawei Confidential

Page 95 of 131

RAN10 KPI Troubleshooting Guide

INTERNAL

Data of the corresponding subrack if the field personnel determine that the problem is
a single-subrack problem

7.2.3 Method
1.

Collecting the information from the RNC by using FTP


For the RNC of the RAN10 version, the CHR log and PCHR log are merged into one
file.
Log in to the RNC BAM (remotely, locally or through FTP). Then, you can obtain the
CHR data on the RNC in the following path:
X:\Bsc6800\BAM\Common\Famlog\fmt

2.

Collecting the information through the COL LOG comand


By running the COL LOG command, you can export the CHR log, alarm log, and
operation log at a time. If you need to return the information, the method is
recommended.

Figure 1 Exporting the CHR log (by running the COL LOG command)

The exported file is named FixInfo_Host.zip. After decompressing the file, you can
obtain the operation information.
By default, the exported file is saved in the following directory:
V2 platform: \BAM\VersionA\FTP or \BAM\VersionB\FTP
V1 platform: \BAM\\FTP
You can query the workarea directory of the BAM by running the LST BAMAREA command.

7.3 RNC IOS Tracing


7.3.1 Purpose

To determine the processes that cause the problem

7.3.2 Information to Be Collected

2016-12-19

Trace the faulty top N cells

Huawei Confidential

Page 96 of 131

RAN10 KPI Troubleshooting Guide

INTERNAL

7.3.3 Method
In the Maintenance window of the operation and maintenance system, click Trace
Management and select the types of objects to be traced.
Figure 1 Types of objects to be traced

Double-click IOS. Then, the IOS Tracing dialog box is displayed.

2016-12-19

Huawei Confidential

Page 97 of 131

RAN10 KPI Troubleshooting Guide

INTERNAL

Figure 2 IOS Tracing dialog box

Set the related parameters in the dialog box. For the traced events, select Select Default
usually. In special cases, you can select other events or select Select All. At a time, a
maximum of 50 calls and 32 cells can be traced. You can start a maximum of eight tasks.
The trace tasks occupy a large number of resources. It is recommended that you create the trace tasks
when the system is idle or only one IOS is traced at a time. If the system is busy, the running trace
task may be terminated automatically.

Click More Info. Then, you can set the browsing and saving of messages, as shown in
Figure 3. You need to pay attention to two parts in Figure 3. In the area marked with 1, you
need to select the desired traffic classes, for example, the BE service or stream service. In
the area marked with 2, you need to set the RAB properties to trace the events selectively.
Especially, you can obtain the filtered information when analyzing the specific problems.

2016-12-19

Huawei Confidential

Page 98 of 131

RAN10 KPI Troubleshooting Guide

INTERNAL

Figure 3 MoreInfo dialog box

Click OK to start the tracing task.

7.4 RNC IFTS/CDT (User Plane) Tracing


7.4.1 Purpose

To deeply analyze the data (mainly the L2 data) of a typical scenario

To extract the TCP data and voice Wav data

7.4.2 Information to Be Collected

User-plane tracing and L2 measurement

Visibility of the performance monitoring item

Top N faulty cells for IFTS tracing

7.4.3 Method
Trace the internal printed messages of the RNC by modifying the script file: Use the Ultra
Edit32 or notepad to open the text file RncTestConfig.xml under the specified directory in
the traced LMT version, for example,
D:\HWLMT\adaptor\clientadaptor\RNC\BSC6800V100R008C01B082\style\defaultstyle\lo
cale\en_US\rnctest\RncTestConfig.xml
Set the value of each parameter to 1.
<DESC descname="CDTMSGTYPE">
<PARAS>
<PARA name="UI_FAM_UT_STANDARD_MSG" value="1"/>
<PARA name="UI_FAM_UT_INTRA_MSG" value="1"/>

2016-12-19

Huawei Confidential

Page 99 of 131

RAN10 KPI Troubleshooting Guide

INTERNAL

<PARA name="UI_FAM_UT_CTRL_TBL" value="1"/>


<PARA name="UI_FAM_UT_STATE_TRANS" value="1"/>
<PARA name="UI_FAM_UT_PRINT_INFO" value="1"/>
<PARA name="UI_FAM_UT_FUNC_CALL" value="1"/>
<PARA name="UI_FAM_UT_L2_DATA_FWD_MSG" value="1"/>
<PARA name="UI_FAM_UT_L2_TXT_FWD_MSG" value="1"/>
<PARA name="UI_FAM_UT_GTPU_DATA_FWD_MSG" value="1"/> <PARA
name="UI_FAM_UT_REAL_TIME_INFO" value="1"/>
<PARA name="UI_FAM_UT_FMR_SIG_DT_FWD_MSG" value="1"/> <PARA
name="UI_FAM_UT_FMR_UP_DT_FWD_MSG" value="1"/>
<PARA name="UI_FAM_UT_RADIO_PERF_INFO" value="1"/>
<PARA name="UI_FAM_UT_CELL_INFO" value="1"/>
<PARA name="UI_FAM_UT_ALPATH_PVC_INFO" value="1"/>
</PARAS>
</DESC>
After modifying the script, start the LMT of the corresponding RNC version and log in to
the BAM. In the Maintenance window of the operation and maintenance system, click
Trace Management and select the types of objects to be traced. For both IFTS tracing and
CDT tracing, you need to select CDT.
Figure 1 Type of trace object

2016-12-19

Huawei Confidential

Page 100 of 131

RAN10 KPI Troubleshooting Guide

INTERNAL

Double-click CDT. Then, the dialog box of task parameter setting is displayed. If you select
UEID in the CDT Match Type box, the tracing task is CDT tracing. On the UE ID tab,
enter the traced UE IMSI and select the saving path. The default saving path is X:\HW
LMT\client\output\RNC\BSC6810V200R010C01B061\trace. You can also set a custom
saving path and file name. If the tracing period is extremely long, multiple files are
generated. Generally, the end of the filename is 1 or -2. Click OK to start the tracing
task. The CDT data of up to two UEs can be traced simultaneously. However, the sum of
the number of started CDTs and number of standard-interface tasks of the UE cannot
exceed six.
Figure 2 Configuration page of CDT parameters

If you select IFTS in the CDT Match Type box, the tracing task is IFTS tracing. On the
UE ID tab, you can set the tracing period. The value 0 indicates that the tracing period is
not restricted. In addition, you need to set the ID of the cell to be traced, select an SPU
subsystem, select traffic classes, or RRC setup reasons. Finally, click OK to start the tracing
task.

2016-12-19

Huawei Confidential

Page 101 of 131

RAN10 KPI Troubleshooting Guide

INTERNAL

Figure 3 Configuration page of IFTS parameters

For CDT tracing or IFTS tracing, it is recommended that the user-plane tracing be attached.
If necessary, you need to attach the performance monitor tracing.

2016-12-19

Huawei Confidential

Page 102 of 131

RAN10 KPI Troubleshooting Guide

INTERNAL

Figure 4 Configuration page of user-plane tracing

On the Other tab, set the information about the user-plane tracing. Generally, you need to
select Periodically Data Report (it is set to 2s). L2 Data Report Time(s) is set to 100s.
AUTO_PACKET_GENERATE cannot be selected.
On the Monitor tab, you can select the performance monitor items to be traced.

2016-12-19

Huawei Confidential

Page 103 of 131

RAN10 KPI Troubleshooting Guide

INTERNAL

Figure 5 Configuration page of performance monitoring

7.5 Standard Signaling Tracing on the RNC


7.5.1 Purpose

Mainly analyze the problems of RRC access on the Uu interface.

Analyze the Iub/Iur-specific problems.

7.5.2 Information to Be Collected

Trace the related interface signaling on a case-by-case basis.

7.5.3 Method
On the Maintenance page of the RNC LMT, choose Trace Management Interface
Trace Task, select and double-click the corresponding interface, and configure the tracing
task. Then, you can start the interface tracing task. Standard signaling tracing includes the
message tracing of the UU interface, IUB interface, IUR interface, and IU interface.
1.
2016-12-19

Uu Interface Tracing
Huawei Confidential

Page 104 of 131

RAN10 KPI Troubleshooting Guide

INTERNAL

On the Maintenance page of the RNC LMT, choose Trace Management Interface
Trace Task, and double-click UU Interface. Then, the following interface is
displayed. Click OK on the interface.
Figure 1 Uu interface tracing

When tracing the cell configuration, enter the cells to be traced in the format of
R1:C1/C2;R2:C1/C2, for example, 174:101/102;175:201. 174 and 175 indicate RNC
IDs. 101 102 201 indicates the cell ID. For Tracing message type, it is recommended
that you select Select All if you are not sure of the problem. You can also select the
appropriate tracing message types as needed.
2.

Iub Interface Tracing


On the Maintenance page of the RNC LMT, choose Trace Management Interface
Trace Task, and select and double-click IUB Interface. Then, the following interface
is displayed. You can choose to trace the specified NodeB or all NodeBs.

2016-12-19

Huawei Confidential

Page 105 of 131

RAN10 KPI Troubleshooting Guide

INTERNAL

Figure 1 Iub interface tracing

3.

Iur Interface Tracing


IUR interface tracing enables you to trace the messages between the current RNC and
all neighboring RNCs or the messages between the current RNC and the specific
neighbor RNC. To trace the messages between the current RNC and the specific RNC,
you need to run the LST N7DPC command to query the DSP code of the neighbor
RNC.
On the Maintenance page of the RNC LMT, choose Trace Management Interface
Trace Task, and select and double-click IUR Interface. Then, the following interface
is displayed. To trace the messages of the specified DSP, you need to enter the DSP
code. Note that the DSP code should be in the hexadecimal format. Finally, click OK.

2016-12-19

Huawei Confidential

Page 106 of 131

RAN10 KPI Troubleshooting Guide

INTERNAL

Figure 1 Iur interface tracing

4.

Iu Interface Tracing
IU interface tracing enables you to trace the messages between the current RNC and
all CNs or the messages between the current RNC and the specific CN. To trace the
messages between the current RNC and the specific CN, you need to run the LST
N7DPC command to query the DSP code of the specified CN, as shown in the
following figure.

2016-12-19

Huawei Confidential

Page 107 of 131

RAN10 KPI Troubleshooting Guide

INTERNAL

Figure 1 Querying the DSP code of the CN

On the Maintenance page of the RNC LMT, choose Trace Management Interface
Trace Task, and select and double-click IU Interface. Then, the following interface is
displayed. To trace the messages of the specified DSP, enter the DSP code. Note that
the DSP code must be in the hexadecimal format. Finally, click OK.

7.6 UE QXDM LOG


7.6.1 Purpose

To analyze the problems related to signaling flow by querying the RNC data, and
analyze the KPI problems caused by the specific terminals and under specific
scenarios

To analyze the problems related to power control

To analyze the user-plane problems

7.6.2 Information to Be Collected

During the drive test, obtain the log at the UE side according to the log at the network
side.

7.6.3 Method
Install the QPST and QXDM software (Note: To use the QXDM, you need to activate the
QXDM online or apply for a license and activate the QXDM manually). To install the data
card diver, insert the data card into the port and query the port position from the equipment
administrator of Windows.
Configure the QPST. Choose QPST Configuration Port and click the Add New Port
button at the lower right corner. Then, the Add New Port dialog box is displayed. Select

2016-12-19

Huawei Confidential

Page 108 of 131

RAN10 KPI Troubleshooting Guide

INTERNAL

the data card port found at the last step. If you add the port successfully, you can view the
information about the port in the QPST Configuration window.
Figure 1 Configuring the QPST port

Choose Qxdm Option > Communication, select the port to be observed on the
equipment administrator, and enable log tracing.
Figure 2 Connecting the equipment ports

Choose Qxdm Options > Logging View Configuration, select the message items to be
traced on the Message Packets and Log packets tabs, set the saving path of the tracing
files on the Misc tab, and click OK.

2016-12-19

Huawei Confidential

Page 109 of 131

RAN10 KPI Troubleshooting Guide

INTERNAL

Figure 3 Enabling log tracing

Choose Options > Logging or press the ALT+L shortcut key to start the tracing. Press the
ALT+L shortcut key again to stop the tracing.

7.7 Real-Time Performance Monitoring of RNC


7.7.1 Purpose

To know the signal quality of the air interface on the uplink and downlink before and
after call drop and handover

7.7.2 Information to Be Collected


7.7.3 Method
Real-time performance monitoring includes connection performance monitoring, cell
performance monitoring, link performance monitoring, and board resource monitoring.
During the troubleshooting, connection performance monitoring and cell performance
monitoring are often used.
Log in to the RNC LMT, and choose Realtime Performance Monitoring Connection
performance monitoring on the Maintenance tab. On the popup interface, select the item
to be monitored and file saving directory and click OK. The parameter settings for cell
performance monitoring, link performance monitoring, and board resource monitoring are
the same as the preceding operation.

2016-12-19

Huawei Confidential

Page 110 of 131

RAN10 KPI Troubleshooting Guide

INTERNAL

Figure 1 Real-time performance monitoring

7.8 RNC Script Configuration


7.8.1 Purpose

To know the network configuration, neighbor relation between cells, switch setting,
and parameter setting from the configuration script and file

7.8.2 Information to Be Collected

The configuration script generated before and after the problem occurs.

7.8.3 Method
Run the EXP INNERCFGMML command on the RNC LMT, export the configuration
data on the BAM as an MML script file, and save the script file in the default or specified
path.
Figure 1 NC script configuration

Extract the configuration script under the corresponding directory.


By default, the script configuration is saved in the following path:
V2 platform: \BAM\VersionA\FTP or \BAM\VersionB\FTP
V1 platform: \BAM\\FTP
You can query the workarea directory of the BAM by running the LST BAMAREA command.

2016-12-19

Huawei Confidential

Page 111 of 131

RAN10 KPI Troubleshooting Guide

INTERNAL

7.9 Operation Log of RNC


7.9.1 Purpose

To know the main suspicious operations performed before and after the problem
occurs

7.9.2 Information to Be Collected

Logs generated before and after the problem occurs

7.9.3 Method
1.

Collecting the information through the EXP LOG command


Run the EXP LOG command on the RNC LMT to export the operation log generated
in a certain time segment on the RNC.

Figure 1 Exporting the operation log by running the EXP LOG command

By default, the operation log data is saved in the following path:


V2 platform: \BAM\VersionA\FTP or \BAM\VersionB\FTP
V1 platform: \BAM\\FTP
You can query the workarea directory of the BAM by running the LST BAMAREA command.

2.

Collecting the information through the COL LOG command


You can also export the CHR log, alarm information, and operation log information by
running the COL LOG MML command on the LMT. If you also need to collect such
data at the same time, the method is recommended. For details, see 7.2.3.

2016-12-19

Huawei Confidential

Page 112 of 131

RAN10 KPI Troubleshooting Guide

INTERNAL

7.10 Alarm Information on RNC


7.10.1 Purpose

To check whether there exists the alarm information about the corresponding problem,
for example, intermittent interrupt of transmission and high DSP utilization rate

7.10.2 Information to Be Collected

Alarm information generated before and after the problem occurs

7.10.3 Method
1.

Saving the alarms from the alarm box of the LMT


Open the Alarm Browsing window on the LMT, select the alarm to be saved, and
save the alarm as a csv file, html file, or txt file. You can define the saving directory
and filename yourself.

Figure 1 Alarm box of the LMT

2.

Collecting the information through the EXP ALMLOG command


Run the EXP ALMLOG MML command on the LMT and set the related parameters.
Then, you can obtain the corresponding alarm log. You can save the alarm log as a csv
file or txt file.

2016-12-19

Huawei Confidential

Page 113 of 131

RAN10 KPI Troubleshooting Guide

INTERNAL

Figure 1 Exporting the alarms

Alarm Severity: It is recommended that you set the parameter to the default value. As a
result, the LMT returns all types of alarm log information.
Returned Records: Preferably, the parameter value is larger than 500. Because the
alarm generated at the occurrence time of the problem, the RMT feeds back as many
as records.
Filename: The filename is the system time at which the alarm is exported.
By default, the exported file is saved in the following directory:
V2 platform: \BAM\VersionA\FTP or \BAM\VersionB\FTP\ExportAlmLog
V1 platform: \BAM\\FTP\ExportAlmLog
You can query the workarea directory of the BAM by running the LST BAMAREA command.

3.

Collecting the information through the COL LOG command


You can run the COL LOG command to export the CHR log, operation log, and alarm
information at the same time. For details about data collection, see 7.2.3.

7.11 Node B Configuration Script


7.11.1 Purpose

To have a general knowledge of the basic data and algorithm of the NodeB

To check the configuration data of the RNC for consistency

7.11.2 Information to Be Collected

Configuration script generated before and after the problem occurs

Typical site or faulty site

7.11.3 Method
1.
2016-12-19

Collecting the information from the M2000


Huawei Confidential

Page 114 of 131

RAN10 KPI Troubleshooting Guide

INTERNAL

Run the ULD CFGFILE MML command to the NodeB through the M2000, and thus
import the configuration file to the FTP server.
Figure 1 Exporting the NodeB configuration file through the M2000

Log in to the FTP server through a FTP client, and obtain the configuration data under
the specified path.
2.

Collecting the information from the NodeB LMT


On the Maintenance page of the NodeB LMT, choose Service Navigation
Software Management, and select Data Config File Transfer.

Figure 1 Data Config File Transfer

Double-click Data Config File Transfer, and the following dialog box is displayed.
Select Upload (NodeB to FTP Server), set Compress Flag to Compress, and select a
saving path for the exported file.

2016-12-19

Huawei Confidential

Page 115 of 131

RAN10 KPI Troubleshooting Guide

INTERNAL

Figure 2 FTP upload

Set the related parameters of the FTP server. You can use the current built-in FTP
server or the specified FTP server. After setting the FTP server, click OK. Then, the
LMT collects the configuration script of the NodeB.

7.12 Node B CHR


7.12.1 Purpose

To check whether data arrives at the NodeB

To measure the number of packets that are transmitted successfully or discarded


through a dedicated channel or a public channel

To measure the number of Preambles received by the PRACH

7.12.2 Information to Be Collected

Data generated in the time segment during which the problem occurs

7.12.3 Method
Log in to the NodeB LMT, and run an MML command to enable the CHR function of the
NodeB. If the CHR function is enabled, you can skip the step. By default, the CHR function
of the NodeB is disabled.

2016-12-19

Huawei Confidential

Page 116 of 131

RAN10 KPI Troubleshooting Guide

INTERNAL

Figure 1 Setting the CHR level of the NodeB

Collect the NodeB CHR logs. In a period after the problem occurs, download the CHR logs
of the NodeB through the M2000, RNC FtpServer, or local FtpServer of the NodeB.
Irrespective of the mode you use, you must ensure that the CHR function is enabled.
Figure 2 NodeB CHR reporting switch

7.13 Node B Alarm


7.13.1 Purpose

Check whether there exists the alarm information about the corresponding problem,
for example, intermittent interrupt of transmission.

7.13.2 Information to Be Collected

The alarm information generated before and after the problem occurs.

7.13.3 Method
1.

Collecting the information from the M2000


On the default interface of the M2000, you can query four types of alarms respectively
by choosing Fault Query.

2016-12-19

Huawei Confidential

Page 117 of 131

RAN10 KPI Troubleshooting Guide

INTERNAL

Figure 1 Querying the alarm information

Select all NodeBs, BBUs, or RRUs, and select all levels and types. Specify the
corresponding time range. Click Query.
Click Save to save the alarm information as a TXT file.
Figure 2 Saving the alarm information

Query the three other types of alarms by using the method, take the alarm files in the
selected folders, and save the alarm information respectively.
2.

2016-12-19

Saving the alarm information from the alarm box of the LMT

Huawei Confidential

Page 118 of 131

RAN10 KPI Troubleshooting Guide

INTERNAL

Open the Alarm Browsing window on the LMT, select the alarm to be saved, and
save the alarm as a csv file, html file, or txt file. You can define the saving directory
and filename yourself.
Figure 1 Alarm box of the NodeB LMT

7.14 Node B CDT


7.14.1 Purpose

To analyze the service-related problems

7.14.2 Information to Be Collected

During the drive test, collect the logs at the network side and at the RNC side.

7.14.3 Method
Open the TraceTask.ini file in the following path: Disk letter: \HW
LMT\adaptor\clientadaptor\NodeB\Version number\style\defaultstyle\conf\trace. Find the
property page label to be modified (including the Iub and Uu interfaces of the user, and Iub
and Uu interfaces of the cells), and set the check marks of the monitor items to be traced to
0 or 1.
The modification rules are as follows:
Name of monitor item = Check mark, ID of monitor item, Parameter 1 is required or not,
maximum of Parameter 1, minimum of Parameter 1, default of Parameter 1
Parameter 2 is required or not, maximum of Parameter 2, minimum of Parameter 2, default
of Parameter 2

2016-12-19

Huawei Confidential

Page 119 of 131

RAN10 KPI Troubleshooting Guide

INTERNAL

Parameter 3 is required or not, maximum of Parameter 3, minimum of Parameter 3, default


of Parameter 3
Value of check mark: 1, tick on the interface; 0, not tick on the interface by default; 2, not
displayed on the interface
Blue fonts: Except the contents whether the parameter is required, the contents in blue font
can remain blank, but their positions must be reserved.
Whether the parameter is required: 0, not required; 1, required; 2, non-editable on the
interface
For details, see Figure 1.
Figure 1 Modifying the properties of the monitor items of the NodeB CDT

On the LMT, double-click CDT Cell Tracing, set the IDs of logical cells to be traced,
select the monitor items to be traced on the Iub/Uu page, and click OK.
Figure 2 Enabling CDT tracing of the NodeB cells

Double-click User Tracing. The trace method can be the initial link establishment time,
CRNCID, and IMSI. If Trace Method is set to Chain Time, note that the entered time
must be consistent with the time of the BTS.

2016-12-19

Huawei Confidential

Page 120 of 131

RAN10 KPI Troubleshooting Guide

INTERNAL

Figure 3 Basic setting

If you select a specified IMSI ID to trace, run the MOD NODEB: NodeBId = xxx,
NodebTraceSwitch=ON command on the RNC (xxx indicates the NodeB ID).
If you enable user tracing on the NodeB LMT, set Trace Method to IMSI, and enter the
corresponding IMSI ID.
Figure 4 Setting other monitor items

Select the corresponding CDT monitor items on the IUB interface and UU interface
respectively.

2016-12-19

Huawei Confidential

Page 121 of 131

RAN10 KPI Troubleshooting Guide

INTERNAL

7.15 Checking Whether Any Neighboring Cells


are not Configured
Missing neighboring cell detection: Check whether any neighboring cells are not
configured, that is, whether the neighbor relation is configured for the neighboring cells.
The detection enables you to find the missing neighboring cells.
Missing neighboring cell detection has three types: Intra-frequency, inter-frequency, and
inter-RAT. The three types of detection are independent of each other. After the LMT
delivers the MNCDT message, all cells in the RNC undergo the missing neighboring cell
detection independently.
Intra-frequency detection: Set the Trigger Condition of the 1A event in the intra-frequency
measurement control to Monitored Set plus Detected Set. Then, the UE reports the
measured detected set.
Inter-frequency detection: You need to set the detection frequency and range of scrambling
codes. When the RNC enables the inter-frequency detection, a maximum of 32 neighboring
cells can be measured in the measurement control while one cell is not configured with so
many neighboring cells. Therefore, the configured missing neighboring cells are used to fill
the measurement object list till 32 neighboring cells.
Inter-RAT detection: You need to configure the network color code, BTS color code, band
indication, and frequency range to be detected. Like the inter-frequency detection, the
configured missing neighboring cells are used to fill the measurement object list till 32
neighboring cells.

7.15.1 Enabling Call Trace for Missing Neighboring


Cell Detection Tracing
Like the interface tracing, the LMT enables the call trace for missing neighboring cell
detection tracing (MNCDT), as shown in Figure 1.

2016-12-19

Huawei Confidential

Page 122 of 131

RAN10 KPI Troubleshooting Guide

INTERNAL

Figure 1 Enabling call trace to check whether any neighboring cells are not configured

Double-click MNCDT, and the following interface is displayed.


Figure 2 Configuration interface of intra-frequency MNCDT

On the configuration interface, you can select three types of MNCDT: Intra-frequency,
inter-frequency, and inter-RAT. Figure 2 shows the configuration interface of intrafrequency MNCDT.
2.

Enable intra-frequency MNCDT


For intra-frequency MNCDT, you do not need to set any parameters. Set Detection
Type to Intra Freq, and click OK. Then, the following window is displayed.

2016-12-19

Huawei Confidential

Page 123 of 131

RAN10 KPI Troubleshooting Guide

INTERNAL

Figure 1 MNCDT window

After you enable the intra-frequency MNCDT, the intra-frequency measurement control
message contains the following information: The TriggerCondition of the 1A event is
detectedSetAndMonitoredSetCells.
Figure 2 Intra-frequency measurement control after the intra-frequency MNCDT is enabled

3.

Enable inter-frequency MNCDT


Set Detection Type to Inter Freq.
The following configuration interface is displayed.

2016-12-19

Huawei Confidential

Page 124 of 131

RAN10 KPI Troubleshooting Guide

INTERNAL

Figure 1 Configuration interface of inter-frequency MNCDT

Uplink UARFCN: Uplink UARFCN of the cell that undergoes the MNCDT
Downlink UARFCN: Downlink UARFCN of the cell that undergoes the MNCDT
Start of Primary Scrambling Code: Minimum scrambling code that undergoes the
MNCDT
End of Primary Scrambling Code: Maximum scrambling code that undergoes the
MNCDT
Constraints:

The relationship between the uplink UARFCN and downlink UARFCN needs to be
constrained by the user. Considering the scalability and that the protocol does not
stipulate the relationship between them, the system does not constrain the
relationship between them.

The End of Primary Scrambling Code is greater than or equal to the Start of
Primary Scrambling Code.
Click OK. Then, the MNCDT window is displayed, as shown in Figure 1.
Observe the measurement control: After you enable the inter-frequency MNCDT,
the cell list of the inter-frequency measurement control includes some cells in the
MNCDT range.

4.

Enable inter-RAT MNCDT


Set Detection Type to Inter RAT.
The following configuration interface is displayed.

2016-12-19

Huawei Confidential

Page 125 of 131

RAN10 KPI Troubleshooting Guide

INTERNAL

Figure 1 Configuration interface of inter-RAT MNCDT

Network color code (NCC): The NCC of the cell that undergoes the MNCDT.
BTS color code (BCC): The BCC of the cell that undergoes the MNCDT.
Frequency Indicator: The DCS 1800 and PCS 1900 have some overlapped frequency
numbers. Therefore, the frequency indicator is mainly used for the band indication of
the overlapped frequency numbers.
Start of BCCH ARFCN: The minimum frequency number that undergoes the MNCDT.
End of BCCH ARFCN: The maximum frequency number that undergoes the MNCDT.
Click OK. Then, the MNCDT window is displayed
Observe the measurement control: After you enable the inter-RAT MNCDT, the cell
list of the inter-RAT measurement control includes some cells in the MNCDT range.

7.15.2 Stopping the MNCDT


Like interface tracing, you can stop the corresponding MNCDT if closing the message
tracing window.

7.15.3 Reporting the Missing Neighboring Cell


Message
After the UE reports the measurement report on the missing neighboring cells and the
measurement report meets the handover requirements, the RNC displays the information
about the missing neighboring cells in the missing neighboring cell message tracing
window. For example, Figure 1 shows the message tracing window for the missing intrafrequency neighboring cells. The window displays the following information: Serial
number, generation time, cell ID (ID of best cell), standard message type, URNTI, and
message contents.

2016-12-19

Huawei Confidential

Page 126 of 131

RAN10 KPI Troubleshooting Guide

INTERNAL

Figure 1 Message tracing for the missing intra-frequency neighboring cells

As shown in Figure 1, 16 messages are reported. Double-click a message and a window is


displayed to display the contents of the message.
Figure 2 Reported message about the missing intra-frequency neighboring cells

The cells (only the MNCDT-related cells) are described as follows:


The reported message about the missing intra-frequency neighboring cells
ulRnti: ulRnti of the UE.
ucActSetNum: Number of active sets.
ausActCellId: The array of the cell IDs of the active set; the first cell is the best cell.
ucDetectCellNum: Number of detected missing neighboring cells.
2016-12-19

Huawei Confidential

Page 127 of 131

RAN10 KPI Troubleshooting Guide

INTERNAL

ausSrimbleCode: Array of the scrambling code of the detected missing neighboring cell.
Figure 3 Message tracing for the missing inter-frequency neighboring cells

Double-click the message.


The following window is displayed.
Figure 4 Reported message about the missing inter-frequency neighboring cells

The cells are described as follows:


ulRnti: ulRnti of the UE
ucActSetNum: Number of active sets.
AusActCellId: The array of the cell IDs of the active set; the first cell is the best cell.
UcDetectCellNum: Number of detected missing neighboring cells.
usUlUarFcn: Uplink UARFCN of the detected missing inter-frequency neighboring cell.
usDlUarFcn: Downlink UARFCN of the detected missing inter-frequency neighboring cell.
usPsc: Scrambling code of the detected missing neighboring cell.
Figure 5 shows the messages about the missing inter-RAT neighboring cells.

2016-12-19

Huawei Confidential

Page 128 of 131

RAN10 KPI Troubleshooting Guide

INTERNAL

Figure 5 Message about the missing inter-RAT neighboring cell

ucNCC refers to the network color code of the reported missing neighboring cell, and
ucBcc refers to the BTS color code of the reported missing neighboring cell.
ucInterRatBandInd refers to the band indicator, and usInterRatArfcn refers to the frequency
number.

7.16 Soft Failure of DSP


If the access failure or call drop occurs in a certain DSP within a short period, the problem
may be because of the soft failure of the DSP. To check the soft failure of the DSP, you can
query the CHR log.
Import the CHR log to the tool, and select a DSP log. If you find that the problem mainly
occurs in the same CPU ID within a period, it indicates that the problem may be caused by
the soft failure of the DSP.

2016-12-19

Huawei Confidential

Page 129 of 131

RAN10 KPI Troubleshooting Guide

INTERNAL

Figure 1 Analyzing the soft failure of the DSP through the CHR log

Query the corresponding DSPID of the CPUID through the CPUID tool, and thus solve the
problem by resetting the DSP. In terms of the RNC platform, the tool has the V1 version
and V2 version. You need to use the CPUID tool with the correct version.

CPU ID.rar

Enter the hexadecimal CPUID to the specified position. Press From CPUID to obtain the
corresponding DSP ID. Run the RST DSP command on the LMT to reset the DSP.
Figure 2 Resetting the DSP

2016-12-19

Huawei Confidential

Page 130 of 131

RAN10 KPI Troubleshooting Guide

INTERNAL

7.17 Terminal Troubleshooting


A KPI-related problem may be caused by the compatibility of the UE. Therefore, you need
to judge the compatibility of the UE.
Normally, you can judge whether the problem is associated with a single IMSI. If the
problem occurs only in a single IMSI, you can suspect that the problem is caused by a
specific model of terminal. By associating the IMSI with the IMEI, you can find the
terminal type of the faulty UE.
Figure 1 Analyzing the special UEID through the CHR log

2016-12-19

Huawei Confidential

Page 131 of 131

Вам также может понравиться