Академический Документы
Профессиональный Документы
Культура Документы
XXXX
Project Manager: XXXX
Prepared for: XXXX
Prepared By: XXXX
www.aircominternational.com
Table of Contents
1 INTRODUCTION .............................................................................................................................................. 9
2 NETWORK OVERVIEW..................................................................................................................................... 9
2.1 Core Hardware Location per City .......................................................................................................... 9
2.2 BSS Hardware Location per City .......................................................................................................... 10
3 CAPACITY ANALYSIS...................................................................................................................................... 11
3.1 VLR Subscriber Register Capacity ........................................................................................................ 11
3.2 VLR Subscriber Capacity Currently in Use............................................................................................ 12
3.3 VLR Subscriber Capacity Utilization..................................................................................................... 12
3.4 HLR Subscriber Capacity and Utilization .............................................................................................. 13
3.5 MGW Capacity and Utilization ............................................................................................................ 13
4 PERFORMANCE INDICATORS ........................................................................................................................ 14
4.1 Introduction ....................................................................................................................................... 14
4.2 Concepts ............................................................................................................................................ 14
4.3 Availability.......................................................................................................................................... 18
4.3.1 System Downtime....................................................................................................................... 18
4.3.2 Signaling Performance, SS7 Link availability, ETSI ........................................................................ 18
4.4 Accessibility........................................................................................................................................ 19
4.4.1 Authentication........................................................................................................................... 19
4.4.2 Ciphering, GSM ......................................................................................................................... 19
4.4.3 CP Processor Load .................................................................................................................... 20
4.4.4 Location Update........................................................................................................................ 21
4.4.5 Mobile IN Calls............................................................................................................................ 23
4.4.6 Channel Assignment................................................................................................................. 23
4.4.7 Short Messages Service (SMS), ORG ...................................................................................... 24
4.4.8 Short Messages Service (SMS), TERM .................................................................................... 25
4.4.9 Successful SMS Delivery Terminating SMS............................................................................. 26
4.4.10 Signaling Performance, SS7 Link Congestion......................................................................... 27
List of Figures
List of Tables
Table 1 Core Hardware Location Per City .......................................................................................... 9
Table 2 BSS Hardware Location per City ......................................................................................... 10
Table 3 VLR Subscriber Register Capacity ....................................................................................... 11
Table 4 Clock Reference in XXXX Network ...................................................................................... 37
Table 5 HW FAULT MSC ............................................................................................................... 44
Table 6 HW FAULT BSC ................................................................................................................ 45
Table 7 Unused Cell ID Definitions ................................................................................................. 46
Table 8 Software Level Integrity .................................................................................................... 48
Table 9 SIGTRAN-1 ...................................................................................................................... 49
Table 10 SIGTRAN-2 .................................................................................................................... 49
Table 11 SIGTRAN-3 .................................................................................................................... 50
Table 12 SIGTRAN-4 .................................................................................................................... 50
DISTRIBUTION LIST
NAME POSITION / DEPARTMENT
APPROVALS
APPROVED BY SIGNATUR DATE
XXXX
AIRCOM INTERNATIONAL
XXXX COUNTRY
1 INTRODUCTION
Aircom has conducted a Technical Audit of XXXX Network between the dates of XXXX and XXXX. This Audit
project comprises of a combination of collecting data; discussion with XXXX technical teams; desk based
research; detailed interviews and analysis of documentation and information supplied by the XXXX. This NSS
audit report has been prepared based on the data provided by the Core planning & O&M responsible personal of
XXXX
2 NETWORK OVERVIEW
CITY5 XXMSC1 1
CITY4 XXMSC1 1
CITY3 XXMSC1 1
CITY6 XXMSC1 1
CITY2 XXMSC1 1
CITY1 Technical Villa XXMSC3 1
CITY1 Technical Villa XXMSC4 1
CITY1 Park plaza XXMSC 1
CITY1 Park plaza MSCS2 1
CITY1 HLR1 1
CITY1 HLR2 1
CITY1 Technical Villa MGW11 1
CITY1 Technical Villa MGW12 1
CITY2 MGW21 1
CITY1 Park plaza MGW31 1
Total 9 2 4
Table 1: Core Hardware Location per City
CITY7 BGNRBSC 1 1
CITY8 GZNRBSC 1 1
CITY5 HEBSC1 1 1
CITY4 JABSC1 1 1
CITY9 JZNRBSC 1 1
CITY3 KDBSC1 1 1
CITY10 KHRBSC1 1 1
CITY6 KUBSC1 1 1
CITY2 MABSC1 1 2
CITY2 MABSC2 1
CITY11 NEBSC1 1 1
Total 16 16
3 CAPACITY ANALYSIS
This section contains the outcomes of capacity audits.
3.1 VLR Subscriber Register Capacity
NODE Name TOTNSUB REGISTERD VLR CAPACITY Available Capacity
4 PERFORMANCE INDICATORS
4.1 Introduction
This section defines switching system performance indicators for the MSC and MSC Server. The MSC is the call
control handling node in layered and non-layered architecture. All counter descriptions in this section are used for
information. The Application Informations shall be used for latest and more detailed counter descriptions.
4.2 Concepts
Performance indicators defined in this section focus on reliability and how a service is executed in the MSC/VLR
Server.
The MSC/VLR Server is the call control handling node in the Ericsson Core Network containing counter, which are
stepped/not stepped, based on information received from other core network elements/nodes. Some counters
even reflect end-user and radio network behavior. See figure 2.
Benchmarking
System Improvements
Performance monitoring
Figure 3-3
Availability
Availability is defined as the ability of an item to be in a state to perform a required function at a given point of
time or at any instant of time within a given time interval, assuming that the external resources, if required, are
provided.
Severability
The ability of a service to be obtained - within specified tolerances and other given conditions - when requested
by the user and continue to be provided without excessive impairment for a requested duration. Serve-ability
performance is subdivided into the service accessibility performance, service retain-ability performance and the
service integrity performance.
Accessibility
The ability of a service to be obtained, within specified tolerances and other given conditions, when requested by
the end-user.
Retain-ability
Retain-ability reflects the ability of the user to keep a service once it was accessed under given conditions for a
requested period of time.
Integrity
Integrity reflects the ability of a user to receive requested service at desired quality. No Integrity PIs are defined
for the MSC.
4.3 Availability
4.3.1 System Downtime
Accumulated System Down Time (SDT) for the last 12 Months in Second, Its showing no major down time in
network.
Link unavailbity due to the Transmission fluctuation, XXXX should resolve this issue to improve healthy KPI
4.4 Accessibility
4.4.1 Authentication
The average successful Authentication results for the complete XXXX network are shown in the figures below
Figure 7 Authentication
Recommendations:
The Authentication Success rate is indicating normal conditions in all the network; the values are currently around
97% which is in par with the world average according the previously mentioned benchmark and above the
minimum recommended value of 95%
Recommendations:
The Ciphering Success rate is indicating normal conditions in all the network; the values are currently around
99% which is in par with the world average according the previously mentioned benchmark and above the
minimum recommended value of 95%.
Recommendations:
The central processor load in all the nodes were considered normal and the peak load in the busy hour did not
reach the maximum recommended limit (75%).
Recommendations:
The Location Update Success rate is indicating normal conditions in the XXMSC3 and XXMSC; the values are
currently around 97% which is in par with the world average according to the previously mentioned benchmark.
On the other hand, Location Update Success rates in the other MSCs are showing slightly lower values.
Where the gathered performance measurements for consecutive days show a significant drop starting onwards
on a daily basis; normally, there are many major reasons for Location Update failure: Unknown IMSI in HLR,
Timeout, MAP fallback, Network Failure, Congestion... Further investigations are needed to determine the actual
reasons.
The following location update signaling flows show how the above mentioned counters are being increased
accordingly:
All the MSC in XXXX Network showed a value of 100% regarding successful IN calls so no recommendation
needed on this KPI
The results show a normal behavior regarding channel assignment and no additional recommendations are
needed.
Recommendations:
According to the above table, we can clearly see that the SMS originating success rate is low for the complete
period on all the MSC-S. There are some known reasons for the SMS sending failure rate
Recommendations:
From the above figures, we can see that performance measurements are low before reaching the required level.
Most of the MSC-S are showing a standard average equal to the world and European averages. Some known
causes for low SMS receiving rates are:
Absent Subscriber: The receiving user is either powered off or out of the service area.
System Failure: Mostly related to the radio network and the MS, such as assignment failure of SDDCH,
call drop when receiving SMS, etc
The average results are above the recommended KPI minimum value so no additional recommendation needed
Recommendations:
Dimensioning rules are allowing utilization 30% load in a non-failure situation and 60% load in a load in
a failure situation.
It is very important that load limits are maintained within the range, as when the SS7 links reaches a
certain load level, the message success rate decreases dramatically.
Observed the occurrence of EOS codes in XXMSC1, XXMSC3, XXMSC, XXMSC, XXMSC1 and XXMSC1, the reason
for the errors is improper CIC assignment which includes Cross Connections of E1s, due to this the subscriber
received Wrong(ambiguous) calls and Cross Talk. To rectify the issue it is recommended to check all
Interconnect routes individually with TCTDI command to make sure all CIC are integrated properly
Remove unnecessary configuration to have a clean alarm list. Block Devices on Routes are responsible for Low
ASR, Route Congestion and Call Rejection. See attached file for more detail.
4.4.13 Paging
Figure 20 Paging
The XXMSC3, XXMSC, XXMSC and XXMSC1 paging results show a normal behavior and in accordance to the
global values.
In the other hand, for the XXMSC1,XXMSC1,XXMSC1 located outside of CITY1 the values could be improved a bit
with improvements to radio coverage e.g. an attached mobile out of coverage will not be able to receive or
respond to a page.
Check the parameter settings of the network; it can often improve the paging performance especially if coverage
is not the main problem.
The time between periodic registrations, the function Implicit IMSI detach, the Nr. of LAs and the size of the LAs
are the key issues. TMSI should be used at least for the first page.
Recommendations: (Succ_GSM_Paging)
The XXMSC1, XXMSC1 XXMSC1 and XXMSC1 MSCs are showing a slightly lower average results that the global
benchmark (around 88%) mentioned above.
As for the other MSCs, there seem to be problems as the number of repeated page attempts to a location area
over A-interface is high. The following causes might explain the low paging success rate:
LA dimensioning should be carried out in order to have proper Nr. of LA in 1 MSC. If LA is under dimensioned,
then it will affect paging success rate, on the other hand if LA is over dimensioned, then it will increase LU load,
and affect LU success rate.
Low paging success rate could be explained with coverage problems or that the function Implicit IMSI detach is
not used or that T3212 is set too high.
Paging performance is mainly depending on radio performance, especially radio coverage, radio capacity, cell
planning and frequency planning to reduce as much interference as possible.
Figure 1: Paging of a MS
Other strategies than those recommended affect the paging load as follows:
No second page: No second page reduces the paging load in both the BTS and the BSC. The
disadvantage is risk of more unsuccessful MS paging.
Global second page: Compared to a local second page, a global second page increases the
paging load. The advantage is that MSs that, for some reason, have the wrong LA status in the
VLR stand a better chance of being successfully paged.
TMSI for second pages: If the second page is global, IMSI must be used to identify the MS. If the
second page is local, either IMSI or TMSI can be used to identify the MS. Using TMSI increases
the paging capacity in the BTS. The drawback is that some pages may be unsuccessful if an MS
has the wrong TMSI in the VLR, for example, immediately after having crossed an LA boarder.
The major failure in the ORG-Setup is due to subscriber missed calls or early disconnects and wrong
dialing.
In XXMSC1 area the wrong dialing ratio is high. Call testing is required to identify the missing routes.
In this audit it is observed that in the areas where the MT-SUCC% is low the major cause of degradation is low
paging success rate. Relationship of MT-SUCC% and MT-Subscriber unreachable is also presented to give a
picture of radio coverage impact of MT calls.
4.5 Retain-ability
4.5.1 Inter MSC Handover /Intra-MSC Handover
This performance indicator reflects the successful incoming and outgoing inter-MSC handover attempts including
subsequent handovers. Events are counted for each neighboring MSC.
Observe in many directions the Inter MSS handover (In and out) success rates are low. The external LAC
definition needs to be verified by the help of radio team. In few cases the intra MSS handover is also low. This
should be checked by BSS team, because in intra MSS handover procedure MSS does not play any role.
Recommendations: The Network LAC diagram should be marinated by the help of radio team. The core
network personnel should define the external or adjacent LACs according to the radio geographical boundaries
designed by Radio department.
5 FINDING
5.1 Roaming
The ROAMWARE version XXXX is using is only capable of retaining the users i.e., it will only hold the user which
are already on the XXXX network or after they are registered for the first time due to better radio coverage. This
is not helping to attract new incoming roamers registration in XXXX network.
In order to capture maximum number of incoming new roamers with priority to XXXX, newer version of
ROAMWARE should be used in which the capturing feature is available. (See Attached file for more detail).
Time synchronization in particular ensures that all nodes share the same time reference, which is important for
charging and O&M functions. For example, it may be crucial to know exactly when (in terms of
day/hour/minute/second/millisecond) a certain event has occurred, so that events from different nodes can be
correlated. Event correlation is of fundamental importance not only for trouble shooting and charging but also
for services as the XXXX Revenue Assurance Solution.
Time synchronization is achieved through time servers, which provide Time-of-Day (ToD) information and deliver
it over an IP network to the clients, i.e., the network nodes, by means of the Network Time Protocol (NTP) or its
simplified version Simple Network Time Protocol (SNTP). (More details are available in attached file below)
CONNECTED
KABSC5 0ETM2,MS-0 1ETM2,MS-0 EX,SB NOT
CONNECTED
NEBSC1 0ETM2,MS-0 0ETM2,MS-1 EX,UPD NOT
CONNECTED
KHRBSC1 0ETM2,MS-0 0ETM2,MS-1 EX,SB NOT
CONNECTED
KUBSC1 0ETM2,MS-0 0ETM2,MS-1 EX,ABL NOT
CONNECTED
MABSC1 1ETM2,MS-0 3ETM2,MS-0 EX,SB NOT
CONNECTED
MABSC2 0ETM2,MS-0 4ETM2,MS-0 EX,SB NOT
CONNECTED
KDBSC1 1ETM2,MS-0 1ETM2,MS-1 EX,ABL NOT
CONNECTED
HLR1 0E1551,MS-0 0E1551,MS-1 SB,EX NOT
CONNECTED
HLR2 0E1551,MS-0 0E1551,MS-1 SB,EX NOT
CONNECTED
XXMSC3 NOT NOT NOT
CONNECTED CONNECTED CONNECTED
XXMSC4 NOT NOT NOT
CONNECTED CONNECTED CONNECTED
XXMSC NOT NOT NOT
CONNECTED CONNECTED CONNECTED
XXMSC NOT NOT NOT
CONNECTED CONNECTED CONNECTED
XXMSC1 NOT NOT NOT
CONNECTED CONNECTED CONNECTED
XXMSC1 1E1551,MS-0 1E1551,MS-1 RCM-0 MBL,MBL,EX NOT
CONNECTED
XXMSC1 0E1551,MS-0 0E1551,MS-1 RCM-0 MBL,MBL,EX NOT
CONNECTED
XXMSC1 1E1551,MS-0 1E1551,MS-1 RCM-0 EX,SB,SB NOT
CONNECTED
XXMSC1 0E1551,MS-0 0E1551,MS-1 RCM-0 MBL,MBL,EX NOT
CONNECTED
Table 4: Clock Reference in XXXX Network
Recommendation: Defined Proper Selection Type (ST Value) on Trunk Route Both Side
Observation & Recommendation: Analysis of alternate routing case in XXMSC1, XXMSC2 and XXMSC3, There
is some branching not defined properly for over flow traffic.
Observed occurrence of EOS codes in XXMSC1, XXMSC1, XXMSC1 and XXMSC1. The reason for the errors is
improper CIC assignment which includes Cross Connections of E1s, due to this the subscriber received Wrong
(ambiguous) calls and Cross Talk. To rectify the issue it is recommended to check all Interconnect routes
individually with TCTDI command to make sure all CIC are integrated properly
Recommendation: Set BTDM/T3212 Setting accordingly for implicit detach marking of mobile subscribers.
Check Radio Coverage and Link Fluctuation.
With the recommended setting mentioned below users will observe improved voice call quality with no delay.
In Analysis of B Number Table of all MSCs, all parameters were found correctly defined with the exception of
XXMSC3 where there should be no Charging Case on Announcement Route
ANBSI:B=99-8,RC=94,L=4;
ANBSI:B=99-9,RC=95,L=4;
In the analysis announcement route highly congested and blocked devices were found in XXMSC1, XXMSC1 and
XXMSC1.
In order to reduce congestion all blocked devices should be fixed and more HW to be added. This will increase
the QOS for the subscriber
Recommendation: There are lot of devices on trunk routs blocked due to lack of O&M, Preventive maintenance
and proper integration is highly recommended, Block Devices on Routes are responsible for Low ASR, Route
Congestion and Call Rejection
Recommendation: The RPs highlighted in red are having high errors therefore needs to be replaced with higher
versions. For this CSR to Ericsson should be raised on priority.
Recommendation: The RPs highlighted in red are having high errors therefore needs to be replaced with higher
versions. For this CSR to Ericsson should be raised on priority.
In this section the comparison of MSC and BSS defined cells is presented. The main objective of this practice was
to identify the extra cells defined on the MSC & to remove the junk data for making space available in cells table
and to organize cells tables. Mentioned below is the list of cells which are identified as extra on MSC by
comparing with BSS data.
Notice:
Please do not dilute any cell from the MSC side prior to the final confirmation from BSS Team. BSS should double
check the traffic on these cells. The cells ID dilution should take place with the cooperation of BSS and NSS
teams.
System log defined in all MSCs is of fixed size which eventually results in loss of data after reaching its maximum
limit because the new data coming is over written on the previous data. Therefore it is recommended to define
transfer queue for direct data transfer to the OSS in order to avoid data loss.
The Signaling error reports from the nodes were analyzed after which it was concluded that data coming from the
nodes have some necessary information missing which help in identifying/rectifying the problem occurred. The
missing information issue is resolved for accurate fault fixing in future. (See attached file)
Analysis of Alarms on the APG leads to the fact that on some nodes the APG Drive is almost full, and once it is
completely filled the APG will be down and no statistical data will come forward thus no performance reports
could be generated for the management of the network. Therefore it is recommended to have proper
maintenance of the APG drive.
A lot of unused route data is defined in BSCs as well in the MSCs. This results in High CP load and increased Call
Setup Time. To avoid this situation this data should be removed and proper size alteration to be done for
enhanced CP performance.
After investigating the Alarms (Software fault) on the nodes it is concluded that Software running on all the MSCs
is defective. In order to avoid events such as system restart (i.e., outage in the network) an immediate CSR
should be raised to fix the issued
6 SIGTRAN
but the receiver is not able to identify the association to which the packet belongs.
Recommendation: As shown in the table for XXMSC3 and XXMSC1, M3UA has interruption recorded during
110311 to 130311. Check the error interruption on MPBN side
Recommendation: Check Event Record properly, Time out somewhere in the network
7 M-MGW KPI
7.1 Scope
This study cover the request of XXXX for list the KPI needed on M-MGW. It can be used for:
7.2 Introduction
XXXX has M-MGW R5 on ATM backbone and the KPI suggested in this study are relative to ATM network and M-
MGW R5.
In addition to these KPI mentioned above it is important to know also the traffic/load.
MSC M-MGw
AddReq AddReq received Accessibility
(internal)
Step counter termReq
Ratio of successful
OK
Bearer establishment
Step counter external Accessibility
NotifyReq accessibility failure. No Ok?
Release resources.
(external)
Yes
Through connected.
(QoS related counters Integrity
are stepped.) (BER/BLER/..)
NotifyReq?
NotifyReq release Retainability
Release resources. Failure* Ratio of mature
Yes released connections/
Normal**
SubReq** all connections
No
Accessibility
Retainability
Retainability should it be just one KPI that cover the following measurement:
Measurement starts after external bearer is up i.e. where external accessibility ends.
Considers failures of internal resources e.g. MSB or ET in MGw that lead to that call is disconnected
abnormally.
External retainability
GCP commands that are replied with error code due to external failure.
can be left on lower priority as those can be assumed to be covered by other nodes contributing the network
retainability.
Integrity
The integrity is the ability of an external connection to maintain requested service at desired quality.
Traffic load
This category provides information about the current status of a node, mainly from resource usage point of view.
The following KPI should be considered for check the traffic during special events (High Traffic) or after some
network change.
The internal accessibility is the ability to obtain requested service from the system between the reception of a
GCP Add message and the sending of a GCP AddReply message.
This KPI can be used for example monitoring the utilization and congestion rate of resources.
MGW Accessibility
Check if the event 80 % Capacity Limit Met for Media Stream Channels or the event 100 % Capacity
Limit Met for Media Stream Channels is issued.
Check software capacity licenses.
Analyze the following PIs to see if the problem concerns ATM, IP or TDM traffic, AAL2 Termination
Seizure Success Rate, IP Termination Seizure Success Rate and TDM Termination Reservation Success
Rate.
Identify and redimension (if possible) the congested resources in the node.
Check the status of related resources and devices.
Check the counter MgwApplication.pmNrOfRejsByStaticAdmCtrl.
The major KPI to monitor is Incoming AAL2 Connection Reservation Success Rate:
The Incoming AAL2 Connection Reservation Success Rate measurement is used for calculating the incoming AAL2
connection reservation success rate initiated by the adjacent node. This measurement is made for AAL2 Access
Point (Aal2Ap).
MGW11
MGW21
MGW31
MGW11
MGW21
MGW31
MGW11
Nb Init Fault = 0
Nb Init = 4486564122
MGW21
Nb Init Fault = 0
Nb Init = 75132256
MGW31
Nb Init Fault = 0
Nb Init = 5027045203
MGW11
MGW21
MGW31
MGW11
Call Rejection = 0
MGW21
Call Rejection = 0
MGW31
Call Rejection = 0
MGW11
MGW21
MGW31
7.13 Retainability
It shall be possible to measure retainability on a M-MGw node level. In addition it shall be ensured that external
faults and problems, independent from M-MGw, are excluded from M-MGw retainability result.
The external part is can be left on lower priority as those can be assumed to be covered by other nodes
contributing the network retainability.
Note: the core network level retainability shall be measured in MSC server.
The Service Retainability measurement shows the M-MGw ability to retain the services, once obtained, for the
desired duration. The measurement is made for physical M-MGw.
Reatinabilty
pmNrOfGcpNotifyCsdFaultAEst
The total number of encountered Circuit Switched Data (CSD) termination faults after bearer establishment
(between establishment of bearer and reception of Gateway Control Protocol (GCP) Sub, resulting in the sending
of a GCP Notify message towards the MGC.
Condition: The counter is incremented when a notify message is sent for CSD calls (both internal and external
reasons counted) between establishment of bearer and GCP Sub (tear down of connection).
pmNrOfGcpNotifySpeechFaultAEst
The total number of encountered speech termination faults after bearer establishment (between
establishment of bearer and reception of Gateway Control Protocol (GCP) Sub that result in the sending of a GCP
Notify message towards the Media Gateway controller (MGC).
Condition: The counter is incremented when a notify message is sent for speech calls (both internal and external
reasons counted) between establishment of bearer and GCP Sub (tear down of connection).
7.14 Integrity
The integrity is the ability of an external connection to maintain requested service at desired quality.
It shall be possible to measure integrity on a M-MGw node level. Even though it might be difficult to get an
objective view on what level of integrity (=quality of service) is still normal and acceptable M-MGw shall have
indicators for data handling quality.
PI Integrity Healthy
ATM Transport QoS, Jitter 99,9%
Traffic over ATM, except broadband signalling, is left out since quality related measurements on ATM would
cause considerable high load on the node.
Due to the same reason all current ATM quality supervision measurements have to be set ON separately and
number of them is limited. Besides, ATM is considered very reliable and robust and would not be meaningful to
be monitored (except when building up the network or debugging specific problems).
The SS7 over ATM QoS measurement is used for calculating the SS7 broadband signalling quality (over ATM). It
shows the ratio of successfully handled signalling packets. The measurement is made for physical M-MGw.
Formulas
The SS7 over TDM QoS measurement is used for calculating the incoming and outgoing SS7 narrowband
signalling quality (over TDM). It shows the ratio of successfully handled signalling packets. The measurement is
made for physical M-MGw.
The Signaling over IP QoS, IP Packet Discard Ratio measurements are used for calculating the IP Packet Discard
Ratio (IPDR) of connections in an IP interface, defined for signaling over IP traffic, on an ET-MFG board. The
measurement is made for IpInterface.
0 0 0
0 0 0
The Signaling over IP QoS, IP Packet Error Ratio (Host) measurements are used for calculating the received IP
Packet Error Ratio (IPER) in an IP host in the M-MGw, for signaling over IP related traffic. The measurement is
made for IpAccessHostGpb.
0 0 0
The AAL2 Bearer Establishment Success Rate measurement is used to monitor the AAL2 bearer establishment
success rate. The measurement is made per VMGw.
Very slight Rejection in MGW11.Recommended actions when falling below the healthy value range:
Identify and redimension (if possible) the congested resources in the local node.
7.14.6 SCTP
Number of SCTP packets received from the peers, with an invalid checksum
MGW11 MGW21 MGW31
0 0 0
0 0 0
7.14.8 M3UA
Here Congestion is not the formula but it is calculated on average basis, so very slight congestion in MGW11 it is
ignorable as it in peak hours only, but recommendation is to increase the association. It was observed quite
often ,the disturbance in the IP backbone. Mention below is the time when disturbance was seen in MGW11 and
MGW22
'20110317054500 0 0 '20110317054500 0 0
'20110317060000 0 0 '20110317060000 0 0
'20110317061500 0 0 '20110317061500 0 0
'20110317063000 0 0 '20110317063000 0 0
'20110317064500 0 0 '20110317064500 0 0
'20110317070000 0 0 '20110317070000 0 0
'20110317071500 0 0 '20110317071500 0 0
'20110317073000 0 0 '20110317073000 0 0
'20110317074500 0 0 '20110317074500 0 0
'20110317080000 0 0 '20110317080000 0 0
'20110317081500 0 0 '20110317081500 0 0
'20110317083000 0 0 '20110317083000 0 0
'20110317084500 0 0 '20110317084500 0 0
'20110317090000 0 0 '20110317090000 0 0
'20110317091500 0 0 '20110317091500 0 0
'20110317093000 0 0 '20110317093000 0 0
'20110317094500 0 0 '20110317094500 0 0
'20110317100000 0 0 '20110317100000 0 0
'20110317101500 0 0 '20110317101500 0 0
'20110317103000 0 0 '20110317103000 0 0
'20110317104500 0 0 '20110317104500 0 0
'20110317110000 0 0 '20110317110000 0 0
'20110317111500 0 0 '20110317111500 0 0
'20110317113000 0 0 '20110317113000 0 0
'20110317114500 0 0 '20110317114500 0 0
'20110317120000 0 0 '20110317120000 0 0
'20110317121500 0 0 '20110317121500 0 0
'20110317123000 0 0 '20110317123000 0 0
'20110317124500 0 0 '20110317124500 0 0
'20110317130000 0 0 '20110317130000 0 0
'20110317131500 0 0 '20110317131500 0 0
'20110317133000 0 0 '20110317133000 0 0
'20110317134500 0 0 '20110317134500 0 0
'20110317140000 0 0 '20110317140000 0 0
'20110317141500 0 0 '20110317141500 0 0
'20110317143000 0 0 '20110317143000 0 0
'20110317144500 0 0 '20110317144500 0 0
'20110317150000 0 0 '20110317150000 0 0
'20110317151500 0 0 '20110317151500 0 0
'20110317153000 0 0 '20110317153000 0 0
'20110317154500 0 0 '20110317154500 0 0
'20110317160000 0 0 '20110317160000 0 0
'20110317161500 0 0 '20110317161500 0 0
'20110317163000 0 0 '20110317163000 0 0
'20110317164500 0 0 '20110317164500 0 0
'20110317170000 0 0 '20110317170000 0 0
'20110317171500 0 0 '20110317171500 0 0
'20110317173000 0 0 '20110317173000 0 0
'20110317174500 0 0 '20110317174500 0 0
'20110317180000 0 0 '20110317180000 0 0
'20110317181500 0 0 '20110317181500 0 0
'20110317183000 0 0 '20110317183000 0 0
'20110317184500 0 0 '20110317184500 0 0
'20110317190000 0 0 '20110317190000 0 0
'20110317191500 0 0 '20110317191500 0 0
'20110317193000 0 0 '20110317193000 0 0
'20110317194500 0 0 '20110317194500 0 0
'20110317200000 0 0 '20110317200000 0 0
'20110317201500 0 0 '20110317201500 0 0
'20110317203000 0 0 '20110317203000 0 0
'20110317204500 0 0 '20110317204500 0 0
'20110317210000 0 0 '20110317210000 0 0
'20110317211500 0 0 '20110317211500 0 0
'20110317213000 0 0 '20110317213000 0 0
'20110317214500 0 0 '20110317214500 0 0
'20110317220000 0 0 '20110317220000 0 0
'20110317221500 0 0 '20110317221500 0 0
'20110317223000 0 0 '20110317223000 0 0
'20110317224500 0 0 '20110317224500 0 0
'20110317230000 0 0 '20110317230000 0 0
'20110317231500 0 0 '20110317231500 0 0
'20110317233000 0 0 '20110317233000 0 0
'20110317234500 0 0 '20110317234500 0 0
'20110318000000 0 0 '20110318000000 0 0
This category provides information about the current status of a node, mainly from resource usage point of view.
We suggest the monitoring of the following KPI for Traffic and load:
Q.2630
GCP
SCCP
SCCP Policing 0
SCCP Relay NA
MTP3/MTP3b/M3UA
MTP2
SCTP
AAL2
ATM
Usage Rate of Received and Transmitted ATM Cells on an ATM Port 0-80%
IP
Ethernet Interface)
TDM
The Usage Rate of Received and Transmitted ATM Cells on VC Link measurements are used to calculate the
usage rate (as %) on a VC link during the measurement period. The measurement is made per Virtual Channel. It
is recommended that this measurement is only applied to a preselected group of VCs. If this measurement is
performed for all VPs and VCs, the amount of generated statistical data will be huge.
56 % 99.62% 64.32%
The TDM Termination Reservation Success Rate measurement is used to calculate the rate of successful
reservation of TDM terminations within a TDM group. The availability of underlying resources is also taken into
consideration in the measurement. The measurement is made per TDM termination group.
Primarily, the status of TDM termination groups should be monitored in the MSC server. In case these
measurements indicate problems with TDM traffic, the TDM measurements in M-MGw provide detailed
information.
Abnormal rejection found on this E1 may be block or CIC mismatch close monitoring required
MGW21 COUNTER
PcmNr425231_MOD3-25-2-31 256505
PcmNr425231_MOD3-25-2-31 256505
The Media Stream Resource Reservation Rate measurement is used for calculating the current connection
reservation rate of devices in this device pool and to show the traffic profile at the end of the measurement
period. The measurement is made for MsDevicePool.
The GCP Message Statistics measurements are used to show the amount of received and sent GCP messages as
well as the consistency of the GCP link. The measurement is made per Vmgw.
Healthy value range: while the relative difference (NrOfSentMessages / ReceivedMessages) is not changing
significantly between different measurements.
Mtp3bSpItu=3-4121,Mtp3bSls=3-
4200_27,Mtp3bSlItu=3-4200_slc2 950744.60 936077.79
Mtp3bSpItu=3-4121,Mtp3bSls=3-
4200_27,Mtp3bSlItu=3-4200_slc0 950881.59 936071.61
Mtp3bSpItu=3-4121,Mtp3bSls=3-
4200_27,Mtp3bSlItu=3-4200_slc2 950806.33 936138.88
Mtp3bSpItu=3-4121,Mtp3bSls=3-
4200_27,Mtp3bSlItu=3-4200_slc0 950943.35 936132.47
Mtp3bSpItu=3-4121,Mtp3bSls=3-
4200_27,Mtp3bSlItu=3-4200_slc2 950870.25 936202.23
Mtp3bSpItu=3-4121,Mtp3bSls=3-
4200_27,Mtp3bSlItu=3-4200_slc1 950939.71 936072.26
951007.46 936195.77
Mtp3bSpItu=3-4121,Mtp3bSls=3-
4200_27,Mtp3bSlItu=3-4200_slc0
Mtp3bSpItu=3-4121,Mtp3bSls=3-
4200_27,Mtp3bSlItu=3-4200_slc2 950934.80 936265.73
Mtp3bSpItu=3-4121,Mtp3bSls=3-
4200_27,Mtp3bSlItu=3-4200_slc1 951003.15 936135.81
Mtp3bSpItu=3-4121,Mtp3bSls=3-
4200_27,Mtp3bSlItu=3-4200_slc0 951071.71 936259.06
Mtp3bSpItu=3-4120,Mtp3bSls=3-4204-1,Mtp3bSlItu=3-
4204_slc3 678397.49 673986.69
Mtp3bSpItu=3-4120,Mtp3bSls=3-4204-1,Mtp3bSlItu=3-
4204_slc1 678426.09 674006.83
Mtp3bSpItu=3-4120,Mtp3bSls=3-4204-1,Mtp3bSlItu=3-
4204_slc3 678444.72 674033.70
Mtp3bSpItu=3-4120,Mtp3bSls=3-4204-1,Mtp3bSlItu=3-
4204_slc1 678473.05 674052.26
Mtp3bSpItu=3-4120,Mtp3bSls=3-4204-1,Mtp3bSlItu=3-
4204_slc3 678490.42 674079.21
Mtp3bSpItu=3-4120,Mtp3bSls=3-4204-1,Mtp3bSlItu=3-
4204_slc1 678519.10 674097.53
Mtp3bSpItu=3-4120,Mtp3bSls=3-4204-1,Mtp3bSlItu=3-
4204_slc3 678536.12 674124.59
Mtp3bSpItu=3-4120,Mtp3bSls=3-4204-1,Mtp3bSlItu=3-
4204_slc1 678564.51 673980.82
Mtp3bSpItu=3-4120,Mtp3bSls=3-4204-1,Mtp3bSlItu=3-
4204_slc3 678581.20 674142.05
Mtp3bSpItu=3-4120,Mtp3bSls=3-4204-1,Mtp3bSlItu=3-
4204_slc1 678609.56 674169.19
Mtp3bSpItu=2-4122,Mtp3bSls=2-
4400,Mtp3bSlItu=Slc0 301030.09 403715.10
Mtp3bSpItu=2-4122,Mtp3bSls=2-
4400,Mtp3bSlItu=Slc0 301039.49 403735.61
Mtp3bSpItu=2-4122,Mtp3bSls=2-
4400,Mtp3bSlItu=Slc0 301052.57 403763.61
Mtp3bSpItu=2-4122,Mtp3bSls=2-
4400,Mtp3bSlItu=Slc0 301060.23 403781.02
Mtp3bSpItu=2-4122,Mtp3bSls=2-
4400,Mtp3bSlItu=Slc0 301068.13 403798.68
Mtp3bSpItu=2-4122,Mtp3bSls=2-
4400,Mtp3bSlItu=Slc0 301075.64 403815.37
Mtp3bSpItu=2-4122,Mtp3bSls=2-
4400,Mtp3bSlItu=Slc0 301083.12 403831.99
Mtp3bSpItu=2-4122,Mtp3bSls=2-
4400,Mtp3bSlItu=Slc0 301091.04 403849.47
Mtp3bSpItu=2-4122,Mtp3bSls=2-
4400,Mtp3bSlItu=Slc0 301099.20 403867.80
Mtp3bSpItu=2-4122,Mtp3bSls=2-
4400,Mtp3bSlItu=Slc0 301106.78 403885.00
This sample is taken for top10 send and receive links it can be seen only one or 2 link being used In case if only
these link are there then further link should be added, in case of more link are available then load sharing should
be enable in local and remote node .
Load sharing should be implemented from local side and the remote side .It was observed other SLCs does not
have any traffic at all
Mtp3bSpItu=2-4121,Mtp3bSls=2-4105_30,Mtp3bSlItu=2-4105_slc1 1768
Mtp3bSpItu=2-4121,Mtp3bSls=2-4105_30,Mtp3bSlItu=2-4105_slc1 1768
L
i Mtp3bSpItu=2-4121,Mtp3bSls=2-4105_30,Mtp3bSlItu=2-4105_slc1 1768
n Mtp3bSpItu=2-4121,Mtp3bSls=2-4105_30,Mtp3bSlItu=2-4105_slc1 1768
k
congestion Recorded in MGW21
Mtp3bSpItu=3-4122,Mtp3bSls=3-4202,Mtp3bSlItu=Slc0 391
Mtp3bSpItu=2-4122,Mtp3bSls=2-4400,Mtp3bSlItu=Slc0 391
Mtp3bSpItu=2-4122,Mtp3bSls=2-4400,Mtp3bSlItu=Slc0 391
Mtp3bSpItu=2-4122,Mtp3bSls=2-4400,Mtp3bSlItu=Slc0 391
Mtp3bSpItu=2-4122,Mtp3bSls=2-4400,Mtp3bSlItu=Slc0 391
Mtp3bSpItu=2-4120,Mtp3bSls=2-4122,Mtp3bSlItu=Slc0 141
Mtp3bSpItu=2-4120,Mtp3bSls=2-4122,Mtp3bSlItu=Slc0 141
Mtp3bSpItu=2-4120,Mtp3bSls=2-4122,Mtp3bSlItu=Slc0 141
Mtp3bSpItu=2-4120,Mtp3bSls=2-4122,Mtp3bSlItu=Slc0 141
Mtp3bSpItu=2-4120,Mtp3bSls=2-4122,Mtp3bSlItu=Slc0 141
Mtp3bSpItu=2-4120,Mtp3bSls=2-4122,Mtp3bSlItu=Slc0 141
The TDM Termination Group Utilization Rate measurement is used to calculate the current utilization rate of a
Time Division Multiplexing (TDM) termination group and the amount of TDM traffic in the node. In normal cases
(that is, no TDM misconfiguration), the MSC server is aware of the maximum TDM capacity of all connected M-
MGWs.
Low value mean underutilize resources should be shifted to other destination where it is required for full
utilization of node capacity. This KPI is based on average, 100% is full utilization
Utilization rate and Reservation Success are checked for all the Device Pool Resources:
MGW11
1545 Subrack=1Slot=11PlugInUnit=1 53
1415 Subrack=1Slot=6PlugInUnit=1 54
1430 Subrack=1Slot=6PlugInUnit=1 59
1400 Subrack=1Slot=6PlugInUnit=1 61
MGW21
1545 Subrack=1,Slot=10,PlugInUnit=1 23
1415 Subrack=1,Slot=10,PlugInUnit=1 22
1515 Subrack=1,Slot=10,PlugInUnit=1 22
1530 Subrack=1,Slot=10,PlugInUnit=1 22
MGW31
1400 Subrack=3,Slot=2,PlugInUnit=1 49
1415 Subrack=3,Slot=2,PlugInUnit=1 49
1400 Subrack=4,Slot=3,PlugInUnit=1 48
1400 Subrack=4,Slot=2,PlugInUnit=1 48
Note: The healthy value range above should be considered more as a recommendation. Depending on how the
network is dimensioned, the healthy value range may also exceed the upper value defined above, even when the
network is operating under normal conditions. The healthy value range may also temporarily exceed the upper
limit above, for example during peak hours.
Decreased traffic handling capacity of overloaded resources, which may eventually result in restarts.
H.248 Load Control function may be activated, making the MSC server rerouting traffic to other VMGws in
other nodes.
Event Overload in VMGw Pool is issued when there is an overload situation in a VMGw pool.
The Current Traffic Load measurement shows the current traffic level of a M-MGw. The result of the formula is an
estimate of the traffic level in Erlang. Erlang calculation is not possible because M-MGw operates on connection
level and is not able to distinguish between individual calls.
MGW11
Average per day = 29%
Maximum in one Hour = 69%
MGW21
Average per day = 24.60%
Maximum in one Hour = 58.39%
MGW31
Average per day = 19.53%
Maximum in one Hour = 52.52%
The main contributors to the loading of the TRHs are: processing of the paging messages, processing of the
measurement reports from the MSs, signalling caused by call handling, processing of the location updates and
processing of the SMS messages. If the traffic intensity and/or level becomes too high, the TRHs could become
overloaded
Recommendations:
Dimensioning rules are allowing utilization 30% load in a non-failure situation and 60% load in a load in
a failure situation.
It is very important that load limits are followed, as when the SS7 links reaches a certain load level, the
message success rate decreases dramatically.
This load limit is a function of the message length (Location Updates to being one of the worst)
Recommendation: Check Transsmission stabilty, It is effect in all network as well as on ASR suscess rate.
This command is used to initiate a printout of the transcoder pool data for one, several or all defined transcoder
pools. Most RNOTRA should be equal to POOLACT, if not, the number of TRA demux devices should be checked
again
<RRTPP:TRAPOOL=ALL;
1 768 708 60
2 744 744 0
END
This command is used to initiate a printout of transcoder pool idle level supervision data for one, several or all
defined transcoder pools. Each limit has its own alarm class. The given alarm class is assigned to limit 2. Limit 1
is assigned to the nearest alarm class below the alarm class of limit 2. If the alarm class A2 is given in the
command, limit 1 will be assigned to alarm class A3. This means that limit 2 is a more serious limit.
The limits are given as a percentage of the required number of transcoder resources in the transcoder pool.
<RRISP:TRAPOOL=ALL;
AMRHR OFF
EFR OFF
HR ON A2 2 1
FR ON A2 2 1
END
RRMSP:TRAPOOL=ALL;
This command is used to initiate a printout of transcoder pool mean hold time supervision data for one, several or
all transcoder pools.
The alarm level for a transcoder pool is a percentage of the transcoder pool mean hold time. If the mean hold
time for any of the transcoder resources in the transcoder pool falls below the current pool alarm level the RADIO
TRANSMISSION TRANSCODER POOL MEAN HOLD TIME SUPERVISION alarm is issued. We suggest that ALPERC
be 40 in most cases.
<RRMSP:TRAPOOL=ALL;
AMRHR OFF
EFR OFF
HR ON A1 3 20
FR ON A1 0 20
END
9.1 Documentation
During the process of information gathering it was found that there is no centralized database available for
reference, which comprises of all the details and information regarding the network. The information available
was scattered and most of the time it was incomplete. It is recommended to have standard documentations for
large networks like XXXX, this will not only help to track as well facilitate the process of network planning,
optimization and operations & maintenance. In order to built the documentation database Aircom recommends
XXXX to ask the vendor for documented details for the following.
The documents available with XXXX team includes only the overview for the equipments and processes, these
high level documents are not enough for the understanding/maintaining proper functioning of the network.
Therefore Aircom recommends that vendor shall be asked to provide details for following:
During the audit of Core Network it was responsibility of XXXX to provide Core Network performance KPI reports
which were never provided. The KPIs should be available for the monitoring of the network; some of the most
important KPIs which should be monitored on daily basis include Busy Hour Call Attempts (BHCA) for neither
MSC. The BHCA trend is one of the key measurements for MSC and MGW licensing. BHCA also related to the
direct subscriber calling behaviour, so it also serves as quick reference to know the calling pattern of the network.
Aircom generated all the KPIs manually by extracting the raw data reports from the APG and processing them
according to the Ericsson standard formula for the audit purposes. This KPIs generation process was very time
consuming.
Lack of training:
On interviewing/discussing it was realized that the daily activities and procedures followed by the XXXX team are
not efficient and productive. As an example there are no periodic report generated automatically by performance
tool for the Busy Hour, the way this BH performance is taken into account by the XXXX team is inaccurate and
have maximum probability of errors; this is because they are calculating it manually. Besides this after
understanding the procedures followed by XXXX team in details it was concluded that they require proper training
for troubleshooting and tackling the problem occurring in the network. Currently the way the problems are being
handled it is doubted that the root cause for problem occurred can be traced quickly.
9.2 Procedures
Aircom interacted with the XXXX team and developed understanding of processes and procedures
followed.
The XXXX does not have a dedicated Planning and Optimization team of its own. The entire planning and
design activities are conducted by the vendor and this has a drawback that provides it provides maximum
privileges to the vendor for the BOQs and makes the vendor itself is a demand generating organization.
Besides this in this process networks secret information is disclosed to the vendor as it extensive
knowledge about the network.
Currently KPI Reporting for core is done with the help of excel sheet and besides this the KPI information
provided in them are not complete thus cannot be used extensively for analysis. In addition to it is very
difficult to maintain/retrieve the records from excel sheets. Currently all the KPIs available with XXXX are
provided by the vendor only when demanded as XXXX is completely dependent on the vendor and cannot
carry out its maintenance and optimization processes individually.
9.3 Recommendations
1. SCTP Associations & Signaling links: : Observed one LIP (Local IP) caring high unit of SCTP
association load and other LIP caring small load unit, this is unbalance of signaling load. As the signaling
load is not balanced a complete through redesign for signaling aspects of core network is required, in
which all signaling should optimally dimensioned
2. Over dimensioned HLR/VLR: to improve the performance of the HLR reconciliation should be done for
the VLR performance route optimization activity to be performed. In addition to better performance of
these network elements these process will help to save the expansion cost to meet the commercial
demands.
3. Planning and Design Team: XXXX should built an experienced design, planning and optimization team
in place dedicated to evaluate network status, planning, design and dimensioning to meet future network
expansion requirements and optimally used the current available hardware resources best suited to the
interest of XXXX network.
4. Performance Management Tool: It is strongly recommended that all Core Network KPIs and
counters should be available in the performance management tool. This will speed up the maintenance
process and most importantly a true picture of network performance can be seen at different levels.
5. Technical Training of Employees: Aircom recommends that XXXX work force should be organized
according to industry best practice. And all relevant staff shall be trained in specific domains. The below
mentioned specialized team are of pivotal importance in any organization:
10 CONCLUSION
The above document has been prepared in order to serve as a reference document mentioning all the control
points in Core network in terms of process failure and (or) in terms of Configuration and dimensioning.
Care has been taken to report the issues in terms of factual data, after applying all standard calculation, wherever
applicable.
Purpose of this document is to be shared as a knowledge base, and act upon all the Recommendations
mentioned.
11 APPENDIX
Roaming Documentations