You are on page 1of 66

Contents

3.8 Blade Labeling ........................................................................................................................... 1


CUDB user management....................................................................................................................... 4
CUDB Node Health Check .................................................................................................................. 5
Printing Active Alarms .............................................................................................................. 5
Checking COM SA Status .............................................................................................................. 5
3.Identify which DSG is hosted in this DS. ............................................................ 14
4.Close previous command line interface: exit ..................................................... 14
cudbDsgMastershipChange ............................................................................................................. 46

root

3.8 Blade Labeling


The blades are labeled according to Linux Open Telecom Cluster (LOTC) name principles.

Common UNIX commands used in CUDB


Find command
To start searching the whole drive you would type the following:
find / -name myresume.odt

to start searching the current file system


find . -name game
find ~ -name game
find ~ -atime 100
find / -empty
find / -read
find / -name *.mp3
find / -name *.mp3 -fprint nameoffiletoprintto

3.8.1 For the three subracks


Figure 1 Subrack 0 Layout

PL Sc2-1 Sc_2 Pl_2-3 Pl-2- Pl-2- Pl-2- Pl-2- Pl-2- Pl-2- Pl-2- Pl-2-11 Pl-2- Pl-2-13
4 5 6 7 8 9 10 12
slot 0-1 0-3 05 07 09 11 13 15 17 19 21 23
Figure 2 Subrack 1 Layout

Figure 3 Subrack 2 Layout


shelf slot ip
node 1 control 0 1 SC_2_1 10.22.0.1 OAM1
node 2 control 0 3 SC_2_2 10.22.0.2 OAM2
node 3 payload 0 5 PL_2_3 10.22.0.3 PL0
node 4 payload 0 7 PL_2_4 10.22.0.4 PL1
node 5 payload 0 9 PL_2_5 10.22.0.5 PL2 new
node 6 payload 0 11 PL_2_6 10.22.0.6 PL3 new
node 7 payload 0 13 PL_2_7 10.22.0.7 DS1_0
node 8 payload 0 15 PL_2_8 10.22.0.8 DS1_1 new
node 9 payload 0 17 PL_2_9 10.22.0.9 DS2_0 new
node 10 payload 0 19 PL_2_10 10.22.0.10 DS2_1
node 11 payload 0 21 PL_2_11 10.22.0.11 DS3_0
node 12 payload 0 23 PL_2_12 10.22.0.12 DS3_1
node 13 payload 1 1 PL_2_13 10.22.0.13 DS4_0 new
node 14 payload 1 3 PL_2_14 10.22.0.14 DS4_1

CUDB user management

List all users


CUDB_11 SC_2_1# cut -d: -f1 /etc/passwd
mysql
open_ldap
system_monitor
g_soap
cluster_supervisor
cudbOperator
cudbadmin
eric-cudb
reasuser
root
bin
daemon
lp
mail
games
wwwrun
ftp
nobody
dhcpd
ntp
man
news
uucp
messagebus
at
polkituser
haldaemon
sshd

CUDB_21 SC_2_2# passwd


Changing password for root.
New password:mcelR123
Retype new password:mcelR123
Password changed.
CUDB_21 SC_2_2#

CUDB Node Health Check

Printing Active Alarms


CUDB_21 SC_2_1# fmactivealarms
Active alarms:
!---------------------------------------------------------------
Module : PREVENTIVE-MAINTENANCE
Error Code : 4
Resource Id : 1.3.6.1.4.1.193.169.100.1
Timestamp : Fri Mar 25 12:50:04 CAT 2016
Model Description : Logchecker found critical error(s), Preventive
Maintenance.
Active Description : Preventive Maintenance: Logchecker has found critical
error(s).
Event Type : 1
Probable Cause : 1024
Severity : critical
Orig Source IP : 10.202.6.67

Checking COM SA Status


CUDB_11 SC_2_2# cudbHaState | grep COM
COM state:
COM is assigned as ACTIVE in controller SC-1
COM is assigned as STANDBY in controller SC-2

Checking SAF Cluster State


CUDB_11 SC_2_2# cudbHaState

LOTC cluster uptime:


--------------------
Wed Oct 1 21:00:45 2014

LOTC cluster state:


-------------------
Node safNode=SC_2_1 joined cluster | Thu Apr 30 15:24:19 2015
Node safNode=SC_2_2 joined cluster | Thu Apr 30 12:21:00 2015
Node safNode=PL_2_3 joined cluster | Wed Oct 1 21:01:52 2014
Node safNode=PL_2_4 joined cluster | Wed Oct 1 21:01:56 2014
Node safNode=PL_2_5 joined cluster | Wed Oct 1 21:01:56 2014
Node safNode=PL_2_6 joined cluster | Wed Oct 1 21:02:01 2014
Node safNode=PL_2_7 joined cluster | Wed Oct 1 21:02:01 2014
Node safNode=PL_2_8 joined cluster | Fri Oct 3 11:46:36 2014
Node safNode=PL_2_9 joined cluster | Wed Oct 1 21:01:52 2014
Node safNode=PL_2_10 joined cluster | Wed Oct 1 21:02:06 2014
Node safNode=PL_2_11 joined cluster | Tue Apr 21 03:05:01 2015
Node safNode=PL_2_12 joined cluster | Wed Oct 1 21:01:56 2014
Node safNode=PL_2_13 joined cluster | Wed Oct 1 21:01:56 2014
Node safNode=PL_2_14 joined cluster | Wed Oct 1 21:02:01 2014

AMF cluster state:


------------------
saAmfNodeAdminState."safAmfNode=SC-1,safAmfCluster=myAmfCluster": Unlocked
saAmfNodeOperState."safAmfNode=SC-1,safAmfCluster=myAmfCluster": Enabled
saAmfNodeAdminState."safAmfNode=SC-2,safAmfCluster=myAmfCluster": Unlocked
saAmfNodeOperState."safAmfNode=SC-2,safAmfCluster=myAmfCluster": Enabled
saAmfNodeAdminState."safAmfNode=PL-3,safAmfCluster=myAmfCluster": Unlocked
saAmfNodeOperState."safAmfNode=PL-3,safAmfCluster=myAmfCluster": Enabled
saAmfNodeAdminState."safAmfNode=PL-4,safAmfCluster=myAmfCluster": Unlocked
saAmfNodeOperState."safAmfNode=PL-4,safAmfCluster=myAmfCluster": Enabled
saAmfNodeAdminState."safAmfNode=PL-5,safAmfCluster=myAmfCluster": Unlocked
saAmfNodeOperState."safAmfNode=PL-5,safAmfCluster=myAmfCluster": Enabled
saAmfNodeAdminState."safAmfNode=PL-6,safAmfCluster=myAmfCluster": Unlocked
saAmfNodeOperState."safAmfNode=PL-6,safAmfCluster=myAmfCluster": Enabled
saAmfNodeAdminState."safAmfNode=PL-7,safAmfCluster=myAmfCluster": Unlocked
saAmfNodeOperState."safAmfNode=PL-7,safAmfCluster=myAmfCluster": Enabled
saAmfNodeAdminState."safAmfNode=PL-8,safAmfCluster=myAmfCluster": Unlocked
saAmfNodeOperState."safAmfNode=PL-8,safAmfCluster=myAmfCluster": Enabled
saAmfNodeAdminState."safAmfNode=PL-9,safAmfCluster=myAmfCluster": Unlocked
saAmfNodeOperState."safAmfNode=PL-9,safAmfCluster=myAmfCluster": Enabled
saAmfNodeAdminState."safAmfNode=PL-10,safAmfCluster=myAmfCluster": Unlocked
saAmfNodeOperState."safAmfNode=PL-10,safAmfCluster=myAmfCluster": Enabled
saAmfNodeAdminState."safAmfNode=PL-11,safAmfCluster=myAmfCluster": Unlocked
saAmfNodeOperState."safAmfNode=PL-11,safAmfCluster=myAmfCluster": Enabled
saAmfNodeAdminState."safAmfNode=PL-12,safAmfCluster=myAmfCluster": Unlocked
saAmfNodeOperState."safAmfNode=PL-12,safAmfCluster=myAmfCluster": Enabled
saAmfNodeAdminState."safAmfNode=PL-13,safAmfCluster=myAmfCluster": Unlocked
saAmfNodeOperState."safAmfNode=PL-13,safAmfCluster=myAmfCluster": Enabled
saAmfNodeAdminState."safAmfNode=PL-14,safAmfCluster=myAmfCluster": Unlocked
saAmfNodeOperState."safAmfNode=PL-14,safAmfCluster=myAmfCluster": Enabled

CoreMW HA state:
----------------
CoreMW is assigned as ACTIVE in controller SC-2
CoreMW is assigned as STANDBY in controller SC-1

COM state:
----------
COM is assigned as ACTIVE in controller SC-1
COM is assigned as STANDBY in controller SC-2
SI HA state:
------------
saAmfSISUHAState."safSu=PL-4,safSg=2N,safApp=ERIC-
CUDB_SOAP_NOTIFIER"."safSi=2N-1": standby(2)
saAmfSISUHAState."safSu=PL-3,safSg=2N,safApp=ERIC-
CUDB_SOAP_NOTIFIER"."safSi=2N-1": active(1)
saAmfSISUHAState."safSu=SC-2,safSg=PLDB_2N,safApp=ERIC-
CUDB_CS"."safSi=PLDB_2N-1": standby(2)
saAmfSISUHAState."safSu=SC-2,safSg=2N,safApp=ERIC-CUDB_CUDBOI"."safSi=2N-1":
standby(2)
saAmfSISUHAState."safSu=SC-2,safSg=2N,safApp=ERIC-
CUDB_LDAPFE_MONITOR"."safSi=2N-1": standby(2)
saAmfSISUHAState."safSu=SC-2,safSg=DS3_2N,safApp=ERIC-CUDB_CS"."safSi=DS3_2N-
1": standby(2)
saAmfSISUHAState."safSu=SC-2,safSg=DS1_2N,safApp=ERIC-CUDB_CS"."safSi=DS1_2N-
1": standby(2)
saAmfSISUHAState."safSu=SC-2,safSg=DS4_2N,safApp=ERIC-CUDB_CS"."safSi=DS4_2N-
1": standby(2)
saAmfSISUHAState."safSu=SC-2,safSg=DS2_2N,safApp=ERIC-CUDB_CS"."safSi=DS2_2N-
1": standby(2)
saAmfSISUHAState."safSu=SC-2,safSg=2N,safApp=ERIC-
CUDB_BC_SERVER_MONITOR"."safSi=2N-1": standby(2)
saAmfSISUHAState."safSu=Control2,safSg=2N,safApp=ERIC-EVIP"."safSi=2N":
standby(2)
saAmfSISUHAState."safSu=Control1,safSg=2N,safApp=ERIC-EVIP"."safSi=2N":
active(1)
saAmfSISUHAState."safSu=SC-1,safSg=2N,safApp=ERIC-
CUDB_BC_SERVER_MONITOR"."safSi=2N-1": active(1)
saAmfSISUHAState."safSu=SC-1,safSg=2N,safApp=ERIC-CUDB_CUDBOI"."safSi=2N-1":
active(1)
saAmfSISUHAState."safSu=SC-1,safSg=2N,safApp=ERIC-
CUDB_LDAPFE_MONITOR"."safSi=2N-1": active(1)
saAmfSISUHAState."safSu=SC-1,safSg=DS3_2N,safApp=ERIC-CUDB_CS"."safSi=DS3_2N-
1": active(1)
saAmfSISUHAState."safSu=SC-1,safSg=DS2_2N,safApp=ERIC-CUDB_CS"."safSi=DS2_2N-
1": active(1)
saAmfSISUHAState."safSu=SC-1,safSg=DS1_2N,safApp=ERIC-CUDB_CS"."safSi=DS1_2N-
1": active(1)
saAmfSISUHAState."safSu=SC-1,safSg=PLDB_2N,safApp=ERIC-
CUDB_CS"."safSi=PLDB_2N-1": active(1)
saAmfSISUHAState."safSu=SC-1,safSg=DS4_2N,safApp=ERIC-CUDB_CS"."safSi=DS4_2N-
1": active(1)

SU States:
----------
Status OK

Checking eVIP Agent Status


1. Login to eVIP console with the following command:

CUDB_11 SC_2_2# telnet `/opt/vip/bin/getactivecontrol` 25190


Trying fe80::1234%evip_macvlan0...
Connected to fe80::1234%evip_macvlan0.
Escape character is '^]'.

* Copyright (C) 2011 by Ericsson AB


* S - 125 26 STOCKHOLM
* SWEDEN, tel int + 46 10 719 0000
*
* The copyright to the computer program herein is the property of Ericsson
AB. The
* program may be used and/or copied only with the written permission from
Ericsson
* AB, or in accordance with the terms and conditions stipulated in the
* agreement/contract under which the program has been supplied.
*
* All rights reserved.

EVIP> show agents alb_0


+---------------------------[ ALB alb_0 (ACTIVE) ]---------------------------+
+------------------------------------- PN -----------------------------------+
| pnagent (28) | lbesel_pn (56) |
|[1] fe80::ff:fe01:b : ACTIVE |[1] fe80::ff:fe01:b : ACTIVE |
|[2] fe80::ff:fe01:e : ACTIVE |[2] fe80::ff:fe01:e : ACTIVE |
|[3] fe80::ff:fe01:11 : ACTIVE |[3] fe80::ff:fe01:11 : ACTIVE |
|[4] fe80::ff:fe01:14 : ACTIVE |[4] fe80::ff:fe01:14 : ACTIVE |
|[5] fe80::ff:fe01:17 : ACTIVE |[5] fe80::ff:fe01:17 : ACTIVE |
|[6] fe80::ff:fe01:1a : ACTIVE |[6] fe80::ff:fe01:1a : ACTIVE |
|[7] fe80::ff:fe01:1d : ACTIVE |[7] fe80::ff:fe01:1d : ACTIVE |
|[8] fe80::ff:fe01:20 : ACTIVE |[8] fe80::ff:fe01:20 : ACTIVE |
|[9] fe80::ff:fe01:23 : ACTIVE |[9] fe80::ff:fe01:23 : ACTIVE |
|[10] fe80::ff:fe01:26 : ACTIVE |[10] fe80::ff:fe01:26 : ACTIVE |
|[11] fe80::ff:fe01:29 : ACTIVE |[11] fe80::ff:fe01:29 : ACTIVE |
|[12] fe80::ff:fe01:2c : ACTIVE |[12] fe80::ff:fe01:2c : ACTIVE |
|[13] fe80::ff:fe01:2f : ACTIVE |[13] fe80::ff:fe01:2f : ACTIVE |
|[14] fe80::ff:fe01:32 : ACTIVE |[14] fe80::ff:fe01:32 : ACTIVE |
+----------------------------------------------------------------------------+
| ersipc (0) | repdb (56) |
|[1] fe80::ff:fe01:b : ACTIVE |[1] fe80::ff:fe01:b : ACTIVE |
|[2] fe80::ff:fe01:e : ACTIVE |[2] fe80::ff:fe01:e : ACTIVE |
|[3] fe80::ff:fe01:11 : ACTIVE |[3] fe80::ff:fe01:11 : ACTIVE |
|[4] fe80::ff:fe01:14 : ACTIVE |[4] fe80::ff:fe01:14 : ACTIVE |
|[5] fe80::ff:fe01:17 : ACTIVE |[5] fe80::ff:fe01:17 : ACTIVE |
|[6] fe80::ff:fe01:1a : ACTIVE |[6] fe80::ff:fe01:1a : ACTIVE |
|[7] fe80::ff:fe01:1d : ACTIVE |[7] fe80::ff:fe01:1d : ACTIVE |
|[8] fe80::ff:fe01:20 : ACTIVE |[8] fe80::ff:fe01:20 : ACTIVE |
|[9] fe80::ff:fe01:23 : ACTIVE |[9] fe80::ff:fe01:23 : ACTIVE |
|[10] fe80::ff:fe01:26 : ACTIVE |[10] fe80::ff:fe01:26 : ACTIVE |
|[11] fe80::ff:fe01:29 : ACTIVE |[11] fe80::ff:fe01:29 : ACTIVE |
|[12] fe80::ff:fe01:2c : ACTIVE |[12] fe80::ff:fe01:2c : ACTIVE |
|[13] fe80::ff:fe01:2f : ACTIVE |[13] fe80::ff:fe01:2f : ACTIVE |
|[14] fe80::ff:fe01:32 : ACTIVE |[14] fe80::ff:fe01:32 : ACTIVE |
+------------------------------------- LBE ----------------------------------+
| lbeagent (66) | sesel_lbe (22) |
|[2] fe80::1:f4ff:fe01:3 : ACTIVE |[2] fe80::1:f4ff:fe01:3 : ACTIVE |
|[1] fe80::1:f4ff:fe01:4 : ACTIVE |[1] fe80::1:f4ff:fe01:4 : ACTIVE |
+------------------------------------- FE -----------------------------------+
| feeagent (44) | lbesel_fe (26) |
|[1] fe80::1:f6ff:fe01:7 : ACTIVE UP |[1] fe80::1:f6ff:fe01:7 : ACTIVE |
|[2] fe80::1:f6ff:fe01:9 : ACTIVE UP |[2] fe80::1:f6ff:fe01:9 : ACTIVE |
+----------------------------------------------------------------------------+
| sesel_fe (22) | |
|[1] fe80::1:f6ff:fe01:7 : ACTIVE | |
|[2] fe80::1:f6ff:fe01:9 : ACTIVE | |
+------------------------------------- SE -----------------------------------+
| seagent (70) | lbesel_se (42) |
|[2] fe80::1:f5ff:fe01:5 : ACTIVE RDY |[2] fe80::1:f5ff:fe01:5 : ACTIVE |
|[1] fe80::1:f5ff:fe01:6 : ACTIVE RDY |[1] fe80::1:f5ff:fe01:6 : ACTIVE |
+----------------------------------------------------------------------------+
| sesel_se (12) | |
|[2] fe80::1:f5ff:fe01:5 : ACTIVE | |
|[1] fe80::1:f5ff:fe01:6 : ACTIVE | |
+------------------------------------ IPSEC ---------------------------------+
| ikeagent (0) | ipsecuagent (44) |
| |[1] fe80::ff:fe01:d : ACTIVE RDY |
| |[2] fe80::ff:fe01:10 : ACTIVE RDY |
| |[3] fe80::ff:fe01:13 : ACTIVE RDY |
| |[4] fe80::ff:fe01:16 : ACTIVE RDY |
| |[5] fe80::ff:fe01:19 : ACTIVE RDY |
| |[6] fe80::ff:fe01:1c : ACTIVE RDY |
| |[7] fe80::ff:fe01:1f : ACTIVE RDY |
| |[8] fe80::ff:fe01:22 : ACTIVE RDY |
| |[9] fe80::ff:fe01:25 : ACTIVE RDY |
| |[10] fe80::ff:fe01:28 : ACTIVE RDY |
| |[11] fe80::ff:fe01:2b : ACTIVE RDY |
| |[12] fe80::ff:fe01:2e : ACTIVE RDY |
| |[13] fe80::ff:fe01:31 : ACTIVE RDY |
| |[14] fe80::ff:fe01:34 : ACTIVE RDY |
+----------------------------------- XALBSEL --------------------------------+
| xalbsel (26) | |
|[1] fe80::ff:fe01:c : ACTIVE | |
|[2] fe80::ff:fe01:f : ACTIVE | |
|[3] fe80::ff:fe01:12 : ACTIVE | |
|[4] fe80::ff:fe01:15 : ACTIVE | |
|[5] fe80::ff:fe01:18 : ACTIVE | |
|[6] fe80::ff:fe01:1b : ACTIVE | |
|[7] fe80::ff:fe01:1e : ACTIVE | |
|[8] fe80::ff:fe01:21 : ACTIVE | |
|[9] fe80::ff:fe01:24 : ACTIVE | |
|[10] fe80::ff:fe01:27 : ACTIVE | |
|[11] fe80::ff:fe01:2a : ACTIVE | |
|[12] fe80::ff:fe01:2d : ACTIVE | |
|[13] fe80::ff:fe01:30 : ACTIVE | |
|[14] fe80::ff:fe01:33 : ACTIVE | |
+----------------------------------------------------------------------------+
+----------------------------------------------------------------------------+
eRSIP state: ACTIVE cIPSEC state: ACTIVE RDY

OK

EVIP> exit
Checking ESA Processes
Use the esa status command to check if the ESA processes are running
in both controller blades. All ESA agents must be running. See the example
below on how to execute the command and how the expected output looks like.

CUDB_11 SC_2_2# esa status


[info] ESA Master Agent is running.
[info] ESA Sub Agent is running.
[info] ESA PM Agent is running.
CUDB_11 SC_2_2# ssh OAM2 esa status
[info] ESA Master Agent is running.
[info] ESA Sub Agent is running.
[info] ESA PM Agent is running.
CUDB_11 SC_2_2#

Checking MySQL Replication State

Check the MySQL replication state between the master and slave replica(s)
with the following command
CUDB_11 SC_2_1# cudbCheckReplication
[info] cudbCheckReplication is running with options: '', BEHIND_LIMIT: 10
[info] Acquiring mastership information.
[info] Acquiring DSG status information.
[info] Injecting verification data to master DS units.
[info] Sleeping 10 second(s) to wait for replication.
[info] Checking replication in slave DS units.
[info] Node21-DSG0 replication: OK
[info] Node21-DSG1 replication: OK
[info] Node21-DSG2 replication: OK
[info] Node21-DSG3 replication: OK
[info] Node21-DSG4 replication: OK
[info] Summary: channels checked: 5 -> PASSED: 5, FAILED: 0, UNKNOWN: 0.

Checking LDAP FE Processes and Errors

Check that slapd and cudbLdapFeMonitor processes are running on each PLDB
blade by executing the following command:

CUDB_21 SC_2_1# for i in `awk '/^node.*[PL]_2_.*$/ {print $4}'


/cluster/etc/cluster.conf | xargs`; do echo -n $i:' ';ssh $i "ps -ef | grep
slapd| grep -v grep";echo; done
PL_2_3: root 9296 1 0 Mar25 ? 00:03:18
/opt/ericsson/cudb/Monitors/LdapFeMonitor/bin/cudbLdapFeMonitor --cpu-set
0x00000F80 --arguments -
f|/home/cudb/dataAccess/ldapAccess/ldapFe/config/slapd.conf|-h|
ldap://10.22.0.3:389/ ldaps://10.22.0.3:636/ ldap://192.168.0.3:389/
ldaps://192.168.0.3:636/ ldap://10.202.6.65:389/ ldap://10.202.6.68:389/
ldap://10.202.6.66:389/ ldaps://10.202.6.66:636/ ldaps://10.202.6.65:636/
ldaps://10.202.6.68:636/ ldapi://%2Ftmp%2Fslapd.sock --ldap-uri ldap://PL_2_3
--config-file
/opt/ericsson/cudb/Monitors/LdapFeMonitor/etc/cudbLdapFeMonitor.conf
root 10284 1 4 Mar25 ? 05:14:57
/opt/ericsson/cudb/ldapfe/libexec/slapd -f
/home/cudb/dataAccess/ldapAccess/ldapFe/config/slapd.conf -h
ldap://10.22.0.3:389/ ldaps://10.22.0.3:636/ ldap://192.168.0.3:389/
ldaps://192.168.0.3:636/ ldap://10.202.6.65:389/ ldap://10.202.6.68:389/
ldap://10.202.6.66:389/ ldaps://10.202.6.66:636/ ldaps://10.202.6.65:636/
ldaps://10.202.6.68:636/ ldapi://%2Ftmp%2Fslapd.sock

Check for errors in the LDAP FE printout with the following command:

CUDB_21 SC_2_1# cudbFollowLdapfeLogs


PL_2_13: Mar 30 14:34:28 PL_2_13 slapd[11732]: connection_read(244): no
connection!
PL_2_13: Mar 30 14:40:40 PL_2_13 slapd[11732]: connection_read(161): no
connection!
PL_2_13: Mar 30 14:42:48 PL_2_13 slapd[11732]: connection_read(161): no
connection!
PL_2_13: Mar 30 14:48:02 PL_2_13 slapd[11732]: connection_read(161): no
connection!

Checking LDAP FE Occupation and Operations

CUDB_11 SC_2_1# cudbTpsStat


03/30/16-15:09:55 | PL_2_9 | PL_2_8 | PL_2_7 | PL_2_6 | PL_2_5 | PL_2_4 |
PL_2_3 |PL_2_14 |PL_2_13 |PL_2_12 |PL_2_11 |PL_2_10 |SUMMARY
-----------------------------------------------------------------------------
----------------------------------------------------------
TPS | 0 | 83 | 91 | 235 | 96 | 246 |
104 | 55 | 109 | 100 | 93 | 91 | 1303
ERR5123 | 0 | 0 | 0 | 0 | 0 | 0 |
0 | 0 | 0 | 0 | 0 | 0 | 0
ERR80 | 0 | 0 | 0 | 0 | 0 | 0 |
0 | 0 | 0 | 0 | 0 | 0 | 0
PROXY_IN | 0 | 0 | 0 | 0 | 0 | 0 |
0 | 0 | 0 | 0 | 0 | 0 | 0

Checking MySQL Database Consistency

Check the database consistency between MySQL clusters (that is, between the
PLDB and DSG master replicas and their slaves) with the following command
CUDB_11 SC_2_1# cudbCheckConsistency
[info] cudbCheckConsistency is running with options: '', MAXDIFF_LIMIT:
1.00%, CHECK_LIMIT: 100
[info] Acquiring mastership information.
[info] Checking consistency in slave DS units.
[info] Node21-DSG0 consistency: OK - row count difference 0.00%
[info] Node21-DSG1 consistency: OK - row count difference 0.00%
[info] Node21-DSG2 consistency: OK - row count difference 0.00% in 'CPK'
(1 rows, MA: 1386319 <> SL: 1386318)
[info] Node21-DSG3 consistency: OK - row count difference 0.00% in 'CPK'
(1 rows, MA: 1528125 <> SL: 1528124)
[info] Node21-DSG4 consistency: OK - row count difference 0.00%
[info] Summary: slaves checked: 5 -> PASSED: 5, FAILED: 0, UNKNOWN: 0.

Checking BC Cluster States


In addition to the cudbSystemStatus command output, check the Blackboard
Coordination (BC) cluster state with the following command on each node
CUDB_11 SC_2_1# /opt/ericsson/cudb/sm_bcclient/bin/BCClient.sh
cs
N11-D0-I1 --> active
N11-D0-I2 --> standby
N11-D1-I1 --> active
N11-D1-I2 --> standby
N11-D2-I1 --> active
N11-D2-I2 --> standby
N11-D3-I1 --> active
N11-D3-I2 --> standby
N11-D4-I1 --> active
N11-D4-I2 --> standby
dsg
N11-D0 --> alive [21]
N11-D1 --> alive [86]
N11-D2 --> alive [83]
N11-D3 --> alive [87]
N11-D4 --> alive [87]
dsgStatusList
D0 --> [size:44] {S1-N11-D0=alive [21], S2-N21-D0=alive
[21]}
D1 --> [size:44] {S1-N11-D1=alive [86], S2-N21-D1=alive
[85]}
D2 --> [size:44] {S1-N11-D2=alive [83], S2-N21-D2=alive
[81]}
D3 --> [size:44] {S1-N11-D3=alive [87], S2-N21-D3=alive
[82]}
D4 --> [size:44] {S1-N11-D4=alive [87], S2-N21-D4=alive
[83]}
masterList
D0 --> master-S1-N11-D0 [1458897874241]
D1 --> master-S1-N11-D1 [1430406148672]
D2 --> master-S1-N11-D2 [1458897874247]
D3 --> master-S1-N11-D3 [1430523867073]
D4 --> master-S1-N11-D4 [1434545875471]
status --> majority[1, 2] AR[] NS[]
leader --> S1-N11-I2
auxiliar
S2-N21-I1 --> Hello
S2-N21-I2 --> Hello
working
mode --> normal
actions
alarms
clear
raise
Checking BC Server States

Perform the following steps to check where BC servers are running, and which
mode they are running in:

 Execute the following command to check which blades the BC servers


are running on:

CUDB_11 SC_2_1# cudbSystemStatus -b

Execution date: Wed Mar 30 15:16:17 CAT 2016

Checking BC clusters:

[Site 1]

SM leader: Node 11 OAM2

Node 10.201.6.68
BC server in SC_2_1 ......... running
BC server in SC_2_2 ......... running
BC server in PL_2_5 ......... running (Leader)

[Site 2]

SM leader: Node 21 OAM1

Node 10.202.6.68
BC server in SC_2_1 ......... running
BC server in SC_2_2 ......... running (Leader)
BC server in PL_2_5 ......... running

Execute the following command on the blades where the BC servers are running
on to check their mode of operation
CUDB_11 SC_2_1# cudbManageBCServer -check4
Actual Host=SC_2_1
Correct number of processes running in blade SC_2_1
CUDB_11 SC_2_1# cudbManageBCServer -checkMode
Actual Host=SC_2_1
Mode: follower

CUDB_21 SC_2_1# cudbManageStore --all --order status

cudbManageStore stores to process: pl ds1 (in dsgroup1) ds2 (in dsgroup2)


ds3 (in dsgroup3) ds4 (in dsgroup4).

Store pl in dsgroup 0 is alive and reporting status ACTIVE.


Store ds1 in dsgroup 1 is alive and reporting status ACTIVE.
Store ds2 in dsgroup 2 is alive and reporting status ACTIVE.
Store ds3 in dsgroup 3 is alive and reporting status ACTIVE.
Store ds4 in dsgroup 4 is alive and reporting status ACTIVE.
cudbManageStore command successful.

Server Platform, Blade Replacement

This OPI specifies how to replace a blade in a CUDB node. A replacement may
be done because of a blade fault, or as part of a hardware upgrade.

For further information see CUDB Node EBS Hardware Description.

login as: root


Using keyboard-interactive authentication.
Password:mcelR123
Last login: Sun Apr 19 21:41:49 2015 from 10.14.0.55
1.Login to active

1.1 display the active and standby controller


CUDB_21 SC_2_2# cudbHaState | grep COM
COM state:
COM is assigned as ACTIVE in controller SC-1
COM is assigned as STANDBY in controller SC-2
1.2 switch to active SC
CUDB_21 SC_2_2# ssh SC_2_1

2.identify a system controller (SC), a DS or a PL


Identify DS number in Linux Open Telecom Cluster (LOTC) file
CUDB_21 SC_2_2# cat /cluster/etc/cluster.conf
node 11 payload PL_2_11
node 12 payload PL_2_12
node 13 payload PL_2_13
node 14 payload PL_2_14
………………………………………………………………….

host all 10.22.0.11 DS3_0


host all 10.22.0.12 DS3_1
host all 10.22.0.13 DS4_0
host all 10.22.0.14 DS4_1

……………………………………………………………………………

interface 11 eth3 ethernet 34:07:fb:ed:2f:90


interface 11 eth4 ethernet 34:07:fb:ed:2f:91
interface 11 eth2 ethernet 34:07:fb:ed:2f:92
blank 34:07:fb:ed:2f:93
interface 11 eth0 ethernet 34:07:fb:ed:2f:94
interface 11 eth1 ethernet 34:07:fb:ed:2f:95

…………………………………………………………………………………………….
#Define default network routes
route all multicast interface bond0
route control default gateway 10.202.4.151

The DMX IP is : route control default gateway IP-1.


So in the node now, route control default gateway is 10.202.4.151.
So DMX IP is 10.202.4.150

3.Identify which DSG is hosted in this DS.

Open a command line interface /opt/com/bin/cliss in COM Active Controller


CUDB_21 SC_2_1# /opt/com/bin/cliss

Where <node> is the node's identification number and <ds> is the DS's
identification number.
>show configuration
ManagedElement=1,CudbSystem=1,CudbLocalNode=21,CudbLocalDs=3
CudbLocalDs=3
dsGroupId=3
enabled=true
instancePriority=2
up

4.Close previous command line interface: exit


CUDB_21 SC_2_1# exit
logout
Connection to SC_2_1 closed.

Find out whether that DSG is a master or not:


CUDB_21 SC_2_2# cudbSystemStatus -R

Execution date: Wed Apr 15 22:55:17 CAT 2015

Checking Replication Channels in the System:


Node | 11 | 21
====================
PLDB ___|__S2_|__M__
DSG 1 __|__M__|__S2_
DSG 2 __|__S2_|__M__
[-E-] DSG 3 __|__M__|__Xu_
DSG 4 __|__S2_|__M__

[-E-]

CUDB_21 SC_2_2#
2.1 Preparation for the replacement
Lock the blade at SAF level:
CUDB_21 SC_2_2# cmw-node-lock PL_2_11

Make a copy of the rpm.conf and leave in the rpm.conf just the linux-control
or linux-payload rpm
CUDB_21 SC_2_1# cp /cluster/nodes/14/etc/rpm.conf
/cluster/nodes/14/etc/rpm.conf_FULL
CUDB_21 SC_2_1# grep -ia linux /cluster/nodes/14/etc/rpm.conf_FULL >
/cluster/nodes/14/etc/rpm.conf

Connect and login in expert mode to Active DMX of the Subrack where the blade
to be replaced is

CUDB_21 SC_2_1# ssh -p 2024 expert@10.202.4.150


expert@10.202.4.150's password: expert
Welcome to the DMX CLI
expert connected from 10.202.4.148 using ssh on blade_0_25
expert@blade_0_25 09:29:53>
expert@blade_0_25 09:29:53>
show ManagedElement 1 DmxFunctions 1 BladeGroupManagement 1 Group CUDB
ShelfSlot Blade 1
ShelfSlot 0-21
Blade 1
operationalState enabled
availabilityStatus noStatus
productNumber "ROJ 208 840/3"
productRevisionState R6A
productName GEP3-HD300
serialNumber "A065094550 "
manufacturingDate 2013-12-23Z
vendorName "Ericsson AB"
changeDate 2014-09-03T08:32:50Z
busType ipmi
firstMacAddress 34:07:fb:ed:2f:8f
consecutiveMacAddresses 12
ShelfSlot 0-23
Blade 1
operationalState enabled
availabilityStatus noStatus
productNumber "ROJ 208 840/3"
productRevisionState R6A
productName GEP3-HD300
serialNumber "A065094544 "
manufacturingDate 2013-12-23Z
vendorName "Ericsson AB"
changeDate 2014-09-03T08:32:50Z
busType ipmi
firstMacAddress 34:07:fb:ed:2f:47
consecutiveMacAddresses 12
ShelfSlot 1-1
Blade 1
operationalState enabled
availabilityStatus noStatus
productNumber "ROJ 208 840/3"

productRevisionState R6A
productName GEP3-HD300
serialNumber "A065094341 "
manufacturingDate 2013-12-21Z
vendorName "Ericsson AB"
changeDate 2014-09-03T08:38:10Z
busType ipmi
firstMacAddress 34:07:fb:ed:28:b7
consecutiveMacAddresses 12
ShelfSlot 1-3
Blade 1
operationalState enabled
availabilityStatus noStatus
productNumber "ROJ 208 840/3"
productRevisionState R6A
productName GEP3-HD300
serialNumber "A065094584 "
manufacturingDate 2013-12-23Z
vendorName "Ericsson AB"
changeDate 2014-09-03T08:38:11Z
busType ipmi
firstMacAddress 34:07:fb:ed:30:c7
consecutiveMacAddresses 12
[ok][2015-04-17 09:34:39]
expert@blade_0_25 09:34:39> conf
Power down the node (blade) to be replaced. From the DMX CLI, run the command
below with the specific slot number.
set ManagedElement 1 Equipment 1 Shelf <subrack> Slot <slot> Blade 1
administrativeState locked and press Enter.
commit and press Enter.

expert@blade_0_25 09:34:39> set ManagedElement 1 Equipment 1 Shelf 0 Slot 21


Blade 1 administrativeState locked
commit

1. Disconnect any cables for the board to be replaced.


2. Use an Electrostatic Discharge (ESD) wrist strap. Connect the strap to
the ESD connection point in the upper part of the cabinet.
3. Untighten the mounting screws and remove the board from the subrack..
4. Extract the new board unit from its ESD bag.
5. Check the product identity of the replacement board unit.
6. Align the processor board with the upper card rails in the subrack.
7. Carefully push the new processor board plug-in board unit into the
subrack.
8. Tighten the mounting screws, with a torque of 0.5 – 0.7 Nm.
9. Reconnect any cables as is described in CUDB Node EBS Hardware
Installation Guide for EBS and in CUDB Node EBS BSP Hardware
Installation Guide for EBS BSP. Tighten the screws of all connectors.
Torque with a screwdriver of type TORX® T5. Torque to a maximum of 0.3
Nm.
10. Put the replaced unit in the ESD bag from the replacement unit.
Use the packaging material from the replacement unit to repackage the
replaced unit.
11. Procedure insertion completed. Remove the ESD wrist strap.
12. Connect and login in expert mode to Active DMX of the Subrack
where the blade to be replaced is, as it is described in CUDB Node EBS
Hardware Configuration Guide.

Power off the new blade by issuing the following command:


expert@blade_0_25 22:29:39> conf
Entering configuration mode private
[ok][2015-04-20 22:38:09]

[edit]
expert@blade_0_25 22:38:09% set ManagedElement 1 Equipment 1 Shelf 0 Slot 21
Blade 1 administrativeState locked
[ok][2015-04-20 22:38:28]

[edit]
expert@blade_0_25 22:38:28% commit
Commit complete.
[ok][2015-04-20 22:38:35]

[edit]
Power on the blade by issuing the following command:
expert@blade_0_25 09:34:39> set ManagedElement 1 Equipment 1 Shelf 0 Slot 21
Blade 1 administrativeState unlocked
commit

Check that the correct firmware version is used by retrieving the firmware
data for the inserted blade
request ManagedElement 1 Equipment 1 Shelf 0 Slot 21 Blade 1 getFirmwareData
type running
expert@blade_0_25 15:38:20> conf
expert@blade_0_25 06:31:24% request ManagedElement 1 Equipment 1 Shelf 1 Slot
3 Blade 1 getFirmwareData type running
IpmiFirmwareData {
type upg
productNumber CXC138912
productRevisionState R7B
version 3.20
}
[ok][2015-04-17 15:38:57]
If firmware version has been updated, the GEP3 blade must be restarted.
expert@blade_0_25 22:29:39> conf
Entering configuration mode private
[ok][2015-04-20 22:38:09]

[edit]
expert@blade_0_25 22:38:09% set ManagedElement 1 Equipment 1 Shelf 0 Slot 21
Blade 1 administrativeState locked
[ok][2015-04-20 22:38:28]

[edit]
expert@blade_0_25 22:38:28% commit
Commit complete.
[ok][2015-04-20 22:38:35]

[edit]
expert@blade_0_25 09:34:39> set ManagedElement 1 Equipment 1 Shelf 0 Slot 21
Blade 1 administrativeState unlocked
commit
Check that the BIOS version of the blade is not under R6A01, executing the
next command
CUDB_21 SC_2_1# ssh PL_2_12
Last login: Fri Apr 17 10:26:19 2015 from sc_2_1
CUDB_21 PL_2_12# dmidecode --type BIOS | grep Version
Version: R7A02

CUDB_21 SC_2_1# ssh -p 2024 expert@10.202.4.150


expert@10.202.4.150's password:
Welcome to the DMX CLI
expert connected from 10.202.4.148 using ssh on blade_0_25
expert@blade_0_25 22:59:12> show ManagedElement 1 DmxFunctions 1
BladeGroupManagement 1 Group CUDB ShelfSlot Blade 1 firstMacAddress
SHELF
SLOT BLADE FIRST MAC
ID ID ADDRESS
---------------------------------
0-1 1 34:07:fb:ed:2f:d7
0-3 1 34:07:fb:ed:2f:5f
0-5 1 34:07:fb:ed:2f:2f
0-7 1 34:07:fb:ed:28:27
0-9 1 34:07:fb:ed:2f:9b
0-11 1 34:07:fb:ed:31:03
0-13 1 34:07:fb:ed:2f:17
0-15 1 34:07:fb:ed:30:43
0-17 1 34:07:fb:ed:30:97
0-19 1 34:07:fb:ed:30:d3
0-21 1 34:07:fb:ed:2f:8f
0-23 1 34:07:fb:ed:2f:47
1-1 1 34:07:fb:ed:28:b7
1-3 1 34:07:fb:ed:30:c7

[ok][2015-04-20 22:59:33]

Edit LOTC file /cluster/etc/cluster.conf and replace the old MACs with the
new ones obtained
vi
i –to insert
interface 11 eth3 ethernet 34:07:fb:f1:77:20
interface 11 eth4 ethernet 34:07:fb:f1:77:21
interface 11 eth2 ethernet 34:07:fb:f1:77:22
interface 11 eth0 ethernet 34:07:fb:f1:77:24 was 23
interface 11 eth1 ethernet 34:07:fb:f1:77:25 was 24
Esc
:w to save
:q! –to exit

Then reboot the cluster


CUDB_21 SC_2_1# cluster reboot -n 11
Rebooting node 11 (PL_2_11)
Succeeded to execute /sbin/reboot on 11 (PL_2_11)

CUDB_21 SC_2_1# cp /cluster/nodes/11/etc/rpm.conf_FULL


/cluster/nodes/11/etc/rpm.conf
cmw-node-unlock PL_2_11

CUDB_21 SC_2_1# cd /opt/ericsson/cudb/OAM/support/bin

CUDB_21 SC_2_1# ./cudbPartTool rebuild -n 14


ss
CUDB partitioning tool for EBS

-= Rebuilding payloads =-

Reset partition table in PL_2_14


WARNING: command "/bin/umount -f /local" in PL_2_14 exit with non-zero
status.
WARNING: command "/bin/umount -f /local2" in PL_2_14 exit with non-zero
status.
Building partitions in PL_2_14
Formatting partitions in PL_2_14

Done.
CUDB_21 SC_2_1#

CUDB_21 SC_2_1# ssh PL_2_14


Last login: Tue Jan 19 11:07:33 2016 from sc_2_1
CUDB_21 PL_2_14# df -h
Filesystem Size Used Avail Use% Mounted on
rootfs 2.0G 2.0G 0 100% /
root 2.0G 2.0G 0 100% /
tmpfs 12G 696K 12G 1% /dev/shm
shm 12G 696K 12G 1% /dev/shm
192.168.0.100:/.cluster 62G 24G 35G 41% /cluster
CUDB_21 PL_2_14# exit
logout
Connection to PL_2_14 closed.
CUDB_21 SC_2_1# cd /opt/ericsson/cudb/OAM/support/bin/
CUDB_21 SC_2_1# ./cudbPartTool check -n 14

CUDB partitioning tool for EBS

-= Cluster filesystem analysis =-

Payload PL_2_14 report:


WARNING: local storages not mounted.

Done.
CUDB_21 SC_2_1# cmw-node-lock PL_2_14
CUDB_21 SC_2_1# cd /opt/ericsson/cudb/OAM/support/bin/
CUDB_21 SC_2_1# ./cudbPartTool rebuild -n 14

CUDB partitioning tool for EBS

-= Rebuilding payloads =-
Reset partition table in PL_2_14
WARNING: command "/bin/umount -f /local" in PL_2_14 exit with non-zero
status.
WARNING: command "/bin/umount -f /local2" in PL_2_14 exit with non-zero
status.
Building partitions in PL_2_14
Formatting partitions in PL_2_14

Done.
CUDB_21 SC_2_1# ./cudbPartTool check -n 14

CUDB partitioning tool for EBS

-= Cluster filesystem analysis =-

Payload PL_2_14 report:


WARNING: local storages not mounted.

Done.
CUDB_21 SC_2_1# cp /cluster/nodes/14/etc/rpm.conf_FULL
/cluster/nodes/14/etc/rpm.conf

CUDB_21 SC_2_2# cluster reboot -n 11


Rebooting node 14 (PL_2_14)
Succeeded to execute /sbin/reboot on 14 (PL_2_14)

CUDB_21 SC_2_1# ssh


.
CUDB_21 SC_2_1# ssh PL_2_14
PL_2_14:~ # exit
logout
Connection to PL_2_14 closed.
CUDB_21 SC_2_1# ssh PL_2_14
Last login: Tue Jan 19 12:03:28 2016 from sc_2_1
CUDB_21 PL_2_14# df -h
Filesystem Size Used Avail Use% Mounted on
rootfs 2.0G 1.4G 716M 66% /
root 2.0G 1.4G 716M 66% /
tmpfs 12G 0 12G 0% /dev/shm
shm 12G 0 12G 0% /dev/shm
192.168.0.100:/.cluster 62G 24G 35G 41% /cluster
/dev/sdb1 138G 188M 131G 1% /local
/dev/sdb2 138G 188M 131G 1% /local2
CUDB_21 PL_2_14#
command log

CUDB_21 SC_2_1# cudbManageStore --all --order status

cudbManageStore stores to process: pl ds1 (in dsgroup1) ds2 (in dsgroup2)


ds3 (in dsgroup3) ds4 (in dsgroup4).

Store pl in dsgroup 0 is alive and reporting status ACTIVE.


Store ds1 in dsgroup 1 is alive and reporting status ACTIVE.
Store ds2 in dsgroup 2 is alive and reporting status ACTIVE.
Store ds3 in dsgroup 3 is alive and reporting status ACTIVE.
Store ds4 in dsgroup 4 is alive and reporting status DEGRADED.
cudbManageStore command successful.

cudbSystemStatus reports the followings.

Node 21:
PL Cluster (20%) .............................OK
DSG1 Cluster (83%) ...........................OK
DSG2 Cluster (84%) ...........................OK
[-W-] DSG3 Cluster (82%) ...........................NOK: degraded
DSG4 Cluster (84%) ...........................OK

fmactivealarms
to verify the node down
!---------------------------------------------------------------
Module : STORAGE-ENGINE
Error Code : 2
Resource Id : 1.3.6.1.4.1.193.169.1.2.2.3.3.10.22.0.11
Timestamp : Tue Feb 10 02:04:23 CAT 2015
Model Description : Cluster node down, Storage Engine.
Active Description : Storage Engine (DS-group #3): NDB node #3 down @
10.22.0.11.
Event Type : 4
Probable Cause : 546
Severity : major
Orig Source IP : 10.202.6.67
---------------------------------------------------------------!

According to cat /cluster/etc/cluster.conf, the DSG3 is PL_2_11


cat /cluster/etc/cluster.conf

#Define number of blades. Id, type and name are fixed.


node 1 control SC_2_1
node 2 control SC_2_2
node 3 payload PL_2_3
node 4 payload PL_2_4
node 5 payload PL_2_5
node 6 payload PL_2_6
node 7 payload PL_2_7
node 8 payload PL_2_8
node 9 payload PL_2_9
node 10 payload PL_2_10
node 11 payload PL_2_11
node 12 payload PL_2_12
node 13 payload PL_2_13
node 14 payload PL_2_14

host all 10.22.0.1 OAM1


host all 10.22.0.2 OAM2
host all 10.22.0.3 PL0
host all 10.22.0.4 PL1
host all 10.22.0.5 PL2
host all 10.22.0.6 PL3
host all 10.22.0.7 DS1_0
host all 10.22.0.8 DS1_1
host all 10.22.0.9 DS2_0
host all 10.22.0.10 DS2_1
host all 10.22.0.11 DS3_0
host all 10.22.0.12 DS3_1
host all 10.22.0.13 DS4_0
host all 10.22.0.14 DS4_1

cudbAnalyser –a reports the followings.

Feb 15 14:53:42 PL_2_11 mysqld: 150215 14:53:42 [ERROR] mysqld: Table


'./mysql/ndb_binlog_index' is marked as crashed and should be repaired
Feb 15 14:53:47 PL_2_11 mysqld: 150215 14:53:47 [ERROR] mysqld: Table
'./mysql/ndb_binlog_index' is marked as crashed and should be repaired
Feb 15 14:53:50 PL_2_11 mysqld: Ndb Event Buffer : Fatal error.

Kernel Logs in PL_2_11 reports the followings.

Feb 18 14:17:24 PL_2_11 kernel: [948887.876817] mptscsih: ioc0: attempting


task abort! (sc=ffff880614e15cc0)
Feb 18 14:17:24 PL_2_11 kernel: [948887.876823] sd 7:0:0:0: [sdb] CDB:
Read(10): 28 00 02 d4 21 f8 00 04 00 00
Feb 18 14:17:24 PL_2_11 kernel: [948888.347563] mptbase: ioc0:
LogInfo(0x31140000): Originator={PL}, Code={IO Executed}, SubCode(0x0000)
cb_idx mptscsih_io_done
Feb 18 14:17:24 PL_2_11 kernel: [948888.347759] mptscsih: ioc0: task abort:
SUCCESS (rv=2002) (sc=ffff880614e15cc0) (sn=2056861)
Feb 18 14:17:24 PL_2_11 kernel: [948888.347763] mptscsih: ioc0: attempting
task abort! (sc=ffff8806162560c0)
Feb 18 14:17:24 PL_2_11 kernel: [948888.347766] sd 7:0:0:0: [sdb] CDB:
Read(10): 28 00 02 d6 a4 d0 00 00 08 00
Feb 18 14:17:25 PL_2_11 kernel: [948888.846374] mptbase: ioc0:
LogInfo(0x31140000): Originator={PL}, Code={IO Executed}, SubCode(0x0000)
cb_idx mptscsih_io_done
Feb 18 14:17:25 PL_2_11 kernel: [948888.846573] mptscsih: ioc0: task abort:
SUCCESS (rv=2002) (sc=ffff8806162560c0) (sn=2056862)
Feb 18 14:17:25 PL_2_11 kernel: [948889.273865] net_ratelimit: 10 callbacks
suppressed
Feb 18 14:17:31 PL_2_11 kernel: [948895.266576] net_ratelimit: 10 callbacks
suppressed
Feb 18 14:17:37 PL_2_11 kernel: [948901.259299] net_ratelimit: 10 callbacks
suppressed

Detailed HC of Disk does not respond.

CUDB_21 PL_2_11# smartctl -a /dev/sdb


smartctl 5.39 2008-10-24 22:33 [x86_64-suse-linux-gnu] (openSUSE RPM)
Copyright (C) 2002-8 by Bruce Allen, http://smartmon tools.sourceforge.net

^C^C^C^C

Solution:

Replace the affected blade PL_2_11 using procedures in Alex. There is


Hardware fault with Memory in GEP3 blade of DSG3.

Procedure 1: CUDB EBS Node Hardware Replacement, 10/1553-CSH109067/6 Uen C


Procedure 2: Server Platform, Blade Replacement: 4/1543-CSH 109 067/6 Uen B
CUDB_11 SC_2_1# cudbSystemStatus -c

Execution date: Mon Apr 13 14:04:10 CAT 2015

Checking Clusters status:


dsg node stat memu role
0 11 2 21 Slave
1 11 2 84 Master
2 11 2 84 Slave
3 11 2 84 Master
4 11 2 85 Slavecu
0 21 2 21 Master
1 21 2 84 Slave
2 21 2 84 Master
3 21 0 -1 Unreachable
4 21 2 85 Master

[-W-] There are Clusters in wrong state:


D3 in N21 down.
1 check the musters
CUDB_21 SC_2_1# cudbSystemStatus -r -R

Execution date: Mon Apr 13 11:55:43 CAT 2015

Checking Replication Channel in the System


Node 21 :
DSG0 is Master
Replication in DSG1(Node=21--Chan=2).... OK -- Delay = 0
DSG2 is Master
DSG3 is in state down
DSG4 is Master
Node 11 :
Replication in DSG0(Node=11--Chan=2).... OK -- Delay = 0
DSG1 is Master
Replication in DSG2(Node=11--Chan=2).... OK -- Delay = 0
DSG3 is Master
Replication in DSG4(Node=11--Chan=2).... OK -- Delay = 0
Checking Replication Channels in the System:
Node | 11 | 21
====================
PLDB ___|__S2_|__M__
DSG 1 __|__M__|__S2_
DSG 2 __|__S2_|__M__
[-E-] DSG 3 __|__M__|__Xu_
DSG 4 __|__S2_|__M__

[-E-]
CUDB_21 SC_2_1# cudbAnalyser -a
Started on ACTIVE SC...
[INFO] Checking files:
//home/cudb/monitoring/preventiveMaintenance//CUDB_21_201504130125.log
//home/cudb/monitoring/preventiveMaintenance//CUDB_21_201504121325.log
logfile versions:0.0.72/0.0.72
[ERROR] ALARM: There are active cluster alarms (Severity: Major)
Node Hostname Severity Type Problem
Information
11 PL_2_11 Major 3 Disk Usage
Disk usage above threshold major 90% (/ (100%), / (100%))
[ERROR] OS: One or more partition on one of the blades are getting full
(Severity: Major)
CUDB21 DS3_0 rootfs 2.0G 2.0G 0 100% /
CUDB21 DS3_0 root 2.0G 2.0G 0 100% /
[ERROR] cudbSystemStatus: CUDB Process is not running as expected (Severity:
Warning)
[-W-] MySQL server process (Master).................Not running in: DS3_0
[-W-] MySQL server process (Slave)..................Not running in: DS3_0
[-W-] MySQL server process (Access).................Not running in: DS3_0
DS3_1
[ERROR] MYSQL: Binlog is not written (Severity: Major)
CUDB21 DS3_0 0
CUDB21 DS3_1 0

CUDB_11 SC_2_1# more /cluster/etc/cluster.conf


#Define number of blades. Id, type and name are fixed.
node 1 control SC_2_1
node 2 control SC_2_2
node 3 payload PL_2_3
node 4 payload PL_2_4
node 5 payload PL_2_5
node 6 payload PL_2_6
node 7 payload PL_2_7
node 8 payload PL_2_8
node 9 payload PL_2_9
node 10 payload PL_2_10
node 11 payload PL_2_11
node 12 payload PL_2_12
node 13 payload PL_2_13
node 14 payload PL_2_14

Then login to p

»
»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»
»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»
1. Take backup on Master

CUDB_11 SC_2_2# cudbHaState | grep COM


COM state:
COM is assigned as ACTIVE in controller SC-1
COM is assigned as STANDBY in controller SC-2

Execute the following command to back up the PLDB or a specifig DSG cluster
of the node
CUDB_11 SC_2_2# cudbManageStore --ds 4 -o backup

cudbManageStore stores to process: ds4 (in dsgroup4).

Starting Backup ...


Launching order Backup for ds4 in dsgroup 4.
Obtaining Mgm Information.
Trying backup on mgmt access 1, wait a moment ...
ndb_mgm 10.22.0.1 2375 -e "START BACKUP 999 WAIT COMPLETED"
..ok
BACKUP-999 renamed in PL_2_13 to /local/cudb/mysql/ndbd/backup/BACKUP/BACKUP-
2015-07-01_15-12
BACKUP-999 renamed in PL_2_14 to /local/cudb/mysql/ndbd/backup/BACKUP/BACKUP-
2015-07-01_15-12

Backup finished successfully for store ds4.


Stores where order backup was successfully completed: ds4.
cudbManageStore command successful.

4.Performing Unit Data Restore

CUDB_21 SC_2_2# cudbUnitDataBackupAndRestore -d 1 -n 21

CREATE PART
--------------------------------

creating backup on node 11

cudbManageStore stores to process: ds1 (in dsgroup1).

Starting Backup ...


Launching order Backup for ds1 in dsgroup 1.
Obtaining Mgm Information.
Trying backup on mgmt access 1, wait a moment ...
ndb_mgm 10.22.0.1 2372 -e "START BACKUP 999 WAIT COMPLETED"

»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»
»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»

»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»
»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»»
»»»
Potential data inconsistency between replicas

CUDB_21 SC_2_2# fmactivealarms


!---------------------------------------------------------------
Module : STORAGE-ENGINE
Error Code : 19
Resource Id : 1.3.6.1.4.1.193.169.1.2.19.4
Timestamp : Thu Jul 02 00:37:21 CAT 2015
Model Description : Potential data inconsistency between replicas found,
Storage Engine.
Active Description : Storage Engine (DS-group #4): Potential data
inconsistency between replicas found.
Event Type : 4
Probable Cause : 160
Severity : major
Orig Source IP : 10.202.6.67
---------------------------------------------------------------!

CUDB_21 SC_2_2# cudbCheckConsistency


Checks the consistency manually with the default values given in the
configuration file. No alarms are raised

[info] cudbCheckConsistency is running with options: '', MAXDIFF_LIMIT:


1.00%, CHECK_LIMIT: 100
[info] Acquiring mastership information.
[info] Checking consistency in slave DS units.
[info] Node11-DSG0 consistency: OK - row count difference 0.00%
[info] Node21-DSG1 consistency: OK - row count difference 0.00% in 'CPK'
(3 rows, MA: 1236814 <> SL: 1236817)
[info] Node11-DSG2 consistency: OK - row count difference 0.00%
[info] Node21-DSG3 consistency: OK - row count difference 0.00% in 'CPK'
(4 rows, MA: 1244658 <> SL: 1244662)
[error] Node21-DSG4 consistency: NOK (FAILED) - row count difference
50.51% in 'CPA' (499 rows, MA: 988 <> SL: 489)!
[info] Summary: slaves checked: 5 -> PASSED: 4, FAILED: 1, UNKNOWN: 0.

CUDB_21 SC_2_2# cudbCheckConsistency -m 0.00 -c 0


Checks the consistency strictly so that no difference is accepted and all
the tables are checked. No alarms are raised.

[info] cudbCheckConsistency is running with options: '-m 0.00 -c 0',


MAXDIFF_LIMIT: 0.00%, CHECK_LIMIT: 0
[info] Acquiring mastership information.
[info] Checking consistency in slave DS units.
[info] Node11-DSG0 consistency: OK - row count difference 0.00%
[error] Node21-DSG1 consistency: NOK (FAILED) - row count difference
0.00% in 'AU1' (1 rows, MA: 2339962 <> SL: 2339961)!
[error] Node11-DSG2 consistency: NOK (FAILED) - row count difference
0.00% in 'CPK' (3 rows, MA: 1179644 <> SL: 1179647)!
[error] Node21-DSG3 consistency: NOK (FAILED) - row count difference
0.00% in 'CPK' (1 rows, MA: 1244769 <> SL: 1244768)!
[error] Node21-DSG4 consistency: NOK (FAILED) - row count difference
100.00% in 'CP11' (1 rows, MA: 1 <> SL: 0)!
[info] Summary: slaves checked: 5 -> PASSED: 1, FAILED: 4, UNKNOWN: 0.

Disk fault and replacement

After the group data restore is performed, all stored procedures related to
the application counter process are lost, and therefore must be created
again.

Aplication counters
CUDB_21 SC_2_2# cudbSystemStatus -c

Execution date: Fri Jul 3 11:22:02 CAT 2015

Checking Clusters status:


dsg node stat memu role
0 11 2 20 Slave
1 11 2 84 Master
2 11 2 85 Slave
3 11 2 85 Master
4 11 2 85 Master
0 21 2 20 Master
1 21 2 81 Slave
2 21 2 85 Master
3 21 2 82 Slave
4 21 2 42 Slave

CUDB_21 SC_2_1# fmactivealarms


Active alarms:
!---------------------------------------------------------------
Module : APPLICATION-COUNTERS
Error Code : 1
Resource Id :
1.3.6.1.4.1.193.169.8.1.4.22.71.69.84.95.82.83.65.83.85.66.
83.53.56.95.67.79.85.78.84.69.82.83
Timestamp : Wed Jul 01 17:56:04 CAT 2015
Model Description : Fault retrieving subscriber statistics, Application
Counter
s.
Active Description : Application Counters: fault retrieving subscriber
statistic
s for DS-group #4 in Group counter GET_RSASUBS58_COUNTERS (error application
cou
nters trap for DS#4.).
Event Type : 4
Probable Cause : 158
Severity : major
Orig Source IP : 10.202.6.67

CUDB_21 SC_2_1# pmreadcounter | grep VLR*


LOCSUBS; NVLRBARREDSUBS; 2015-05-02 01:46:17; 0
LOCSUBS; NVLRKNOWNSUBS; 2015-05-02 01:46:17; 0
LOCSUBS; NVLRPURGESUBS; 2015-05-02 01:46:17; 0
LOCSUBS; NVLRRESTSUBS; 2015-05-02 01:46:17; 0
LOCSUBS; NVLRUNKNOWNSUBS; 2015-05-02 01:46:17; 0

CUDB_21 SC_2_2# cd /cluster/software/app_counters


CUDB_21 SC_2_2# chmod +x app_counters.pl
CUDB_21 SC_2_2# c

UDC HLR and HSS Applications Counters Installation, version Rev. E 4.0.6

-------------------------------------------------------------------------
---------------------------------------------------------
* Checking Active OAM blade...
[ ok ]
-------------------------------------------------------------------------
---------------------------------------------------------
* Reading System Hosts info...
[ ok ]
-------------------------------------------------------------------------
---------------------------------------------------------
* Reading System config...
[ ok ]
-------------------------------------------------------------------------
---------------------------------------------------------
* Checking Active Alarms...
[warn]
* Alarms exist in the node...Are you sure you want to continue ? (y/n)y
* Proceeding...
[ ok ]
-------------------------------------------------------------------------
---------------------------------------------------------
* Checking the CUDB System Status..
[ ok ]
-------------------------------------------------------------------------
---------------------------------------------------------
* Please choose a system application counter uninstallation option:
* A. Uninstallation of HLR application counters
* B. Uninstallation of HSS application counters
* C. Uninstallation of All application counters
* X. Exit
* Please enter your choice(a/b/c/x):c
* Proceeding with HLR & HSS application counters uninstallation.
-------------------------------------------------------------------------
---------------------------------------------------------
* Deleting previous counter installation crontab files..
[ ok ]
-------------------------------------------------------------------------
---------------------------------------------------------
* Deleting previous counter installation files...
[ ok ]
-------------------------------------------------------------------------
---------------------------------------------------------
* Reading Counters config files..
[ ok ]
-------------------------------------------------------------------------
---------------------------------------------------------
* INFORMATION: Creation of the Pm Counter Job file for the active
application counters
* can be handled by either an Ericsson OSS-RC or this program!
* WARNING : Will an Ericsson OSS-RC used to handle the Pm Counter Job
file for application counters ? (y/n)y
* The Pm Counter Jobs file for the active application counters will be
created by OSS-RC! [warn]
-------------------------------------------------------------------------
---------------------------------------------------------
* Reloading Jobs..
* Stopping PmAgent in node 10.22.0.1 ... OK
* ESA PmAgent has been successfully stopped.
* Starting PmAgent in node 10.22.0.1 ... OK
* Stopping PmAgent in node 10.22.0.2 ... OK
* ESA PmAgent has been successfully stopped.
* Starting PmAgent in node 10.22.0.2 ... OK
-------------------------------------------------------------------------
---------------------------------------------------------
* Checking Mastership of PLDB...
[ ok ]
* This node HAS the Mastership of the PLDB !!!
* Proceeding with Master uninstallation...
* Deleting procedures in DS1_0.. [ ok ]
* Deleting procedures in DS1_1.. [ ok ]
* Deleting procedures in DS2_0.. [ ok ]
* Deleting procedures in DS2_1.. [ ok ]
* Deleting procedures in DS3_0.. [ ok ]
* Deleting procedures in DS3_1.. [ ok ]
* Deleting procedures in DS4_0.. [ ok ]
* Deleting procedures in DS4_1.. [ ok ]
* Deleting procedures in PL0.. [ ok ]
* Dropping database cudb_application_counters in PL0.. [ ok ]
* Deleting procedures in PL1.. [ ok ]
-------------------------------------------------------------------------
---------------------------------------------------------
* Clearing Application Counter Alarms, 92 alarms found... Remaining alarms:
-1 [ ok ]
-------------------------------------------------------------------------
---------------------------------------------------------
CUDB_21 SC_2_2# ./app_counters.pl -i

UDC HLR and HSS Applications Counters Installation, version Rev. E 4.0.6

-------------------------------------------------------------------------
---------------------------------------------------------
* Checking Active OAM blade...
[ ok ]
-------------------------------------------------------------------------
---------------------------------------------------------
* Reading System Hosts info...
[ ok ]
-------------------------------------------------------------------------
---------------------------------------------------------
* Reading System config...
[ ok ]
-------------------------------------------------------------------------
---------------------------------------------------------
* Checking Active Alarms...
[warn]
* Alarms exist in the node...Are you sure you want to continue ? (y/n)y
* Proceeding...
[ ok ]
-------------------------------------------------------------------------
---------------------------------------------------------
* Checking the CUDB System Status..
[ ok ]
-------------------------------------------------------------------------
---------------------------------------------------------
* Please choose a system application counter installation option:
* A. Installation of HLR application counters
* B. Installation of HSS application counters
* C. Installation of All application counters
* X. Exit
* Please enter your choice(a/b/c/x):a
* Proceeding with HLR application counter installation.
-------------------------------------------------------------------------
---------------------------------------------------------
* Copying Files...HLR...
[ ok ]
-------------------------------------------------------------------------
---------------------------------------------------------
* Checking Mastership of PLDB...
[ ok ]
* This node HAS the Mastership of the PLDB !!!
* Proceeding with Master installation...
* Creating procedures in DS1_0...
[ ok ]
* Creating procedures in DS1_1...
[ ok ]
* Creating procedures in DS2_0...
[ ok ]
* Creating procedures in DS2_1...
[ ok ]
* Creating procedures in DS3_0...
[ ok ]
* Creating procedures in DS3_1...
[ ok ]
* Creating procedures in DS4_0...
[ ok ]
* Creating procedures in DS4_1...
[ ok ]
* Creating procedures in PL0...
[ ok ]
* Creating tables in PL0..
[ ok ]
* Creating procedures in PL1...
[ ok ]
-------------------------------------------------------------------------
---------------------------------------------------------
* Reading Counters config files..
[ ok ]
-------------------------------------------------------------------------
---------------------------------------------------------
* Please choose an installation option
* A. Installation of HLR counters (Includes all HLR default and regional
counters)
* B. Installation of HLR default counters (Excludes all HLR regional
counters)
* C. Installation of HLR default counters and custom selection of HLR
regional counters
* D. Custom installation of HLR default and regional counters
* X. Exit
* Please enter your choice(a/b/c/d/x):b
-------------------------------------------------------------------------
---------------------------------------------------------
* INFORMATION: Creation of the Pm Counter Job file for the active
application counters
* can be handled by either an Ericsson OSS-RC or this program!
* WARNING : Will an Ericsson OSS-RC used to handle the Pm Counter Job
file for application counters ? (y/n)y
* The Pm Counter Jobs file for the active application counters will be
created by OSS-RC! [warn]
-------------------------------------------------------------------------
---------------------------------------------------------
* Reloading Jobs..
* Stopping PmAgent in node 10.22.0.1 ... OK
* ESA PmAgent has been successfully stopped.
* Starting PmAgent in node 10.22.0.1 ... OK
* Stopping PmAgent in node 10.22.0.2 ... OK
* ESA PmAgent has been successfully stopped.
* Starting PmAgent in node 10.22.0.2 ... OK
-------------------------------------------------------------------------
---------------------------------------------------------
CUDB_21 SC_2_2#

CUDB_21 SC_2_1# cudbCheckConsistency


[info] cudbCheckConsistency is running with options: '', MAXDIFF_LIMIT:
1.00%
, CHECK_LIMIT: 100
[info] Acquiring mastership information.
[info] Checking consistency in slave DS units.
[info] Node11-DSG0 consistency: OK - row count difference 0.00% in
'CUDBMultiServiceConsumer' (3 rows, MA: 10239596 <> SL: 10239593)
[info] Node21-DSG1 consistency: OK - row count difference 0.00% in 'CPK'
(1 rows, MA: 1361644 <> SL: 1361643)
[info] Node11-DSG2 consistency: OK - row count difference 0.00% in 'CPK'
(1 rows, MA: 1316144 <> SL: 1316143)
[info] Node21-DSG3 consistency: OK - row count difference 0.00% in 'AU1'
(1 rows, MA: 2582871 <> SL: 2582870)
[error] Node21-DSG4 consistency: NOK (FAILED) - row count difference
51.82% in 'CPA' (483 rows, MA: 932 <> SL: 449)!
[info] Summary: slaves checked: 5 -> PASSED: 4, FAILED: 1, UNKNOWN: 0.
CUDB_21 SC_2_1# cudbUnitDataBackupAndRestore -d 4 -n 21

CREATE PART
--------------------------------

creating backup on node 11

cudbManageStore stores to process: ds4 (in dsgroup4).

Starting Backup ...


Launching order Backup for ds4 in dsgroup 4.
Obtaining Mgm Information.
Trying backup on mgmt access 1, wait a moment ...
ndb_mgm 10.22.0.1 2375 -e "START BACKUP 999 WAIT COMPLETED"
..ok
BACKUP-999 renamed in PL_2_13 to /local/cudb/mysql/ndbd/backup/BACKUP/BACKUP-
2016-01-21_15-33
BACKUP-999 renamed in PL_2_14 to /local/cudb/mysql/ndbd/backup/BACKUP/BACKUP-
2016-01-21_15-33

Backup finished successfully for store ds4.


Stores where order backup was successfully completed: ds4.
cudbManageStore command successful.

DIR CREATE PART


--------------------------------
creating backup directory /local/cudb/mysql/ndbd/backup/BACKUP/BACKUP-2016-
01-21_15-33 on node 21 on blade 10.22.0.13
creating backup directory /local/cudb/mysql/ndbd/backup/BACKUP/BACKUP-2016-
01-21_15-33 on node 21 on blade 10.22.0.14

COPY PART
--------------------------------

copying backup files from node 11 blade


10.22.0.13:/local/cudb/mysql/ndbd/backup/BACKUP/BACKUP-2016-01-21_15-33 to
node 21 blade 10.22.0.13:/local/cudb/mysql/ndbd/backup/BACKUP/BACKUP-2016-01-
21_15-33
copying backup files from node 11 blade
10.22.0.14:/local/cudb/mysql/ndbd/backup/BACKUP/BACKUP-2016-01-21_15-33 to
node 21 blade 10.22.0.14:/local/cudb/mysql/ndbd/backup/BACKUP/BACKUP-2016-01-
21_15-33
RESTORE PART
--------------------------------

Restoring backup on node 21

cudbManageStore stores to process: ds4 (in dsgroup4).

Launching restore order in CUDB Node 21 to store ds4 in dsgroup 4.


Starting restore in CUDB Node 21 for store ds4, backup path
/local/cudb/mysql/ndbd/backup/BACKUP/BACKUP-2016-01-21_15-33, sql scripts
path /home/cudb/storageEngine/config/schema/ds/internal/restoreTempSql.
Waiting for restore order(s) to be completed in CUDB Node 21 for stores :
ds4.

Consistency Failed

CUDB_21 SC_2_2# cudbCheckConsistency


[info] cudbCheckConsistency is running with options: '', MAXDIFF_LIMIT:
1.00%, CHECK_LIMIT: 100
[info] Acquiring mastership information.
[info] Checking consistency in slave DS units.
[info] Node21-DSG0 consistency: OK - row count difference 0.00% in
'IDEN_MSISDN' (11 rows, MA: 7111383 <> SL: 7111394)
[info] Node21-DSG1 consistency: OK - row count difference 0.00% in 'AU1'
(1 rows, MA: 2558493 <> SL: 2558494)
[error] Node21-DSG2 consistency: NOK (FAILED) - row count difference
1.49% in 'CPL' (8 rows, MA: 536 <> SL: 544)!
[info] Node21-DSG3 consistency: OK - row count difference 0.00%
[info] Node21-DSG4 consistency: OK - row count difference 0.00% in 'CPK'
(3 rows, MA: 1396386 <> SL: 1396383)
[info] Summary: slaves checked: 5 -> PASSED: 4, FAILED: 1, UNKNOWN: 0.

Execute the following command to back up the PLDB or a specifig DSG cluster
of the active node
CUDB_11 SC_2_1# cudbManageStore --ds 2 -o backup

cudbManageStore stores to process: ds2 (in dsgroup2).


Starting Backup ...
Launching order Backup for ds2 in dsgroup 2.
Obtaining Mgm Information.
Trying backup on mgmt access 1, wait a moment ...
ndb_mgm 10.22.0.1 2373 -e "START BACKUP 999 WAIT COMPLETED"
..ok
BACKUP-999 renamed in PL_2_9 to /local/cudb/mysql/ndbd/backup/BACKUP/BACKUP-
2016-03-29_14-49
BACKUP-999 renamed in PL_2_10 to /local/cudb/mysql/ndbd/backup/BACKUP/BACKUP-
2016-03-29_14-49

Backup finished successfully for store ds2.


Stores where order backup was successfully completed: ds2.
cudbManageStore command successful.

copy backup file from PL to cluster


CUDB_11 SC_2_1# scp -r PL_2_9:/local/cudb/mysql/ndbd/backup/BACKUP/BACKUP-
2016-03-29_14-49 /cluster/PL_2_9/BACKUP-2016-03-29_14-49
BACKUP-999.3.log
100% 1246KB 1.2MB/s 00:00
BACKUP-999.3.ctl
100% 204KB 204.0KB/s 00:00
BACKUP-999-0.3.Data
100% 307MB 34.1MB/s 00:09
CUDB_11 SC_2_1# scp -r PL_2_10:/local/cudb/mysql/ndbd/backup/BACKUP/BACKUP-
2016-03-29_14-49 /cluster/PL_2_10/BACKUP-2016-03-29_14-49
BACKUP-999.4.ctl
100% 204KB 204.0KB/s 00:00
BACKUP-999-0.4.Data
100% 307MB 38.4MB/s 00:08
BACKUP-999.4.log
100% 1244KB 1.2MB/s 00:00

Back to slave node


Copy a backup file from active node
CUDB_21 SC_2_2# scp -r 10.201.6.67:/cluster/PL_2_9/BACKUP-2016-03-29_14-49
/cluster/PL_2_9/BACKUP-2016-03-29_14-49/
BACKUP-999.3.log
100% 1246KB 1.2MB/s 00:00
BACKUP-999-0.3.Data
100% 307MB 43.8MB/s 00:07
BACKUP-999.3.ctl
100% 204KB 204.0KB/s 00:00s

CUDB_21 SC_2_2# scp -r 10.201.6.67:/cluster/PL_2_10/BACKUP-2016-03-29_14-49


/cluster/PL_2_10/BACKUP-2016-03-29_14-49/
BACKUP-999.4.log
100% 1244KB 1.2MB/s 00:00
BACKUP-999.4.ctl
100% 204KB 204.0KB/s 00:00
BACKUP-999-0.4.Data
100% 307MB 30.7MB/s 00:10

copy from cluster to PL

CUDB_21 SC_2_2# scp -r /cluster/PL_2_10/BACKUP-2016-03-29_14-49


PL_2_10:/local/cudb/mysql/ndbd/backup/BACKUP
BACKUP-999.4.ctl
100% 204KB 204.0KB/s 00:00
BACKUP-999-0.4.Data
100% 307MB 61.4MB/s 00:05
BACKUP-999.4.log
100% 1244KB 1.2MB/s 00:00
CUDB_21 SC_2_2# scp -r /cluster/PL_2_9/BACKUP-2016-03-29_14-49
PL_2_9:/local/cudb/mysql/ndbd/backup/BACKUP
BACKUP-999.3.log
100% 1246KB 1.2MB/s 00:00
BACKUP-999-0.3.Data
100% 307MB 61.3MB/s 00:05
BACKUP-999.3.ctl
100% 204KB 204.0KB/s 00:00

Performing Unit Data Restore ( take approximately 30 min)


CUDB_21 SC_2_2# cudbManageStore --ds 2 -o restore --location
/local/cudb/mysql/ndbd/backup/BACKUP/BACKUP-2016-03-29_14-49

cudbManageStore stores to process: ds2 (in dsgroup2).

Launching restore order in CUDB Node 21 to store ds2 in dsgroup 2.


Starting restore in CUDB Node 21 for store ds2, backup path
/local/cudb/mysql/ndbd/backup/BACKUP/BACKUP-2016-03-29_14-49, sql scripts
path /home/cudb/storageEngine/config/schema/ds/internal/restoreTempSql.
Waiting for restore order(s) to be completed in CUDB Node 21 for stores :
ds2.
restore order finished successfully in CUDB Node 21 for store ds2.
restore order(s) completed in CUDB Node 21 for stores : ds2.
Stores where order restore was successfully completed: ds2.
Closing connections for all blades of DSUnitGroup 2.
cudbManageStore command successful.

CUDB_21 SC_2_2# cudbCheckConsistency


[info] cudbCheckConsistency is running with options: '', MAXDIFF_LIMIT:
1.00%, CHECK_LIMIT: 100
[info] Acquiring mastership information.
[info] Checking consistency in slave DS units.
[info] Node21-DSG0 consistency: OK - row count difference 0.00% in
'IDEN_MSISDN' (9 rows, MA: 7111385 <> SL: 7111394)
[info] Node21-DSG1 consistency: OK - row count difference 0.00%
[info] Node21-DSG2 consistency: OK - row count difference 0.00% in 'CPK'
(4 rows, MA: 1386992 <> SL: 1386988)
[info] Node21-DSG3 consistency: OK - row count difference 0.00% in 'AU1'
(1 rows, MA: 2378545 <> SL: 2378544)
[info] Node21-DSG4 consistency: OK - row count difference 0.00% in 'CPK'
(1 rows, MA: 1396218 <> SL: 1396219)
[info] Summary: slaves checked: 5 -> PASSED: 5, FAILED: 0, UNKNOWN: 0.

Storage Engine (PLDB): Synchronization to current master impossible

!---------------------------------------------------------------
Module : STORAGE-ENGINE
Error Code : 1
Resource Id : 1.3.6.1.4.1.193.169.1.1.1
Timestamp : Fri Mar 25 13:25:41 CAT 2016
Model Description : Unable to synchronize cluster, Storage Engine.
Active Description : Storage Engine (PLDB): Synchronization to current master
impossible.
Event Type : 3
Probable Cause : 514
Severity : major
Orig Source IP : 10.202.6.67

Execute the following command to back up the PLDB or a specific DSG cluster
of the node

CUDB_11 SC_2_1# cudbManageStore --pl --order backup

cudbManageStore stores to process: pl.

Starting Backup ...


Launching order Backup for pl in dsgroup 0.
Obtaining Mgm Information.
Trying backup on mgmt access 1, wait a moment ...
ndb_mgm 10.22.0.1 2290 -e "START BACKUP 999 WAIT COMPLETED"
..ok
BACKUP-999 renamed in PL_2_3 to /local/cudb/mysql/ndbd/backup/BACKUP/BACKUP-
2016-03-29_16-15
BACKUP-999 renamed in PL_2_4 to /local/cudb/mysql/ndbd/backup/BACKUP/BACKUP-
2016-03-29_16-15
BACKUP-999 renamed in PL_2_5 to /local/cudb/mysql/ndbd/backup/BACKUP/BACKUP-
2016-03-29_16-15
BACKUP-999 renamed in PL_2_6 to /local/cudb/mysql/ndbd/backup/BACKUP/BACKUP-
2016-03-29_16-15

Backup finished successfully for store pl.


Stores where order backup was successfully completed: pl.
cudbManageStore command successful.

Create a directories if does not exist


CUDB_11 SC_2_1# mkdir PL_2_3
CUDB_11 SC_2_1# mkdir PL_2_4
CUDB_11 SC_2_1# mkdir PL_2_5
CUDB_11 SC_2_1# mkdir PL_2_6

copy backup file from PL to cluster

CUDB_11 SC_2_1# scp -r PL_2_3:/local/cudb/mysql/ndbd/backup/BACKUP/BACKUP-


2016-03-29_16-15 /cluster/PL_2_3/BACKUP-2016-03-29_16-15
BACKUP-999.3.ctl
100% 342KB 342.0KB/s 00:00
BACKUP-999-0.3.Data
100% 129MB 64.4MB/s 00:02
BACKUP-999.3.log
100% 4096 4.0KB/s 00:00
CUDB_11 SC_2_1# scp -r PL_2_4:/local/cudb/mysql/ndbd/backup/BACKUP/BACKUP-
2016-03-29_16-15 /cluster/PL_2_4/BACKUP-2016-03-29_16-15
BACKUP-999.4.ctl
100% 342KB 342.0KB/s 00:00
BACKUP-999.4.log
100% 2048 2.0KB/s 00:00
BACKUP-999-0.4.Data
100% 129MB 64.4MB/s 00:02
CUDB_11 SC_2_1# scp -r PL_2_5:/local/cudb/mysql/ndbd/backup/BACKUP/BACKUP-
2016-03-29_16-15 /cluster/PL_2_5/BACKUP-2016-03-29_16-15
BACKUP-999.5.ctl
100% 342KB 342.0KB/s 00:00
BACKUP-999-0.5.Data
100% 129MB 64.4MB/s 00:02
BACKUP-999.5.log
100% 2048 2.0KB/s 00:00
CUDB_11 SC_2_1# scp -r PL_2_6:/local/cudb/mysql/ndbd/backup/BACKUP/BACKUP-
2016-03-29_16-15 /cluster/PL_2_6/BACKUP-2016-03-29_16-15
BACKUP-999.6.ctl
100% 342KB 342.0KB/s 00:00
BACKUP-999.6.log
100% 2560 2.5KB/s 00:00
BACKUP-999-0.6.Data
100% 129MB 64.4MB/s 00:02

In a slave Node

Create a directories if does not exist

CUDB_21 SC_2_2# mkdir PL_2_3


CUDB_21 SC_2_2# mkdir PL_2_4
CUDB_21 SC_2_2# mkdir PL_2_5
CUDB_21 SC_2_2# mkdir PL_2_6
Copy from active to slave
CUDB_21 SC_2_2# scp -r 10.201.6.67:/cluster/PL_2_3/BACKUP-2016-03-29_16-15
/cluster/PL_2_3/BACKUP-2016-03-29_16-15/
BACKUP-999.3.log
100% 4096 4.0KB/s 00:00
BACKUP-999-0.3.Data
100% 129MB 64.4MB/s 00:02
BACKUP-999.3.ctl
100% 342KB 342.0KB/s 00:00
CUDB_21 SC_2_2# scp -r 10.201.6.67:/cluster/PL_2_4/BACKUP-2016-03-29_16-15
/cluster/PL_2_4/BACKUP-2016-03-29_16-15/
BACKUP-999.4.log
100% 2048 2.0KB/s 00:00
BACKUP-999.4.ctl
100% 342KB 342.0KB/s 00:00
BACKUP-999-0.4.Data
100% 129MB 25.8MB/s 00:05
CUDB_21 SC_2_2# scp -r 10.201.6.67:/cluster/PL_2_5/BACKUP-2016-03-29_16-15
/cluster/PL_2_5/BACKUP-2016-03-29_16-15/
BACKUP-999-0.5.Data
100% 129MB 42.9MB/s 00:03
BACKUP-999.5.ctl
100% 342KB 342.0KB/s 00:00
BACKUP-999.5.log
100% 2048 2.0KB/s 00:00
CUDB_21 SC_2_2# scp -r 10.201.6.67:/cluster/PL_2_6/BACKUP-2016-03-29_16-15
/cluster/PL_2_6/BACKUP-2016-03-29_16-15/
BACKUP-999.6.log
100% 2560 2.5KB/s 00:00
BACKUP-999.6.ctl
100% 342KB 342.0KB/s 00:00
BACKUP-999-0.6.Data
100% 129MB 32.2MB/s 00:04

Copy from cluster to PL

CUDB_21 SC_2_2# scp -r /cluster/PL_2_3/BACKUP-2016-03-29_16-15


PL_2_3:/local/cudb/mysql/ndbd/backup/BACKUP
BACKUP-999.3.log
100% 4096 4.0KB/s 00:00
BACKUP-999-0.3.Data
100% 129MB 64.4MB/s 00:02
BACKUP-999.3.ctl
100% 342KB 342.0KB/s 00:00
CUDB_21 SC_2_2# scp -r /cluster/PL_2_4/BACKUP-2016-03-29_16-15
PL_2_4:/local/cudb/mysql/ndbd/backup/BACKUP
BACKUP-999.4.ctl
100% 342KB 342.0KB/s 00:00
BACKUP-999-0.4.Data
100% 129MB 64.4MB/s 00:02
BACKUP-999.4.log
100% 2048 2.0KB/s 00:00
CUDB_21 SC_2_2# scp -r /cluster/PL_2_5/BACKUP-2016-03-29_16-15
PL_2_5:/local/cudb/mysql/ndbd/backup/BACKUP
BACKUP-999.5.log
100% 2048 2.0KB/s 00:00
BACKUP-999-0.5.Data
100% 129MB 128.7MB/s 00:01
BACKUP-999.5.ctl
100% 342KB 342.0KB/s 00:01
CUDB_21 SC_2_2# scp -r /cluster/PL_2_6/BACKUP-2016-03-29_16-15
PL_2_6:/local/cudb/mysql/ndbd/backup/BACKUP
BACKUP-999.6.log
100% 2560 2.5KB/s 00:00
BACKUP-999.6.ctl
100% 342KB 342.0KB/s 00:00
BACKUP-999-0.6.Data
100% 129MB 64.4MB/s 00:02

The restore takes approximately 20min


CUDB_21 SC_2_2# cudbManageStore --pl -o restore --location
/local/cudb/mysql/ndbd/backup/BACKUP/BACKUP-2016-03-29_16-15

cudbManageStore stores to process: pl.

Launching restore order in CUDB Node 21 to store pl in dsgroup 0.


Starting restore in CUDB Node 21 for store pl, backup path
/local/cudb/mysql/ndbd/backup/BACKUP/BACKUP-2016-03-29_16-15, sql scripts
path /home/cudb/storageEngine/config/schema/pl/internal/restoreTempSql.
Keeping LDAPCounters stopped until restore have finished ....
Waiting for restore order(s) to be completed in CUDB Node 21 for stores :
pl.
restore order finished successfully in CUDB Node 21 for store pl.
Resuming LDAPCounters.
Repopulating system monitor tables with data from configuration model (can
take one minute).
restore order(s) completed in CUDB Node 21 for stores : pl.
Stores where order restore was successfully completed: pl.
Closing connections for all blades of DSUnitGroup 0.
cudbManageStore command successful.

Checking MySQL server connection:


[-W-] MySQL Master Server connection Fault in....: DS3_0
[-W-] MySQL Slave Server connection Fault in.....: DS3_0
[-W-] MySQL Access Server connection Fault in....: DS3_0
CUDB_11 SC_2_2# cudbManageStore --ds 3 -o backup

cudbManageStore stores to process: ds3 (in dsgroup3).

Starting Backup ...


Launching order Backup for ds3 in dsgroup 3.
Obtaining Mgm Information.
BACKUP-999 renamed in PL_2_11 to /local/cudb/mysql/ndbd/backup/BACKUP/BACKUP-
999_OLD_2016-03-30_09-23
BACKUP-999 renamed in PL_2_12 to /local/cudb/mysql/ndbd/backup/BACKUP/BACKUP-
999_OLD_2016-03-30_09-23
Trying backup on mgmt access 1, wait a moment ...
ndb_mgm 10.22.0.1 2374 -e "START BACKUP 999 WAIT COMPLETED"
..ok
BACKUP-999 renamed in PL_2_11 to /local/cudb/mysql/ndbd/backup/BACKUP/BACKUP-
2016-03-30_09-23
BACKUP-999 renamed in PL_2_12 to /local/cudb/mysql/ndbd/backup/BACKUP/BACKUP-
2016-03-30_09-23

Backup finished successfully for store ds3.


Stores where order backup was successfully completed: ds3.
cudbManageStore command successful.

CUDB_11 SC_2_2# scp -r PL_2_11:/local/cudb/mysql/ndbd/backup/BACKUP/BACKUP-


2016-03-30_09-23 /cluster/PL_2_11/BACKUP-2016-03-30_09-23
BACKUP-999.3.log
100% 1396KB 1.4MB/s 00:00
BACKUP-999.3.ctl
100% 204KB 204.0KB/s 00:00
BACKUP-999-0.3.Data
100% 316MB 39.5MB/s 00:08
CUDB_11 SC_2_2# scp -r PL_2_12:/local/cudb/mysql/ndbd/backup/BACKUP/BACKUP-
2016-03-30_09-23 /cluster/PL_2_12/BACKUP-2016-03-30_09-23
BACKUP-999.4.ctl
100% 204KB 204.0KB/s 00:00
BACKUP-999.4.log
100% 1392KB 1.4MB/s 00:00
BACKUP-999-0.4.Data
100% 316MB 28.8MB/s 00:11
In a slave
CUDB_21 SC_2_1# scp -r 10.201.6.67:/cluster/PL_2_11/BACKUP-2016-03-30_09-23
/cluster/PL_2_11/BACKUP-2016-03-30_09-23/
BACKUP-999.3.log
100% 1396KB 1.4MB/s 00:00
BACKUP-999-0.3.Data
100% 316MB 45.1MB/s 00:07
BACKUP-999.3.ctl
100% 204KB 204.0KB/s 00:00
CUDB_21 SC_2_1# scp -r 10.201.6.67:/cluster/PL_2_12/BACKUP-2016-03-30_09-23
/cluster/PL_2_12/BACKUP-2016-03-30_09-23/
BACKUP-999.4.log
100% 1392KB 1.4MB/s 00:01
BACKUP-999.4.ctl
100% 204KB 204.0KB/s 00:00
BACKUP-999-0.4.Data
100% 316MB 24.3MB/s 00:13
CUDB_21 SC_2_1# scp -r /cluster/PL_2_11/BACKUP-2016-03-30_09-23
PL_2_11:/local/cudb/mysql/ndbd/backup/BACKUP
BACKUP-999.3.log
100% 1396KB 1.4MB/s 00:00
BACKUP-999-0.3.Data
100% 316MB 63.1MB/s 00:05
BACKUP-999.3.ctl
100% 204KB 204.0KB/s 00:00
CUDB_21 SC_2_1# scp -r /cluster/PL_2_12/BACKUP-2016-03-30_09-23
PL_2_12:/local/cudb/mysql/ndbd/backup/BACKUP
BACKUP-999.4.ctl
100% 204KB 204.0KB/s 00:00
BACKUP-999-0.4.Data
100% 316MB 63.3MB/s 00:05
BACKUP-999.4.log
100% 1392KB 1.4MB/s 00:00

CUDB_21 SC_2_1# cudbManageStore --ds 3 -o restore --location


/local/cudb/mysql/ndbd/backup/BACKUP/BACKUP-2016-03-30_09-23

cudbManageStore stores to process: ds3 (in dsgroup3).

Launching restore order in CUDB Node 21 to store ds3 in dsgroup 3.


Starting restore in CUDB Node 21 for store ds3, backup path
/local/cudb/mysql/ndbd/backup/BACKUP/BACKUP-2016-03-30_09-23, sql scripts
patEngine/config/schema/ds/internal/restoreTempSql.
Waiting for restore order(s) to be completed in CUDB Node 21 for stores :
ds3.
restore order finished successfully in CUDB Node 21 for store ds3.
restore order(s) completed in CUDB Node 21 for stores : ds3.
Stores where order restore was successfully completed: ds3.
Closing connections for all blades of DSUnitGroup 3.
cudbManageStore command successful.
Checking MySQL server connection:
MySQL Master Servers connection ..............OK
MySQL Slave Servers connection ...............OK
MySQL Access Servers connection ..............OK

Application Counters: fault retrieving subscriber statistics for DS-group


Module : APPLICATION-COUNTERS
Error Code : 1
Resource Id :
1.3.6.1.4.1.193.169.8.1.3.22.71.69.84.95.82.83.65.83.85.66.83.49.48.95.67.79.
85.78.84.69.82.83
Timestamp : Wed Mar 30 10:32:21 CAT 2016
Model Description : Fault retrieving subscriber statistics, Application
Counters.
Active Description : Application Counters: fault retrieving subscriber
statistics for DS-group #3 in Group counter GET_RSASUBS10_COUNTERS (error
application counters trap for DS#3.).
Event Type : 4
Probable Cause : 158
Severity : major
Orig Source IP : 10.201.6.67

CUDB_21 SC_2_1# pmreadcounter | grep VLR*


LOCSUBS; NVLRBARREDSUBS; 2015-05-02 01:46:17; 0
LOCSUBS; NVLRKNOWNSUBS; 2015-05-02 01:46:17; 0
LOCSUBS; NVLRPURGESUBS; 2015-05-02 01:46:17; 0
LOCSUBS; NVLRRESTSUBS; 2015-05-02 01:46:17; 0
LOCSUBS; NVLRUNKNOWNSUBS; 2015-05-02 01:46:17; 0

CUDB_21 SC_2_2# cd /cluster/software/app_counters


CUDB_21 SC_2_2# chmod +x app_counters.pl
CUDB_21 SC_2_1# ./app_counters.pl -u

UDC HLR and HSS Applications Counters Installation, version Rev. E 4.0.6

-------------------------------------------------------------------------
----------------------------------------------------------------------------
* Checking Active OAM blade...
[fail]
NOT Active OAM Blade...Execute the script from the Active OAM blade...Exiting
CUDB_21 SC_2_1# ssh sc_2_2
Last login: Wed Mar 30 10:11:24 2016 from 10.100.32.199
CUDB_21 SC_2_2# ./app_counters.pl -u

UDC HLR and HSS Applications Counters Installation, version Rev. E 4.0.6

-------------------------------------------------------------------------
----------------------------------------------------------------------------
* Checking Active OAM blade...
[ ok ]
-------------------------------------------------------------------------
----------------------------------------------------------------------------
* Reading System Hosts info...
[ ok ]
-------------------------------------------------------------------------
----------------------------------------------------------------------------
* Reading System config...
[ ok ]
-------------------------------------------------------------------------
----------------------------------------------------------------------------
* Checking Active Alarms...
[warn]
* Alarms exist in the node...Are you sure you want to continue ? (y/n)y
* Proceeding...
[ ok ]
-------------------------------------------------------------------------
----------------------------------------------------------------------------
* Checking the CUDB System Status..\
Please choose a system application counter uninstallation option:
* A. Uninstallation of HLR application counters
* B. Uninstallation of HSS application counters
* C. Uninstallation of All application counters
* X. Exit
* Please enter your choice(a/b/c/x):c
* Proceeding with HLR & HSS application counters uninstallation.
-------------------------------------------------------------------------
----------------------------------------------------------------------------
* Deleting previous counter installation crontab files..
[ ok ]
-------------------------------------------------------------------------
----------------------------------------------------------------------------
* Deleting previous counter installation files...
[ ok ]
-------------------------------------------------------------------------
----------------------------------------------------------------------------
* Reading Counters config files..
[ ok ]
-------------------------------------------------------------------------
----------------------------------------------------------------------------
* INFORMATION: Creation of the Pm Counter Job file for the active
application counters
* can be handled by either an Ericsson OSS-RC or this program!
* WARNING : Will an Ericsson OSS-RC used to handle the Pm Counter Job
file for application counters ? (y/n)y
* The Pm Counter Jobs file for the active application counters will be
created by OSS-RC!
[warn]
-------------------------------------------------------------------------
----------------------------------------------------------------------------
* Reloading Jobs..
* Stopping PmAgent in node 10.22.0.1 ... OK
* ESA PmAgent has been successfully stopped.
* Starting PmAgent in node 10.22.0.1 ... OK
* Stopping PmAgent in node 10.22.0.2 ... OK
* ESA PmAgent has been successfully stopped.
* Starting PmAgent in node 10.22.0.2 ... OK
-------------------------------------------------------------------------
----------------------------------------------------------------------------
* Checking Mastership of PLDB...
[ ok ]
* This node DOES NOT have the Mastership of the PLDB !!!
* Proceeding with Slave uninstallation...
* Deleting procedures in DS1_0..[ ok ]
* Deleting procedures in DS1_1. [ ok ]
* Deleting procedures in DS2_0..[ ok ]
* Deleting procedures in DS2_1..[ ok ]
* Deleting procedures in DS3_0..[ ok ]
* Deleting procedures in DS3_1..[ ok ]
* Deleting procedures in DS4_0..[ ok ]
* Deleting procedures in DS4_1..[ ok ]
* Deleting procedures in PL0… .[ ok ]
* Deleting procedures in PL1. .[ ok ]
-------------------------------------------------------------------------
----------------------------------------------------------------------------
* Clearing Application Counter Alarms, 0 alarms found
[ ok ]
------------------------------------------------------------------------
CUDB_21 SC_2_2# ./app_counters.pl -i

UDC HLR and HSS Applications Counters Installation, version Rev. E 4.0.6

-------------------------------------------------------------------------
----------------------------------------------------------------------------
* Checking Active OAM blade...
[ ok ]
-------------------------------------------------------------------------
----------------------------------------------------------------------------
* Reading System Hosts info...
[ ok ]
-------------------------------------------------------------------------
----------------------------------------------------------------------------
* Reading System config...
[ ok ]
-------------------------------------------------------------------------
----------------------------------------------------------------------------
* Checking Active Alarms...
[warn]
* Alarms exist in the node...Are you sure you want to continue ? (y/n)y
* Proceeding...
[ ok ]
-------------------------------------------------------------------------
----------------------------------------------------------------------------
* Checking the CUDB System Status..
[ ok ] -------------------------------------------------------------------
-----------------------------------------------------------------------------
-----
* Please choose a system application counter installation option:
* A. Installation of HLR application counters
* B. Installation of HSS application counters
* C. Installation of All application counters
* X. Exit
* Please enter your choice(a/b/c/x):a
* Proceeding with HLR application counter installation.
-------------------------------------------------------------------------
----------------------------------------------------------------------------
* Copying Files...HLR...
[ ok ]
-------------------------------------------------------------------------
----------------------------------------------------------------------------
* Checking Mastership of PLDB...
[ ok ]
* This node DOES NOT have the Mastership of the PLDB !!!
* Proceeding with Slave installation...
* Creating procedures in DS1_0...
[ ok ]
* Creating procedures in DS1_1...
[ ok ]
* Creating procedures in DS2_0...
[ ok ]
* Creating procedures in DS2_1...
[ ok ]
* Creating procedures in DS3_0...
[ ok ]
* Creating procedures in DS3_1...
[ ok ]
* Creating procedures in DS4_0...
[ ok ]
* Creating procedures in DS4_1...
[ ok ]
* Creating procedures in PL0...
[ ok ]
* Creating procedures in PL1...
[ ok ]
-------------------------------------------------------------------------
----------------------------------------------------------------------------
* Reading Counters config files..
[ ok ]
-------------------------------------------------------------------------
----------------------------------------------------------------------------
* Please choose an installation option
* A. Installation of HLR counters (Includes all HLR default and regional
counters)
* B. Installation of HLR default counters (Excludes all HLR regional
counters)
* C. Installation of HLR default counters and custom selection of HLR
regional counters
* D. Custom installation of HLR default and regional counters
* X. Exit
* Please enter your choice(a/b/c/d/x):b
-------------------------------------------------------------------------
----------------------------------------------------------------------------
* INFORMATION: Creation of the Pm Counter Job file for the active
application counters
* can be handled by either an Ericsson OSS-RC or this program!
* WARNING : Will an Ericsson OSS-RC used to handle the Pm Counter Job
file for application counters ? (y/n)y
* The Pm Counter Jobs file for the active application counters will be
created by OSS-RC!
[warn]
-------------------------------------------------------------------------
----------------------------------------------------------------------------
* Reloading Jobs..
* Stopping PmAgent in node 10.22.0.1 ... OK
* ESA PmAgent has been successfully stopped.
* Starting PmAgent in node 10.22.0.1 ... OK
* Stopping PmAgent in node 10.22.0.2 ... OK
* ESA PmAgent has been successfully stopped.
* Starting PmAgent in node 10.22.0.2 ... OK

CUDB_21 SC_2_2# ssh 10.22.0.5


ssh_exchange_identification: Connection closed by remote host
CUDB_21 SC_2_2# ssh 10.22.0.5
Go to expert mode lock and unlock the PL
CUDB_21 SC_2_2# ssh -p 2024 expert@10.202.4.150
expert@10.202.4.150's password:expert

expert@blade_0_0 12:50:01> show ManagedElement 1 DmxFunctions 1


BladeGroupManagement 1 Group CUDB ShelfSlot Blade 1 firstMacAddress
SHELF
SLOT BLADE FIRST MAC
ID ID ADDRESS
---------------------------------
0-1 1 34:07:fb:ed:2f:d7
0-3 1 34:07:fb:ed:2f:5f
0-5 1 34:07:fb:ed:2f:2f
0-7 1 34:07:fb:ed:28:27
0-9 1 34:07:fb:ed:2f:9b
0-11 1 34:07:fb:ed:31:03
0-13 1 34:07:fb:ed:2f:17
0-15 1 34:07:fb:ed:30:43
0-17 1 34:07:fb:ed:30:97
0-19 1 34:07:fb:ed:30:d3
0-21 1 34:07:fb:f1:77:1f
0-23 1 34:07:fb:ed:2f:47
1-1 1 34:07:fb:ed:28:b7
1-3 1 34:07:fb:ed:29:cb

expert@blade_0_0 12:53:00% show ManagedElement 1 DmxFunctions 1


BladeGroupManagement 1 Group CUDB ShelfSlot Blade 1
ShelfSlot 0-1 {
Blade 1 {
userLabel SC-1;
administrativeState unlocked;
width 2;
ipmiOffset 0;
baseOffset [ 0 ];
dataOffset [ 0 ];
}
}
ShelfSlot 0-3 {
Blade 1 {
userLabel SC-2;
administrativeState unlocked;
width 2;
ipmiOffset 0;
baseOffset [ 0 ];
dataOffset [ 0 ];
}
}
ShelfSlot 0-5 {
Blade 1 {
userLabel PL-3;
administrativeState unlocked;
width 2;
ipmiOffset 0;
baseOffset [ 0 ];
dataOffset [ 0 ];
}
}
ShelfSlot 0-7 {
Blade 1 {
userLabel PL-4;
administrativeState unlocked;
width 2;
ipmiOffset 0;
baseOffset [ 0 ];
dataOffset [ 0 ];
}
}
ShelfSlot 0-9 {
Blade 1 {
userLabel PL-5;
administrativeState locked;
width 2;
ipmiOffset 0;
baseOffset [ 0 ];
dataOffset [ 0 ];

[ok][2017-08-14 12:50:30]
expert@blade_0_0 12:50:30> conf
Entering configuration mode private
[ok][2017-08-14 12:52:14]

[edit]
expert@blade_0_0 12:52:14% set ManagedElement 1 Equipment 1 Shelf 0 Slot 9
Blade 1 administrativeState locked
[ok][2017-08-14 12:52:53]

[edit]
expert@blade_0_0 12:52:53% commit
Commit complete.
[ok][2017-08-14 12:53:00]

[edit]
[edit]
expert@blade_0_0 12:57:09% set ManagedElement 1 Equipment 1 Shelf 0 Slot 9
Blade 1 administrativeState unlocked
[ok][2017-08-14 12:57:31]

[edit]
expert@blade_0_0 12:57:31% commit
Commit complete.
[ok][2017-08-14 12:57:36]

exit from expert mode and reboot the PL


CUDB_21 SC_2_2# cluster reboot -n 5
Rebooting node 5 (PL_2_5)
Succeeded to execute /sbin/reboot on 5 (PL_2_5)

ReadinessState=OUT-OF-SERVICE(1)

Lock and unlock the PL


CUDB_21 SC_2_2# cudbHaState
safSu=PL-3,safSg=NoRed,safApp=ERIC-
CoreMW|AdminState=UNLOCKED(1)|OperState=ENABLED(1)|PresenceState=UNINSTANTIAT
ED(1)|ReadinessState=OUT-OF-SERVICE(1)
CUDB_21 SC_2_2# cmw-node-lock PL_2_3
CUDB_21 SC_2_2# cudbSystemStatus
…………………………………………………………………
PLs................................
Storage Engine process (ndbd).................Running
LDAP FE.......................................Running
KeepAlive process.............................Running
MySQL server process (Master).................Running
MySQL server process (Slave)..................Running
MySQL server process (Access).................Running
[-W-] CudbNotifications process.....................Not running in: PL0
LDAP FE Monitor process.......................Running

CUDB_21 SC_2_2# cmw-node-unlock PL_2_3


CUDB_21 SC_2_2# cudbSystemStatus

PLs................................
Storage Engine process (ndbd).................Running
LDAP FE.......................................Running
KeepAlive process.............................Running
MySQL server process (Master).................Running
MySQL server process (Slave)..................Running
MySQL server process (Access).................Running
CudbNotifications process.....................Running
LDAP FE Monitor process.......................Running

consistency: NOK (UNKNOWN)


[error] Node21-DSG1 consistency: NOK (UNKNOWN) - could not get data from
slave DS unit.
Run the next command in both nodes
CUDB_11 SC_2_2# /etc/init.d/cudbCheckConsistencySrv restart

cudbDsgMastershipChange

This command is used to move the master of one DSG to a selected node. It
needs to be launched on the CUDB node that is to hold the new master replica.
The command works for both 1+1 and 1+1+1 configuration.
The mstership can be changed if the standby is (S1)
CUDB_21 SC_2_1# cudbDsgMastershipChange -d --dsg 3
Processing DSG mastership change...
Replication checking...
Replication in DSG3(Node=21--Chan=1).... OK -- Delay = 0
Warning: This step could stuck forever if the allowed synchronization delay
between the current master replica and the future master replica is never
reached. If you experience that the command takes too long to finish you can
stop the command with CTRL+C. You would have more chances to finish the
command execution by slowling down the traffic in the master or using the
option --time with a higher value.
\ Distance to master is 200. Proceeding...

Putting the master in maintenance...


Waiting for mastership switchover...
Putting the original master back to ready mode...

Master replica on node: 11


Distance to master replica: 0 seconds
cudbDsgMastershipChange: Success.
DSG 3 master replica moved from node 11 to 21 successfully
CUDB_21 SC_2_1# cudbSystemStatus -R

Execution date: Wed Mar 30 16:46:37 CAT 2016

Checking Replication Channels in the System:


Node | 11 | 21
====================
PLDB ___|__M__|__S2_
DSG 1 __|__S2_|__M__
DSG 2 __|__M__|__S2_
DSG 3 __|__S1_|__M__
DSG 4 __|__M__|__S2_

Software backup
CUDB_11 SC_2_1# cudbSwBackup -i

CUDB SW Backups are stored in two directories:


/cluster/home/cudb/swbackup and
/cluster/storage/no-backup
The backup files are:
- Two tar backup files
- One sql backup file

CUDB_11 SC_2_1# cudbSwBackup -l

CUDB SW backups
---------------
atp_to10_002_swbackup
backup_cudb_11_03-07-2014
bkupg_13BR1K_CUDB_13B_FD1_IRC2
cudbSWBackup_180916
cudb_sw_backup_20_05_14

CUDB_11 SC_2_1# cd /cluster/home/cudb/swbackup


CUDB_11 SC_2_1# ls -lt
total 1900
-rw-r--r-- 1 root root 401772 Sep 18 21:45 cudbSWBackup_180916.tar
-rw-r--r-- 1 root root 4257 Sep 18 21:45 cudbSWBackup_180916-
cudbSmpConfig.sql
-rw-r--r-- 1 root root 4257 Sep 11 2014 atp_to10_002_swbackup-
cudbSmpConfig.sql
-rw-r--r-- 1 root root 401621 Sep 11 2014 atp_to10_002_swbackup.tar
-rw-r--r-- 1 root root 400607 Aug 28 2014 bkupg_13BR1K_CUDB_13B_FD1_IRC2.tar
-rw-r--r-- 1 root root 4257 Aug 28 2014 bkupg_13BR1K_CUDB_13B_FD1_IRC2-
cudbSmpConfig.sql
-rw-r--r-- 1 root root 400602 Jul 3 2014 backup_cudb_11_03-07-2014.tar
-rw-r--r-- 1 root root 4257 Jul 3 2014 backup_cudb_11_03-07-2014-
cudbSmpConfig.sql
-rw-r--r-- 1 root root 269761 May 20 2014 cudb_sw_backup_20_05_14.tar
-rw-r--r-- 1 root root 4257 May 20 2014 cudb_sw_backup_20_05_14-
cudbSmpConfig.sql
CUDB_11 SC_2_1# cudbT

Take all masters in case of one of the cudb is out of service


cudbTakeAllMasters cudbTpsStat
CUDB_11 SC_2_1# cudbTakeAllMasters -h
===================================================
USE of cudbTakeAllmasters
===================================================
Format : cudbTakeAllmasters [-h | --help]

Where:

-h | --help Shows command usage and help

Comandos importantes do trainnig


!Pg 157-158 do livro CUDB13B operation &service configuration (comandos)
!Pg 168-198 do livro CUDB13B operation &service configuration(backups)
! Pg 191-191 do livro CUDB13B operation &service configuration (Ver o comando de fazer o
automatico)

ALEX ALLEX en_lzn7020327_r3l

CUDB_21 SC_2_1# esa status

Print a versao do cudb software

CUDB_21 SC_2_1#cudbSwVersionCheck

CUDB_21 SC_2_1#cudbGetLogs

Print alarms ;
CUDB_21 SC_2_1# fmactivealarms

Backup completo manually

CUDB_21 SC_2_1# cudbDataBackup

Backup Automatic

CUDB_21 SC_2_1# cudbDataBackup and restore –c (automatically stip/unzip the file))

Ou

CUDB_21 SC_2_1# cudbDataBackup –q

List
CUDB_21 SC_2_1# cd /home/cudb/systemDataBackup
CUDB_21 SC_2_1# dir

Para remover os backups


CUDB_21 SC_2_1# rm BACKUP-2015-09-29_09-56-11.2.4.tar

Para transferência usamos o winscap

No file protocol temos que usar o SFTP

Caso de falha do provisioning

Paragem de Provisioning.txt

Programar o automatically backup( este comando so pode ocorrer no active node)

Example;
CUDB_21 SC_2_1# cudbHaState

COM state:
----------
COM is assigned as ACTIVE in controller SC-1
COM is assigned as STANDBY in controller SC-2

Vamos configurar o crontab no SC-1 o ACTIVE


CUDB_21 SC_2_1# crontab – e
!vai dar o anterior comando depois retificamos com o comando

quite

0 2 * * * cudbDataBackup –q
0 – minutes
2 –horas
Entao foi programado para as 2horas o automatic backup

gravar crontab.txt
To print
CUDB_21 SC_2_1# crontab –l

Backup Automatic restore

CUDB_21 SC_2_1# cudbDataBackup and restore –r (automatically restore)

When have one ds1 (node 11) (down) to restore data to ds1(node 21) to (node 11)

CUDB_21 SC_2_1# cudbUnitDataBackup and restore


-n(node id)
-p(pl)
-d(ds)
Example CUDB_21 SC_2_1# cudbUnitDataBackup and restore –n 21 –d2 –dsg2
Extracao de Estatiscas no CUDB
Numero de subscitores HLR e AUC

CUDB_Commands.tx
t

LOGS NO CUDB

Logs no CUDB no Dump.txt

Check CLOCK REFERENCE


cat /cluster/etc/cluster.conf | grep ntp
cat /cluster/etc/cluster.conf | grep timezone
ntpq -c peers

reference clock.txt
CUDB health Check

1. Check the CUDB node status SC_2_1# cudbSystemStatus

2. Check that Replication between CUDB nodes SC_2_1# cudbCheckReplication


is working properly.
It performs a deeper check of the Replication
stauts than cudbSystemStatus command
above
NOTE: Press enter if command result does not
appear after a while.

3. Check the Cluster state. SC_2_1# cudbHaState


It shows the cluster status more detailed than
cudbSystemStatus command above.

4. Check consistency between Master and Slave SC_2_1# cudbCheckConsistency


copies of the DSGs

5. Check the CPU load of the CUDB blades. It SC_2_1# cudbMpstat


shows the percentage of CPU load used.

Note:Press CONTROL-C to exit

6. Dump CUDB

Dump.txt

7. Backup CUDB por DS


Backups_CUDB_0307
2015.txt

___________________________________________________________________________

8. Saber a localizacao do Subscritor em que DS esta

saber o cliente em DS esta.txt

9. Verificar o sttaus dos DS

CUDB_21 SC_2_1# cudbManageStore --ds all --order status

___________________________________________________________________________

10. Printar os Backups

CUDB_21 SC_2_2# cudbSwBackup –print

CUDB_21 SC_2_2# cudbSwBackup –list

______________________________________________________________________________

11. DATABASE LOCKED FOR BACKUP

DATABASE LOCKED FOR BACKUP.txt

12. DUMP AUC_HLR in only HLR

Dump AUC_HLR and only HLR.txt

Login:root
Psw: ericsson

CUDB_21 SC_2_2# cudbSystemStatus –h

-a | --alarms Print Alarms


-s | --sm-status Print System Monitor status
-c | --cluster-status Old Cluster status for ERICSSON personnel only
-C | --new-cluster-status Print Cluster status
-m | --check-mysql Check MySql server connections
-p | --check-cudbprocess Check Cudb process
-b | --bc-status Print the status of the BC process
-r | --replication-status Old Replication status for ERICSSON personnel only
-R | --new-replication-status Print Replication status
-v | --version Print CUDB version
-h | --help Shows command usage and help
Legend for -R (replication status):
'M' : Master
'Sn' : Slave - replication channel #n is active
'S?' : Slave - replication status is unknown
'[S]' : Slave - replication channels down
'Xm' : Wrong state - masterless
'Xu' : Wrong state - unreachable

NOTE: if there is no option supplied, -v -s -b -C -R -a -m -p options are supposed.


CUDB_21 SC_2_2# cudbSystemStatus -(-v -s -b -C -R -a -m -p)
Example

CUDB_21 SC_2_2# cudbSystemStatus -b


!verificao da inconsistencia dos dados em (%(
CUDB_21 SC_2_2# cudbCheckConsistency
!verificar que alguem tentou fazer um login failure vai nos dar o ip que teve

CUDB_21 SC_2_1# fmactivealarms


CUDB_21 SC_2_2# fmsendmessage -h
DESCRIPTION:
The command triggers the sending of an alarm or event.
COMMAND FORMAT:
fmsendmessage <action> <moduleId> <errorCode> <resourceId> ["<activeDescr>"] [<sourceIp>]
ARGUMENTS:
action
The type of action to trigger.
-r: Trigger 'raise alarm' message
-c: Trigger 'clear alarm' message
-e: Trigger 'event' message
moduleId
The module identity of the alarm to trigger.
errorCode
The error code of the alarm to trigger.
resourceId
The resource identity of the alarming object.
activeDescr
The text to send in the alarm message.
sourceIp
The IP address to use as the alarm originating source.

CUDB_21 SC_2_1# cudbGetLogs


CUDB_21 SC_2_2# cudbGetLogs

Not started on the ACTIVE SC, exiting. Log in to the other SC, and start log collection there.
cron crontab

CUDB_21 SC_2_2# crontab -h


crontab: invalid option -- 'h'
crontab: usage error: unrecognized option
usage: crontab [-u user] file
crontab [-u user] [ -e | -l | -r ]
(default operation is replace, per 1003.2)
-e (edit user's crontab)
-l (list user's crontab)
-r (delete user's crontab)

CUDB_21 SC_2_2# crontab –l


# DO NOT EDIT THIS FILE - edit the master and reinstall.
# (/var/spool/cron/tabs/root installed on Wed Sep 3 10:58:03 2014)
# (Cron version V5.0 -- $Id: crontab.c,v 1.12 2004/01/23 18:56:42 vixie Exp $)
25 0,12 * * * /bin/bash /opt/ericsson/cudb/OAM/bin/cudbGetLogs
50 0,12 * * * /bin/bash /opt/ericsson/cudb/OAM/bin/cudbAnalyser --auto-check --send-alarm --save-counter >
/home/cudb/monitoring/preventiveMaintenance/cron_analysis.SC_2_2.log
37 0 * * * /bin/bash /opt/ericsson/cudb/Monitors/bin/cudbCheckConsistency --locked --alarms >/dev/null 2>&1 || true
7 0 * * * /bin/bash /opt/ericsson/cudb/Monitors/bin/cudbCheckReplication --locked --alarms >/dev/null 2>&1 || true
0,15,30,45 * * * * /home/cudb/oam/performanceMgmt/appCounters/scripts/appCounters.cron >> /dev/null
*/1 * * * * /opt/ericsson/cudb/Monitors/keepAlive/bin/keepAlive_monitor.sh >/dev/null 2>&1

CUDB_21 SC_2_2# cudbAnalyser -h


Usage:
The purpose of this script is to run a log analysis on the logs collected by cudbGetLogs.sh
cudbAnalyser [-h|--help] [-l <logfile> | --logfile <logfile>] [-p | --previous-logfile <previous-logfile>] [-s | --save-counter]
Examples:
cudbAnalyser --logfile /home/cudb/monitoring/preventiveMaintenance/CUDB_28_201108190300.log --previous-logfile
/home/cudb/monitoring/preventiveMaintenance/CUDB_28_201108160300.log
cudbAnalyser -l /home/cudb/monitoring/preventiveMaintenance/CUDB_28_201108190300.log -p
/home/cudb/monitoring/preventiveMaintenance/CUDB_28_201108160300.log --save-counter
cudbAnalyser -l /home/cudb/monitoring/preventiveMaintenance/CUDB_28_201108190300.log -p
/home/cudb/monitoring/preventiveMaintenance/CUDB_28_201108160300.log --send-alarm
cudbAnalyser --auto-check --save-counter
cudbAnalyser -h
cudbAnalyser -help
Options:
-h | --help Shows command usage and help
-l <logfile> | --logfile <logfile> points to the newer logfile
-p <logfile> | --previous-logfile <logfile> points to an olde
NOTE:the timestamp of <logfile> has to be newer than <previous-logfile> in order to have a correct diff analysis.
-s | --save-counter if set, save counter under
/home/cudb/monitoring/preventiveMaintenance/cudb_analyser_error_counter.log
that show the number of failing checks. Needed for ESA integration

-a | --auto-check if set, you don't have to use --logfile and --previous-logfile switches, as the script will
check the last two log files automatically.
NOTE: --auto-check works on the CUDB node only

-q | --quiet if set, then error printouts are less verbose

-d | --debug if set, then debug printouts are produced-S | --send-alarm S if set, then an
alarm is also sent to ESA, based on the results

-w | --weight-threshold <threshold> print errors with WEIGHT >= weight-threshold (default:2). Setting this value to 0
will print more errors,
which might help a detailed system audit.

CUDB_21 SC_2_2# cluster alarm -h


usage: alarm <options>
options:
-l, --list Print alarm status
-s, --status Print alarm status (same as -l)
--short Print short human readable format (default)
--full Print full human readable format
--machine Print full machine readable format
-n, --node <node id> Select node
-a, --all All nodes

CUDB_21 SC_2_2# cudbTpsStat


CUDB_21 SC_2_2# cudbMpsStat

!Login in other PL/DS/CUDB

CUDB_11 SC_2_2# ssh pl_2_14


CUDB_11 SC_2_2# nodessh 21
CUDB_21 SC_2_1# ssh ds4_1

!BackPainel

SQL LINUX
SYSTEM
NODE cluster
Blade Server Storage Engine Node
OAM Managemet Server SC 2-1
SC 2-2
PL (0,1,2,3) DSG, SQL/nod cluster PL 2-3
PL2-4
PL2-5
PL2-6
DS 0 DSG 1,2 PL2-7
DS 1 Node cluster PL2-8(pay
load)
Acesso via consola;

O 1˚ subrack_ PG;
O 2˚ subrack_ CUDB mas so tem DS 4;
O 3˚ subrack_ CUDB BLADE 0_5, 0_7, 0_9, 0_11 (2X PLD);
BLADE 0_13& 0_15, 0_17& 0_19, 0_21& 0_23 (3X DS);

! cada PLD ou DS leva dois(2) BLADE por causa da sicronização


! a conexao via consola e feito no terminal “ CONSO” no BLADE

DMX – e feito a conexao no apartir do cabo consola na 1 porta CMXB

psw : Tre,14
!comando vai nos dar o DMX activo
@dmxr
Psw: expert

!DUVIDAS

 Como fazer o backup no CUDB e sua colecção


 Como fazer o Dump Subscriber e sua colecção
 Como Colecção maxima dos alarmes para o K:drive report

#login:

#CUDB 11:
#VIP:
ssh root@10.201.6.67 / ericsson

#SYSOAM SC_2_1/SC_2_2:
ssh root@10.201.4.148 /ericsson
ssh root@10.201.4.149 /ericsson

#CUDB 11:
#VIP:
ssh root@10.202.6.67 / ericsson

#SYSOAM SC_2_1/SC_2_2:
ssh root@10.202.4.148 /ericsson
ssh root@10.202.4.149 /ericsson

#Login to active DMX (from any SC):

#CUDB11:
ssh expert@10.201.4.150 -p2024 -c 3des-cbc
#pwd:
expert
#CUDB21:
ssh expert@10.202.4.150 -p2024 -c 3des-cbc
#pwd:
expert

#Login to CMX from DMX


#expert@blade_0_25 13:57:33>
wizard bridge-config
#Specify the CMXB:
#1: bridge 0-26
#2: bridge 0-28
#3: bridge 1-26
#4: bridge 1-28

########################################################################################################
#############################

#System Status
cudbSystemStatus

#check only replication channels:


cudbSystemStatus -r -R

#check only cluster status:

cudbSystemStatus -c

#check only active alarms:

cudbSystemStatus -a

#checl consistency (checks if slave DS units hold approximately as many rows in their tables as the corresponding master DS
unit in the same DSG)

CudbCheckConsistency

#check replications (This command checks if active MySQL replication channels are functional in the CUDB system)

cudbCheckReplication

#collect info (This command is to create tarball of CUDB logs)

cudbCollectInfo

#to monitor the CPU load on one or more blades in a CUDB node, with adjustable periodicity

cudbMpstat

#check the number of successful and unsuccessful LDAP transactions per second on all blades in the local CUDB node

cudbTpsStat

#check information on installed software packages from both IMM and LOTC

cudbSwVersionCheck

#check the High Availability state

cudbHaState

########################################################################################################
#############################

#check active alarms:


fmactivealarms

#clear irrelevant alarms:


#example:
#!---------------------------------------------------------------
#Module : SECURITY
#Error Code :2
#Resource Id : 1.3.6.1.4.1.193.169.11.2.10.201.6.69
#Timestamp : Tue Jul 01 12:12:19 CEST 2014
#Model Description : Root Login Failed, Security.
#Active Description : Root Login Failed @10.201.6.69
#Event Type : 10
#Probable Cause : 600
#Severity : warning
#Orig Source IP : 10.202.6.67
#---------------------------------------------------------------!

#fmsendmessage <action> <moduleId> <errorCode> <resourceId> ["<activeDescr>"] [<sourceIp>]

fmsendmessage -c SECURITY 2 1.3.6.1.4.1.193.169.11.2.10.201.6.69 10.202.6.67

!---------------------------------------------------------------
Module : STORAGE-ENGINE
Error Code : 19
Resource Id : 1.3.6.1.4.1.193.169.1.2.19.2
Timestamp : Sun Mar 27 00:37:22 CAT 2016
Model Description: Potential data inconsistency between replicas found, Storag e Engine.
Active Description: Storage Engine (DS-group #2): Potential data inconsistency between replicas found.
Event Type :4
Probable Cause : 160
Severity : major
Orig Source IP : 10.202.6.67
---------------------------------------------------------------!
! Resolução
CUDB_21 SC_2_2# fmsendmessage -c STORAGE-ENGINE 19 1.3.6.1.4.1.193.169.1.2.19.2 10.202.6.69

########################################################################################################
#############################

#LOGCHECKER

#CUDB_21 SC_2_2# fmactivealarms


#Active alarms:
#!---------------------------------------------------------------
#Module : PREVENTIVE-MAINTENANCE
#Error Code :4
#Resource Id : 1.3.6.1.4.1.193.169.100.1
#Timestamp : Tue Jul 01 00:50:04 CEST 2014
#Model Description : Logchecker found critical error(s), Preventive Maintenance.
#Active Description : Preventive Maintenance: Logchecker has found critical error(s).
#Event Type :1
#Probable Cause : 1024
#Severity : critical
#Orig Source IP : 10.202.6.67
#---------------------------------------------------------------!
#CUDB_21 SC_2_2#
#To print the errors (run in Active SC):
cudbAnalyser --auto-check
#or
cudbAnalyser -a

#To get new logs for the analyser:


cudbGetLogs

#To check cluster alarms:


cluster alarm -l

########################################################################################################
#############################

#CUDB UNIT DATA BACKUP


#if the replication chanel is down:

#CUDB11:
#PLDB:
cudbUnitDataBackupAndRestore -p -n 11
#DS1
cudbUnitDataBackupAndRestore -d 1 -n 11
#DS2
cudbUnitDataBackupAndRestore -d 2 -n 11
#DS3
cudbUnitDataBackupAndRestore -d 3 -n 11
#DS4
cudbUnitDataBackupAndRestore -d 4 -n 11

#CUDB21:
#PLDB:
cudbUnitDataBackupAndRestore -p -n 21
#DS1
cudbUnitDataBackupAndRestore -d 1 -n 21
#DS2
cudbUnitDataBackupAndRestore -d 2 -n 21
#DS3
cudbUnitDataBackupAndRestore -d 3 -n 21
#DS4
cudbUnitDataBackupAndRestore -d 4 -n 21

########################################################################################################
#############################

#After a CUDB system, node, or replica is successfully restored, all stored procedures related to the application counters are
lost,
#and therefore must be created again.

# show procedure status:


mysql -hpl0 -pmysql -umysql -P 15000 -e call cudb_user_data;
mysql -hpl0 -pmysql -umysql -P 15001 -e "show procedure status\G";
mysql -hpl1 -pmysql -umysql -P 15000 -e "show procedure status\G";
mysql -hpl1 -pmysql -umysql -P 15002 -e "show procedure status\G";

mysql -hds1_0 -pmysql -umysql -P 15010 -e "show procedure status\G";


mysql -hds1_1 -pmysql -umysql -P 15010 -e "show procedure status\G";
mysql -hds1_0 -pmysql -umysql -P 15011 -e "show procedure status\G";
mysql -hds1_1 -pmysql -umysql -P 15012 -e "show procedure status\G";
mysql -hds2_0 -pmysql -umysql -P 15020 -e "show procedure status\G";
mysql -hds2_1 -pmysql -umysql -P 15020 -e "show procedure status\G";
mysql -hds2_0 -pmysql -umysql -P 15021 -e "show procedure status\G";
mysql -hds2_1 -pmysql -umysql -P 15022 -e "show procedure status\G";

mysql -hds3_0 -pmysql -umysql -P 15030 -e "show procedure status\G";


mysql -hds3_1 -pmysql -umysql -P 15030 -e "show procedure status\G";
mysql -hds3_0 -pmysql -umysql -P 15031 -e "show procedure status\G";
mysql -hds3_1 -pmysql -umysql -P 15032 -e "show procedure status\G";

mysql -hds4_0 -pmysql -umysql -P 15040 -e "show procedure status\G";


mysql -hds4_1 -pmysql -umysql -P 15040 -e "show procedure status\G";
mysql -hds4_0 -pmysql -umysql -P 15041 -e "show procedure status\G";
mysql -hds4_1 -pmysql -umysql -P 15042 -e "show procedure status\G";

#Check slave status:


#mysql -p<pswd> -u<user> -P<mysql port> -h<host> -e "show slave status\G;"

mysql -pmysql -umysql -P15000 -hpl0 -e "show slave status\G;"


mysql -pmysql -umysql -P15001 -hpl0 -e "show slave status\G;"
mysql -pmysql -umysql -P15000 -hpl1 -e "show slave status\G;"
mysql -pmysql -umysql -P15002 -hpl1 -e "show slave status\G;"

mysql -pmysql -umysql -P15000 -hds1_0 -e "show slave status\G;"


mysql -pmysql -umysql -P15001 -hds1_0 -e "show slave status\G;"
mysql -pmysql -umysql -P15000 -hds1_1 -e "show slave status\G;"
mysql -pmysql -umysql -P15002 -hds1_1 -e "show slave status\G;"

mysql -pmysql -umysql -P15000 -hds2_0 -e "show slave status\G;"


mysql -pmysql -umysql -P15001 -hds2_0 -e "show slave status\G;"
mysql -pmysql -umysql -P15000 -hds2_1 -e "show slave status\G;"
mysql -pmysql -umysql -P15002 -hds2_1 -e "show slave status\G;"

mysql -pmysql -umysql -P15000 -hds3_0 -e "show slave status\G;"


mysql -pmysql -umysql -P15001 -hds3_0 -e "show slave status\G;"
mysql -pmysql -umysql -P15000 -hds3_1 -e "show slave status\G;"
mysql -pmysql -umysql -P15002 -hds3_1 -e "show slave status\G;"

mysql -pmysql -umysql -P15000 -hds4_0 -e "show slave status\G;"


mysql -pmysql -umysql -P15001 -hds4_0 -e "show slave status\G;"
mysql -pmysql -umysql -P15000 -hds4_1 -e "show slave status\G;"
mysql -pmysql -umysql -P15002 -hds4_1 -e "show slave status\G;"

#Load stored Procedures:


#PLDB:
mysql -hpl0 -P15000 --user=mysql --password=mysql -e " source
/home/cudb/oam/performanceMgmt/appCounters/procedures/create_stored_procedures_hlr.sql "
mysql -hpl0 -P15001 --user=mysql --password=mysql -e " source
/home/cudb/oam/performanceMgmt/appCounters/procedures/create_stored_procedures_hlr.sql "
mysql -hpl1 -P15000 --user=mysql --password=mysql -e " source
/home/cudb/oam/performanceMgmt/appCounters/procedures/create_stored_procedures_hlr.sql "
mysql -hpl1 -P15002 --user=mysql --password=mysql -e " source
/home/cudb/oam/performanceMgmt/appCounters/procedures/create_stored_procedures_hlr.sql "
mysql -hpl0 -P15000 --user=mysql --password=mysql -e " source
/home/cudb/oam/performanceMgmt/appCounters/procedures/create_stored_procedures_auc.sql "
mysql -hpl0 -P15001 --user=mysql --password=mysql -e " source
/home/cudb/oam/performanceMgmt/appCounters/procedures/create_stored_procedures_auc.sql "
mysql -hpl1 -P15000 --user=mysql --password=mysql -e " source
/home/cudb/oam/performanceMgmt/appCounters/procedures/create_stored_procedures_auc.sql "
mysql -hpl1 -P15002 --user=mysql --password=mysql -e " source
/home/cudb/oam/performanceMgmt/appCounters/procedures/create_stored_procedures_auc.sql "

#DSG-1
mysql -hds1_0 -P15010 --user=mysql --password=mysql -e " source
/home/cudb/oam/performanceMgmt/appCounters/procedures/create_stored_procedures_hlr.sql "
mysql -hds1_1 -P15010 --user=mysql --password=mysql -e " source
/home/cudb/oam/performanceMgmt/appCounters/procedures/create_stored_procedures_hlr.sql "
mysql -hds1_0 -P15011 --user=mysql --password=mysql -e " source
/home/cudb/oam/performanceMgmt/appCounters/procedures/create_stored_procedures_hlr.sql "
mysql -hds1_1 -P15012 --user=mysql --password=mysql -e " source
/home/cudb/oam/performanceMgmt/appCounters/procedures/create_stored_procedures_hlr.sql "
mysql -hds1_0 -P15010 --user=mysql --password=mysql -e " source
/home/cudb/oam/performanceMgmt/appCounters/procedures/create_stored_procedures_auc.sql "
mysql -hds1_1 -P15010 --user=mysql --password=mysql -e " source
/home/cudb/oam/performanceMgmt/appCounters/procedures/create_stored_procedures_auc.sql "
mysql -hds1_0 -P15011 --user=mysql --password=mysql -e " source
/home/cudb/oam/performanceMgmt/appCounters/procedures/create_stored_procedures_auc.sql "
mysql -hds1_1 -P15012 --user=mysql --password=mysql -e " source
/home/cudb/oam/performanceMgmt/appCounters/procedures/create_stored_procedures_auc.sql "

#DSG-2
mysql -hds2_0 -P15020 --user=mysql --password=mysql -e " source
/home/cudb/oam/performanceMgmt/appCounters/procedures/create_stored_procedures_hlr.sql "
mysql -hds2_1 -P15020 --user=mysql --password=mysql -e " source
/home/cudb/oam/performanceMgmt/appCounters/procedures/create_stored_procedures_hlr.sql "
mysql -hds2_0 -P15021 --user=mysql --password=mysql -e " source
/home/cudb/oam/performanceMgmt/appCounters/procedures/create_stored_procedures_hlr.sql "
mysql -hds2_1 -P15022 --user=mysql --password=mysql -e " source
/home/cudb/oam/performanceMgmt/appCounters/procedures/create_stored_procedures_hlr.sql "
mysql -hds2_0 -P15020 --user=mysql --password=mysql -e " source
/home/cudb/oam/performanceMgmt/appCounters/procedures/create_stored_procedures_auc.sql "
mysql -hds2_1 -P15020 --user=mysql --password=mysql -e " source
/home/cudb/oam/performanceMgmt/appCounters/procedures/create_stored_procedures_auc.sql "
mysql -hds2_0 -P15021 --user=mysql --password=mysql -e " source
/home/cudb/oam/performanceMgmt/appCounters/procedures/create_stored_procedures_auc.sql "
mysql -hds2_1 -P15022 --user=mysql --password=mysql -e " source
/home/cudb/oam/performanceMgmt/appCounters/procedures/create_stored_procedures_auc.sql "

#DSG-3:
mysql -hds3_0 -P15030 --user=mysql --password=mysql -e " source
/home/cudb/oam/performanceMgmt/appCounters/procedures/create_stored_procedures_hlr.sql "
mysql -hds3_1 -P15030 --user=mysql --password=mysql -e " source
/home/cudb/oam/performanceMgmt/appCounters/procedures/create_stored_procedures_hlr.sql "
mysql -hds3_0 -P15031 --user=mysql --password=mysql -e " source
/home/cudb/oam/performanceMgmt/appCounters/procedures/create_stored_procedures_hlr.sql "
mysql -hds3_1 -P15032 --user=mysql --password=mysql -e " source
/home/cudb/oam/performanceMgmt/appCounters/procedures/create_stored_procedures_hlr.sql "
mysql -hds3_0 -P15030 --user=mysql --password=mysql -e " source
/home/cudb/oam/performanceMgmt/appCounters/procedures/create_stored_procedures_auc.sql "
mysql -hds3_1 -P15030 --user=mysql --password=mysql -e " source
/home/cudb/oam/performanceMgmt/appCounters/procedures/create_stored_procedures_auc.sql "
mysql -hds3_0 -P15031 --user=mysql --password=mysql -e " source
/home/cudb/oam/performanceMgmt/appCounters/procedures/create_stored_procedures_auc.sql "
mysql -hds3_1 -P15032 --user=mysql --password=mysql -e " source
/home/cudb/oam/performanceMgmt/appCounters/procedures/create_stored_procedures_auc.sql "

#DSG-4:
mysql -hds4_0 -P15040 --user=mysql --password=mysql -e " source
/home/cudb/oam/performanceMgmt/appCounters/procedures/create_stored_procedures_hlr.sql "
mysql -hds4_1 -P15040 --user=mysql --password=mysql -e " source
/home/cudb/oam/performanceMgmt/appCounters/procedures/create_stored_procedures_hlr.sql "
mysql -hds4_0 -P15041 --user=mysql --password=mysql -e " source
/home/cudb/oam/performanceMgmt/appCounters/procedures/create_stored_procedures_hlr.sql "
mysql -hds4_1 -P15042 --user=mysql --password=mysql -e " source
/home/cudb/oam/performanceMgmt/appCounters/procedures/create_stored_procedures_hlr.sql "
mysql -hds4_0 -P15040 --user=mysql --password=mysql -e " source
/home/cudb/oam/performanceMgmt/appCounters/procedures/create_stored_procedures_auc.sql "
mysql -hds4_1 -P15040 --user=mysql --password=mysql -e " source
/home/cudb/oam/performanceMgmt/appCounters/procedures/create_stored_procedures_auc.sql "
mysql -hds4_0 -P15041 --user=mysql --password=mysql -e " source
/home/cudb/oam/performanceMgmt/appCounters/procedures/create_stored_procedures_auc.sql "
mysql -hds4_1 -P15042 --user=mysql --password=mysql -e " source
/home/cudb/oam/performanceMgmt/appCounters/procedures/create_stored_procedures_auc.sql "

#to clean the alarms:


/home/cudb/oam/performanceMgmt/appCounters/scripts/appCounters.cron

########################################################################################################
#############################

#LDAP Search:

ldapsearch -x -H ldap://PL0 -D cudbUser=HLR,ou=admin,dc=mcel -w HLRUSER1 -b


IMSI=643012205447126,dc=IMSI,ou=identities,dc=mcel -s sub -a always

ldapsearch -x -H ldap://PL0 -D cudbUser=HLR,ou=admin,dc=mcel -w HLRUSER1 -b


MSISDN=258822173114,dc=MSISDN,ou=identities,dc=mcel -s sub -a always

#check ldap users:


ldapsearch -h PL0 -x -D "cudbUser=HLR,ou=admin,dc=mcel" -w HLRUSER1 -b "dc=mcel" -s base

########################################################################################################
#############################

#LOCK/UNLOCK blades:
#login to dmx:
ssh expert@10.201.4.150 -p2024 -c 3des-cbc
#or
ssh expert@10.202.4.150 -p2024 -c 3des-cbc
#enter to configure mode
configure
#lock/unlock blades

set ManagedElement 1 DmxFunctions 1 BladeGroupManagement 1 Group CUDB ShelfSlot 0-1 Blade 1 administrativeState
locked
set ManagedElement 1 DmxFunctions 1 BladeGroupManagement 1 Group CUDB ShelfSlot 0-1 Blade 1 administrativeState
unlocked
#0-1 0-3 0-5 0-7 0-9 0-11 0-13 0-15 0-17 0-19 0-21 0-23
#1-1 1-3 1-5 1-7 1-9 1-11 1-13 1-15 1-17 1-19 1-21 1-23

set ManagedElement 1 DmxFunctions 1 BladeGroupManagement 1 Group DMX ShelfSlot 0-0 Blade 1 administrativeState
locked
set ManagedElement 1 DmxFunctions 1 BladeGroupManagement 1 Group DMX ShelfSlot 0-0 Blade 1 administrativeState
unlocked
#0-0 0-25 0-26 0-28 1-0 1-25 1-26 1-28

commit

########################################################################################################
#############################
#restart LDAP processes (use only on that case, if Ericsson Support team asks you)
#kill:
#for a in `seq 7 14`; do ssh PL_2_$a pkill -9 slapd; done

#print:

#for a in `seq 7 14`; do ssh PL_2_$a ps -ef | grep -ia 'slapd '; done

#start:

#for a in `seq 7 14`; do ssh PL_2_$a chmod a+x /opt/ericsson/cudb/ldapfe/libexec/slapd ; done

########################################################################################################
#############################

Node degraded “DSG4 Cluster (82%) ...........................NOK: degraded”

Checking Clusters status:


Node 11:
PL Cluster (21%) .............................OK
DSG1 Cluster (86%) ...........................OK
DSG2 Cluster (83%) ...........................OK
DSG3 Cluster (87%) ...........................OK
DSG4 Cluster (87%) ...........................OK
Node 21:
PL Cluster (21%) .............................OK
DSG1 Cluster (84%) ...........................OK
DSG2 Cluster (81%) ...........................OK
DSG3 Cluster (82%) ...........................OK
[-W-] DSG4 Cluster (82%) ...........................NOK: degraded

- DSG memory usage on DSG1, DSG3 and DSG4 in CUDB11 are also on the high side. There are alarms also
reporting this since February:
[Feb 09 15:40:04]( Storage Engine (DS-group #4) memory usage at Warning level. )
[Feb 09 15:40:04]( Storage Engine (DS-group #1) memory usage at Warning level. )
[May 13 14:15:08]( Storage Engine (DS-group #3) memory usage at Warning level. )

- If one DSG is reported faulty, the automatic backup will fail. In our case DSG4 is reported faulty:
CUDB_21 SC_2_2# sh -x /cluster/home/cudb//ExecuteBackup.sh
+ PATH_SWBACKUP=/opt/ericsson/cudb/OAM
++ cat /etc/nodeid
+ bladeNumber=2
+ bladeCmwHaState=
+ case ${bladeNumber} in
++ immlist -a saAmfSISUHAState 'safSISU=safSu=Cmw2\,safSg=2N\,safApp=ERIC-
ComSa,safSi=2N,safApp=ERIC-ComSa'
++ cut -d= -f2
+ bladeCmwHaState=1
+ '[' 1 -eq 1 ']'
+ /bin/bash -l cudbDataBackup -q -L
/home/cudb/systemDataBackup
Listening for current PLDB and DSGs status reports (may take upto 2 minutes)
ERROR: DSG 4 does not have an eligible replica from which to take the backup
ERROR: Not all the DSGs have an eligible replica from which to take the backup
CUDB_21 SC_2_2#
! Solution

Please perform a backup and restore for DSG4.


CUDB_21 SC_2_2#cudbUnitDataBackupAndRestore -d 4 -n 21
Problema de counters

Execute it on all PLDB or DSG SQL access servers for every SQL definition counter file as follows:
 Establish a new administrative CUDB CLI session towards one of the SC blades of the target
CUDBnode: ssh <admin_user>@<CUDB_Node_OAM_VIP_Address>

 In case the counter belongs to the PLDB, execute the following commands:
shell> mysql -h PL0 -P 15000 --user=<user_name> --password=<password> <
<applicationCountersProcedure.sql>

shell> mysql -h PL1 -P 15000 --user=<user_name> --password=<password> <


<applicationCountersProcedure.sql>
In the above example, <user_name> and <password> stand for the user name and password for the
MySQL access servers.

 In case the counter belongs to a DSG, execute the following command for the two access servers
in the DSGcluster:
shell> mysql -h <DsHostName> -P <AccessServerPort> --user=<user_name> --
password=<password> < <applicationCountersProcedure.sql>

In the above example, <DsHostName> is a DSG blade, while <AccessServerPort> is the access port set in
the configuration model through the <accessPort> attribute of the <CudbDsGroup> object.
<user_name> and <password> stand for the user name and password for the MySQL access servers.
In our case, faulty counters are in the DSG1 and DSG4. Pleas perform below actions:
mysql -h DS3_0 -P 15031 --user=mysql --password=mysql <
/cluster/software/app_counters/software_records_HLR/Application_Counters_CXC/SQL/crea
te_stored_procedures_hlr.sql
mysql -h DS3_1 -P 15032 --user=mysql --password=mysql <
/cluster/software/app_counters/software_records_HLR/Application_Counters_CXC/SQL/crea
te_stored_procedures_hlr.sql
mysql -h DS4_0 -P 15041 --user=mysql --password=mysql <
/cluster/software/app_counters/software_records_HLR/Application_Counters_CXC/SQL/crea
te_stored_procedures_hlr.sql
mysql -h DS4_1 -P 15042 --user=mysql --password=mysql <
/cluster/software/app_counters/software_records_HLR/Application_Counters_CXC/SQL/crea
te_stored_procedures_hlr.sql

mysql -h DS3_0 -P 15031 --user=mysql --password=mysql <


/cluster/software/app_counters/software_records_HLR/Application_Counters_CXC/SQL/crea
te_stored_procedures_auc.sql
mysql -h DS3_1 -P 15032 --user=mysql --password=mysql <
/cluster/software/app_counters/software_records_HLR/Application_Counters_CXC/SQL/crea
te_stored_procedures_auc.sql
mysql -h DS4_0 -P 15041 --user=mysql --password=mysql <
/cluster/software/app_counters/software_records_HLR/Application_Counters_CXC/SQL/crea
te_stored_procedures_auc.sql
mysql -h DS4_1 -P 15042 --user=mysql --password=mysql <
/cluster/software/app_counters/software_records_HLR/Application_Counters_CXC/SQL/crea
te_stored_procedures_auc.sql

mysql -h DS3_0 -P 15031 --user=mysql --password=mysql <


/cluster/software/app_counters/software_records_HLR/Application_Counters_CXC/SQL/crea
te_stored_procedures_nph.sql
mysql -h DS3_1 -P 15032 --user=mysql --password=mysql <
/cluster/software/app_counters/software_records_HLR/Application_Counters_CXC/SQL/crea
te_stored_procedures_nph.sql
mysql -h DS4_0 -P 15041 --user=mysql --password=mysql <
/cluster/software/app_counters/software_records_HLR/Application_Counters_CXC/SQL/crea
te_stored_procedures_nph.sql
mysql -h DS4_1 -P 15042 --user=mysql --password=mysql <
/cluster/software/app_counters/software_records_HLR/Application_Counters_CXC/SQL/crea
te_stored_procedures_nph.sql

!Waiting for feedback soon.

CUDB_21 SC_2_2# cudbSystemStatus


Checking Clusters status:
Node 11:
PL Cluster (19%) .............................OK
DSG1 Cluster (86%) ...........................OK
DSG2 Cluster (78%) ...........................OK
DSG3 Cluster (87%) ...........................OK
DSG4 Cluster (87%) ...........................OK
Node 21:
[-W-] PL Cluster (14%) .............................NOK: degraded
DSG1 Cluster (77%) ...........................OK
DSG2 Cluster (81%) ...........................OK
DSG3 Cluster (82%) ...........................OK
DSG4 Cluster (82%) ...........................OK

Checking NDB status:


[-W-] PL NDB's (3/4) ...............................NOK: degraded
DS1 NDB's (2/2) ..............................OK
DS2 NDB's (2/2) ..............................OK
DS3 NDB's (2/2) ..............................OK
DS4 NDB's (2/2) ..............................OK

Smartcontrol

UDB_21 PL_2_6# smartctl -a /dev/sdb


smartctl 5.39 2008-10-24 22:33 [x86_64-suse-linux-gnu] (openSUSE RPM)
Copyright (C) 2002-8 by Bruce Allen, http://smartmontools.sourceforge.net

=== START OF INFORMATION SECTION ===


Device Model: TOSHIBA MQ01ABF050
Serial Number: 35JLWCI2T
Firmware Version: AM0P1A
User Capacity: 500,107,862,016 bytes
Device is: Not in smartctl database [for details use: -P showall]
ATA Version is: 8
ATA Standard is: Exact ATA specification draft version not indicated
Local Time is: Thu May 2 12:29:38 2019 CAT
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

Enable smartcontrol
CUDB_21 PL_2_6# smartctl --smart=on --offlineauto=on --saveauto=on /dev/sdb
smartctl 5.39 2008-10-24 22:33 [x86_64-suse-linux-gnu] (openSUSE RPM)
Copyright (C) 2002-8 by Bruce Allen, http://smartmontools.sourceforge.net

=== START OF ENABLE/DISABLE COMMANDS SECTION ===


SMART Enabled.
SMART Attribute Autosave Enabled.
SMART Automatic Offline Testing Enabled every four hours.

CUDB_21 PL_2_13# smartctl -a /dev/sdb


smartctl 5.39 2008-10-24 22:33 [x86_64-suse-linux-gnu] (openSUSE RPM)
Copyright (C) 2002-8 by Bruce Allen, http://smartmontools.sourceforge.net

Smartctl open device: /dev/sdb failed: No such device


CUDB_21 PL_2_13# smartctl -a /dev/sdb
smartctl 5.39 2008-10-24 22:33 [x86_64-suse-linux-gnu] (openSUSE RPM)
Copyright (C) 2002-8 by Bruce Allen, http://smartmontools.sourceforge.net

Smartctl open device: /dev/sdb failed: No such device