You are on page 1of 55

WCDMA RAN, Rel.

RU40,
Operating Documentation,
Issue 05
Replacing Multicontroller
RNC Hardware Units
DN09109953
Issue 02A
Approval Date 2014-02-10

ReplacingMulticontrollerRNCHardwareUnits

Theinformationinthisdocumentissubjecttochangewithoutnoticeanddescribesonlytheproduct
defined in the introduction of this documentation. This documentation is intended for the use of
NokiaSolutionsandNetworkscustomersonlyforthepurposesoftheagreementunderwhichthe
documentissubmitted,andnopartofitmaybeused,reproduced,modifiedortransmittedinany
formormeanswithoutthepriorwrittenpermissionofNokiaSolutionsandNetworks.Thedocumentationhasbeenpreparedtobeusedbyprofessionalandproperlytrainedpersonnel,andthecustomerassumesfullresponsibilitywhenusingit.NokiaSolutionsandNetworkswelcomescustomer
commentsaspartoftheprocessofcontinuousdevelopmentandimprovementofthedocumentation.
The information or statements given in this documentation concerning the suitability, capacity, or
performanceofthementionedhardwareorsoftwareproductsaregiven"asis"andallliabilityarisinginconnectionwithsuchhardwareorsoftwareproductsshallbedefinedconclusivelyandfinally
in a separate agreement between Nokia Solutions and Networks and the customer. However,
NokiaSolutionsandNetworkshasmadeallreasonableeffortstoensurethattheinstructionscontained in the document are adequate and free of material errors and omissions. Nokia Solutions
and Networks will, if deemed necessary by Nokia Solutions and Networks, explain issues which
maynotbecoveredbythedocument.
NokiaSolutionsandNetworkswillcorrecterrorsinthisdocumentationassoonaspossible.INNO
EVENT WILL Nokia Solutions and Networks BE LIABLE FOR ERRORS IN THIS DOCUMENTATION OR FOR ANY DAMAGES, INCLUDING BUT NOT LIMITED TO SPECIAL, DIRECT, INDIRECT,INCIDENTALORCONSEQUENTIALORANYLOSSES,SUCHASBUTNOTLIMITEDTO
LOSSOFPROFIT,REVENUE,BUSINESSINTERRUPTION,BUSINESSOPPORTUNITYORDATA,THATMAYARISEFROMTHEUSEOFTHISDOCUMENTORTHEINFORMATIONINIT.
Thisdocumentationandtheproductitdescribesareconsideredprotectedbycopyrightsandother
intellectualpropertyrightsaccordingtotheapplicablelaws.
NSN is a trademark of Nokia Solutions and Networks. Nokia is a registered trademark of Nokia
Corporation.Otherproductnamesmentionedinthisdocumentmaybetrademarksoftheirrespectiveowners,andtheyarementionedforidentificationpurposesonly.
CopyrightNokiaSolutionsandNetworks2014.Allrightsreserved

Important Notice on Product Safety


Thisproductmaypresentsafetyrisksduetolaser,electricity,heat,andothersourcesof
danger.
Only trained and qualified personnel may install, operate, maintain or otherwise handle
this product and only after having carefully read the safety information applicable to this
product.
ThesafetyinformationisprovidedintheSafetyInformationsectionintheLegal,Safety
andEnvironmentalInformationpartofthisdocumentordocumentationset.

NokiaSolutionsandNetworksiscontinuallystrivingtoreducetheadverseenvironmentaleffectsof
itsproductsandservices.Wewouldliketoencourageyouasourcustomersanduserstojoinusin
working towards a cleaner, safer environment. Please recycle product packaging and follow the
recommendationsforpoweruseandproperdisposalofourproductsandtheircomponents.
IfyoushouldhavequestionsregardingourEnvironmentalPolicyoranyoftheenvironmentalservicesweoffer,pleasecontactusatNokiaSolutionsandNetworksforanyadditionalinformation.

DN09109953

Issue:02A

ReplacingMulticontrollerRNCHardwareUnits

Table of Contents
Thisdocumenthas55pages

Summaryofchanges..................................................................... 6

Issue:02A

1
1.1
1.1.1
1.1.1.1
1.1.2
1.1.2.1

Replacingthefaultychassisinarunningsystem.......................... 7
Replacingafaultychassis..............................................................7
Removingthefaultychassisfromtherunningsystem...................7
Steps............................................................................................. 7
Installingthenewchassis.............................................................11
Steps........................................................................................... 11

2
2.1
2.1.1
2.2
2.2.1

ReplacingtheharddiskdriveonharddiskdrivecarrierAMC..... 16
Removingthefaultyharddiskdrive............................................. 16
Steps........................................................................................... 17
Installingthenewharddiskdrive................................................. 18
Steps........................................................................................... 18

3
3.1
3.2

ReplacingthefailedharddiskdrivesonbothCFPUnodes........23
Removingthefaultyharddiskdrives........................................... 23
Installingthenewharddiskdrives............................................... 24

4
4.1
4.2

ReplacinganAMC....................................................................... 27
RemovinganAMC....................................................................... 27
InstallinganAMC......................................................................... 28

5
5.1
5.1.1
5.2
5.2.1

Replacingafanmodule............................................................... 30
Removingafanmodule............................................................... 30
Steps........................................................................................... 30
Installingafanmodule................................................................. 30
Steps........................................................................................... 31

6
6.1
6.1.1
6.2
6.2.1

Replacinganadd-incard............................................................. 32
Removinganadd-incard............................................................. 32
Steps........................................................................................... 32
Installinganadd-incard............................................................... 37
Steps........................................................................................... 37

7
7.1
7.1.1
7.2
7.2.1

Replacingapowerdistributionunit.............................................. 39
Removingapowerdistributionunit(PDU)................................... 39
Steps........................................................................................... 40
Installingapowerdistributionunit(PDU)..................................... 40
Steps........................................................................................... 41

8
8.1
8.1.1

Replacingapowersupplyunit..................................................... 43
Removingapowersupplyunit..................................................... 43
Steps........................................................................................... 43

DN09109953

ReplacingMulticontrollerRNCHardwareUnits

8.2
8.2.1

Installingapowersupplyunit....................................................... 44
Steps........................................................................................... 44

Replacingtheairfilter.................................................................. 45

10

Dealingwithsensoralarms.......................................................... 46

11

CommunicationbetweenactiveandstandbyunitsinaBCN
clusterfails................................................................................... 54
Description................................................................................... 54
Symptoms.................................................................................... 54
Recoveryprocedures................................................................... 54

11.1
11.2
11.3

DN09109953

Issue:02A

ReplacingMulticontrollerRNCHardwareUnits

List of Figures

Issue:02A

Figure1

TheSAS/SATAswitchintheHDSAM-A............................................. 16

Figure2

TheharddiskdriveontheharddiskdrivecarrierAMC..................... 18

Figure3

InstallingaharddiskdriveontheharddiskdrivecarrierAMC.......... 19

Figure4

TheSAS/SATAswitchintheHDSAM-A............................................. 23

Figure5

TheharddiskdriveontheharddiskdrivecarrierAMC..................... 24

Figure6

InstallingaharddiskdriveontheharddiskdrivecarrierAMC.......... 25

Figure7

PullingthehotswaphandleofanAMC..............................................27

Figure8

RemovinganAMCfromtheBCNmodule..........................................28

Figure9

InsertinganAMCintotheBCNmodule..............................................28

Figure10

Pressingthehotswaphandle............................................................ 29

Figure11

BCNtopcoverscrews........................................................................ 35

Figure12

RemovingtheBCNtopcover............................................................. 36

Figure13

BCNadd-incardscrews.....................................................................36

Figure14

Pullinganadd-incardoutfromtheBCNmodule............................... 36

Figure15

Insertinganadd-incardintoBCNmodule..........................................37

Figure16

BCNadd-incardscrews.....................................................................37

Figure17

InstallingBCNtopcover..................................................................... 38

Figure18

Powerdistributionunitsinthecabinet................................................ 39

Figure19

ReplacingaPDU................................................................................ 40

Figure20

InstallingPDUtothecabinet.............................................................. 41

Figure21

PDUgroundingcable......................................................................... 41

Figure22

Unscrewingthetwothumbscrews...................................................... 45

Figure23

Openningtheairfiltercoverandpullingouttheairfilter.................... 45

Figure24

PositionsofthePSUsandfantrays................................................... 53

DN09109953

Summaryofchanges

ReplacingMulticontrollerRNCHardwareUnits

Summary of changes
Changesbetweendocumentissuesarecumulative.Therefore,thelatestdocument
issuecontainsallchangesmadetopreviousissues.
SeeGuide to WCDMA RAN and I-HSPA Documentation.
Changes made between issues 02 (RU40) and 02A (RU40)
Instructionsforagracefulshutdownhavebeenaddedwhenreplacinganadd-incard.
Changes made between issues 01B (RU30) and 02 (RU40)
InstructionsapplytobothBCN-AandBCN-Bhardware.Theexampledisplayoutputs
havebeenupdatedandmayvaryslightlyasaresult.
Changes made between issues 01A (RU30) and 01B (RU30)
ReplacingtheharddiskdriveonharddiskdrivecarrierAMChasbeenupdatedto
includeverificationsteps.
hasbeenadded.

DN09109953

Issue:02A

ReplacingMulticontrollerRNCHardwareUnits

Replacingthefaultychassisinarunningsystem

1 Replacing the faulty chassis in a running


system
1.1 Replacing a faulty chassis
Purpose
Inamultichassisenvironment(twoormorechassis),iftheexistingchassisisfaulty,you
needtoremovethechassisandreplaceanewchassisinitsplace.
Before you start
Ensurethat:

Theembeddedsoftwareisupgradedtotherequiredversioninthereplacement
chassis.
TheInitialLMPsettingsareproperlyconfiguredforthereplacementchassis,suchas
switchconfiguration,backplaneresiliencyconfiguration.Formoreinformation,see
Commissioning Multicontroller RNC.

1.1.1 Removing the faulty chassis from the running system


1.1.1.1

Steps

Identify the chassis to be replaced in the running system.


Inthissection,thechassistobereplacedreferstothechassis-2.

Check all the nodes that are running in the chassis to be replaced.
Tocheckalltherunningnodespresentinthechassistobereplaced,enterthe
followingcommand:show hardware state list
Thefollowingoutputisdisplayed:
root@CFPU-0 [RNC-89]

> show hardware state list

cabinet-1
: unit
/cabinet-1
chassis-1
: unit
/cabinet-1/chassis-1
chassis-2
: unit
/cabinet-1/chassis-2
LMP-1-1-1
: node available /cabinet-1/chassis-1/piu-1
LMP-1-2-1
: node available /cabinet-1/chassis-2/piu-1
CFPU-0
: node available /cabinet-1/chassis-1/piu-1/addin-1/CPU1/core -0,1,10,2,3,4,5,6,7,8,9
CSPU-0
: node available /cabinet-1/chassis-1/piu-1/addin-2/CPU1/core -0,1,2,3,4,5
USPU-0
: node available /cabinet-1/chassis-1/piu-1/addin-3/CPU1/core -0,1,2,3,4
EIPU-0
: node available /cabinet-1/chassis-1/piu-1/addin-4/CPU1/core -0,1,2,3,4,5
CSPU-2
: node available /cabinet-1/chassis-1/piu-1/addin-5/CPU1/core -0,1,2,3,4,5
USPU-2
: node available /cabinet-1/chassis-1/piu-1/addin-6/CPU1/core -0,1,2,3,4
USPU-4
: node available /cabinet-1/chassis-1/piu-1/addin-7/CPU-

Issue:02A

DN09109953

Replacingthefaultychassisinarunningsystem

1/core
EIPU-2
1/core
USSR-0
1/core
CSUP-0
1/core
USUP-0
1/core
EITP-0
1/core
CSUP-2
1/core
USUP-2
1/core
USUP-4
1/core
EITP-2
1/core
CFPU-1
1/core
CSPU-1
1/core
USPU-1
1/core
EIPU-1
1/core
CSPU-3
1/core
USPU-3
1/core
USPU-5
1/core
EIPU-3
1/core
USSR-1
1/core
CSUP-1
1/core
USUP-1
1/core
EITP-1
1/core
CSUP-3
1/core
USUP-3
1/core
USUP-5
1/core
EITP-3
1/core
cluster

ReplacingMulticontrollerRNCHardwareUnits

-0,1,2,3,4
: node
-0,1,2,3,4,5
: node
-11
: node
-10,11,6,7,8,9
: node
-10,11,5,6,7,8,9
: node
-10,11,6,7,8,9
: node
-10,11,6,7,8,9
: node
-10,11,5,6,7,8,9
: node
-10,11,5,6,7,8,9
: node
-10,11,6,7,8,9
: node

available /cabinet-1/chassis-1/piu-1/addin-8/CPUavailable /cabinet-1/chassis-1/piu-1/addin-1/CPUavailable /cabinet-1/chassis-1/piu-1/addin-2/CPUavailable /cabinet-1/chassis-1/piu-1/addin-3/CPUavailable /cabinet-1/chassis-1/piu-1/addin-4/CPUavailable /cabinet-1/chassis-1/piu-1/addin-5/CPUavailable /cabinet-1/chassis-1/piu-1/addin-6/CPUavailable /cabinet-1/chassis-1/piu-1/addin-7/CPUavailable /cabinet-1/chassis-1/piu-1/addin-8/CPUavailable /cabinet-1/chassis-2/piu-1/addin-1/CPU-

: node available /cabinet-1/chassis-2/piu-1/addin-2/CPU: node available /cabinet-1/chassis-2/piu-1/addin-3/CPU: node available /cabinet-1/chassis-2/piu-1/addin-4/CPU: node available /cabinet-1/chassis-2/piu-1/addin-5/CPU: node available /cabinet-1/chassis-2/piu-1/addin-6/CPU: node available /cabinet-1/chassis-2/piu-1/addin-7/CPU: node available /cabinet-1/chassis-2/piu-1/addin-8/CPU: node available /cabinet-1/chassis-2/piu-1/addin-1/CPU: node available /cabinet-1/chassis-2/piu-1/addin-2/CPU: node available /cabinet-1/chassis-2/piu-1/addin-3/CPU: node available /cabinet-1/chassis-2/piu-1/addin-4/CPU: node available /cabinet-1/chassis-2/piu-1/addin-5/CPU: node available /cabinet-1/chassis-2/piu-1/addin-6/CPU: node available /cabinet-1/chassis-2/piu-1/addin-7/CPU: node available /cabinet-1/chassis-2/piu-1/addin-8/CPU: cluster available

TheoutputprovidesinformationthatnodesCFPU-1,CSPU-1,USPU-1,EIPU-1,
CSPU-3,USPU-3,USPU-5,EIPU-3,USSR-1,CSUP-1,USUP-1,EITP-1,CSUP-3,
USUP-3,USUP-5,EITP-3arepresentinthechassistobereplaced.

DN09109953

Issue:02A

ReplacingMulticontrollerRNCHardwareUnits

Replacingthefaultychassisinarunningsystem

Check if the current SCLI session is running on a node located in the chassis
to be replaced.
IftheSCLIsessionisrunningonanode(nodenameidentifiedbytheprompt)
locatedinthechassistobereplaced.Then,performaswitchoverforthe/SSH
recoverygroupinordertohavetheconnectivitytotheclusterduringchassis
replacement.Toperformaswitchoverforthe/SSHrecoverygroup,enterthe
followingcommand:
set has switchover force managed-object /SSH

TheSSHconnectionbreakswhentheswichovercommandisexecuted,and
theSSHsessionmustbestartedagain.

Disable cluster manager nodes located in the chassis to be replaced.


Toidentifythenodesconfiguredasaclustermanager,enterthefollowingcommand:
show has view managed-object /ClusterHA
Thefollowingoutputisdisplayed:
/ClusterHA:
RecoveryGroup /ClusterHA
specialConstraints=(serviceInterruptionDenied)
RecoveryUnit /CFPU-0/FSClusterHAServer
recoveryUnitType=(ClusterManagerRecoveryUnit)
Process /CFPU-0/FSClusterHAServer/HASClusterManager
command=(/opt/nokiasiemens/SS_CFM/bin/HASClusterManager)
status=(nonHA)
startMethod=(always)
severity=(important)
RecoveryUnit /CFPU-1/FSClusterHAServer
recoveryUnitType=(ClusterManagerRecoveryUnit)
Process /CFPU-1/FSClusterHAServer/HASClusterManager
command=(/opt/nokiasiemens/SS_CFM/bin/HASClusterManager)
status=(nonHA)
startMethod=(always)
severity=(important)

Theoutputindicatesthat/CluserHArecoverygrouphasrecoveryunitsforCFPU-0
andCFPU-1nodes.Hence,theCFPU-0andCFPU-1nodesareconfiguredas
clustermanagernodes.
Fromtheoutputofstep4,itisobservedthatoneclustermanagernode(CFPU-1)is
locatedinthechassistobereplaced.Hence,thefollowingstepsmustbeexecuted:
a) DisabletheClusterManagementFunctionality(CMF)ontheCFPU-1node.Enter
thefollowingcommand:
set cmf disable node-name /CFPU-1
Thefollowingoutputisdisplayed:
Cluster management functionality disabled on host CFPU-1
b) CheckCFPU-1nodewhereCMFwasdisabledhasCMF-DISABLEDstatusand
theotherclustermanagernode(inthiscaseCFPU-0node)hasCMF-SERVING
status.Enterthefollowingcommand:
show cmf status node-name /CFPU-1
Thefollowingoutputmustbedisplayed:
CFPU-1: CMF-DISABLED
CFPU-0: CMF-SERVING

Issue:02A

priority: 6
priority: 5

DN09109953

Replacingthefaultychassisinarunningsystem

ReplacingMulticontrollerRNCHardwareUnits

Lock all the managed objects in the chassis to be replaced.


Tolockallthemanagedobjects,enterthefollowingcommand:
set has lock managed-object <mo-name1> <mo-name2>...

TheSEnodesarenotmanagedandthereforetheyarenotlocked.
Example
Tolockallthemanagednodesinchassis-2,enterthefollowingcommands:
set has lock managed-object /CFPU-1 /CSPU-1 /USPU-1 /EIPU-1
/CSPU-3 /USPU-3 /USPU-5 /EIPU-3
Ifthenodesaresuccessfullylocked,thefollowingoutputisdisplayed:
root@CFPU-0 [RNC-1002] > set has lock managed-object /CFPU-1 /CSPU-1 /USPU-1
/EIPU-1 /CSPU-3 /USPU-3 /USPU-5 /EIPU-3
/CFPU-1 locked successfully, 1 services activated on standby node(s)
/CSPU-1 locked successfully, 1 services activated on standby node(s)
/USPU-1 locked successfully
/EIPU-1 locked successfully, 3 services activated on standby node(s)
/CSPU-3 locked successfully, 1 services activated on standby node(s)
/USPU-3 locked successfully
/USPU-5 locked successfully
/EIPU-3 locked successfully, 3 services activated on standby node(s)

Power off all the managed objects in the chassis to be replaced.


Topoweroffthemanagedobjects,enterthefollowingcommand:
set has power off managed-object <mo-name1> <mo-name2>...
Example
Topoweroffallthemanagednodesinchassis-2,enterthefollowingcommands:
set has power off managed-object /CFPU-1 /CSPU-1 /USPU-1
/EIPU-1 /CSPU-3 /USPU-3 /USPU-5 /EIPU-3
Ifthenodesaresuccessfullypoweredoff,thefollowingoutputisdisplayed:
/CFPU-1
/CSPU-1
/USPU-1
/EIPU-1
/CSPU-3
/USPU-3
/USPU-5
/EIPU-3

is
is
is
is
is
is
is
is

powered
powered
powered
powered
powered
powered
powered
powered

OFF
OFF
OFF
OFF
OFF
OFF
OFF
OFF

successfully
successfully
successfully
successfully
successfully
successfully
successfully
successfully

Verify all the managed objects in the chassis to be replaced are powered off.
Toverifytheavailabilityofamanagednode,enterthefollowingcommand:
show has state availability managed-object <mo-name>
Example
Toverifythattheavailabilitystatusofallmanagednodesinchassis-2are
availability (POWEROFF),enterthefollowingcommands:
show has state availability managed-object /CFPU-1 /CSPU-1 \
/CSPU-3 /EIPU-1 /EIPU-3 /USPU-1 /USPU-3 /USPU-5

10

DN09109953

Issue:02A

ReplacingMulticontrollerRNCHardwareUnits

Replacingthefaultychassisinarunningsystem

Thefollowingoutputmustbedisplayed:
OBJECT
/CFPU-1
/CSPU-1
/CSPU-3
/EIPU-1
/EIPU-3
/USPU-1
/USPU-3
/USPU-5

AVAILABILITY
POWEROFF
POWEROFF
POWEROFF
POWEROFF
POWEROFF
POWEROFF
POWEROFF
POWEROFF

Disconnect all the cables connected to the chassis to be replaced.


Todisconnectthecablesconnectedtothechassistobereplaced,followthesesteps:
a) IfthereisPDUusedwiththeBCNmodule,
Then
Switch off the circuit breaker on the PDU for the BCN module in question.
b) Disconnectthepowerfeedcables.
c) Disconnectallthecablesfromthetransceiversonthefrontsideofthechassis.
d) Disconnect the BCN grounding cable.
e) Keep the network cables attached to the front cable tray of the chassis.
f) UninstallthecabletraywithattachednetworkcablesfromtheBCNmodule.
Thecabletrayisuninstalledbyunscrewingthetwothumbscrewsfixingthecable
traytotheBCNmodule.Ifthescrewsaretootighttobeopenedbyhand,
looseningthescrewsthatfixtheBCNmodulemountingflangestothecabinet
mighthelp.Formoreinformationaboutdetachingthecabletray,refertothe
documentInstalling BCN Modules to the IR206 Cabinet.
g) Move the cable tray with attached network cables under the module, so
themodule can be easily pulled out from the rack.

Remove the HDD AMC from the AMC bay.


IfthechassishasanAMCslotremovetheHDDAMC,followtheinstructionsinthe
section,ReplacinganAMC.

10 Remove the chassis to be replaced from the rack.

1.1.2 Installing the new chassis


1.1.2.1

Steps

Insert the new chassis in the rack.

Insert the AMC back in the AMC bay.


IfthechassishadanHDDAMCequippedintheAMCbay,theninserttheremoved
HDDAMCintothesameslotofthereplacementchassis.Followtheinstructionsin
thesection,ReplacinganAMC.

Issue:02A

DN09109953

11

Replacingthefaultychassisinarunningsystem

ReplacingMulticontrollerRNCHardwareUnits

Connect all the cables to the new chassis.


a) InstallthecabletraywithattachednetworkcablesbacktotheBCNmodule.
Formoreinformationaboutthecabletrayinstallation,checkthedocument
Installing BCN Modules to the IR206 Cabinet.
b) Connect the BCN grounding cable.
c) Connect the network cables back to the transceivers on the front side of
themodule.
d) Connect the power feed cables.
e) IfthereisPDUusedwiththeBCNmodule,
Then
Switch on the circuit breaker on the PDU for the BCN module in question.

Check that the LMP and all nodes of the new chassis are available.
TocheckthattheLMPandallnodesofthenewchassisareavailable,enterthe
followingcommand:
show hardware state list
Thefollowingoutputisdisplayed:
root@CFPU-0 [RNC-89]

> show hardware state list

cabinet-1
: unit
chassis-1
: unit
chassis-2
: unit
LMP-1-1-1
: node available
LMP-1-2-1
: node available
CFPU-0
: node available
1/core -0,1,10,2,3,4,5,6,7,8,9
CSPU-0
: node available
1/core -0,1,2,3,4,5
USPU-0
: node available
1/core -0,1,2,3,4
EIPU-0
: node available
1/core -0,1,2,3,4,5
CSPU-2
: node available
1/core -0,1,2,3,4,5
USPU-2
: node available
1/core -0,1,2,3,4
USPU-4
: node available
1/core -0,1,2,3,4
EIPU-2
: node available
1/core -0,1,2,3,4,5
USSR-0
: node available
1/core -11
CSUP-0
: node available
1/core -10,11,6,7,8,9
USUP-0
: node available
1/core -10,11,5,6,7,8,9
EITP-0
: node available
1/core -10,11,6,7,8,9
CSUP-2
: node available
1/core -10,11,6,7,8,9
USUP-2
: node available
1/core -10,11,5,6,7,8,9
USUP-4
: node available
1/core -10,11,5,6,7,8,9
EITP-2
: node available

12

DN09109953

/cabinet-1
/cabinet-1/chassis-1
/cabinet-1/chassis-2
/cabinet-1/chassis-1/piu-1
/cabinet-1/chassis-2/piu-1
/cabinet-1/chassis-1/piu-1/addin-1/CPU/cabinet-1/chassis-1/piu-1/addin-2/CPU/cabinet-1/chassis-1/piu-1/addin-3/CPU/cabinet-1/chassis-1/piu-1/addin-4/CPU/cabinet-1/chassis-1/piu-1/addin-5/CPU/cabinet-1/chassis-1/piu-1/addin-6/CPU/cabinet-1/chassis-1/piu-1/addin-7/CPU/cabinet-1/chassis-1/piu-1/addin-8/CPU/cabinet-1/chassis-1/piu-1/addin-1/CPU/cabinet-1/chassis-1/piu-1/addin-2/CPU/cabinet-1/chassis-1/piu-1/addin-3/CPU/cabinet-1/chassis-1/piu-1/addin-4/CPU/cabinet-1/chassis-1/piu-1/addin-5/CPU/cabinet-1/chassis-1/piu-1/addin-6/CPU/cabinet-1/chassis-1/piu-1/addin-7/CPU/cabinet-1/chassis-1/piu-1/addin-8/CPU-

Issue:02A

ReplacingMulticontrollerRNCHardwareUnits

Replacingthefaultychassisinarunningsystem

1/core -10,11,6,7,8,9
CFPU-1
: node available /cabinet-1/chassis-2/piu-1/addin-1/CPU1/core
CSPU-1
: node available /cabinet-1/chassis-2/piu-1/addin-2/CPU1/core
USPU-1
: node available /cabinet-1/chassis-2/piu-1/addin-3/CPU1/core
EIPU-1
: node available /cabinet-1/chassis-2/piu-1/addin-4/CPU1/core
CSPU-3
: node available /cabinet-1/chassis-2/piu-1/addin-5/CPU1/core
USPU-3
: node available /cabinet-1/chassis-2/piu-1/addin-6/CPU1/core
USPU-5
: node available /cabinet-1/chassis-2/piu-1/addin-7/CPU1/core
EIPU-3
: node available /cabinet-1/chassis-2/piu-1/addin-8/CPU1/core
USSR-1
: node available /cabinet-1/chassis-2/piu-1/addin-1/CPU1/core
CSUP-1
: node available /cabinet-1/chassis-2/piu-1/addin-2/CPU1/core
USUP-1
: node available /cabinet-1/chassis-2/piu-1/addin-3/CPU1/core
EITP-1
: node available /cabinet-1/chassis-2/piu-1/addin-4/CPU1/core
CSUP-3
: node available /cabinet-1/chassis-2/piu-1/addin-5/CPU1/core
USUP-3
: node available /cabinet-1/chassis-2/piu-1/addin-6/CPU1/core
USUP-5
: node available /cabinet-1/chassis-2/piu-1/addin-7/CPU1/core
EITP-3
: node available /cabinet-1/chassis-2/piu-1/addin-8/CPU1/core
cluster
: cluster available

Theoutputmustdisplaythatallthenodesofthenewchassisarenowavailable.

Thenewchassisanditsnodestakesometimetobootup.
5

Setup the post configuration for the LMP of the new chassis.
TosetupthepostconfigurationfortheLMPofthenewchassis,enterthefollowing
commands:
cd /opt/nokiasiemens/SS_FSetup/bin
./configBCNLmp.py
Thefollowingoutputisdisplayed:
INFO
Copy ssh keys to LMPs.
INFO
Using credential file : /mnt/state/_global/etc/credentials/BCNLMP/root.cred
INFO
Copying /tftpboot/lmp/hosts file to all LMPs.
INFO
Changing the syslog.conf on all LMPs.
INFO
Changing the ntp.conf on all LMPs.
INFO
Configuring port monitor for all lmps.
INFO
Changing the mch.conf on all LMPs.
INFO
Removing bcn_sfp module loading from all LMPs.
INFO
Patching fastpath reset script on all LMPs.
INFO
Adding node reset init script to all LMPs.
INFO
Removing PET/SNMP trap configuration on all LMPs
INFO
Creating LMP configuration backup for automated configuration restore,

Issue:02A

DN09109953

13

Replacingthefaultychassisinarunningsystem

ReplacingMulticontrollerRNCHardwareUnits

this might take up to 5 minutes.

ThenexitwiththeCtrl+C.
6

Unlock all the nodes in the new chassis.


Tounlockallthenodesinthenewchassis,enterthefollowingcommand:
set has unlock managed-object <mo-name1> <mo-name2>...
Example
Tounlockallthenodesinchassis-2,enterthefollowingcommands:
set has unlock managed-object /CFPU-1 /CSPU-1 /USPU-1 /EIPU-1
\ /CSPU-3 /USPU-3 /USPU-5 /EIPU-3
Thefollowingoutputisdisplayed:
/CFPU-1
/CSPU-1
/USPU-1
/EIPU-1
/CSPU-3
/USPU-3
/USPU-5
/EIPU-3

unlocked
unlocked
unlocked
unlocked
unlocked
unlocked
unlocked
unlocked

successfully.
successfully.
successfully.
successfully.
successfully.
successfully.
successfully.
successfully.

Check that all the nodes in the new chassis are operational.
Waitforthenodestorestart.Afterthenodeshaverestarted,waitfortheoperational
statetobecomeOPERATIONAL(ENABLED).Enterthefollowingcommandtoview
theoperationalstateofthenode:
show has state managed-object <mo-name1> <mo-name2>...
Example
Tocheckthatallthenodesinchassis-2haveOPERATIONAL (ENABLED)status,
enterthefollowingcommands:
show has state operational managed-object /CFPU-1 /CSPU-1
/USPU-1 /EIPU-1 /CSPU-3 /USPU-3 /USPU-5 /EIPU-3
Thefollowingoutputisdisplayed:

OBJECT

OPERATIONAL

/CFPU-1
/CSPU-1
/CSPU-3
/EIPU-1
/EIPU-3
/USPU-1
/USPU-3
/USPU-5

ENABLED
ENABLED
ENABLED
ENABLED
ENABLED
ENABLED
ENABLED
ENABLED

Enable CMF on the node configured as cluster manager.


Toidentifythenodesconfiguredasclustermanager,enterthefollowingcommand:
show has view managed-object /ClusterHA

14

DN09109953

Issue:02A

ReplacingMulticontrollerRNCHardwareUnits

Replacingthefaultychassisinarunningsystem

Thefollowingoutputisdisplayed:
/ClusterHA:
RecoveryGroup /ClusterHA
specialConstraints=(serviceInterruptionDenied)
RecoveryUnit /CFPU-0/FSClusterHAServer
recoveryUnitType=(ClusterManagerRecoveryUnit)
Process /CFPU-0/FSClusterHAServer/HASClusterManager
command=(/opt/nokiasiemens/SS_CFM/bin/HASClusterManager)
status=(nonHA)
startMethod=(always)
severity=(important)
RecoveryUnit /CFPU-1/FSClusterHAServer
recoveryUnitType=(ClusterManagerRecoveryUnit)
Process /CFPU-1/FSClusterHAServer/HASClusterManager
command=(/opt/nokiasiemens/SS_CFM/bin/HASClusterManager)
status=(nonHA)
startMethod=(always)
severity=(important)

Theoutputindicatesthat/CluserHArecoverygrouphasrecoveryunitsforCFPU-0
andCFPU-1nodes.Hence,theCFPU-0andCFPU-1nodesareconfiguredas
clustermanagernodes.
Fromtheoutputofstep8,itisobservedthatoneclustermanagernode(CFPU-1)is
locatedinthenewchassis.Hence,thefollowingstepsmustbeexecuted:
a) EnabletheClusterManagementFunctionality(CMF)ontheCFPU-1node.Enter
thefollowingcommand:
set cmf enable node-name /CFPU-1
Thefollowingoutputisdisplayed:
Cluster management functionality enabled on host CFPU-1
b) CheckthattheCFPU-1nodewheretheCMFwasenabled,hasCMF-BACKUP
statusandtheCFPU-0nodehasCMF-SERVINGstatus.Enterthefollowing
command:
show cmf status node-name /CFPU-1
Thefollowingoutputisdisplayed:
CFPU-1: CMF-BACKUP
CFPU-0: CMF-SERVING

Issue:02A

priority: 6
priority: 5

DN09109953

15

Replacingtheharddiskdriveonharddiskdrivecarrier
AMC

ReplacingMulticontrollerRNCHardwareUnits

2 Replacing the hard disk drive on hard disk


drive carrier AMC
Purpose
TheharddiskdrivecarrierAMC(HDSAM-A)isdeliveredwiththeharddiskdrivein
place.Theharddiskdriveshouldbereplacedevery3to4years.
Youmayalsoneedtoreplacetheharddiskdriveifitisfaultyorifitneedstobe
upgradedorserviced.
Before you start

Electrostaticdischarge(ESD)maydamagecomponentsinthemoduleorother
units.
WearanESDwriststraporuseacorrespondingmethodwhenhandlingthe
units,anddonottouchtheconnectorsurfaces.
TheHDSAM-AsupportsbothSASandSATAharddiskdrivesandincludesaSAS/SATA
switchforselectingthedisktype.BCNplatformsupportsonlySASharddiskdrives,thus
alwayscheckthattheswitchissettoSAS,beforestartingthereplacementprocedure.
Figure 1

TheSAS/SATAswitchintheHDSAM-A

HDSAM-A
Handle
Switch

IPMB-L

SATA

2.5 SAS!or!SATA Drive

5V

Power

12V

AMC!Connector

ON

SAS

Switch

Mechanical!adapter

MMC

2!X!SAS

LEDs
DN0945027

2.1 Removing the faulty hard disk drive

16

DN09109953

Issue:02A

ReplacingMulticontrollerRNCHardwareUnits

Replacingtheharddiskdriveonharddiskdrivecarrier
AMC

2.1.1 Steps
1

Log into the CFPU node where the hard disk is not faulty.

Lock the node where the faulty hard disk drive is located.
Tolockthenodewherethefaultyharddiskdriveislocated,enterthefollowing
command:
set has lock managed-object <mo-name>
Example
set has lock managed-object /CFPU-1
Thefollowingoutputisdisplayed:
/CFPU-1 locked successfully.

Power off the node where the faulty hard disk drive is located.
Topoweroffthenodewherethefaultyharddiskdriveislocated,enterthefollowing
command: set has power off managed-object <mo_name>
Example
set has power off managed-object /CFPU-1
Thefollowingoutputisdisplayed:
/CFPU-1 is powered OFF successfully.

Remove the AMC from the AMC bay.


FollowtheinstructionsinsectionReplacinganAMC.

Place the AMC so that the faulty hard disk drive side is facing down. Unscrew
the four screws on the metal bracket of the AMC module, then turn the module
over carefully while holding the hard disk drive.

Disconnect the faulty hard disk drive.


Detachthefaultyharddiskdrivefromtheconnectorbypullingitgently(fromrightto
leftinthefollowingfigure).

Issue:02A

DN09109953

17

Replacingtheharddiskdriveonharddiskdrivecarrier
AMC

Figure 2

ReplacingMulticontrollerRNCHardwareUnits

TheharddiskdriveontheharddiskdrivecarrierAMC
Harddiskdrive

DN0945257

2.2 Installing the new hard disk drive


2.2.1 Steps
1

Connect the new hard disk drive to the SAS connector of HDSAM-A.
ConnectthenewharddiskdrivetotheSASconnectorintheHDSAM-Abypushingit
gently(fromlefttorightinthefollowingfigure).

18

DN09109953

Issue:02A

ReplacingMulticontrollerRNCHardwareUnits

Figure 3

Replacingtheharddiskdriveonharddiskdrivecarrier
AMC

InstallingaharddiskdriveontheharddiskdrivecarrierAMC

Harddiskdrive

SASconnector

DN0945245

Insertedscrew

Turn the AMC over and attach the new hard disk drive to the AMC with four
screws.
Tightenthescrewssothattheirheadsareinlinewiththemetalbracket.

Install the AMC module back into the AMC bay.


FollowtheinstructionsinsectionReplacinganAMC.

Enable network boot for the node with the new hard disk drive.
Toenablethenetworkbootforthenodewiththenewharddiskdrive,enterthe
followingcommands:
a) Loginasroot.
set user username root
b) Poweronthenode.
hwcli -np on <node_name>
Waitafewsecondsbeforeproceedingtothenextstep.
c) Resetthenode.
hwcli -nr -B 3 <node_name>
d) Exitroot.
exit

Issue:02A

DN09109953

19

Replacingtheharddiskdriveonharddiskdrivecarrier
AMC

ReplacingMulticontrollerRNCHardwareUnits

Example:
set user username root
hwcli -np on CFPU-1

Thefollowingouputisdisplayed:
Powering on

CFPU-1

[ok]

hwcli -nr -B 3 CFPU-1

Thefollowingouputisdisplayed:
Resetting

CFPU-1

[ok]

TheCFPUnodetakessometimetorebootandtheavailabilitycanbe
checkedbyloggingthroughtheSSH.
5

Disable the watchdog on the node with the new hard disk drive.
TodisablethewatchdogonthenodewiththenewharddiskdrivethroughSSH,
enterthefollowingcommand:
ssh <node_name> \ wdctl -d
Example:
ssh CFPU-1 \ wdctl -d
exit

Initialize the new disk from the other node where the hard disk is not faulty.
Toinitializethenewdisk,enterthefollowingcommand:
initialise hw

Thefollowingoutputisdisplayed:
Hardware successfully initialized

Toruntheinitializationscriptanddisplaytheconsoleoutput,thespacebar
mustbepressedseveraltimesafterenteringthecommand.

Reboot the node with the new hard disk drive from the local disk.
Enterthefollowingcommands:
set user username root
hwcli -nr -B 2 <node_name>
exit

Example:
Enter:
set user username root
hwcli -nr -B 2 CFPU-1
exit

Thefollowingouputisdisplayed:
Resetting

CFPU-1

[ok]

ThenodewillrestartandsynchronizetheDistributedReplicatedBlockDevices
(DRBD).Youcanenterthewatch -n 10 cat /proc/drbd toseehowthe
synchronizationisprogressing.However,ifthewatch -n 10 cat /proc/drbd
commandfails,the cat /proc/drbdcommandmustbeexecuted.

20

DN09109953

Issue:02A

ReplacingMulticontrollerRNCHardwareUnits

Replacingtheharddiskdriveonharddiskdrivecarrier
AMC

Setuserusernamerootmustfirstbeexecutedbefore
watch -n 10 cat /proc/drbd.
DonotrestartthenodeduringtheDRBDsynchronization.Theinitializationprocess
ofthenewdiskisnotreadyuntilthesynchronizationissuccessfullycompleted.
Example:
# watch -n 10 cat /proc/drbd
Every 10.0s: cat /proc/drbd
24 10:54:39 2013

Wed Apr

version: 8.3.7 (api:88/proto:86-92)


srcversion: 35B9BF7C501212268498452
0: cs:Connected ro:Primary/Secondary ds:UpToDate/UpToDate C r---ns:512081 nr:0 dw:635 dr:512860 al:4 bm:32 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b
oos:0
1: cs:Connected ro:Primary/Secondary ds:UpToDate/UpToDate C r---ns:104849 nr:0 dw:31303 dr:106531 al:6 bm:7 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b
oos:0
2: cs:Connected ro:Primary/Secondary ds:UpToDate/UpToDate C r---ns:204756 nr:0 dw:36 dr:205264 al:3 bm:13 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b
oos:0
3: cs:SyncSource ro:Primary/Secondary ds:UpToDate/Inconsistent C r---ns:2890764 nr:0 dw:46448 dr:2882497 al:32 bm:190 lo:0 pe:0 ua:0 ap:0 ep:1
wo:b oos:12469416
[==>.................] sync'ed: 18.9% (12176/14996)M
finish: 0:05:37 speed: 36,872 (30,744) K/sec
4: cs:PausedSyncS ro:Primary/Secondary ds:UpToDate/Inconsistent C rap-ns:16024 nr:0 dw:78448 dr:174701 al:489 bm:200 lo:0 pe:0 ua:0 ap:0 ep:1
wo:b oos:3062668
5: cs:PausedSyncS ro:Primary/Secondary ds:UpToDate/Inconsistent C rap-ns:533 nr:0 dw:4921 dr:4830 al:13 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b
oos:102360
6: cs:PausedSyncS ro:Primary/Secondary ds:UpToDate/Inconsistent C rap-ns:0 nr:0 dw:349 dr:1461 al:2 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:8152
7: cs:PausedSyncS ro:Primary/Secondary ds:UpToDate/Inconsistent C rap-ns:0 nr:0 dw:12 dr:675 al:2 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:8152
8: cs:PausedSyncS ro:Primary/Secondary ds:UpToDate/Inconsistent C rap-ns:133 nr:0 dw:1894 dr:759 al:5 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b
oos:511948
9: cs:PausedSyncS ro:Primary/Secondary ds:UpToDate/Inconsistent C rap-ns:0 nr:0 dw:116 dr:4461 al:7 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b
oos:2916224
10: cs:PausedSyncS ro:Primary/Secondary ds:UpToDate/Inconsistent C rap-ns:0 nr:0 dw:1500 dr:956 al:3 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b
oos:102360

Theexampleshowsthatthefirstblocksarenowsynchronized.Theoos(outof
synch)valueiszero.Forblock3,thesynchronizationisinprocessandprogressis
displayed.Oncesynchronizationiscomplete,theoosvalueforallblockswillbe0.
8

Check that serving and backup CMF (Cluster Management Functionality) are
working normally.
Enter:
show cmf status recovery-unit node-name <mo-name>

Issue:02A

DN09109953

21

Replacingtheharddiskdriveonharddiskdrivecarrier
AMC

ReplacingMulticontrollerRNCHardwareUnits

Example:
_nokadmin@CFPU-0 [RNC-37] > show cmf status recovery-unit node-name /CFPU-1
CFPU-0@RNC-37
[2013-04-24 13:18:51 +0200]
Recovery units with DRBD resources for managed object /CFPU-1:
/CFPU-1/FSDirectoryServer: 1/1 (peer[s]/drbd device[s] up)
/dev/drbd1: DRBD_SECONDARY 1/0 (peer/wait secondary)
/CFPU-1/FSAlarmSystemLightServer: 1/1 (peer[s]/drbd device[s] up)
/dev/drbd5: DRBD_SECONDARY 1/0 (peer/wait secondary)
/CFPU-1/QNOMUServer-1: 1/1 (peer[s]/drbd device[s] up)
/dev/drbd9: DRBD_SECONDARY 1/0 (peer/wait secondary)
/CFPU-1/FSPM9Server: 1/1 (peer[s]/drbd device[s] up)
/dev/drbd8: DRBD_SECONDARY 1/0 (peer/wait secondary)
/CFPU-1/FSClusterStateServer: 1/1 (peer[s]/drbd device[s] up)
/dev/drbd4: DRBD_SECONDARY 1/0 (peer/wait secondary)
/CFPU-1/FSLogServer: 1/1 (peer[s]/drbd device[s] up)
/dev/drbd3: DRBD_SECONDARY 1/0 (peer/wait secondary)
/CFPU-1/FSSSHServer: 1/1 (peer[s]/drbd device[s] up)
/dev/drbd2: DRBD_SECONDARY 1/0 (peer/wait secondary)
/CFPU-1/QNEMServer: 1/1 (peer[s]/drbd device[s] up)
/dev/drbd10: DRBD_SECONDARY 1/0 (peer/wait secondary)
/CFPU-1/FSCLMServer: 1/1 (peer[s]/drbd device[s] up)
/dev/drbd6: DRBD_SECONDARY 1/0 (peer/wait secondary)
/CFPU-1/FSHotSwapMonitorServer: 1/1 (peer[s]/drbd device[s] up)
/dev/drbd7: DRBD_SECONDARY 1/0 (peer/wait secondary)
_nokadmin@CFPU-0 [RNC-37] > show cmf status recovery-unit node-name /CFPU-0
CFPU-0@RNC-37
[2013-04-24 13:18:55 +0200]
Recovery units with DRBD resources for managed object /CFPU-0:
/CFPU-0/FSPM9Server: 1/1 (peer[s]/drbd device[s] up)
/dev/drbd8: DRBD_PRIMARY 1/0 (peer/wait secondary)
/CFPU-0/FSClusterStateServer: 1/1 (peer[s]/drbd device[s] up)
/dev/drbd4: DRBD_PRIMARY 1/0 (peer/wait secondary)
/CFPU-0/FSLogServer: 1/1 (peer[s]/drbd device[s] up)
/dev/drbd3: DRBD_PRIMARY 1/0 (peer/wait secondary)
/CFPU-0/FSSSHServer: 1/1 (peer[s]/drbd device[s] up)
/dev/drbd2: DRBD_PRIMARY 1/0 (peer/wait secondary)
/CFPU-0/QNEMServer: 1/1 (peer[s]/drbd device[s] up)
/dev/drbd10: DRBD_PRIMARY 1/0 (peer/wait secondary)
/CFPU-0/FSCLMServer: 1/1 (peer[s]/drbd device[s] up)
/dev/drbd6: DRBD_PRIMARY 1/0 (peer/wait secondary)
/CFPU-0/FSHotSwapMonitorServer: 1/1 (peer[s]/drbd device[s] up)
/dev/drbd7: DRBD_PRIMARY 1/0 (peer/wait secondary)
/CFPU-0/FSAlarmSystemLightServer: 1/1 (peer[s]/drbd device[s] up)
/dev/drbd5: DRBD_PRIMARY 1/0 (peer/wait secondary)
/CFPU-0/FSDirectoryServer: 1/1 (peer[s]/drbd device[s] up)
/dev/drbd1: DRBD_PRIMARY 1/0 (peer/wait secondary)
/CFPU-0/QNOMUServer-0: 1/1 (peer[s]/drbd device[s] up)
/dev/drbd9: DRBD_PRIMARY 1/0 (peer/wait secondary)

Comparetheblocksandtheyshouldmatchforbothmanagedobjects.
9

Unlock the node with the new hard disk drive.


Enter:
set has unlock managed-object <mo-name>

Example:
set has unlock managed-object /CFPU-1

Thefollowingoutputisdisplayed:
/CFPU-1 unlocked successfully.

22

DN09109953

Issue:02A

ReplacingMulticontrollerRNCHardwareUnits

ReplacingthefailedharddiskdrivesonbothCFPU
nodes

3 Replacing the failed hard disk drives on both


CFPU nodes
Summary
TheharddiskdrivecarrierAMC(HDSAM-A)isdeliveredwiththeharddiskdrivein
place.Theharddiskdriveshouldbereplacedevery3to4years.
Youmayalsoneedtoreplacetheharddiskdrivesiftheyarefaultyorneedtobe
upgradedorserviced.
Purpose
ToreplacethefailedharddiskdrivesonboththeCFPUnodes.
Before you start

Electrostaticdischarge(ESD)maydamagecomponentsinthemoduleorother
units.
WearanESDwriststraporuseacorrespondingmethodwhenhandlingthe
units,anddonottouchtheconnectorsurfaces.
TheHDSAM-AsupportsbothSASandSATAharddiskdrivesandincludesaSAS/SATA
switchforselectingthedisktype.BCNplatformsupportsonlySASharddiskdrives,thus
alwayscheckthattheswitchissettoSAS,beforestartingthereplacementprocedure.
Figure 4

TheSAS/SATAswitchintheHDSAM-A

HDSAM-A
Handle
Switch

IPMB-L

SATA

2.5 SAS!or!SATA Drive

5V

Power

12V

AMC!Connector

ON

SAS

Switch

Mechanical!adapter

MMC

2!X!SAS

LEDs
DN0945027

3.1 Removing the faulty hard disk drives


1

Remove the AMC from the AMC bay.


FollowtheinstructionsinsectionReplacinganAMC.

Issue:02A

DN09109953

23

ReplacingthefailedharddiskdrivesonbothCFPU
nodes

ReplacingMulticontrollerRNCHardwareUnits

Place the AMC so that the faulty hard disk drive side is facing down. Unscrew
the four screws on the metal bracket of the AMC module, then turn the module
over.

Disconnect the faulty hard disk drive.


Detachthefaultyharddiskdrivefromtheconnectorbypullingitgently(fromrightto
leftinthefollowingfigure).
Figure 5

TheharddiskdriveontheharddiskdrivecarrierAMC
Harddiskdrive

DN0945257

Repeat steps 1 to 3 for removing the other hard disk drive.

3.2 Installing the new hard disk drives


1

Connect the new hard disk drive to the SAS connector of HDSAM-A.
ConnectthenewharddiskdrivetotheSASconnectorintheHDSAM-Abypushingit
gently(fromlefttorightinthefollowingfigure).

24

DN09109953

Issue:02A

ReplacingMulticontrollerRNCHardwareUnits

Figure 6

ReplacingthefailedharddiskdrivesonbothCFPU
nodes

InstallingaharddiskdriveontheharddiskdrivecarrierAMC

Harddiskdrive

SASconnector

DN0945245

Insertedscrew

Turn the AMC over and attach the new hard disk drive to the AMC with four
screws.
Tightenthescrewssothattheirheadsareinlinewiththemetalbracket.

Install the AMC module back into the AMC bay.


FollowtheinstructionsinsectionReplacinganAMC.

Check the embedded software version on the new hard disk (HDSAM-A).
Usethefollowingcommand:
show sw-manage embedded-sw version all

Upgrade the embedded software version.


Iftherearenewerembeddedsoftwareversions,thenupgradetheembedded
softwareversion.Forinstructions,seeUpgrading Embedded Software.

Issue:02A

DN09109953

25

ReplacingthefailedharddiskdrivesonbothCFPU
nodes

ReplacingMulticontrollerRNCHardwareUnits

Repeat the steps 1 to 5 for installing the hard disk drive on the other CFPU
node.

Perform the full restoration for the system.


Performthefullrestorationforthesystem.Forinstructions,seeCommissioning
mcRNC.

26

DN09109953

Issue:02A

ReplacingMulticontrollerRNCHardwareUnits

ReplacinganAMC

4 Replacing an AMC
Purpose
YoumayneedtoreplaceanAMCifitisfaultyorifitneedstobereplaceddueto
configurationchanges,extensionsorservicing.

WhensendingafaultyharddiskdriveAMCtobereplaced,rememberto
removetheharddiskdrive.
Before you start

Electrostaticdischarge(ESD)maydamagecomponentsinthemoduleorother
units.
WearanESDwriststraporuseacorrespondingmethodwhenhandlingthe
units,anddonottouchtheconnectorsurfaces.

4.1 Removing an AMC


1

Gently pull the hot swap handle on the front panel of the AMC.
Donotpullthehandleoutallthewayyet.Pullingthehandlenotifiesthehardware
managementsystemthatyouaregoingtoremovetheAMCandtellsittofinishall
processes.
ThehotswapLEDstartsflashing.
Figure 7

PullingthehotswaphandleofanAMC

DN0977767

Wait until the hot swap LED turns into a solid blue.
Thismaytakeafewseconds.

Issue:02A

DN09109953

27

ReplacinganAMC

ReplacingMulticontrollerRNCHardwareUnits

Pull the hot swap handle again more firmly and slide the AMC out of the bay.
Figure 8

RemovinganAMCfromtheBCNmodule

DN0973762

If you are not installing another AMC immediately, install an AMC filler into the
empty AMC bay.
ThisistoensureadequatecoolingandaproperEMCshieldinthemodule.

4.2 Installing an AMC


1

Check that the EMC gasket is correctly in place and that its contacts are clean.

Insert the AMC into the bay, sliding it along the guide rails as shown in the
figure below.
MakesurethattheAMCisfirmlyseatedinthemodulesconnectors.
Figure 9

InsertinganAMCintotheBCNmodule

DN0977588

Press the hot swap handle firmly.


WaituntilthebluehotswapLEDturnsoffandthepowerLEDturnssolidgreen.

28

DN09109953

Issue:02A

ReplacingMulticontrollerRNCHardwareUnits

Figure 10

ReplacinganAMC

Pressingthehotswaphandle

DN0977782

Issue:02A

Ifharddiskcrossconnectingisused,theharddiskAMCcanonlybeplaced
inAMCbay1.

DN09109953

29

Replacingafanmodule

ReplacingMulticontrollerRNCHardwareUnits

5 Replacing a fan module


Summary
ThefanmodulesarelocatedattherearoftheBCNmodule.
BCN fan modules

DN0973747

Before you start

f
t

Electrostaticdischarge(ESD)maydamagecomponentsinthemoduleorother
units.
WearanESDwriststraporuseacorrespondingmethodwhenhandlingthe
units,anddonottouchtheconnectorsurfaces.
ThefanmodulecanbereplacedwhiletheBCNispoweredon.Onlyonefan
modulecanbereplacedatonce.
Preparethesparefanunitforreplacementbeforehand.Afterremovingafan
fromtheBCNmodule,thesystemsstartstoheatupveryquickly.Proceed
immediatelywiththenewfaninstallation.Thefollowingprocedureappliestoall
threefanmodulesoftheBCNmodule.

5.1 Removing a fan module


5.1.1 Steps
1

Unscrew the two thumbscrews attaching the fan module to the BCN.
ThePhillipsscrewsarebuiltintothefanmoduleandcanbeloosenedeitherbyhand
orwithascrewdriver.

Pull the fan module out from the BCN module.

5.2 Installing a fan module

30

DN09109953

Issue:02A

ReplacingMulticontrollerRNCHardwareUnits

Replacingafanmodule

5.2.1 Steps
1

Insert the fan module to its slot at the rear side of the BCN module.

Tighten the fan modules thumbscrews.


ThePhillipsscrewsarebuiltintothefanmoduleandcanbetightenedeitherbyhand
orwithascrewdriver.

Issue:02A

DN09109953

31

Replacinganadd-incard

ReplacingMulticontrollerRNCHardwareUnits

6 Replacing an add-in card


Before you start
PowerofftheBCNmodulebeforeremovingorinstallinganadd-incard.

Electrostaticdischarge(ESD)maydamagecomponentsinthemoduleorother
units.
WearanESDwriststraporuseacorrespondingmethodwhenhandlingthe
units,anddonottouchtheconnectorsurfaces.

6.1 Removing an add-in card


6.1.1 Steps
1

Option Description
If

theBCNmoduleisinstalledinthecabinet,

Then

a) GracefullyshutdowntheBCNmodule.

1. Identifythechassiswheretheplug-inunitislocatedthatistobereplaced.
Inthissection,thechassisreferstothechassis-2.
2. Checkallthenodesthatarerunninginthechassiswheretheplug-inunitislocate
Tocheckalltherunningnodespresentinthechassistobereplaced,enterthefol
Thefollowingoutputisdisplayed:
root@CFPU-0 [RNC-89]
cabinet-1
chassis-1
chassis-2
LMP-1-1-1
LMP-1-2-1
CFPU-0
CSPU-0
USPU-0
EIPU-0
CSPU-2
USPU-2
USPU-4
EIPU-2
USSR-0
CSUP-0
USUP-0
EITP-0
CSUP-2
USUP-2
USUP-4
EITP-2
CFPU-1
CSPU-1
USPU-1
EIPU-1
CSPU-3

32

DN09109953

:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:

> show hardware state list


unit
unit
unit
node
node
node
node
node
node
node
node
node
node
node
node
node
node
node
node
node
node
node
node
node
node
node

/cabinet-1
/cabinet-1/chassis-1
/cabinet-1/chassis-2
available /cabinet-1/chassis-1/piu-1
available /cabinet-1/chassis-2/piu-1
available /cabinet-1/chassis-1/piu-1/addin-1/CPU-1
available /cabinet-1/chassis-1/piu-1/addin-2/CPU-1
available /cabinet-1/chassis-1/piu-1/addin-3/CPU-1
available /cabinet-1/chassis-1/piu-1/addin-4/CPU-1
available /cabinet-1/chassis-1/piu-1/addin-5/CPU-1
available /cabinet-1/chassis-1/piu-1/addin-6/CPU-1
available /cabinet-1/chassis-1/piu-1/addin-7/CPU-1
available /cabinet-1/chassis-1/piu-1/addin-8/CPU-1
available /cabinet-1/chassis-1/piu-1/addin-1/CPU-1
available /cabinet-1/chassis-1/piu-1/addin-2/CPU-1
available /cabinet-1/chassis-1/piu-1/addin-3/CPU-1
available /cabinet-1/chassis-1/piu-1/addin-4/CPU-1
available /cabinet-1/chassis-1/piu-1/addin-5/CPU-1
available /cabinet-1/chassis-1/piu-1/addin-6/CPU-1
available /cabinet-1/chassis-1/piu-1/addin-7/CPU-1
available /cabinet-1/chassis-1/piu-1/addin-8/CPU-1
available /cabinet-1/chassis-2/piu-1/addin-1/CPU-1
available /cabinet-1/chassis-2/piu-1/addin-2/CPU-1
available /cabinet-1/chassis-2/piu-1/addin-3/CPU-1
available /cabinet-1/chassis-2/piu-1/addin-4/CPU-1
available /cabinet-1/chassis-2/piu-1/addin-5/CPU-1

Issue:02A

ReplacingMulticontrollerRNCHardwareUnits

Replacinganadd-incard

Option Description
USPU-3
USPU-5
EIPU-3
USSR-1
CSUP-1
USUP-1
EITP-1
CSUP-3
USUP-3
USUP-5
EITP-3
cluster

:
:
:
:
:
:
:
:
:
:
:
:

node available /cabinet-1/chassis-2/piu-1/addin-6/CP


node available /cabinet-1/chassis-2/piu-1/addin-7/CP
node available /cabinet-1/chassis-2/piu-1/addin-8/CP
node available /cabinet-1/chassis-2/piu-1/addin-1/CP
node available /cabinet-1/chassis-2/piu-1/addin-2/CP
node available /cabinet-1/chassis-2/piu-1/addin-3/CP
node available /cabinet-1/chassis-2/piu-1/addin-4/CP
node available /cabinet-1/chassis-2/piu-1/addin-5/CP
node available /cabinet-1/chassis-2/piu-1/addin-6/CP
node available /cabinet-1/chassis-2/piu-1/addin-7/CP
node available /cabinet-1/chassis-2/piu-1/addin-8/CP
cluster available

TheoutputprovidesinformationthatnodesCFPU-1,CSPU-1,USPU-1,EIPUpresentinthechassistobereplaced.
3. CheckifthecurrentSCLIsessionisrunningonanodelocatedinthechassis.
IftheSCLIsessionisrunningonanode(nodenameidentifiedbytheprompt)
connectivitytotheclusterduringchassisreplacement.Toperformaswitchove
set has switchover force managed-object /SSH

TheSSHconnectionbreakswhentheswichovercommandisexec

4. Disableclustermanagernodeslocatedinthechassis.
Toidentifythenodesconfiguredasaclustermanager,enterthefollowingcom
show has view managed-object /ClusterHA
Thefollowingoutputisdisplayed:
/ClusterHA:
RecoveryGroup /ClusterHA
specialConstraints=(serviceInterruptionDenied)
RecoveryUnit /CFPU-0/FSClusterHAServer
recoveryUnitType=(ClusterManagerRecoveryUnit)
Process /CFPU-0/FSClusterHAServer/HASClusterManager
command=(/opt/nokiasiemens/SS_CFM/bin/HASClusterManager)
status=(nonHA)
startMethod=(always)
severity=(important)
RecoveryUnit /CFPU-1/FSClusterHAServer
recoveryUnitType=(ClusterManagerRecoveryUnit)
Process /CFPU-1/FSClusterHAServer/HASClusterManager
command=(/opt/nokiasiemens/SS_CFM/bin/HASClusterManager)
status=(nonHA)
startMethod=(always)
severity=(important)

Theoutputindicatesthat/CluserHArecoverygrouphasrecoveryunitsforC
Fromtheoutputofstep4,itisobservedthatoneclustermanagernode(CFPU

a) DisabletheClusterManagementFunctionality(CMF)ontheCFPU-1node
set cmf disable node-name /CFPU-1
Thefollowingoutputisdisplayed:
Cluster management functionality disabled on host CFPUb) CheckCFPU-1nodewhereCMFwasdisabledhasCMF-DISABLEDstatus
command:

Issue:02A

DN09109953

33

Replacinganadd-incard

ReplacingMulticontrollerRNCHardwareUnits

Option Description
show cmf status node-name /CFPU-1
Thefollowingoutputmustbedisplayed:
CFPU-1: CMF-DISABLED
CFPU-0: CMF-SERVING

priority: 6
priority: 5

5. Lockallthemanagedobjectsinthechassis.
Tolockallthemanagedobjects,enterthefollowingcommand:
set has lock managed-object <mo-name1> <mo-name2>...

TheSEnodesarenotmanagedandthereforetheyarenotlocked.
Example
Tolockallthemanagednodesinchassis-2,enterthefollowingcommands:
set has lock managed-object /CFPU-1 /CSPU-1 /USPU-1 /EIPU-1
Ifthenodesaresuccessfullylocked,thefollowingoutputisdisplayed:

root@CFPU-0 [RNC-1002] > set has lock managed-object /CFPU-1 /CSPU-1 /USPU
/CFPU-1 locked successfully, 1 services activated on standby node(s)
/CSPU-1 locked successfully, 1 services activated on standby node(s)
/USPU-1 locked successfully
/EIPU-1 locked successfully, 3 services activated on standby node(s)
/CSPU-3 locked successfully, 1 services activated on standby node(s)
/USPU-3 locked successfully
/USPU-5 locked successfully
/EIPU-3 locked successfully, 3 services activated on standby node(s)

6. Poweroffallthemanagedobjectsinthechassis.
Topoweroffthemanagedobjects,enterthefollowingcommand:
set has power off managed-object <mo-name1> <mo-name2>...
Example
Topoweroffallthemanagednodesinchassis-2,enterthefollowingcommands:
set has power off managed-object /CFPU-1 /CSPU-1 /USPU-1 /EI
Ifthenodesaresuccessfullypoweredoff,thefollowingoutputisdisplayed:
/CFPU-1
/CSPU-1
/USPU-1
/EIPU-1
/CSPU-3
/USPU-3
/USPU-5
/EIPU-3

is
is
is
is
is
is
is
is

powered
powered
powered
powered
powered
powered
powered
powered

OFF
OFF
OFF
OFF
OFF
OFF
OFF
OFF

successfully
successfully
successfully
successfully
successfully
successfully
successfully
successfully

7. Verifyallthemanagedobjectsinthechassistobereplacedarepoweredoff.
Toverifytheavailabilityofamanagednode,enterthefollowingcommand:
show has state availability managed-object <mo-name>
Example
Toverifythattheavailabilitystatusofallmanagednodesinchassis-2areavaila
show has state availability managed-object /CFPU-1 /CSPU-1 \
Thefollowingoutputmustbedisplayed:
OBJECT

AVAILABILITY

/CFPU-1 POWEROFF
/CSPU-1 POWEROFF
/CSPU-3 POWEROFF

34

DN09109953

Issue:02A

ReplacingMulticontrollerRNCHardwareUnits

Replacinganadd-incard

Option Description
/EIPU-1
/EIPU-3
/USPU-1
/USPU-3
/USPU-5

POWEROFF
POWEROFF
POWEROFF
POWEROFF
POWEROFF

b)
c)
d)
e)
f)
g)

Disconnectthepowerfeedcables.
Disconnectthenetworkcablesfromthetransceiversonthefrontsideofthemodu
DisconnecttheBCNgroundingcable.
KeepthenetworkcablesattachedtothefrontcabletrayoftheBCNmodule.
UninstallthecabletraywithattachednetworkcablesfromtheBCNmodule.
Thecabletrayisuninstalledbyunscrewingthetwothumbscrewsfixingthecablet
mountingflangestothecabinetmighthelp.
Formoreinformationaboutdetachingthecabletray,refertothedocumentInstalli
h) Movethecabletraywithattachednetworkcablesunderthemodule,sothemodul

Unscrew the two thumbscrews securing the top cover of the BCN module.
Thescrewsarelocatedattherearsideofthemoduleasshownonthefigurebelow.
ThePhillipsscrewsarebuiltintothetopcoveroftheBCNmoduleandcanbe
loosenedeitherbyhandorwithascrewdriver.
Figure 11

BCNtopcoverscrews

DN0973774

Issue:02A

Option

Description

If

theBCNmoduleisinstalledinthecabinet,

Then

a) Pullthemoduleoutofthecabinet,untilitlocksintotheoutmost
position.
b)

DN09109953

35

Replacinganadd-incard

ReplacingMulticontrollerRNCHardwareUnits

Slide the top cover of module towards the rear side until it stops. Lift the top
cover upwards.
Figure 12

RemovingtheBCNtopcover

Unscrew the two thumbscrews securing the add-in card to the rails inside the
BCN module.
Figure 13

BCNadd-incardscrews

DN0977525

Slide the add-in card upwards to remove it from the BCN module.
Figure 14

Pullinganadd-incardoutfromtheBCNmodule

DN0973798

36

DN09109953

Issue:02A

ReplacingMulticontrollerRNCHardwareUnits

Replacinganadd-incard

6.2 Installing an add-in card


6.2.1 Steps
1

Slide the add-in card into the rails inside the BCN module until the pins of the
card fall into connectors of the main board.
Figure 15

Insertinganadd-incardintoBCNmodule

DN0973708

Secure the add-in card to the rails with built-in thumbscrews.


ThePhillipsscrewsarebuiltintotheadd-incardandcanbetightenedeitherbyhand
orwithascrewdriver.
Figure 16

BCNadd-incardscrews

DN0977525

Issue:02A

DN09109953

37

Replacinganadd-incard

ReplacingMulticontrollerRNCHardwareUnits

Place the BCN modules cover on the top of the module, leaving small gap
between the top cover and the front edge of the module.
Figure 17

4
5

6
7

38

InstallingBCNtopcover

Slide the top cover to the front side of the module, until it falls into place.

Option

Description

If

theBCNmoduleisinstalledinthecabinet,

Then

a)
b) Pushthemodulebackintothecabinet,untilitlocksintoposition.
Pullthegreenlatchesontheinnerslidingrailstowardsyouand
slidetheBCNmoduleintothecabinet.

Tighten the thumbscrews of the top cover.

Option

Description

If

theBCNmoduleisinstalledinthecabinet,

Then

a) Installthecabletraywithattachednetworkcablesbacktothe
BCNmodule.
Formoreinformationaboutthecabletrayinstallation,checkthe
documentInstalling BCN Modules to the IR206 Cabinet.
b) ConnecttheBCNgroundingcable.
c) Connectthenetworkcablesbacktothetransceiversonthefront
sideofthemodule.
d) Connectthepowerfeedcables.
e)
f) PowerontheBCNmodule.

DN09109953

Issue:02A

ReplacingMulticontrollerRNCHardwareUnits

Replacingapowerdistributionunit

7 Replacing a power distribution unit


Purpose
Ifthepowerdistributionunit(PDU)isfaulty,youmustreplaceitwithanewone.
Figure 18

Powerdistributionunitsinthecabinet

ON
ON
ON
ON
OFF
OFF
OFF

OFF

2
3

4
ON
ON
ON
ON
ON

OFF

ON

OFF

ON
ON

OFF

OFF

6
7

OFF

OFF
OFF

OFF

2
3

4
ON
ON
ON
ON
OFF
OFF
OFF

OFF

6
7

DN0960093

frontview

Before you start


Makesureyouhaveadigitalmultimeterorvoltagemeteravailable.

f
f
f

Dangerofhazardousvoltagesandelectricshock!
Beforeconnectingorremovinganypowersupplycablestoorfromthepower
distributionunit,makesurethatbothsitepowerfeedstothepowerdistribution
unitareoff,thecircuitbreakersonthefrontpanelofthepowerdistributionunit
areintheOFFposition,andtheequipmentisproperlyearthed(grounded).
Dangerofhazardousvoltagesandelectricshock!
Makesureyourhandsaredryandremoveanymetalobjectssuchasrings
beforetouchingthepowersupplyequipment.
Riskofpersonalinjury.
Observethegiventorquerangesatalltimes.Incorrecttorquecanresultin
damagetoequipment,unreliability,andfirehazardsduetoexcessivepower
dissipationandhightemperatureofmaterials.

7.1 Removing a power distribution unit (PDU)

Issue:02A

DN09109953

39

Replacingapowerdistributionunit

ReplacingMulticontrollerRNCHardwareUnits

7.1.1 Steps
1

Make sure that the redundant PDU is functional.

Switch off the circuit breakers on the PDU you are going to remove.

Check the PDU input feeds with a digital multimeter to ensure there are no
voltages in the cables.

Disconnect all cables from the PDU.


a) DisconnectthefourpowerfeedcablesfromthePDU.
b) DisconnecttheCGNDBgroundingcablefromthePDU.
c) DisconnecttheeightPSUinputfeedsfromthePDU.

Unscrew the four fixing screws attaching the PDU to the cabinet.
Figure 19

ReplacingaPDU

M6
ON
ON
ON
ON
OFF
OFF
OFF

OFF

2
3

4
ON
ON
ON
ON
ON

OFF

ON

OFF

ON
ON

OFF

OFF

6
7

OFF

OFF
OFF

OFF

2
3

4
ON
ON
ON
ON
OFF
OFF
OFF

OFF

6
7

frontview
DN0960109

Remove the PDU from the cabinet.

7.2 Installing a power distribution unit (PDU)

40

DN09109953

Issue:02A

ReplacingMulticontrollerRNCHardwareUnits

Replacingapowerdistributionunit

7.2.1 Steps
1

Insert the PDU into the cabinet and align the holes of its mounting ear with the
cabinet mounting rail.

Attach the PDU to the cabinet with four M6x12 screws.


Figure 20

InstallingPDUtothecabinet

M6
ON
ON
ON
ON
OFF
OFF
OFF

OFF

2
3

4
ON
ON
ON
ON
ON

OFF

ON

OFF

ON
ON

OFF

OFF

6
7

OFF

OFF
OFF

OFF

2
3

4
ON
ON
ON
ON
OFF
OFF
OFF

OFF

6
7

frontview
DN0960187

Connect the PDU grounding cable (CGNDB) to the PDU.


Figure 21

PDUgroundingcable

-48RTN
-48RTN
7

5
3

6
4

rearview
DN0977591

Issue:02A

DN09109953

41

Replacingapowerdistributionunit

42

ReplacingMulticontrollerRNCHardwareUnits

Check the PDU input feeds with a digital multimeter to ensure there are no
voltages in the cables.

Connect the site power supply cables to the PDU (for DC power supply only).

Connect the site power supply cables to the PDU (for AC power supply only).

Connect the eight PSU input feeds to the PDU.

Switch on the site power supply to the PDU.

Switch on the circuit breakers on the PDU.

DN09109953

Issue:02A

ReplacingMulticontrollerRNCHardwareUnits

Replacingapowersupplyunit

8 Replacing a power supply unit


Before you start

Electrostaticdischarge(ESD)maydamagecomponentsinthemoduleorother
units.
WearanESDwriststraporuseacorrespondingmethodwhenhandlingthe
units,anddonottouchtheconnectorsurfaces.

8.1 Removing a power supply unit


8.1.1 Steps
1

Option

Description

If

thereisPDUusedwiththeBCNmodule,

Then

Switch off the circuit breaker on the PDU for the power supply
unit to be replaced.

Unplug the power cable connected to the power supply unit.

Unscrew the two thumbscrews attaching the power supply unit to the BCN.
ThePhillipsscrewsarebuiltintothepowersupplyunitandcanbeloosenedeither
byhandorwithascrewdriver.

Pull the power supply unit out from the BCN module.
Removing an AC PSU from the BCN module

DN0960151

Issue:02A

DN09109953

43

Replacingapowersupplyunit

ReplacingMulticontrollerRNCHardwareUnits

POK
POK

RTN -48V

RTN -48V

Removing a DC PSU from the BCN module

DN0960163

8.2 Installing a power supply unit


8.2.1 Steps
1

Insert the power supply unit to its slot at the rear side of the BCN module so
the screws built into the unit are on the right-hand side.

Tighten the units thumbscrews.


ThePhillipsscrewsarebuiltintothepowersupplyunitandcanbetightenedeither
byhandorwithascrewdriver.

Plug the power cable to the power supply unit.

Attach the cable clamp to the cable.

44

Option

Description

If

thereisPDUusedwiththeBCNmodule,

Then

Switch on the circuit breaker on the PDU for the power supply
unit, which was replaced.

DN09109953

Issue:02A

ReplacingMulticontrollerRNCHardwareUnits

Replacingtheairfilter

9 Replacing the air filter


Purpose
Inspecttheairfilterregularly.Topreventdustfromaccumulatinginsidetheequipment,
thefilterelementshouldbereplacedtwiceayear.

Steps
1

Unscrew the two thumbscrews attaching the air filer cover to the BCN module.
Figure 22

Unscrewingthetwothumbscrews

Open the air filter cover and pull out the air filter.
Figure 23

Openningtheairfiltercoverandpullingouttheairfilter

DN0960112

Issue:02A

Push the new air filter into the guide rails on both sides of the air filter cover.

Push the air filter cover back and fasten the two thumbscrews.

Record the date of the air filter change.

DN09109953

45

Dealingwithsensoralarms

ReplacingMulticontrollerRNCHardwareUnits

10 Dealing with sensor alarms


Symptoms
AnalarmaboutthesensorvalueofanFieldReplacementUnit(FRU)isreceived.
Thefollowingisanexampleofthealarm:
Alarm ID: 2813
Specific problem: 70307 - VOLTAGE OUT OF LIMIT
Managed object: fshwModuleId=addin-5,fshwPIUId=piu1,fshwEquipmentHolderId=chassis-2,fshwEquipmentHolderId=cabinet-1,
fsFragmentId=HW,fsClusterId=ClusterRoot
Severity: 2 (critical)
Cleared: no
Clearing: automatic
Acknowledged: no
Ack. user ID: N/A
Ack. time: N/A
Alarm time: 2012-03-12 09:08:29:940 EET
Event type: x5 (equipment)
Application:
fshaProcessInstanceName=HPIMonitor,fshaRecoveryUnitName=FSHPIMonitorServer,f
sipH ostName=CFPU0,fsFragmentId=Nodes,fsFragmentId=HA,fsClusterId=ClusterRoot
IAppl Addl. Info: Unit={BCNOC-A} Position=/chassis-2/slot-5
Sensor={number=218,Name=VDD_QLM3}
Appl. Addl. Info: 0.044
Notification ID: 8422
Extended event type : x1 (raise)
Control indicator: 7 (full visible)

Recovery procedures
1. DeterminethesensornamefromtheSensorfieldofthe IAppl Addl. Info
sectionofthealarm.
2. DeterminetheeffectedFRUfromthePositionfieldoftheIAppl Addl. Info
sectionofthealarm.
3. CheckthesensordataoftheFRUintroublewiththehelpofthesensornameand
FRUname.
BCNincludesseveralsensorsthatreportonhardwareconditions.Manyofthe
sensorreadingscanbeusedtodiagnosethehardwarefault.
FollowthestepsbelowtocheckthesensordataoftheFRUintrouble:
a) ChecktheLMPversion.
IssuethefollowingcommandtoshowtheLMPversion:
sw_fw_versioninfo
Example:
root@LMP-1-2-1:~# sw_fw_versioninfo
Active U-Boot Version 5.3.0 (in flash 0)
Backup U-Boot Version 5.3.0 (in flash 1)
LMP Version 5.3.0
PCB Version A104-3
LED CPLD Version 05
PCI-LPC bridge XP2 Version 05
VCMC Version 5.3.0

46

DN09109953

Issue:02A

ReplacingMulticontrollerRNCHardwareUnits

Dealingwithsensoralarms

PWR1014 Version 0007


FRUD Version 5.3.0
Part Number C111721.B3B
Add-in Card
MMC Version
Part Number
BCNOC-A PCPL Version
BCNOC-A OCTF Version
BCNOC-A FRUD Version

1
4.2.3
C111723.A1A
0.3.0
4.2.3
4.2.3

Add-in Card
MMC Version
Part Number
BCNOC-A PCPL Version
BCNOC-A OCTF Version
BCNOC-A FRUD Version

2
4.2.3
C111723.A1A
0.3.0
4.2.3
4.2.3

Add-in Card
MMC Version
Part Number
BCNOC-A PCPL Version
BCNOC-A OCTF Version
BCNOC-A FRUD Version

3
4.2.3
C111723.A1A
0.3.0
4.2.3
4.2.3

Add-in Card
MMC Version
Part Number
BCNOC-A PCPL Version
BCNOC-A OCTF Version
BCNOC-A FRUD Version

4
4.2.3
C111723.A1A
0.3.0
4.2.3
4.2.3

Add-in Card
MMC Version
Part Number
BCNOC-A PCPL Version
BCNOC-A OCTF Version
BCNOC-A FRUD Version

5
4.2.3
C111723.A1A
0.3.0
4.2.3
4.2.3

Add-in Card
MMC Version
Part Number
BCNOC-A PCPL Version
BCNOC-A OCTF Version
BCNOC-A FRUD Version

6
4.2.3
C111723.A1A
0.3.0
4.2.3
4.2.3

Add-in Card
MMC Version
Part Number
BCNOC-A PCPL Version
BCNOC-A OCTF Version
BCNOC-A FRUD Version

7
4.2.3
C111723.A1A
0.3.0
4.2.3
4.2.3

Add-in Card 8
MMC Version 4.2.3
Part Number C111723.A1A

Issue:02A

DN09109953

47

Dealingwithsensoralarms

ReplacingMulticontrollerRNCHardwareUnits

BCNOC-A PCPL Version 0.3.0


BCNOC-A OCTF Version 4.2.3
BCNOC-A FRUD Version 4.2.3
AMC
MMC Version
Part Number
hdsam-a_ad_frud Version

1
1.10
C110598.B3A
01.10.0000

AMC 2
PSU info 0
Board Mfg
: EMERSON
Board Product
: BAFE-B
Board Serial
: TR120201616
Board Part Number
: C112156.C1A
Board Extra
: 1f01
Product Manufacturer : EMERSON
Product Name
: DS1200-3-007
Product Part Number
: PSU
Product Version
: 04
Product Serial
: I510JS000H04P
PSU info 1
Board Mfg
: EMERSON
Board Product
: BAFE-B
Board Serial
: TR120201617
Board Part Number
: C112156.C1A
Board Extra
: 1f01
Product Manufacturer : EMERSON
Product Name
: DS1200-3-007
Product Part Number
: PSU
Product Version
: 04
Product Serial
: I510JS000J04P

b) Checksensors.
1. Listthehardwaresensors.
Issuethefollowingcommandtolistallthehardwaresensors:
mch_cli ShowSensor

Thiswilllistallthesensors,withtheLogicalUnitNumber(LUN)and
sensoraddresses,attachedtoahardwareunit.
Example:
mch_cli ShowSensor
root@LMP-1-2-1:~#
Entity: Unknown
Hot Swap PSU 1
PSU1 IN_Curr
PSU1 Fan 1
PSU1 Temp 2
PSU1 Temp 1
PSU1 Status
PSU1 OUT_Curr
PSU1 OUT_3V3

48

mch_cli ShowSensor
0x00
0x00
0x00
0x00
0x00
0x00
0x00
0x00

0x43
0x2c
0x28
0x26
0x24
0x22
0x20
0x1e

DN09109953

Issue:02A

ReplacingMulticontrollerRNCHardwareUnits

Dealingwithsensoralarms

PSU1 OUT_12V
PSU1 INPUT
Entity: BAFE-B
Hot Swap PSU 2
PSU2 IN_Curr
PSU2 Fan 1
PSU2 Temp 2
PSU2 Temp 1
PSU2 Status
PSU2 OUT_Curr
PSU2 OUT_3V3
PSU2 OUT_12V
PSU2 INPUT
Entity: BMFU-A
Hot Swap CU 1
Fan 2
Fan 1
Entity: BMFU-A
Hot Swap CU 2
Fan 4
Fan 3
Entity: BAFU-A
Hot Swap CU 3
Fan 6
Fan 5
Entity: BCNMB-A
POST Error
LMP Reset
SEL status
BMC Watchdog
CLOCK_IRQ
Reset Button
VCCA
1.0V
1.25V_GE
1.25V_XG
VCC3
VCC5
12V
0.9V
1.8V
1.1V
3.3SB
Inlet3 Temp
Inlet2 Temp
Outlet Temp
Inlet1 Temp
BCM56820 Temp
BCM56512 Temp
LMP Temp
PEX Temp
InterSrc NewSel
InterSrc Loss
Sync2 NewSel
Sync2 Loss

Issue:02A

0x00 0x1c
0x00 0x1a
0x00
0x00
0x00
0x00
0x00
0x00
0x00
0x00
0x00
0x00

0x44
0x2d
0x29
0x27
0x25
0x23
0x21
0x1f
0x1d
0x1b

0x00 0x45
0x00 0x0a
0x00 0x09
0x00 0x46
0x00 0x0c
0x00 0x0b
0x00 0x47
0x00 0x0e
0x00 0x0d
0x00
0x00
0x00
0x00
0x00
0x00
0x00
0x00
0x00
0x00
0x00
0x00
0x00
0x00
0x00
0x00
0x00
0x00
0x00
0x00
0x00
0x00
0x00
0x00
0x00
0x00
0x00
0x00
0x00

0x48
0x3d
0x3c
0x3b
0x30
0x2e
0x19
0x18
0x17
0x16
0x15
0x14
0x13
0x12
0x11
0x10
0x0f
0x08
0x07
0x06
0x05
0x04
0x03
0x02
0x01
0xf5
0xf4
0xf3
0xf2

DN09109953

49

Dealingwithsensoralarms

ReplacingMulticontrollerRNCHardwareUnits

Sync1 NewSel
0x00
Sync1 Loss
0x00
External alarm
0x00
Entity: Unknown AMC at
Hot Swap AMC 1
0x00
Entity: Unknown AMC at
Hot Swap AMC 2
0x00
Entity: Unknown AMC at
Hot Swap AMC 3
0x00
Entity: Unknown AMC at
Hot Swap AMC 4
0x00
Entity: Unknown AMC at
Hot Swap AMC 5
0x00
Entity: Unknown AMC at
Hot Swap AMC 6
0x00
Entity: Unknown AMC at
Hot Swap AMC 7
0x00
Entity: Unknown AMC at
Hot Swap AMC 8
0x00
Entity: Unknown AMC at
Hot Swap AMC 9
0x00
Entity: Unknown AMC at
Hot Swap AMC 10 0x00
Entity: CPU 1
RESET_TYPE
0x00
BOOT
0x00
BOOT_ERROR
0x00
VDD_QLM3
0x00
VDD_QLM2
0x00
VDD_QLM1
0x00
VDD_QLM0
0x00
VDD_VTT0
0x00
DDR_VDD
0x00
VDD_OCORE
0x00
MON_3VSB
0x00
MON_12V
0x00
Tmp421 Temp
0x00
BMC Watchdog
0x00
Entity: CPU 2
RESET_TYPE
0x00
BOOT
0x00
BOOT_ERROR
0x00
VDD_QLM3
0x00
VDD_QLM2
0x00
VDD_QLM1
0x00
VDD_QLM0
0x00
VDD_VTT0
0x00
DDR_VDD
0x00
VDD_OCORE
0x00
MON_3VSB
0x00
MON_12V
0x00
Tmp421 Temp
0x00
BMC Watchdog
0x00
Entity: CPU 3
RESET_TYPE
0x00

50

0xf1
0xf0
0xfe
00
0x31
00
0x32
00
0x33
00
0x34
00
0x35
00
0x36
00
0x37
00
0x38
00
0x39
00
0x3a
0x5d
0x5c
0x5b
0x5a
0x59
0x58
0x57
0x56
0x55
0x54
0x53
0x52
0x51
0x50
0x7d
0x7c
0x7b
0x7a
0x79
0x78
0x77
0x76
0x75
0x74
0x73
0x72
0x71
0x70
0x9d

DN09109953

Issue:02A

ReplacingMulticontrollerRNCHardwareUnits

Dealingwithsensoralarms

BOOT
BOOT_ERROR
VDD_QLM3
VDD_QLM2
VDD_QLM1
VDD_QLM0
VDD_VTT0
DDR_VDD
VDD_OCORE
MON_3VSB
MON_12V
Tmp421 Temp
BMC Watchdog
Entity: CPU 4
RESET_TYPE
BOOT
BOOT_ERROR
VDD_QLM3
VDD_QLM2
VDD_QLM1
VDD_QLM0
VDD_VTT0
DDR_VDD
VDD_OCORE
MON_3VSB
MON_12V
Tmp421 Temp
BMC Watchdog
Entity: CPU 5
RESET_TYPE
BOOT
BOOT_ERROR
VDD_QLM3
VDD_QLM2
VDD_QLM1
VDD_QLM0
VDD_VTT0
DDR_VDD
VDD_OCORE
MON_3VSB
MON_12V
Tmp421 Temp
BMC Watchdog
Entity: CPU 6
RESET_TYPE
BOOT
BOOT_ERROR
VDD_QLM3
VDD_QLM2
VDD_QLM1
VDD_QLM0
VDD_VTT0
DDR_VDD
VDD_OCORE
MON_3VSB

Issue:02A

0x00
0x00
0x00
0x00
0x00
0x00
0x00
0x00
0x00
0x00
0x00
0x00
0x00

0x9c
0x9b
0x9a
0x99
0x98
0x97
0x96
0x95
0x94
0x93
0x92
0x91
0x90

0x00
0x00
0x00
0x00
0x00
0x00
0x00
0x00
0x00
0x00
0x00
0x00
0x00
0x00

0xbd
0xbc
0xbb
0xba
0xb9
0xb8
0xb7
0xb6
0xb5
0xb4
0xb3
0xb2
0xb1
0xb0

0x00
0x00
0x00
0x00
0x00
0x00
0x00
0x00
0x00
0x00
0x00
0x00
0x00
0x00

0xdd
0xdc
0xdb
0xda
0xd9
0xd8
0xd7
0xd6
0xd5
0xd4
0xd3
0xd2
0xd1
0xd0

0x01
0x01
0x01
0x01
0x01
0x01
0x01
0x01
0x01
0x01
0x01

0x0d
0x0c
0x0b
0x0a
0x09
0x08
0x07
0x06
0x05
0x04
0x03

DN09109953

51

Dealingwithsensoralarms

ReplacingMulticontrollerRNCHardwareUnits

MON_12V
Tmp421 Temp
BMC Watchdog
Entity: CPU 7
RESET_TYPE
BOOT
BOOT_ERROR
VDD_QLM3
VDD_QLM2
VDD_QLM1
VDD_QLM0
VDD_VTT0
DDR_VDD
VDD_OCORE
MON_3VSB
MON_12V
Tmp421 Temp
BMC Watchdog
Entity: CPU 8
RESET_TYPE
BOOT
BOOT_ERROR
VDD_QLM3
VDD_QLM2
VDD_QLM1
VDD_QLM0
VDD_VTT0
DDR_VDD
VDD_OCORE
MON_3VSB
MON_12V
Tmp421 Temp
BMC Watchdog
Entity: AMC 1
Version change
DC/DC Failure
MMC Temp
HDD Temp
+5V Backend
+12V Backend
+3.3V MP
+12V Payload
Hot Swap

0x01 0x02
0x01 0x01
0x01 0x00
0x01
0x01
0x01
0x01
0x01
0x01
0x01
0x01
0x01
0x01
0x01
0x01
0x01
0x01

0x2d
0x2c
0x2b
0x2a
0x29
0x28
0x27
0x26
0x25
0x24
0x23
0x22
0x21
0x20

0x01
0x01
0x01
0x01
0x01
0x01
0x01
0x01
0x01
0x01
0x01
0x01
0x01
0x01

0x4d
0x4c
0x4b
0x4a
0x49
0x48
0x47
0x46
0x45
0x44
0x43
0x42
0x41
0x40

0x01
0x01
0x01
0x01
0x01
0x01
0x01
0x01
0x01

0x68
0x67
0x66
0x65
0x64
0x63
0x62
0x61
0x60

2. Getthehardwaresensorthreshold.
Findthesensoraddressofthesensoryouwantandissuethefollowing
commandtogetthesensorthresholdofthehardwareunit:
mch_cli GetSensorThreshold <LUN> <Sensor addr>
Example:
root@LMP-1-2-1:~# mch_cli GetSensorThreshold 0x00 0xda
sensor VDD_QLM3 (218)
Lower Non-Critical
: NA
Lower Critical
: NA
Lower Non-Recoverable : 1.1720
Upper Non-Critical
: NA
Upper Critical
: NA

52

DN09109953

Issue:02A

ReplacingMulticontrollerRNCHardwareUnits

Dealingwithsensoralarms

Upper Non-Recoverable : 1.2920


support thresholds: lnr unr

3. Getthehardwaresensordata.
Issuethefollowingcommandtogetthesensordataofthehardwareunityou
want:
mch_cli ReadSensor <LUN> <Sensor addr>
Example:
root@LMP-1-2-1:~# mch_cli ReadSensor 0x00 0xda
Sensor: VDD_QLM3
Lun:
0x00
Number: 0xda
Value: 1.236000

4. IfthesensorvalueisnotwithintheLowerandUpperthresholdvalue,trytoreplace
theFRUintroubletocorrectthesensorvalue.SeechapterReplacing hardware
unitsfordetails.
5. Iftheproblempersistsevenafterfollowingtheinstructions,pleasecontactyourlocal
NokiaSolutionsandNetworksrepresentativewithyourobservationsonthesensor
values.

Position of the PSUs and fan trays


ThefollowingpictureshowsthepositionsofthePSUsandfantrays.
Figure 24

Issue:02A

PositionsofthePSUsandfantrays

DN09109953

53

Communicationbetweenactiveandstandbyunitsina
BCNclusterfails

ReplacingMulticontrollerRNCHardwareUnits

11 Communication between active and standby


units in a BCN cluster fails
11.1 Description
Thecommunicationfailurebetweenactiveandstandbyunitsinaboxcontrollernode
(BCN)clusterforalongtimewillcauseasplit-brainsituation.Iftheclusterinternal
networkconnectionbetweenBCNmodulesfails,theclustermaygetpartitionedintotwo
independentparts,whichattempttoprovidethesameservices.Asaresult,theBCN
modulesdonotfunctionproperly.Possiblecausesfortheproblemare:

ImpropercablingbetweentheBCNboxes.
TamperingofcablesconnectingtheBCNboxes.
Incorrectswitchconfigurations.
Malfunctioningofhardwareorembeddedsoftware.

11.2 Symptoms
Improperhandlingofthehardwaremightleadtoascenariowheretwoisolatedpartsof
theBCNclusterarerunningandtryingtoprovidethesameservices.Inthissplit-brain
situation,thefollowingproblemsmightoccur:

StorageresourcesreplicatedusingDistributedReplicatedBlockDevice(DRBD)get
updatedindependentlyonbothsides.
AsbothCLAnodesrunanindependentinstanceoftheclustermanagement
software,nodesmaygetresetcontinuouslybecause,theycancommunicateonly
withonemanagementsoftwareatatime.
ExternalIPaddresseswillbeassignedtoboththeunitswhichcauseIPaddress
conflictsandvariouscommunicationerrors.
TheCLAnodesmightrebootcontinuously.

11.3 Recovery procedures


1

Power-off all the BCN modules.

Power-on one of the CLA node BCN modules.

Wait till the BCN module starts.


WaittilltheBCNmodulestarts.Incase,noconsoleconnectionisavailable,justwait
for3minutes.

54

DN09109953

Issue:02A

ReplacingMulticontrollerRNCHardwareUnits

Communicationbetweenactiveandstandbyunitsina
BCNclusterfails

Power-on the remaining BCN modules.


IftheCLAnodeisupandrunninginthepoweredonBCNmodule,power-onthe
remainingBCNmodules.Thiswilloverwritethediskdevicesofthelastactivatedunit
withthecopiesoftheunitthatwasstartedupfirst.

Verify that all the nodes and services are up and running.
Toverifythatthesplit-brainsituationisoverandalltheservicesareupandrunning,
enterthefollowingcommand:
show has summary

Theexampledisplaysa2BCNconfiguration.IftherearemoreBCNs,the
displayisdifferent.
_nokadmin@CFPU-0 [RNC-37] > show has summary
CFPU-0@RNC-37
Node status
Nodes in configuration
: 16
Unlocked nodes
: 16
RG status
RGs in configuration
Unlocked RGs

: 68
: 68

RU status
RUs in configuration
Unlocked RUs

: 274
: 274

Process status
Processes in configuration
Unlocked processes

: 1482
: 1482

[2013-05-09 10:07:57 +0200]

IftherearedifferencesbetweenNodes in configurationandUnlocked nodes,thensplitbrainisstillactive.

Issue:02A

TherestillmightbedifferencesinNode statusevenifsplit-brainisover.Please
checkiftherearenodesthatarelocked.Ifyes,unlockthemandtryonceagain.

DN09109953

55