Вы находитесь на странице: 1из 65

VCS Hardware Troubleshooting &

Boot Process
6/17/201
Vernon !e"ee
#b$ecti%es
&
'(ter this lesson) learners will be able to*
&
+ecall the ,ost co,,on (or,s o( hardware (ailure on the
Cisco VCS
&
!eter,ine i( one o( these hardware (ailures is ta-ing "lace
&
+ecall the i,"ortant ste"s o( the boot "rocess on the
Cisco VCS and use this -nowledge to troubleshoot issues
with VCS booting
Phase 1 . VCS Hardware
&
H!!
&
/0C
&
1'/S
VCS Harddri%es
&
sda. Pri,ar2 !ri%e3 0( this (ails) the VCS will not boot at
all3 This is ,ounted to the root and /tandberg
&
sdb 4 Secondar2 !ri%e3 0( this (ails) the VCS will act 5odd63
This is ,ounted to /,nt/harddis-3
S!' 1ailure*
&
The VCS will ne%er begin loading the #S3 So,ething
si,ilar to this ,a2 be (ro7en on the console*
S!B (ailure*
&
S!B can (ail in se%eral wa2s*
&
138 9hen the s2ste, boots) sdb will ne%er be detected
&
238 sdb will go unreachable a(ter the s2ste, has booted
&
38 sdb ,a2 be detected during s2ste, boot) but bounce
bac- and (orth between o"erating and not o"erating
"ro"erl2
Testing sdb* ls
&
:ood*
&
ls 4l /de%/sd;
&
brw4rw4444 1 root root <) 0 201241241= 1>*2> /de%/sda
&
brw4rw4444 1 root root <) 1 201241241= 1>*2> /de%/sda1
&
brw4rw4444 1 root root <) 2 201241241= 1>*2> /de%/sda2
&
brw4rw4444 1 root root <) 201241241= 1>*2> /de%/sda
&
brw4rw4444 1 root root <) > 201241241= 1>*2> /de%/sda>
&
brw4rw4444 1 root root <) 6 201241241= 1>*2> /de%/sda6
&
brw4rw4444 1 root root <) 7 201241241= 1>*2> /de%/sda7
&
brw4rw4444 1 root root <) < 201241241= 1>*2> /de%/sda<
&
brw4rw4444 1 root root <) 16 201241241= 1>*2> /de%/sdb
&
brw4rw4444 1 root root <) 17 201241241= 1>*2> /de%/sdb1
&
brw4rw4444 1 root root <) 1< 201241241= 1>*2> /de%/sdb2
Testing sdb* ls
&
Bad*
&
ls 4l /de%/sd;
&
brw4rw4444 1 root root <) 0 201241241= 1>*2> /de%/sda
&
brw4rw4444 1 root root <) 1 201241241= 1>*2> /de%/sda1
&
brw4rw4444 1 root root <) 2 201241241= 1>*2> /de%/sda2
&
brw4rw4444 1 root root <) 201241241= 1>*2> /de%/sda
&
brw4rw4444 1 root root <) > 201241241= 1>*2> /de%/sda>
&
brw4rw4444 1 root root <) 6 201241241= 1>*2> /de%/sda6
&
brw4rw4444 1 root root <) 7 201241241= 1>*2> /de%/sda7
&
brw4rw4444 1 root root <) < 201241241= 1>*2> /de%/sda<
&
Sdb was not detected b2 the VCS
Testing sdb* s,artctl
&
:ood*
&
? @ s,artctl 44all /de%/sdb
&
s,artctl >3=0 2010410416 r1<A BC<6D6=4"c4linuC4gnuE
Flocal build8
&
Co"2right FC8 2002410 b2 Bruce 'llen)
htt"*//s,art,ontools3source(orge3net
&
GGG ST'+T #1 0/1#+H'T0#/ SICT0#/ GGG
&
Hodel 1a,il2* Seagate Barracuda 7200312 (a,il2
&
!e%ice Hodel* ST2>01<'S
&
Serial /u,ber* >VJ11!/S
&
1ir,ware Version* CC<
&
Kser Ca"acit2* 2>0)0>A)>0)016 b2tes
&
!e%ice is* 0n s,artctl database B(or details use* 4P
showE
&
'T' Version is* <
&
'T' Standard is* 'T'4<4'CS re%ision =
&
Local Ti,e is* 1ri !ec 1= 1>*2<*01 2012 :HT
&
SH'+T su""ort is* '%ailable 4 de%ice has SH'+T
ca"abilit23
&
SH'+T su""ort is* Inabled
&
GGG ST'+T #1 +I'! SH'+T !'T' SICT0#/ GGG
&
SH'+T o%erall4health sel(4assess,ent test result*
P'SSI!
Testing sdb* s,artctl
&
Bad*
&
? @ s,artctl 44all /de%/sdb
&
s,artctl >3=0 2010410416 r1<A BC<6D6=4"c4linuC4gnuE
Flocal build8
&
Co"2right FC8 2002410 b2 Bruce 'llen)
htt"*//s,art,ontools3source(orge3net
&
GGG ST'+T #1 0/1#+H'T0#/ SICT0#/ GGG
&
Hodel 1a,il2* Seagate Barracuda 7200312 (a,il2
&
!e%ice Hodel* ST2>01<'S
&
Serial /u,ber* >VJ11!/S
&
1ir,ware Version* CC<
&
Kser Ca"acit2* 2>0)0>A)>0)016 b2tes
&
!e%ice is* 0n s,artctl database B(or details use* 4P
showE
&
'T' Version is* <
&
'T' Standard is* 'T'4<4'CS re%ision =
&
Local Ti,e is* 1ri !ec 1= 1>*2<*01 2012 :HT
&
SH'+T su""ort is* '%ailable 4 de%ice has SH'+T
ca"abilit23
&
SH'+T su""ort is* Inabled
&
GGG ST'+T #1 +I'! SH'+T !'T' SICT0#/ GGG
&
SH'+T o%erall4health sel(4assess,ent test result*
1'0LI!
Testing sdb* s,artctl
&
Bad*
&
? @ s,artctl 44all /de%/sdb
&
s,artctl >3=0 2010410416 r1<A BC<6D6=4"c4linuC4gnuE Flocal build8
&
Co"2right FC8 2002410 b2 Bruce 'llen) htt"*//s,art,ontools3source(orge3net
&
S,artctl o"en de%ice* /de%/sdb (ailed* /o such de%ice
&
Basicall2) an2thing other than P'SSI! is bad3
Testing sdb* d(
&
:ood*
&
? @ d( M gre" sdb
&
/de%/sdb2 207=1A6 16A11=< 2172A22= 1N /,nt/harddis-
&
? @
Testing sdb* d(
&
Bad*
&
? @ d( M gre" sdb
&
? @
&
Sdb is not ,ounted
Testing Ithernet Ports
&
:ood*
&
? @ i(con(ig 4a M gre" eth
&
eth0 Lin- enca"*Ithernet H9addr 00*10*1*1A*'0*B=
&
eth1 Lin- enca"*Ithernet H9addr 00*10*1*1A*'0*B>
&
eth2 Lin- enca"*Ithernet H9addr 00*10*1*1A*'0*B6
&
eth Lin- enca"*Ithernet H9addr 00*10*1*1A*'0*B7
Testing Ithernet Ports
&
Bad*
&
? @ i(con(ig 4a M gre" eth
&
eth0 Lin- enca"*Ithernet H9addr 00*10*1*1A*'0*B=
&
eth1 Lin- enca"*Ithernet H9addr 00*10*1*1A*'0*B6
&
eth2 Lin- enca"*Ithernet H9addr 00*10*1*1A*'0*B7
&
'll = Ithernet "orts ,ust show
Serial /u,ber +ange (or Potential
/0C 0ssues
&
0( the serial nu,ber is in the range >2'1A2674>2'2>>10
(or VCS and >>'000014>>'0001 (or Conductor) the VCS
is at ris- (or a (ailed /0C3
&
ICact Serial /u,bers can be seen on the eCcel doc at
htt"*//17231<310>3107/"atches/S/N20listN20o(
N20"otentialN20/0CN20issueD%er23ClsC
1an 0ssues
&
' VCS can run (ine with 2 (ailed (ans
&
Process +H' i( reOuested b2 custo,er e%en i( onl2 one
(an (ailure3 Custo,er will get a re(urbished VCS that ,a2
ha%e ,ore se%ere "roble,s3
&
VCS running P> and P6 will re"ort a (an alar, e%en i( the
(an is now o"erating at a "ro"er s"eed3 'lar,s can be
,anuall2 reset3
&
Current VCS code will onl2 raise (an alar, i( 2 or ,ore
(ans ha%e (ailed3
VCS :enerations
&
>2'0 . Brage1
&
>2'1)>2'2 . Brage2
&
Brage1 is Ouite old and has reached the eC"eced H!! li(e
(or sdb
Phase 2 . VCS Boot Process
&
:rub
&
0nit scri"ts
VCS Boot Process 4 :rub
&
The VCS boots o(( o( grub on sda13 :rub will then ha%e
the VCS boot either o(( o( sda> Fi,age18 or sda6
Fi,age283
&
The acti%e i,age is ,ounted onto /3
VCS Boot Process 4 inittab
&
/etc/inittab is re(erenced3 /etc/inittab calls /etc/init3d/rc with
the current run le%el3
&
/etc/init3d/rc has di((erent grou"s o( run le%els
&
0* call /etc/rc3shutdown and halt s2ste,3
&
14>* call /etc/rc3s2sinit with the run le%el3
&
6* call /etc/rc3shutdown and reboot3
VCS Boot Process . rc3s2sinit
&
/etc/rc3s2sinit calls all o( the startu" scri"ts in /etc/init3d/3
Scri"ts starting with I are called (irst) a(ter this scri"ts
starting with S are called3 Scri"ts are called in nu,berical
order3
&
ICa,"le* I00bootlogd is called (irst and SAA%,toolsd is
called last3
&
/etc/init3d/ scri"ts are called with an argu,ent o( either
start) sto") or restart3
VCS Boot Process . Hounting o(
other Partitions
&
/etc/init3d/I26,ount reads /etc/"artitions3con( to get the
location o( the ro and rw "artitions3
&
The scri"t will then detect which "artition is ,ounted as
root and ,ount the a""ro"riate rw "artition on /tandberg3
&
+o "artition 1 is /de%/sda>
&
+w "artition 1 is /de%/sda7
&
+o "artition 2 is /de%/sda6
&
+w "artition 2 is /de%/sda<
VCS Boot Process . Clusterdb
&
9ith /etc/init3d/S66clusterdb the VCS loads the cluster
database) which stores the con(iguration o( the VCS and
can re"licate the con(iguration to cluster "eers Fi( there
are an283
&
0( (or an2 reason) clusterdb (ails to start) it can ta-e
se%eral ,inutes (or this "rocess to re"ort a (ailure3 Since
the VCS will not ,o%e (orward in the boot "rocess until it
recei%es (eedbac- (ro, the current stage in the boot
"rocess) this ,a2 loo- li-e a (ro7en VCS3
VCS Boot Process 4 /t,"/hw(ail
&
The /etc/init3d/S7>tandberg scri"t) which launches the
tandberg a""lication chec-s to see i( a (ile na,ed
/t,"/hw(ail eCists3 0( it does) the tandberg a""lication will
not start and a ,essage will be "rinted to the console
sa2ing that /t,"/hw(ail eCists and that the a"" will not
start3
@ !onQt start the i,age i( the hardware is bro-en
i( B 4( /t,"/hw(ail ER then
echo S/t,"/hw(ail eCists* T'/!BI+: a""lication startu" inhibitedS
doDlog SI%entGTSHardware 1ailureTS !etailGTS'""lication startu"
inhibitedTSS
eCit 0
(i
VCS Boot Process 4 /t,"/hw(ail
&
Luc-il2) i( the /t,"/hw(ail (ile eCists) it will tell 2ou wh2 the
(ile eCists3 The contents o( this (ile can be read with the
co,,and
&
cat /t,"/hw(ail
&
The out"ut o( the cat co,,and should tell 2ou what
section o( the hardware has (ailed3
VCS Boot Process . Kse(ul
Ser%ices
&
/etc/init3d/S10networ- brings u" the networ- inter(aces on
the VCS3
&
/etc/init3d/S11dns,asO starts the !/S ,asOuerade3
&
/etc/init3d/S76o"ends calls #"en!S on THS'gent
s2ste,s3
&
/etc/init3d/S<0htt"d calls a"ache3
&
:o ahead and ls /etc/init3d to see a co,"lete list o( whatUs
going on here3
Li%e Loo-u" o( Boot Process
The IndV
&
WuestionsX
Y 2010 Cisco and/or its a((iliates3 'll rights reser%ed3 Cisco Con(idential 0 Cisco Con(idential Y 2010 Cisco and/or its a((iliates3 'll rights reser%ed3 0
'll 'bout Sna"shots
Alan Ford
&

&
alan(ordZcisco3co,
Y 2010 Cisco and/or its a((iliates3 'll rights reser%ed3 Cisco Con(idential 1
#b$ecti%es
&
'(ter this lesson learners will be able to*
&
Collect s2ste, sna"shots (ro, a Cisco VCS
&
'nal27e -e2 "ortions o( a (ull s2ste, sna"shot
Y 2010 Cisco and/or its a((iliates3 'll rights reser%ed3 Cisco Con(idential 2
T2"es o( Sna"shots
&
Status sna"shot
Con(iguration and Status PHL
:enerated (ro, clusterdb* Hore co,"lete than 5Ccon(6 and 5Cstat6 out"ut
Aside: note difference between clustering and clusterdb
Can also get (ro, htt"*//%cs/con(iguration3C,l etc
&
Logging sna"shot
Last two instances o( -e2 logs
+arel2 use(ul . diagnostics logs are t2"icall2 ,ore use(ul when re"licating a
"roble,
&
1ull sna"shot
9hat 2ou t2"icall2 want . all logs stored) and other use(ul in(or,ation
Y 2010 Cisco and/or its a((iliates3 'll rights reser%ed3 Cisco Con(idential
La2out o( sna"shot
&
Host use(ul things are under ,nt/harddis-/sna"shot/"lugins/
harddis-logs
oa-DtCasD(iles
oa-Dcrashlogs
Clusterdb
&
'lso o( note*
tandberg/etc and tandberg/"ersistent* contains the contents o( these
con(iguration directories) so 2ou can chec- (or con(iguration errors
Y 2010 Cisco and/or its a((iliates3 'll rights reser%ed3 Cisco Con(idential =
oa-DtCasD(iles
&
PHL +e"resentations o( s2ste, con(iguration and status
IOui%alent to doing Ccon( / Cstat / etc
0ncludes additional con(ig no longer "resent in Ccon(
&
con(iguration3C,l
S2ste, con(iguration3 0ncludes %ersion) o"tion -e2s) etc3
&
status3C,l
S2ste, status3 #( "articular interest* +esourceKsage) +egistrations
&
histor23C,l
+ecent detailed call and search histor2
Y 2010 Cisco and/or its a((iliates3 'll rights reser%ed3 Cisco Con(idential >
oa-Dcrashlogs F"re4P7328 / crashlogs
F732[8
&
3tar3g7 (ile o( recent crashes
&
/ote an2 "rocess on the boC can crash . so,e serious) so,e not
VCS so(tware F5a""68 will auto,aticall2 restart) but call and registration state
will be lost
5net6 is har,less
's are 5linuCstatus3"26 or 5,anage,ent(ra,ewor-3"26 F"rior to P7328
5winbindd6 could necessitate a restart
&
Loo- inside crash PHL du,"s to see i( the2 are re"etiti%e
&
Load into htt"s*//103>031>23110 to see i( the2Ure -nown
&
Please chec- the out"ut ,a-es sense based on s2,"to,s/logsV
0( in doubt) please do check with us!
Y 2010 Cisco and/or its a((iliates3 'll rights reser%ed3 Cisco Con(idential 6
clusterdb
&
TeCtual du,"s o( the Cluster!B tables
&
Those with a serial nu,ber su((iC* sharded F"eer4s"eci(ic8 records
i3e3 records under co,"lete control o( that "eer
&
/on4sharded Fglobal8 tables
'n2 "eer can edit
&
Status is t2"icall2 sharded
Y 2010 Cisco and/or its a((iliates3 'll rights reser%ed3 Cisco Con(idential 7
harddis-logs
&
'll logs in logs/ subdirector2
&
Prior to P7* 2ou need to untar the logs3tar3g7 (ile too
&
Logs are on a single "artition on the VCS . not %ersion4s"eci(ic
&
u"grade3log
Logs u"grades on the boC3 'llows correlation o( u"grade trans(or,s) reboots)
ser%ice and load changes) etc3
&
tandberg4webcon(ig3log
Logs all con(iguration changes3 Ver2 use(ul3
&
a"ache;log
Logs web ser%er accesses and errors) including THS "robes3
Y 2010 Cisco and/or its a((iliates3 'll rights reser%ed3 Cisco Con(idential <
harddis-logs F28
&
s2sinit3log
Ver2 use(ul . logs e%er2 startu" and shutdown
Jou can see uncontrolled shutdowns here Fi3e3 a startu" (ollowed b2 another
startu"8 . boC would ha%e been hard "ower c2cled
Jou can also see clusterdb) o"ends) etc (ailing to start3
&
-ernel
#ut"ut (ro, the -ernel3 I3g3 use(ul to loo- (or hard dis- errors3
&
critical
Critical errors (ro, other logs3 /ot actuall2 "articularl2 use(ul3
&
dae,on3log F"re%iousl2 sa,ba3log8
#ut"ut (ro, s2ste, dae,ons) notabl2 nt"d Fti,e s2nchronisation8 and racoon
F0PSec (or clustering8
Y 2010 Cisco and/or its a((iliates3 'll rights reser%ed3 Cisco Con(idential A
s2sinit3log
&
Clean reboot*
Mon Nov 19 14:14:49 UTC 2012 system shutdown completed Linux csn-hen-
vcsd1 2!"!#9!4 $1 %M& Thu 'ul 12 11:#":44 (%T 2012 x)"*"4 +NU,Linux
Mon Nov 19 09:1-:02 .%T 2012 system initi/lis/tion st/ted Linux 0none1
2!"!#9!4 $1 %M& Thu 'ul 12 11:#":44 (%T 2012 x)"*"4 +NU,Linux
&
Knclean reboot*
Mon Nov 19 14:4#:-9 .%T 2012 system initi/lis/tion complete
Mon Nov 2" 09:#):#- .%T 2012 system initi/lis/tion st/ted Linux 0none1
2!"!#9!4 $1 %M& Thu 'ul 12 11:#":44 (%T 2012 x)"*"4 +NU,Linux
&
+estarts
%un 'un - 02:#":24 +MT 2011 system est/t complete
&
1ailed ser%ice startu"s*
%/t 'un 4 2#:10:44 &3T 2011 %""clusted4 st/tup 5/iled6
Mon M/y 21 14:09:2) +MT 2012 %2"opends st/tup 5/iled6
Y 2010 Cisco and/or its a((iliates3 'll rights reser%ed3 Cisco Con(idential =0
-ernel
&
Hard dis- errors*
7ct #1 19:49:4# vcs01 8enel: /t/4: lost inteupt 0%t/tus 0x-01
7ct #1 19:49:4# vcs01 8enel: /t/4!00: exception .m/s8 0x10 %9ct 0x0 %. 0x40-0002
/ction 0xe :o;en
7ct #1 19:49:4# vcs01 8enel: /t/4: %.o: < =ecovComm &>?=dyCh@ CommA/8e 3ev.xch B
7ct #1 19:49:4# vcs01 8enel: /t/4!00: :/iled comm/nd: A=CT. 3M9
7ct #1 19:49:4# vcs01 8enel: /t/4!00: cmd c/,00:0):11:0-:"),00:00:00:00:00,e/ t/@ 0 dm/
409" out
7ct #1 19:49:4# vcs01 8enel: es 40,00:01:09:4::c2,00:00:00:00:00,00 .m/s8 0x14 09T9 4us
eo1
7ct #1 19:49:4# vcs01 8enel: /t/4!00: st/tus: < 3=3? B
7ct #1 19:49:4# vcs01 8enel: /t/4: h/d esettin@ lin8
7ct #1 19:49:4# vcs01 8enel: /t/4: %9T9 lin8 up #!0 +4ps 0%%t/tus 12# %Contol #001
7ct #1 19:49:4# vcs01 8enel: /t/4!00: con:i@ued :o U3M9,1##
7ct #1 19:49:4# vcs01 8enel: /t/4: .> complete
&
Ithernet lin- down
9u@ # 11:4":0# cisco 8enel: e1000e: eth0 NCC Lin8 is 3own
&
Probable ,e,or2 issue
'un 1 02:4):10 vcsc 8enel: .DT2-:s 0loop1)1: eo: un/4le to e/d supe4loc8
'un 1 02:4):10 vcsc 8enel: 59T: un/4le to e/d 4oot secto
'un 1 02:4):10 vcsc 8enel: 59T: un/4le to e/d 4oot secto
'un 1 02:4):10 vcsc 8enel: iso:s*:ill*supe: 4e/d :/iledE devFloop1)E iso*4l8numF1"E
4loc8F#2
&
So,e -ernel "anics Fattaching ,onitor is best8
Y 2010 Cisco and/or its a((iliates3 'll rights reser%ed3 Cisco Con(idential =1
harddis-logs 4 sensors
&
Ver2) %er2 use(ul3 !ata "olled e%er2 10 ,inutes3
&
\date]Tue Ha2 1 06*11*06 BST 2012\/date]
&
\d(] G !is- Ksage
,dev,sd/- 100449" #"01-" -9##12 #)G ,
,dev,/m0 19#")2 -#)9) 1292)9 #0G ,v/
,dev,/m1 1420144 10#1" 1#)-12) 1G ,tmp
,dev,sd/2 100449" -10-") 442900 -4G ,t/nd4e@
,dev,sd42 2#0241#9" 1220#24) 201)12124 )G ,mnt,h/ddis8
&
\inodes] G i4node usage
,dev,sd/- "#)22 9##0 -4-42 1-G ,
,dev,/m0 -0000 9" 49904 1G ,v/
,dev,/m1 9#"9" 12"# 919## 2G ,tmp
,dev,sd/2 "#)22 #122 "02-0 -G ,t/nd4e@
,dev,sd42 29#1092" 2111 29#0))"- 1G ,mnt,h/ddis8
Y 2010 Cisco and/or its a((iliates3 'll rights reser%ed3 Cisco Con(idential =2
1ull !is-sX
&
100N d( or inode usage can cause all sorts o( odd beha%iour including
crashes) no web/CL0 access) and odd (ailures
Ha2 see 5no s"ace le(t on de%ice6 in crashes
&
So,e co,,on scenarios^
&
1ull d( /tandbergX
'n2thing uneC"ected in /tandberg
Remember to always cd /mnt/harddisk before running tcpdumps!!
&
1ull d( /,nt/harddis-X
Chec- out /,nt/harddis-/log (or eCcessi%e #"en!S logs
&
1ull inodes in /t,"X
HinodesI
5ilesystem Cnodes CUsed C5ee CUseG Mounted on
/dev/ram1 93696 93696 0 100% /tmp
Probabl2 #"en!S (ailing to start3 Chec- s2sinit3log and /t," contents
+un t,sagentDdestro2DandD"urgeDdata Freboot (irst8
^'nd ideall2) u"grade to THS PIV
Y 2010 Cisco and/or its a((iliates3 'll rights reser%ed3 Cisco Con(idential =
Hard !ri%e 1ailure
&
_ernel log errors nor,all2 seen
&
'lso worth chec-ing \s,artctl] in sensors*
%M9=T .o Lo@ Jesion: 1
No .os Lo@@ed
&
0( an2thing sa2s 5I++#+6) itUs ,ost li-el2 dead
&
'lso) an2 %er2 large $u,"s F/#T the absolute %alue) but the change8 in
see- errors would ,ean slow res"onse and re"eated occurrences o(
such a $u," G high li-elihood o( i,,inent (ailure
2 %ee8*.o*=/te 0x000: 0)4 0"0 0#0 &e-:/il 9lw/ys - 2"2400221
2 %ee8*.o*=/te 0x000: 0)4 0"0 0#0 &e-:/il 9lw/ys - 2"2)0#-"#
Y 2010 Cisco and/or its a((iliates3 'll rights reser%ed3 Cisco Con(idential ==
harddis-logs . sensors F28
&
\sensors]
5/n 1: 10-4" =&M 0min F 2"20 =&ME div F )1
5/n 2: 10-4" =&M 0min F 2"20 =&ME div F )1
5/n #: 10-4" =&M 0min F 2"20 =&ME div F )1
%ys Temp1: K#-!0 C 0hi@h F K4-!0 C1 senso F themisto
%ys Temp2: K#2!0 C 0hi@h F K4-!0 C1 senso F themisto
C&U Temp: K#-!0 C 0hi@h F K-0!0 C1 senso F them/l diode
&
\i(con(ig]
eth0 Lin8 enc/p:.thenet >A/dd 00:10:5#:05:-5:#)
inet /dd:10!-0!1"4!11 (c/st:10!-0!1"4!122 M/s8:2--!2--!2--!12)
inet" /dd: :e)0::210::#::::e0::-:#),"4 %cope:Lin8
inet" /dd: 2001:420:4:e/:)::1"4:11,"4 %cope:+lo4/l
U& (=793C9%T =UNNCN+ MULTCC9%T MTU:1-00 Metic:1
=D p/c8ets:1#222992 eos:0 dopped:0 oveuns:0 :/me:0
TD p/c8ets:1#-222#9 eos:0 dopped:0 oveuns:0 c/ie:0
collisions:0 txLueuelen:1000
=D 4ytes:-"2"0-10#1 0-41#!1 M41 TD 4ytes:"-2"901-)2 0"222!2 M41
Cnteupt:1) Memoy::d/e0000-:d400000
Y 2010 Cisco and/or its a((iliates3 'll rights reser%ed3 Cisco Con(idential =>
1an and /0C 1ailures
&
1an (ailures
0n P73231[ we will onl2 raise an alar, i( two or ,ore (ans ha%e (ailed3
Testing has shown the VCS can (ull2 o"erate with one (ailed (an3
9e also ha%e a high te,"erature alar, as a sa(et2 net3
Single (an (ailures "re4P73231 can be +H'Ud i( custo,er insts) but "lease "oint
to CSCud2=211 (irst . it is sa(e to o"erate with one (an (ailure and alar, can
be ignored3 Chec- out sensors to ,a-e sure onl2 one (an has (ailed3
&
/0C (ailure
0( a VCS is unco,,unicati%e) chec- out the nu,ber o( eth de%ices "resented
in -ernel log
5i(con(ig 4a M gre" eth6 should show (our de%ices
CSCua=7=A docu,ents the batch o( VCSs with "otential L'/ (aults
&
+oute ,iscon(iguration
0( onl2 certain boCes are re"orted una%ailable) do chec- out the routing table^
Y 2010 Cisco and/or its a((iliates3 'll rights reser%ed3 Cisco Con(idential =6
harddis-logs . sensors F8
&
\routes]
3estin/tion +/tew/y +enm/s8 5l/@s Metic =e: Use C:/ce
de:/ult 10!-0!1"4!1 0!0!0!0 U+ 0 0 0 eth0
10!-0!1"4!0 M 2--!2--!2--!12) U 0 0 0 eth0
&
\netstatan]
9ctive Cntenet connections 0seves /nd est/4lished1
&oto =ecv-N %end-N Loc/l 9ddess 5oei@n 9ddess %t/te &C3,&o@/m
tcp 0 0 10!-0!1"4!11:-0"1 10!-4!2"!2:22992 .%T9(LC%>.3 1#"12,/pp
tcp 0 0 10!-0!1"4!11:22104 10!-0!1"1!49:-0"1 .%T9(LC%>.3 1#"12,/pp
&
\u"ti,e]
11:24:## up # d/ysE 14:29E 2 usesE lo/d /ve/@e: 0!00E 0!04E 0!0-
&
\"s] G what "rogra,s are running) and since when
oot 1#-"- 0!0 0!0 109)4 1220 O % M/y24 0:00 ,4in,sh ,s4in,t/nd4e@ st/t
=oot 1#"12 2!2 9!1 220)-" #"))92 O %l M/y24 14#:4- P* ,t/nd4e@,im/@es,/pp
Y 2010 Cisco and/or its a((iliates3 'll rights reser%ed3 Cisco Con(idential =7
harddis-logs . sensors F=8
&
\to"] G "rocesses ordered b2 CPK usage) & resource stats
top - 11:24:#4 up # d/ysE 14:29E 2 usesE lo/d /ve/@e: 0!00E 0!04E 0!0-
T/s8s: 149 tot/lE 1 unnin@E 14) sleepin@E 0 stoppedE 0 ;om4ie
Cpu0s1: 4!1GusE 2!2GsyE 0!#GniE 9#!2GidE 0!2Gw/E 0!0GhiE 0!0GsiE 0!0Gst
Mem: 4044-)48 tot/lE #9949"48 usedE 49"208 :eeE 10#))48 4u::es
%w/p: 922--1"8 tot/lE 142)8 usedE 92240))8 :eeE 1""2")08 c/ched
&C3 U%.= &= NC JC=T =.% %>= % GC&U GM.M TCM.K C7MM9N3
1#"12 oot 20 0 20#m #"0m 19m % ) 9!1 14#:4-!20 ,t/nd4e@,im/@es,/pp
&
\"rocD,e,in(o] G lots o( use(ul ,e,or2 stats) "articularl2*
Committed*9%: #21224) 8(
&
NB: #ut"ut (ro, all sensors log ,odules . so,eti,es in ,ore detail .
at the ti,e the sna"shot was ta-en is in the plugins/sysmonitor director2
in a sna"shot
Y 2010 Cisco and/or its a((iliates3 'll rights reser%ed3 Cisco Con(idential =<
LinuC He,or2 Hanage,ent
&
/u,bers can be %er2 ,isleading3 To" out"ut*
Mem: 4044-)48 tot/lE #9949"48 usedE 49"208 :eeE 10#))48 4u::es
%w/p: 922--1"8 tot/lE 1428k usedE 92240))8 :eeE 1667680k cached
&C3 U%.= &= NC VIRT RES %>= % GC&U GM.M TCM.K C7MM9N3
1#"12 oot 20 0 703m 360m 19m % ) 9!1 14#:4-!20 ,t/nd4e@,im/@es,/pp
&
Ksed ,e,or2* this is not i,"ortant3 ' LinuC boC will atte,"t to use all
,e,or2 . that which is not used will be used to cache Fsee 5cached683
That which is in use is roughl2 SKHF+ISident8
&
Swa"* once all ,e,or2 has been used) and ,ost caching is eli,inated)
LinuC will start to swa"3 This can be bad3
&
Co,,ittedD'S* roughl2 what the boC needs to use to ser%ice all
a""licationUs ,e,or2 reOuests Froughl2 SKHFV0+Tual88
&
\"rocD,e,in(o] gi%es ,ore caching data too3
Y 2010 Cisco and/or its a((iliates3 'll rights reser%ed3 Cisco Con(idential =A
Load 0ssues
&
0( Co,,ittedD'S goes signi(icantl2 o%er =:B Fabout =>000008) the VCS
will li-el2 start swa""ing3 Swa""ing leads to the VCS beco,ing less
res"onsi%e3 This is indicati%e o( a need to reduce load be(ore ,ore
"roble,s "ersist3
&
Load a%erage* i( regularl2 eCceeding 2) things are worr2ing3
&
VCS 5a""6 CPK usage N3 Chec- \to"] o%er ti,e3 0( regularl2 eCceeding
>0N) things are worr2ing3
&
+e4registration inter%als
S0P in "articular is %er2 hea%2 on ,essaging
0( the VCS is a Control) it is sa(e . and reco,,ended . to increase the re(resh
inter%als to 1<00/600 or e%en 600/600) de"ending on how bus2 the VCS is3
&
Search rule o"ti,i7ation to reduce search load
Tr2 not to use 'n2'lias i( at all "ossibleR do tailored) intelligent search rules
&
He,or2 lea-s
Host -nown issues (iCed in P731) a (ew ,ore in P732) and one about intra4
cluster searches (iCed in P73232
&
High intra4cluster latenc2 causes "artitioning/re4clustering which is load4
hea%2
&
#"en!S* 0( #"en!S is in use) ,o%e to THS PIV
&
Knnecessar2 high logging le%els Fsna"shot (ull o( !IBK:X8
&
Knbalanced load across cluster FG] use !/S S+V8
Y 2010 Cisco and/or its a((iliates3 'll rights reser%ed3 Cisco Con(idential >0
harddis-logs F8
&
erlang3log
Irrors (ro, the Irlang runti,e) which runs the Cluster!B3
Search (or 5C+'SH6 or 5I++#+63
:i%es indication i( 5,nesia6 is o%erloaded . i( so all sorts o( crashes and
unres"onsi%eness and dro""ed registrations can ha""en3 9ill t2"icall2 ha""en
when boC is hea%il2 loaded3
&
o"ends;log
Kse(ul (or debugging #"en!S startu" issues3 +egular occurrence*
#"en!SUs con(iguration (ile gets corru"ted F(iCed in P7318
1ails to start
:ets stuc- in restart loo"
1ills /t," with (iles F(iCed in P7328) until web ser%er sto"s res"onding
1iC with t,sagentDdestro2DandD"urgeDdata
#r u"grade to THS PIV
Y 2010 Cisco and/or its a((iliates3 'll rights reser%ed3 Cisco Con(idential >1
harddis-logs . de%elo"erDlog
&
!e%elo"er4oriented logging3 Kse(ul "lace to loo- (or I++#+s3
&
de%elo"er3clusterdb*
2012-0--24T21:2-:42K01:00 usc01-vcs1 UTCTimeFQ2012-0--24 20:2-:42E122Q
ModuleFQdevelope!clusted4!clustem/n/@eQ LevelFQCN57Q NodeFQclusted4R10!-0!1"4!11Q
3et/ilFQ=eceived el/n@ node up eventQ NodeFQclusted4R10!-0!1"-!11Q %t/teFQunde:inedS
&
But donUt be ,isled*
M/y 1 0-:4):29 @4syc/1v@001 tvcs: UTCTimeFQ2012-0--01 04:4):29E22#Q
ModuleFQdevelope!sslQ LevelFQ.==7=Q
CodeLoc/tionFQppcm/ins,ssl,ttssl,ttssl*openssl!cpp0"21Q MethodFQ::TT%%L.o7utputQ
The/dFQ0x2:/-1e/49200Q: TT%%L*continue>/ndsh/8e: 5/iled to est/4lish %%L connection
&
Herging data in clusterdb
See how long things ta-e . s2ste, will be su((ering during this3 0( lots o( node
u"/node down e%ents seen) that is indicati%e o( networ- issues Fincluding
]0,s latenc2 between "eers i( no signi(icant boC load is seen8
Kse(ul to co,"are with other "eer sna"shots too
&
'lar,s) re"lication) etc) etc
Y 2010 Cisco and/or its a((iliates3 'll rights reser%ed3 Cisco Con(idential >2
harddis-logs . ,essages
&
,essages G e%ents) e3g3
2012-0--20T10:49:4)K01:00 usc01-vcs1 tvcs: .ventFQ=e@ist/tion =eLuestedQ
%eviceFQ>#2#Q %c-ipFQ10!-0!1"2!-2Q %c-potFQ1219Q %c-/li/s-typeFQ>#2#Q
%c-/li/sFQ/l/n:od!ex90Rusc01!cisco!comQ &otocolFQU3&Q LevelFQ1Q
UTCTimeFQ2012-0--20 09:49:4)E-24S
&
Hessages include*
+egistration B +eOuested M 'cce"ted M +e$ected E
Search B 'tte,"ted M Co,"leted M Cancelled E
Source 'liases +ewritten
Call B 'tte,"ted M Connected M +e$ected M !isconnected E
Hessage B Sent M +ecei%ed E FH322> Signalling8
+eOuest B Sent M +ecei%ed E FS0P Signalling8
Y 2010 Cisco and/or its a((iliates3 'll rights reser%ed3 Cisco Con(idential >
harddis-logs . networ-Dlog
&
Logs actual networ- ,essages3 ICa,"le H322>*
2012-0--2)T1#:-9:-1K01:00 usc01-vcs1 tvcs: UTCTimeFQ2012-0--2)
12:-9:-1E410Q ModuleFQnetwo8!h#2#Q LevelFQCN57Q: 3st-ipFQ10!-0!1"1!2#Q
3st-potFQ1220Q TM 3et/ilFQ%endin@ >!22- %etup >!#2#v" (/ndwidth:2")84ps
3est9li/s:h/ll/m2!ex90Rusc01!cisco!com 3estC%9dd:
UVC&v4VVTC&VV10!-0!1"1!2#:1220VW Jendo:T9N3(.=+Q
&
ICa,"le S0P3 Trace with Call40!*
2012-0--2)T1":00:19K01:00 usc01-vcs1 tvcs: UTCTimeFQ2012-0--2)
1-:00:19E2#2Q ModuleFQnetwo8!sipQ LevelFQCN57Q: %c-ipFQ10!-0!1"-!1-Q
%c-potFQ2-0-#Q 3et/ilFQ=eceive =eLuest MethodFCNJCT.E =eLuest-
U=CFsip:usc01-1-ed@4/stonRcisco!comE C/ll-
C3Fddd21"#9c2e)/:1dR10!--!)0!222Q
2012-0--2)T1":00:19K01:00 usc01-vcs1 tvcs: UTCTimeFQ2012-0--2)
1-:00:19E24-Q ModuleFQnetwo8!sipQ LevelFQCN57Q: 3st-ipFQ10!-0!1"-!1-Q
3st-potFQ2-0-#Q 3et/ilFQ%endin@ =esponse CodeF100E MethodFCNJCT.E
ToFsip:usc01-1-ed@4/stonRcisco!comE C/ll-
C3Fddd21"#9c2e)/:1dR10!--!)0!222S
&
/ote* uni(iedDlog G networ- [ ,essages [ de%elo"erDlog
Y 2010 Cisco and/or its a((iliates3 'll rights reser%ed3 Cisco Con(idential >=
!iagnostic Logging
&
Pro%ides a si,"le wa2 to get "rotocol
logs and other "ertinent log ,essages
(ro, the VCS
&
Select Haintenance ] !iagnostics ]
!iagnostics Logging
&
+e"laces 5netlog6 (ro, P730 onwards
&
Set 5networ-6 to !IBK: FeOui%alent to
5netlog 268
&
'lso set 5interwor-ing6 to !IBK:
unless 2ouUre sure no interwor-ing is
going on
&
9ill a((ect what is in sna"shot Flog is
ta-en (ro, uni(iedDlog8
&
Logging is started/sto""ed across
whole cluster) but ,ust be downloaded
se"aratel2
Y 2010 Cisco and/or its a((iliates3 'll rights reser%ed3 Cisco Con(idential >>
/ew Sna"shot 1eatures in P732
&
'lar,s (older logging acti%e alar,s
&
Sensors logs now logs 5a""stats6 . the +esourceKsage out"ut Fcalls
and registrations in use at that "oint in ti,e8
Res!urce"sa#e $tem%&1&'
(a))s $tem%&1&'
Traversa) $tem%&1&'
(urre*t $tem%&1&'0/(urre*t'
+a, $tem%&1&'-/+a,'
T!ta) $tem%&1&'680/T!ta)'
/Traversa)'
.!*Traversa) $tem%&1&'
(urre*t $tem%&1&'0/(urre*t'
+a, $tem%&1&'8/+a,'
T!ta) $tem%&1&'1099/T!ta)'
/.!*Traversa)'
/(a))s'
Re#$strat$!*s $tem%&1&'
(urre*t $tem%&1&'-7/(urre*t'
+a, $tem%&1&'89/+a,'
T!ta) $tem%&1&'-92/T!ta)'
/Re#$strat$!*s'
/Res!urce"sa#e'
Y 2010 Cisco and/or its a((iliates3 'll rights reser%ed3 Cisco Con(idential >6
Ti"s) tric-s and co,,on issues^
Y 2010 Cisco and/or its a((iliates3 'll rights reser%ed3 Cisco Con(idential >7
9or-ing with !iagnostic Logs
&
Tracing S0P call
Loo- (or the 0/V0TI
Then trace the Call40!
Beware o( (or-ing
&
!ebugging 0nterwor-ing
#"enLogicalChannel F#LC8 onl2 sent on ,edia
Kse the 5LCList6 out"ut (or ,onitoring logical channels
&
Con(iguration
5Status sna"shot6 is ,ore co,"lete than Cstat/Ccon(
Y 2010 Cisco and/or its a((iliates3 'll rights reser%ed3 Cisco Con(idential ><
+esetting Stu((
&
Bac-u"sV
I,ergenc2 co,,and4line bac-u"s*
touch /t,"/reOuest/s2ste,4bac-u"
Chec- (or /t,"/bac-u"4restore4co,"lete
1ind it in /,nt/harddis-/bac-u"restore/
NB private key not backed up do this separately
&
Cco, !e(aultValuesSet 2 [ Cco, !e(aultLin-s'dd
&
re%erti,age
&
(actor24reset
1iles in /,nt/harddis-/(actor24reset/
tandberg4i,age3tar3g7 [ r-
NB will not be present on a non-upgraded !
&
Co,,and4line sna"shot
+un 5sna"shot3sh6) (ind in /,nt/harddis-/sna"shot/
&
Hanual u"grade Fsc" to /t,"/tandberg4i,age3tar3g7 [ /t,"/release4-e28
Y 2010 Cisco and/or its a((iliates3 'll rights reser%ed3 Cisco Con(idential >A
+H'
&
Should onl2 be used (or hardware (ailure
&
See earlier diagnosis o( so,e issues
&
See +H' :uide on Cisco website FVCS !ocs/Troubleshooting8*
htt"*//www3cisco3co,/en/KS/docs/tele"resence/in(rastructure/%cs/troubleshooting/
+H'DProceduresD(orDCiscoDVCSDandDTelePresenceDConductorD'""liances3"d(
&
1actor2 reset scri"t Flog is as root) run 5(actor24reset68 reinstalls the last
%alid i,ageR suitable (or ,an2 so(tware issues
Y 2010 Cisco and/or its a((iliates3 'll rights reser%ed3 Cisco Con(idential 60
Securit2 Vulnerabilities
&
5VCS 1ails scan (or reason PPP6
&
Chec- out grwi-i list (or (iCes in later VCS %ersions*
htt"*//grwi-i/indeC3"h"/VCS/Securit2D/oti(ications
&
There are a nu,ber o( (alse "ositi%es Fe3g3 PHP C:0) 9eb!'V8
&
0( 2ou ha%e a CVI nu,ber (or a third "art2 a""lication Fe3g3 '"ache)
o"enssl8) chec- out the re"ort e3g3*
htt"*//c%e3,itre3org/cgi4bin/c%ena,e3cgiXna,eGCVI420124001
&
This will t2"icall2 sa2 which %ersions are %ulnerable
&
Then chec- against the %ersion o( so(tware running on the ,ost recent
VCS %ersion
&
1or certi(icate errors) (a,iliarise and "oint towards VCS Certi(icate
Creation and Kse :uide FCisco3co,) VCS Con(iguration :uides8*
htt"*//www3cisco3co,/en/KS/docs/tele"resence/in(rastructure/%cs/con(igDgu
ide/CiscoDVCSDCerti(icateDCreationDandDKseD!e"lo2,entD:uideDP7423
"d(
Y 2010 Cisco and/or its a((iliates3 'll rights reser%ed3 Cisco Con(idential 61
Chec-ing %ersions o( so(tware
&
Log into VCS as root FeCa,"les (ro, P732318
&
'"ache*
? @ /a"ache2/bin/htt"d 4%ersion
Ser%er %ersion* '"ache/2..2 FKniC8
Ser%er built* Se" 21 2012 0A*>>*1A
&
#"enSSL*
? @ o"enssl
#"enSSL] %ersion
#"enSSL !.".!c 10 Ha2 2012
&
#"enSSH*
? @ sshd 4%
sshd* illegal o"tion 44 %
#"enSSHD#.$p2) #"enSSL 13031c 10 Ha2 2012
&
PHP*
? @ "h" 44%ersion
PHP #.%.!# with Suhosin4Patch Fcli8 Fbuilt* Se" 21 2012 10*0=*68
Co"2right Fc8 1AA742012 The PHP :rou"
`end Ingine %2330) Co"2right Fc8 1AA<42012 `end Technologies
&
LinuC -ernel*
? @ una,e 4a
LinuC P0<2 2.&.%$. @1 SHP 1ri Se" 21 10*=<*=7 BST 2012 C<6D6=
:/K/LinuC
Y 2010 Cisco and/or its a((iliates3 'll rights reser%ed3 Cisco Con(idential 62
5VCS is Knres"onsi%e6
&
Can ,ean a lot o( things
&
aust s"eed o( accessX Chec- out load on boC3
&
/etwor- "roble,sX #r crashesX
&
Be(ore restarting a unres"onsi%e VCS) attach a -e2board/,onitor to see
i( there is a -ernel "anic etc3 0( so) +H' is a""ro"riate3
&
Sna"shot logs can tell a lot*
_ernel logs* networ- inter(aces u"/downX
'"ache logs* THS or other co,,unications still (unctioningX
Kni(ied logs* an2 noticeable ga"s in the logsX
S2sinit logs* clean/unclean shutdowns etc
Sensors logs* u"ti,e etc
Y 2010 Cisco and/or its a((iliates3 'll rights reser%ed3 Cisco Con(idential 6
License 'lar,s
&
+aised when license usage hits A0N o( a li,it3
&
'lerting that at so,e "oint since the last restart this li,it has been hit3 0t
is a courtes2 to in(or, the custo,er in case ser%ice is a((ected and the2
wish to "urchase ,ore licenses3
&
The alar, is not lowered when usage reduces) since usage ,a2 onl2
occur brie(l23 Howe%er) the custo,er can ac-nowledge this alar,3
&
There are two di((erent errors listed) which are subtl2 di((erent*
Ca"acit2 warning 4 The nu,ber o( concurrent tra%ersal calls has a""roached
the unit's physical limit
Ca"acit2 warning 4 The nu,ber o( concurrent non4tra%ersal calls has
a""roached the licensed limit
&
The SunitQs "h2sical li,itS re(ers to 100 Tra%ersal Calls) or >00 /on4
Tra%ersal calls3 The Slicensed li,itS re(ers to the nu,ber o( licenses the
custo,er has in their cluster3
&
Bear in ,ind that a call license is allocated as soon as a call is
atte,"ted) so e%en calls to un-nown or bus2 users can te,"oraril2 eat a
license3
&
Custo,ers should de"lo2 cluster load balancing Fe3g3 !/S S+V8
Y 2010 Cisco and/or its a((iliates3 'll rights reser%ed3 Cisco Con(idential 6=
#ther co,,on issues
&
Stuc- calls in status) "re4P732
Call status s2nc between 5Cstat6 and what was seen on the web had issues
be(ore P732 . rewritten in P732 and issues resol%ed FCSCtr2<<=28
&
K"grades losing data F1indHe accounts) S/HP grou") etc8
/on4'SC00 character issue3 1iCed in P732323
&
#ut o( Search +esources
' lot can go wrong i( searches start to be dro""ed3 Chec- out 5Cstatus 7ones
searches6 For eOui%alent in status3C,l83 #"ti,i7e the search rulesV
H%t/tusI HXonesI
H%e/chesI
HCuent itemFQ1QI1H,CuentI
HTot/l itemFQ1QI10024H,Tot/lI
/r!pped $tem%&1&'0//r!pped'
H,%e/chesI
H,XonesI H,%t/tusI
&
K"grade issues
+e,e,ber u"grades (ro, "re4P6 ,ust go to P631 be(ore going to P7[
Y 2010 Cisco and/or its a((iliates3 'll rights reser%ed3 Cisco Con(idential 6>
WuestionsX

Вам также может понравиться