Вы находитесь на странице: 1из 9

8700 Hands-On Session - cheat sheet - D.

Yurach - April 2018


===========================================================

Key 8700 branches in Perforce


=============================
# main development branch:
//oneos/branches/cn5410/dev/main/
# release branches:
//oneos/branches/cn5410/release/saos-8.*
# latest 8.6.0.5 release:
//oneos/branches/cn5410/release/saos-8.6.0-pwgw/

Building a designer software image


==================================
# my local workspace:
cd /localdisk/dyurach/perforce/release/saos-8.6.0-pwgw/
# Ciena source code:
ls src/
# external packages:
ls packages/
# external packages are pre-built, and their binaries are cached here.
# If you make changes to a package, you must rebuild and submit the cached
binaries:
ls cache/

# get help on the build system ("go" command):


# be patient, it takes a while...
./go TARG=saos-thorin help
# to build software for the real hardware, use: TARG=saos-thorin
# The first time, you will be prompted to install a toolchain, and given the
command to issue.
# "-j16" speeds up build by running parallel jobs
# "BLD=<string>" is an optional release tag that is added to the name of the
package that gets built
./go TARG=saos-thorin -j16 BLD=djy02
# build results are found here:
ls build/saos-thorin/
ls build/saos-thorin/fs/releasefs/
ls build/saos-thorin/fs/releasefs/release/
# copy the software package to an FTP server. I have a local server on my
workstation:
cp build/saos-thorin/fs/releasefs/rel_saos8700_8.6.0_blddjy01.tgz
/localdata/ftp/software

Basic SAOS CLI


==============
# telnet to the management IP address. This logs you in to the primary CTX.
# use username "gss" password "pureethernet" for access to diagnostics shells, etc
telnet <activeIp>
# one-line help for SAOS CLI commands:
help
# search for commands containing a given keyword:
cli search keyword module
# use <TAB> for command completion

# node configuration is stored in a text file as a series of CLI commands.


# list the available config files.
# The "Default Load File" is the one that will be used if the node restarts:
config list
# show the current "running" configuration on the node
config show brief
# show any differences between the "running" config and the saved config file:
config show differences-from-saved
# save the current "running" config to a file (by default, the Default Save File):
config save
# search the config file for commands containing a given string:
config search string module
# reset the node to a different config.
# The node is restarted, and the specified config file is loaded:
config reset-to-user-config filename traffic-2_20-3_1-snmp-config

# show the alarms that are active on the node:


alarm show
# show the alarm history - can narrow the time to yesterday, today, etc:
alarm show history today
# show the fault log (module communication errors, etc):
fault log show
# list the available event logs:
log show
# view the FLASH event log:
log view destination flash
# view events matching a keyword:
log view destination flash keyword Fabric tail 10

# basic software management commands


# show the software currently installed on the node:
software show
# remove a software package that you no longer need:
software remove package rel_saos8700_8.6.0_bld212
# install a new software package from an FTP server. The following is for the FTP
server
# on my workstation, so you would specify your own server, or one hosting SW
packages from loadbuild:
software install ftp-server 10.177.111.243 login-id anonymous package-path
/software/rel_saos8700_8.6.0_blddjy02.tgz
# once package is installed, it is distributed to all modules.
# You should wait until this is done, and "Sync State" is blank for ALL modules:
software show state
# perform a forced upgrade to the installed package.
# This involves a full chassis restart and takes about half an hour or more.
software forced-upgrade package rel_saos8700_8.6.0_blddjy02
# alternately, you can perform a graceful or rolling upgrade between supported
releases:
software activate package rel_saos8700_8.6.0_blddjy02

# show the modules in the chassis:


module show
# show more information about a specific module:
module show slot LM2
module show slot LM2 info
module temp show slot LM2
# restart module - step SAOS agents to runMode Shutdown, then reset module:
module restart slot LM2
# reset module without bringing SAOS agents to runMode Shutdown first:
module reset slot LM2
# prepare a module for removal from the chassis:
module hot-swap slot LM2
# administratively disable a module. All ports and traffic are disabled:
module disable slot LM2
# power down a line module:
module shutdown slot LM2
# re-enable a module that was disabled or shutdown:
module enable slot LM2
# delete a pre-configured module. All services on that module must be deleted
first.
module delete slot LM4
# pre-configure a module in a slot.
# If not pre-configured, module is created automatically when inserted in chassis.
module create slot LM4 model <model>

# chassis-level commands
# fabric link status:
sm-fabric health slot CTX1.sm
# general capabilities and status:
chassis show
# fan and power:
chassis fan-tray show
chassis power show
# restart the entire node. Takes about 10 to 60 minutes, depending on scale of
configuration:
chassis restart

# system-level commands:
system show
system health show

# port commands:
# show ports on a given LM:
port show slot lm2
# show summary port statistics:
port show statistics
# show non-zero summary statistics:
port show statistics active
# show port details:
port show port 2/1
port show port 2/1 statistics
port xcvr show slot lm2
# for a CSLM, show PTP, OTU, and ODU details:
port xcvr show xcvr 4/1 ptp
port otn show otu 4/1
port otn show odu 4/1-1

# perform a high-availability switchover to the other CTX:


module high-avail switchover-to-standby

# capture a system state dump, and FTP it to the given location on an FTP server:
system state-dump ftp-server <ipAddr> login-id anonymous include-corefiles include-
datapath include-optics file-name /state-dump/blddjy02-test

Diagnostics Shell
=================
# invoking the Linux diagnostic shell from the SAOS CLI:

# issue a command on the primary CTX:


diag shell command "gsMgr dump 1"
# issue a command on a given slot:
diag shell slot 2 command "led show"
# access the diagnostic shell on the primary CTX:
diag shell
# access the diagnostic shell on the given slot (username root, password ciena123):
diag shell slot 2

CTX Diag Shell


--------------
# help for SAOS commands:
?
# help for NI commands:
x
# additional utilities and scripts are available:
ls /ciena/bin/
ls /ciena/scripts/
# for example, to see registers in MALDEN FPGA:
maldenreg
# you can issue a SAOS CLI command from the diagnostic shell by prefixing it with
"saos":
saos module show

# node status from NI:


x nodestatus
# linx and mapper commands:
linxstat -help
maptest help

# HA database - show tables and dump a table:


dbdump
dbdump bladeTable

# key commands for debugging coreSwitch, HAL, optics, etc:


csm
hal help
optif
# FAP on the LM
arad

# Broadcom debug shell and some useful sub-commands:


bcmdbg
# show port status:
ps
# show NIF status:
diag nif
# show ascii-art representation of key statisics:
diag count g

# many of the subsystems have debug logs that can be enabled:


csm debug
hal debug
optif debug

# debug logs are typically saved to the "ramlog" (a circular log buffer)
# to dump the contents of the ramlog:
rld
# to clear the ramlog:
rlc
# to print new logs as they are added to the ramlog:
rld -c

# useful log directories:


# cesd-ctm output (saosLog.*), CLI command logs (cliLog.*), event logs
(eventLog.*):
ls /flash1/log/
# fault logs:
ls /flash1/fault/
# syslogs - Linux, NI, some cesd-ctm logs:
ls /var/log/messages
ls /var/log/archive/messages*.gz
# dmesg logs captured after a kernel panic:
ls /var/log/archive/paniclogs/
# core files and other information captured after a cesd-ctm crash:
ls /var/crash/ /var/crash/old/

LM Diag Shell
-------------
# very similar to the Linux diagnostics shell on the CTX

# key commands for debugging coreSwitch, HAL, optics, etc:


optif
csm
hal help
arad help

# Broadcom debug shell and some useful sub-commands:


bcmdbg
ps
diag nif
diag count g

# same as CTX, many of the subsystems have debug logs that can be enabled
# For example to enable HAL "xcvr" debug logs:
hal debug set xcvr on
# to view logs as they are printed to the ramlog:
rld -c

# useful log directories:


# cesd-pslm output (saosLog.*):
ls /flash0/log/
# syslogs - Linux, NI, some cesd-pslm logs:
ls /var/log/messages
ls /var/log/archive/messages*.gz
# dmesg logs captured after a kernel panic:
ls /var/log/archive/paniclogs/
# core files and other information captured after a cesd-pslm crash:
ls /var/crash/ /var/crash/old/

Debugging using GDB


===================
# see
https://confluence.ciena.com/display/ces/Thorin+8.x+Development+Helper+Scripts
# for useful short-cuts and helper scripts.

# you can run GDB on your workstation and connect to cesd processes on a CTX or LM
for source-level debugging.
# first, run the "setup" utility on the primary CTX diag shell, and enter your FTP
server information.
# You can also choose "blade developers mode" to prevent BladeMgr from resetting
unresponsive blades.
setup
# Then run the "bug" command without argument to connect to cesd-ctm on the primary
CTX,
# or with an LM argument to connect to cesd-pslm on that LM
bug lm2
# it will tell you what command to run on your Linux workstation...

# then on your Linux workstation, from the directory where you built the load that
is running on the node,
# run the bug command, with the IP address of the node:
cd /localdisk/dyurach/perforce/release/saos-8.6.0-pwgw/
./src/software/saos-tce/apps/saos-tce/devscripts/bug thorin lm2 10.184.105.241
# you can now set breakpoints, etc. For example, from GDB:
l halPortDump
b halPortDump
c

# now from the LM diag shell, you can run a command that will cause the breakpoint
to be hit:
hal portDump

# to delete the breakpoint and detach in GDB:


del 1
detach
quit

# can also be used to debug cesd-ctm on CTX. For example...


# run "bug" from CTX diag shell:
bug
# run bug on your Linux workstation
./src/software/saos-tce/apps/saos-tce/devscripts/bug thorin ctx 10.184.105.241
# install a breakpoint...
b bmAgentSmStateConfirm

# now if you cause a cesd-pslm crash and restart on an LM, you will see the
# blade manager's agent state machine for that blade hit the breakpoint.
# from the LM diag shell:
pkill -SEGV cesd-pslm
# Note that this will cause a core dump to be created. It takes a minute or two:
ls -ltra /var/crash/

# you should probably continue execution quickly after the breakpoint has been hit.
# Otherwise, signal timeouts could cause processes to assert.
# Detaching from the cesd-ctm process doesn't seem to work - cesd-ctm crashes. Be
aware!
del 1
detach
quit

State dumps
===========
# used to capture system state (configuration, log files, core files, etc) for
later debugging

# use the "system state-dump" command on the node to generate a state dump.
# You will usually want to specify the FTP server where the state dump should be
placed.
# The "include-" options specify additional information that should be gathered,
but take longer (10+ minutes):
system state-dump ftp-server 10.177.111.243 login-id anonymous include-corefiles
include-datapath include-optics file-name /state-dump/8700-demo-180410

# untar a state dump:


mkdir /localdisk/dyurach/work/8700-demo-180410 && cd /localdisk/dyurach/work/8700-
demo-180410
tar xzf /localdata/ftp/state-dump/8700-demo-180410.tar.gz

# "sdpeek" command gathers key information from state dump, useful as a starting
point:
sdpeek -h
sdpeek -a

# detailed look at a state dump from a JIRA (JE-74069):


cd /localdisk/dyurach/work/JE-74069/StablilityEndNodeA
sdpeek -a

Primary CTX
-----------
# main state dump file with information gathered from primary CTX:
# date, software load, modules, alarm history, running config, etc
tmp/statedump.txt (or tmp/statedump.txt.gz)
# data gathered from various managers and subsystems on primary CTX:
tmp/*.txt.gz
# default load file:
flash0/config/*

# command logs
flash1/log/cmdLog.*
# event logs
flash1/log/eventLog.*
# fault logs
flash1/fault/*

All Modules
-----------
# various logs are gathered from modules that were accessible at time of state
dump.
# primary CTX is under "./"
# secondary CTX is under "./CTX<n>/"
# each LM is under "./LM<n>/"

# data gathered from various managers, agents, or subsystems:


tmp/*.txt.gz
tmp/dp_csa.txt.gz tmp/dp_hal.txt.gz tmp/procdump.txt.gz tmp/rld.txt.gz
tmp/m2.txt.gz

# current syslog messages file and most recent archived messages file:
var/log/messages var/log/archive/messages.1.gz
# cesd-ctm logs on CTX:
flash1/log/saosLog.*
# cesd-pslm logs on LM:
flash0/log/saosLog.*
# dmesg preserved after a Linux kernel panic:
var/log/archive/paniclogs/

# core and other files from a CESD crash:


var/crash/ var/crash/old/
var/crash/*.txt var/crash/*.linxstat var/crash/*.procdump
# to run GDB on a core file (release build; loadbuild directory must be available):
bug ./var/crash/old/cesd-ctm-5586-1522398991.core.gz

8700 Simulator
==============
# see https://confluence.ciena.com/display/ces/8700+Simulator and sub-pages

# to build software for the simulator, use: TARG=saos-simthorin


# The first time, you will be prompted to install a toolchain, and given the
command to issue.
# "-j16" speeds up build by running parallel jobs
# "BLD=<string>" is an optional release tag that is added to the name of the
package that gets built
./go TARG=saos-simthorin -j16 BLD=djy02

# create your own XML config file:


cp src/generic/sim/tools/sc/config/simnode-thorin.xml
src/generic/sim/tools/sc/config/my-simnode-thorin.xml
chmod +w src/generic/sim/tools/sc/config/my-simnode-thorin.xml
# increase RAM assigned to PSLM:
vi src/generic/sim/tools/sc/config/my-simnode-thorin.xml

# setup X authorization - simulator must be run as root, and "xhost" is no longer


available:
xauth list $DISPLAY
sudo su -
xauth add $DISPLAY MIT-MAGIC-COOKIE-1 <cookie>

# run the simulator from root login:


cd /localdisk/dyurach/perforce/release/saos-8.6.0-pwgw
# simctrl script controls the simulator. Use "create" to build a new set of VM
images and launch simulator:
src/software/saos-tce/tools/bin/simctrl create src/generic/sim/tools/sc/config/my-
simnode-thorin.xml
# a VM is created for each CTX and LM listed in the XML file, and a console window
is opened for each.

# you can telnet to the primary CTX at the IP address listed in the XML file:
telnet 192.168.122.10
module show
port show

# the diagnostics CLI is available in the console window for each module.
# the "simcli" command can be run to simulate various operations on the module.
# on LM, it can be used to simulate fiber connections between ports:
simcli
set fiber 1 lm VM2 1-A-1 2 save
set fiber 2 lm VM2 1-A-1 1 save

port show

# you can use "stop" and "start" to shut down and start up an existing simulation.
# once you are finished, you can use "destroy" to wipe it out:
src/software/saos-tce/tools/bin/simctrl destroy src/generic/sim/tools/sc/config/my-
simnode-thorin.xml