Академический Документы
Профессиональный Документы
Культура Документы
1 Introduction
System Monitoring and Troubleshooting are important topics that will help you find
performance bottlenecks, optimize performance and troubleshoot problems that
you may encounter while implementing solutions on Oracle Linux 6. While there are
several tools available for System Monitoring and Troubleshooting on Linux, we will
look at some of the important Monitoring and Troubleshooting tools that are
available for Oracle Linux 6.
With a few basic exercises we will introduce the learner to some ways to perform
troubleshooting and monitoring Oracle Linux 6 systems. Upon completion of this
lab, participants will have learned how to use some of the common troubleshooting
and monitoring tools available for Oracle Linux 6.
2 Overview
In this lab we will be practicing with some of the Oracle Linux 6 system monitoring,
data collection and troubleshooting tools.
Some of the tools/utilities that well review are listed below.
OSWatcher Tool
Sosreport Tool
System Tools: sar, iostat, vmstat, strace, top, tcpdump, ethereal(wireshark)
System Monitor GNOME Application
Kdump / Netconsole
Crash and vmcore dump
DTrace Introduction
The practice for this lab can be accomplished with a single VirtualBox Oracle Linux 6
instance with some additional installed RPM packages. You must an instance of
Oracle Linux 6.3 running in your VirtualBox environment.
OL 6- Lab 07
Page 2 of 99
A current 64 bit laptop with at least 2GB RAM and 20GB free disk space
Operating system: A 64-bit version of Microsoft Windows, Mac OS X, Linux
or Solaris. Alternatively, a 32-bit host OS installed on a 64-bit CPU with VTx/AMD-V enabled in the BIOS.
Oracle VirtualBox Software 4.2.10 or later (with Extension Pack installed)
Oracle Linux 6.3 instance running inside VirtualBox:
o VM Image Provided by instructor or downloaded on your own
o Installed in Lab 1 of Oracle Linux 6 Boot camp
The following assumptions have been made regarding the environment where this
lab is being performed:
1. Network connectivity to the Internet is available
2. Your Oracle Linux 6.3 VirtualBox instance has been installed and youve
assigned a normal user/password and a root user password.
a. The recommended user name is student1
b. The recommended password is oracle
c. The recommended root password is oracle
OL 6- Lab 07
Page 3 of 99
OL 6- Lab 07
Page 4 of 99
OL 6- Lab 07
Page 5 of 99
OL 6- Lab 07
Page 6 of 99
2) Install ksh shell if using ksh tar ball of OSWatcher Black Box tool if using
the ksh based distribution.
3) Install OSWatcher Black Box Tool
OL 6- Lab 07
Page 7 of 99
We will install the OSWatcher tool in /opt directory of the Oracle Linux 6.2 System.
Move the oswbb5201.tar file that was downloaded to /opt directory as shown
below.
[root@examplehost Downloads]# pwd
/root/Downloads
[root@examplehost Downloads]# ls oswbb5201.tar
oswbb5201.tar
[root@examplehost Downloads]# mv oswbb5201.tar /opt/
[root@examplehost Downloads]# cd /opt
[root@examplehost opt]# ls oswbb5201.tar
oswbb5201.tar
Extract the contents of the tar file using the tar command.
[root@examplehost opt]# tar xvf oswbb5201.tar
OL 6- Lab 07
Page 8 of 99
View the contents of the extracted tar file. The contents are untarred into the
/opt/oswbb directory in screenshot below.
[root@examplehost opt]# cd /opt/oswbb
[root@examplehost oswbb]# ls
analysis
iosub.sh
OSWatcher.sh
src
topaix.sh
docs
mpsub.sh
oswbba.jar
startOSWbb.sh vmsub.sh
Exampleprivate.net nfssub.sh
oswib.sh
stopOSWbb.sh
xtop.sh
gif
OSWatcherFM.sh oswnet.sh
tarupfiles.sh
[root@examplehost oswbb]#
oswrds.sh
oswsub.sh
profile
pssub.sh
The startOSWbb.sh is the main script that is used to run the OSWatcher tool. If you
have downloaded the ksh shell version of the OSWatcher tool and do not have ksh
installed on your Oracle Linux system, you will see a message similar to what is
shown below when you run the OSWatcher tool. You will not see this message if you
are using the Bourne shell version of this tool; instead the tool will start running and
collecting information.
[root@examplehost oswbb]# ./startOSWbb.sh
bash: ./startOSWbb.sh: /usr/bin/ksh: bad interpreter: No
such file or directory
[root@examplehost oswbb]#
OL 6- Lab 07
Page 9 of 99
If you see the above message, then you must first install the ksh shell RPM package
using the yum install ksh command.
[root@examplehost /]# yum install ksh
After installing ksh shell, you can run the which command to verify its location. By
default, you will find it installed in the /bin/ksh directory. Create a symbolic link
from /usr/bin/ksh to /bin/ksh. This link is required because the OSWbb scripts
expect to find ksh binary in the /usr/bin directory.
[root@examplehost /]# which ksh
/bin/ksh
[root@examplehost /]#
[root@examplehost /]# which ksh
/bin/ksh
[root@examplehost /]# ln -s /bin/ksh /usr/bin/ksh
[root@examplehost /]#
You can run the OSWatcher tool without specifying any options by running the
startOSWbb.sh script. This will start the script with default values are 30 seconds
and 48 hours. This means that OSWbb starts recording data at intervals of 30
seconds, and records data for 48 hours.
OL 6- Lab 07
Page 10 of 99
Start this lab by running the OSWatcher tool script without specifying any
frequency/duration options. After it has collected 2-3 sets of data, you can stop the
script using the stopOSWbb.sh script from another terminal window. See sample
screenshots below.
[root@examplehost /]# cd /opt/oswbb
[root@examplehost oswbb]# ./startOSWbb.sh
[root@examplehost oswbb]#
Info...You did not enter a value for snapshotInterval.
Info...Using default value = 30
Info...You did not enter a value for archiveInterval.
Info...Using default value = 48
Setting the archive log directory to/opt/oswbb/archive
Testing for discovery of OS Utilities...
VMSTAT found on your system.
IOSTAT found on your system.
MPSTAT found on your system.
NETSTAT found on your system.
TOP found on your system.
Testing for discovery of OS CPU COUNT
OSWbb is looking for the CPU COUNT on your system
CPU COUNT will be used by oswbba to automatically look for
cpu problems
CPU COUNT found on your system.
CPU COUNT = 1
Discovery completed.
Starting OSWatcher Black Box v5.2.0
11:38:39 PST 2013
With SnapshotInterval = 30
With ArchiveInterval = 48
on Mon Jan 28
OL 6- Lab 07
Page 11 of 99
To stop the OSWatcher tool, run the stopOSWbb.sh script from another terminal
window.
[root@examplehost oswbb]# ./stopOSWbb.sh
[root@examplehost oswbb]#
OL 6- Lab 07
Page 12 of 99
oswvmstat
Examine the oswmeminfo directory and see the file that is captured in this
directory.
You can see that the meminfo file contains a listing of the contents of
/proc/meminfo file of your Oracle Linux system and provides memory
information. View the file using view or vi command.
OL 6- Lab 07
Page 13 of 99
You can review the other directories/files and see the data captured by this tool.
Refer to screenshot below for a short description.
Now that we have looked at the basic functionality, lets proceed by looking at the
documentation and readme file of the OSWatcher tool. The documentation is
available under the oswbb/docs directory.
You can find PDF documentation file under the oswbb/docs/OSWatcher_BlackBox
directory along with the readme file.
OL 6- Lab 07
Page 14 of 99
Read the README.txt file and practice some of the examples in this file to get
familiarised with this tool and the various options. This is left as an exercise to
perform on your own. Remember to run the stopOSWbb.sh script once you have
finished working on this lab.
OL 6- Lab 07
Page 15 of 99
OL 6- Lab 07
Page 16 of 99
Run the yum info command on the sos package to see the information about this
package.
[root@examplehost /]# yum info sos
Read the man pages of sosreport tool to understand the options available for this
tool.
[root@examplehost /]# man sosreport
[root@examplehost /]#
OL 6- Lab 07
Page 17 of 99
OL 6- Lab 07
Page 18 of 99
You may also want to read the man pages by running the man sos.conf command.
[root@examplehost /]# man sos.conf
OL 6- Lab 07
Page 19 of 99
[root@examplehost /]#
OL 6- Lab 07
Page 20 of 99
Run the sosreport tool with default options and default configuration as shown in
the example below.
OL 6- Lab 07
Page 21 of 99
OL 6- Lab 07
Page 22 of 99
In the above example, you can see that the sosreport tool ran all the plugins and
collected all the information. The collected information is saved as a sosreport in the
/tmp directory and the report name includes the host name, case number strings in
the file name as shown in the example screenshots.
[root@examplehost /]# cd /tmp
[root@examplehost tmp]# pwd
/tmp
[root@examplehost tmp]# ls sosreport*
sosreport-examplehost.999-20130114124805-840d.tar.xz
sosreport-examplehost.999-20130114124805-840d.tar.xz.md5
[root@examplehost tmp]#
OL 6- Lab 07
Page 23 of 99
Lets now look at the sosreport data that was collected. The report is saved as a
tar.xz file as can be seen from the file name. We will use the xz tool to uncompress
this file and then untar it using the tar command as shown below.
[root@examplehost /]# cd /tmp
[root@examplehost tmp]# pwd
/tmp
[root@examplehost tmp]# xz -d sosreport-examplehost.99920130114124805-840d.tar.xz
[root@examplehost tmp]#
[root@examplehost tmp]# tar -xvf sosreport-examplehost.99920130114124805-840d.tar
[root@examplehost tmp]#
Once the files from the sosreport have been untarred, you can view the data that is
collected by this tool. To check the files collected by the sosreport tool, look for a
directory under /tmp/ starting with hostname and containing the same timestamp
string as the tar file. See the example below and follow along on your system.
OL 6- Lab 07
Page 24 of 99
Take a look at a couple of files to get an idea of the data collected by the sosreport
tool. In the example below, we look at the uptime file and the sestatus file. The
uptime file provides system uptime information and the sestatus file provides
information about SELinux configuration of your Oracle Linux 6 system.
OL 6- Lab 07
Page 25 of 99
OL 6- Lab 07
Page 26 of 99
Disable the selinux plugin so that information about selinux is not collected in the
sestatus file. To disable or skip a sosreport plugin, run the command using the n
option as shown below.
[root@examplehost /]# sosreport -n selinux
sosreport (version 2.2)
This utility will collect some detailed information about
the hardware and setup of your Oracle Linux system.
....
....
Please enter your first initial and last name
[examplehost]:
Please enter the case number that you are generating this
report for: 111
Running plugins. Please wait ...
Completed [53/53] ...
Creating compressed archive...
OL 6- Lab 07
Page 27 of 99
The report created in the above run should have excluded the selinux information
and you should not see the sestatus file present in the report files. To verify this,
run the commands as shown below.
OL 6- Lab 07
Page 28 of 99
OL 6- Lab 07
Page 29 of 99
As you can see, the sestatus file was not created in the above report. This confirms
that the selinux plugin was disabled correctly using the n option with the
sosreport command. We will not look at all the files included in this report for this
lab but you can examine all the other files at your leisure.
5.3 System Tools sar, iostat, vmstat, strace, top, tcpdump, ethereal
In this lab, we will look at some of the system tools available in Oracle Linux that can
be used for monitoring and troubleshooting problems. We will look at the common
tools sar, iostat, vmstat, strace, top, tcpdump and ethereal in this lab.
We will start with the sar command by reviewing the man pages of the sar
command and understand what all information this tool can collect on a Oracle
Linux 6 system. The sar command collects system activity reports (sar) and is part
of the sysstat package as can be seen by running the rpm command below.
[root@examplehost /]# which sar
/usr/bin/sar
[root@examplehost /]# rpm -qf /usr/bin/sar
sysstat-9.0.4-20.el6.x86_64
[root@examplehost /]#
[root@examplehost /]# man sar
OL 6- Lab 07
Page 30 of 99
Run the sar 3 5 command. This will capture system activity every 3 seconds for 5
iterations and show the output on the screen. Sample output shown below. Review
and understand the various parameters that are reported in the sar output.
[root@examplehost /]# sar 3 5
Linux 2.6.39-300.17.3.el6uek.x86_64 (examplehost.com)
01/16/2013
_x86_64_ (1 CPU)
01:25:49 PM
CPU
%steal
%idle
01:25:52 PM
all
0.00
99.66
01:25:55 PM
all
0.00
99.34
01:25:58 PM
all
0.00
99.33
01:26:01 PM
all
0.00
99.67
01:26:04 PM
all
0.00
99.67
Average:
all
0.00
99.53
[root@examplehost /]#
%user
%nice
%system
%iowait
0.00
0.00
0.34
0.00
0.00
0.00
0.66
0.00
0.33
0.00
0.33
0.00
0.00
0.00
0.33
0.00
0.00
0.00
0.33
0.00
0.07
0.00
0.40
0.00
OL 6- Lab 07
Page 31 of 99
If you want to capture the sar output into a file, you can run the sar command with
the o option as shown below. In this example, the output is captured in binary form
in the /tmp/sar.out file.
%user
%nice
%system
%iowait
0.25
0.00
0.50
0.00
0.00
0.00
0.25
0.00
0.25
0.00
0.25
0.00
0.17
0.00
0.33
0.00
To read the sar output that was captured in the /tmp/sar.out file, you can run the
sar command with the f option as shown below.
OL 6- Lab 07
Page 32 of 99
%user
%nice
%system
%iowait
0.25
0.00
0.50
0.00
0.00
0.00
0.25
0.00
0.25
0.00
0.25
0.00
0.17
0.00
0.33
0.00
To check the swap space utilization report using sar, you can use the S option of
the sar command as shown in the example below. The kbswpfree value is the
amount of free swap space in kilobytes and the kbswpused value is the amount of
used swap space in kilobytes.
For Oracle employees and authorized partners only. Do
not distribute to third parties.
2013 Oracle Corporation
OL 6- Lab 07
Page 33 of 99
%swpused
kbswpcad
0.00
0.00
0.00
0.00
0.00
You can use the -r option of the sar command to check memory utilization
statistics. In this report, the kbmemfree value is the amount of free memory
available in kilobytes. The kbmemused value is the amount of used memory in
kilobytes excluding kernel.
OL 6- Lab 07
Page 34 of 99
%memused kbbuffers
85.25
57492
85.25
57492
85.25
57492
85.25
57492
85.25
57492
We will now simulate a high CPU usage situation. Run the following yes command
on your system in a terminal window to simulate a high CPU situation.
[root@examplehost /]# yes > /dev/null
You may run the above yes command in two separate terminals windows to
increase the CPU load if needed.
OL 6- Lab 07
Page 35 of 99
%user
%nice
%system
%iowait
22.64
0.00
2.36
0.00
96.22
0.00
3.78
0.00
96.21
0.00
3.79
0.00
94.74
0.00
5.26
0.00
97.57
0.00
2.43
0.00
97.92
0.00
2.08
0.00
98.28
0.00
1.72
0.00
97.31
0.00
2.69
0.00
97.97
0.00
2.03
0.00
98.32
0.00
1.68
0.00
89.62
0.00
2.77
0.00
OL 6- Lab 07
Page 36 of 99
The -q option of the sar reports the queue length and load averages. The runq-sz
value is the Run queue length (number of tasks waiting for run time). A run queue
size which is greater than the number of CPUs on your system is usually indicative
of CPU bottleneck. The plist-sz value is the number of tasks in the task list.
[root@examplehost /]# sar -q 3 10
Linux 2.6.39-300.17.3.el6uek.x86_64 (examplehost.com)
01/16/2013
_x86_64_ (1 CPU)
02:22:11
15
02:22:14
0.59
02:22:17
0.59
02:22:20
0.59
02:22:23
0.59
02:22:26
0.59
02:22:29
0.59
02:22:32
0.59
02:22:35
0.60
02:22:38
0.61
PM
runq-sz
plist-sz
ldavg-1
ldavg-5
PM
242
0.73
0.47
PM
242
0.73
0.47
PM
242
0.75
0.48
PM
241
0.77
0.49
PM
242
0.77
0.49
PM
243
0.79
0.50
PM
243
0.79
0.50
PM
243
0.89
0.52
PM
243
0.98
0.54
ldavg-
OL 6- Lab 07
Page 37 of 99
Iostat:
We will now look at the iostat command. As before, lets read the man pages of
iostat to get familiarized with this command and then use some of the options.
[root@examplehost /]# man iostat
[root@examplehost /]# which iostat
/usr/bin/iostat
[root@examplehost /]# rpm -qf /usr/bin/iostat
sysstat-9.0.4-20.el6.x86_64
[root@examplehost /]#
[root@examplehost /]# man iostat
OL 6- Lab 07
Page 38 of 99
Running the iostat command without any options will show the statistics since the
time the system was booted. Go through the output and review the data that this
command reports by referring to the man pages.
[root@examplehost /]# iostat
Linux 2.6.39-300.17.3.el6uek.x86_64 (examplehost.com)
01/16/2013
_x86_64_ (1 CPU)
avg-cpu:
Device:
Blk_wrtn
sda
234322
dm-0
234168
dm-1
24
%user
0.44
%steal
0.00
%idle
99.26
tps
Blk_read/s
Blk_wrtn/s
Blk_read
0.53
13.48
2.67
1185116
0.62
13.12
2.66
1153234
0.00
0.03
0.00
2600
OL 6- Lab 07
Page 39 of 99
Running the iostat command with the c option displays just the CPU utilization
report. In the example below, it captures the data every 2 seconds for 4 times and
then exits the command.
[root@examplehost /]# iostat -c 2 4
Linux 2.6.39-300.17.3.el6uek.x86_64 (examplehost.com)
01/16/2013
_x86_64_ (1 CPU)
avg-cpu:
%user
0.44
%steal
0.00
%idle
99.25
avg-cpu:
%user
1.51
%steal
0.00
%idle
97.99
avg-cpu:
%user
7.07
%steal
0.00
%idle
90.40
avg-cpu:
%user
0.50
%steal
0.00
%idle
99.50
[root@examplehost /]#
OL 6- Lab 07
Page 40 of 99
Running the iostat command with the d option will show devices report.
[root@examplehost /]# iostat -d 3 3
Linux 2.6.39-300.17.3.el6uek.x86_64 (examplehost.com)
01/16/2013
_x86_64_ (1 CPU)
Device:
Blk_wrtn
sda
235482
dm-0
235328
dm-1
24
Device:
Blk_wrtn
sda
80
dm-0
80
....
....
....
tps
Blk_read/s
Blk_wrtn/s
Blk_read
0.53
13.46
2.68
1185116
0.62
13.10
2.67
1153234
0.00
0.03
0.00
2600
tps
Blk_read/s
Blk_wrtn/s
Blk_read
3.01
0.00
26.76
3.34
0.00
26.76
OL 6- Lab 07
Page 41 of 99
Using the p option with the iostat command, you can specify a device name for
which you want to see the report. A sample output is shown below.
[root@examplehost /]# iostat -p sda 2 2
Linux 2.6.39-300.17.3.el6uek.x86_64 (examplehost.com)
01/16/2013
_x86_64_ (1 CPU)
avg-cpu:
%user
0.44
Device:
Blk_wrtn
sda
236298
sda1
130
sda2
236168
avg-cpu:
%user
0.50
%steal
0.00
%idle
99.25
tps
Blk_read/s
Blk_wrtn/s
Blk_read
0.53
13.44
2.68
1185116
0.01
0.31
0.00
27130
0.49
13.12
2.68
1156754
%steal
0.00
%idle
99.01
OL 6- Lab 07
Page 42 of 99
tps
Blk_read/s
Blk_wrtn/s
Blk_read
0.00
0.00
0.00
0.00
0.00
0.00
0.00
0.00
0.00
[root@examplehost /]#
If you wish you can explore more options of the iostat command by trying them out
on your leisure.
Vmstat:
Let us now look at the vmstat command and begin by reading the man pages.
[root@examplehost /]# which vmstat
/usr/bin/vmstat
[root@examplehost /]# man vmstat
OL 6- Lab 07
Page 43 of 99
Run the vmstat command on your system and examine the output. Note that this
report includes CPU, memory, swap, IO and system information.
OL 6- Lab 07
Page 44 of 99
Examine the output in the above screenshot and see the si and so have 0 value.
This means no swap activity is taking place. Also, notice the CPU is almost idle (look
at the id column) and this is indicative of an idle system with little activity.
We will now run the vmstat command using the a option. The a option shows
information about active and inactive memory in the report as seen in the sample
output below.
OL 6- Lab 07
Page 45 of 99
Notice that unlike the sar command, the vmstat output does not log the timestamp
in the report if you do not specify the t option. Example report with t option
shows the timestamp information getting reported.
[root@examplehost /]# vmstat -t 2 3
procs -----------memory---------- ---swap-- -----io---- -system-- -----cpu------ ---timestamp--r b
swpd
free
buff cache
si
so
bi
bo
in
cs us sy id wa st
0 0
8 126432 59276 379388
0
0
7
1
75 140 0 0 99 0 0
2013-01-16 15:00:23 PST
0 0
8 126416 59276 379388
0
0
0
0
78 148 0 0 99 0 0
2013-01-16 15:00:25 PST
0 0
8 126432 59276 379388
0
0
0
0
70 141 0 0 100 0 02013-01-16 15:00:27 PST
[root@examplehost /]#
OL 6- Lab 07
Page 46 of 99
Strace:
The strace command is used to capture system calls and signals of a running
process or a process being launched. Start by reading the man pages as always.
[root@examplehost /]# which strace
/usr/bin/strace
[root@examplehost /]# man strace
In this lab, we will run strace against a running process and capture the output and
also run strace before launching a process. We will not go into the details of
analyzing strace output but you should take a look at the data collected by strace
and see the system calls a process or an application makes when it runs. This strace
command usually comes in handy when you are trying to launch an application and
it is failing to start.
Run the strace command against a running process like firefox on your system as
shown below. Terminate the strace command after capturing the output for about
3-5 seconds using Control-C. You can view the output of strace command collected
in the /tmp/strace_ffox.out file using the view or vi command.
[root@examplehost /]# ps -ef | grep firefox
root
6974
1 1 15:11 ?
00:00:15
/usr/lib64/firefox/firefox
root
7088 6339 0 15:29 pts/1
00:00:00 grep
firefox
[root@examplehost /]#
[root@examplehost /]# strace -o /tmp/strace_ffox.out -p
6974
Process 6974 attached - interrupt to quit
^CProcess 6974 detached
[root@examplehost /]#
[root@examplehost /]# vi /tmp/strace_ffox.out
OL 6- Lab 07
Page 47 of 99
To run the strace when you launch a process/application, you can simply start the
process/application with the strace command first and then the application name.
In the example below, we will start the gedit editor and trace the system calls of
this application using the strace command. The trace will be captured in the
/tmp/strace_gedit.out file. Examine the /tmp/strace_gedit.out file to see what all
system calls were logged.
[root@examplehost /]# strace -o /tmp/strace_gedit.out gedit
[root@examplehost /]#
[root@examplehost /]# vi /tmp/strace_gedit.out
Top:
The top command provides a dynamic real time view of a Linux system. It is a very
good and useful tool to identify system performance related issues. Read the man
pages and then try out the top command on your Oracle Linux 6 systems.
[root@examplehost /]# which top
/usr/bin/top
[root@examplehost /]# rpm -qf /usr/bin/top
procps-3.2.8-23.el6.x86_64
[root@examplehost /]#
[root@examplehost /]# man top
OL 6- Lab 07
Page 48 of 99
Run the top command without any arguments and observe the system real time
view that it provides.
[root@examplehost /]# top
OL 6- Lab 07
Page 49 of 99
Running the top command with n option will exit the top command after n
iterations. Without this option, the top command keeps displaying the real time
system activity.
[root@examplehost /]# top -n 3
To see the processes or activity of a particular user on the system, you can run the
top command with u option. In the example below, student1 activity on the
system is being reported by top command.
[root@examplehost /]# top -u student1
OL 6- Lab 07
Page 50 of 99
We will now simulate a high CPU utilization situation by running the yes command
as we did earlier in another lab. Run the yes command as student1 user on a
terminal window as shown below. In another terminal window, run the top
command and notice the high CPU utilization process and the user name.
[student1@examplehost /]# yes > /dev/null
[root@examplehost /]# top
OL 6- Lab 07
Page 51 of 99
There are some other interactive options for sorting. Some of them are listed below:
Sort command option
A
M
N
P
T
Sorted Field
start time (non-display)
%MEM
PID
%CPU
TIME+
You can try using these interactive sort options in the top terminal window to see
the results.
OL 6- Lab 07
Page 52 of 99
OL 6- Lab 07
Page 53 of 99
Capture the network traffic for a host by specifying the hostname. For example, to
capture network traffic for host ludic.us.oracle.com, type the following command.
Note: Use a different hostname that is available on your network rather than
using the one that is listed in the examples below for this lab.
[root@examplehost /]# tcpdump host ludic.us.oracle.com
tcpdump: verbose output suppressed, use -v or -vv for full
protocol decode
listening on eth0, link-type EN10MB (Ethernet), capture
size 65535 bytes
To capture network traffic for host ludic.us.oracle.com via network interface device
eth0, type the following command:
OL 6- Lab 07
Page 54 of 99
To capture HTTP protocol traffic via TCP port 80, use the following command:
[root@examplehost /]# tcpdump port 80
To print all IP packets between host ludic and any host except host coolwaters:
[root@examplehost /]# tcpdump ip host ludic and not
coolwaters
OL 6- Lab 07
Page 55 of 99
To capture the network traffic output to a file, you can use the w option and specify
the file name. In the example below, the output is captured in the
/tmp/tcpdump.out file.
[root@examplehost /]# tcpdump -w /tmp/tcpdump.out
tcpdump: listening on eth0, link-type EN10MB (Ethernet),
capture size 65535 bytes
Use the Control-C to break out of tcpdump command after a few seconds of
capture. Verify that you have file /tmp/tcpdump.out with some data captured.
To read the file in which network traffic data was captured, you can use the r
option of the tcpdump command:
[root@examplehost /]#
[root@examplehost /]# tcpdump -r /tmp/tcpdump.out
reading from file /tmp/tcpdump.out, link-type EN10MB
(Ethernet)
11:55:18.800306 IP 10.0.2.15.18632 >
....
..
OL 6- Lab 07
Page 56 of 99
You can increase the verbosity level of the data captured by using the v, -vv, -vvv
options and try running the above commands.
Ethereal/Wireshark:
Ethereal is now Wireshark. Wireshark (ethereal) is a network protocol analyzer
tool. It lets you capture and interactively browse the traffic running on a computer
network.
We will first install the Wireshark RPM package on our Oracle Linux 6.3 instance. To
install the package, run the yum install wireshark command. This will install the
command line too tethereal and other binaries in this package. If you want to install
the GUI packages, then you should also install the wireshark-gnome package. We
will install both these packages.
[root@examplehost /]# yum install wireshark
Loaded plugins: refresh-packagekit, security
Setting up Install Process
Resolving Dependencies
--> Running transaction check
---> Package wireshark.x86_64 0:1.2.15-2.0.1.el6_2.1 will
be installed
--> Processing Dependency: libsmi.so.2()(64bit) for
package: wireshark-1.2.15-2.0.1.el6_2.1.x86_64
--> Running transaction check
---> Package libsmi.x86_64 0:0.4.8-4.el6 will be installed
--> Finished Dependency Resolution
Dependencies Resolved
===========================================================
================================
Package
Arch
Version
Repository
Size
==========================================================
OL 6- Lab 07
Page 57 of 99
OL 6- Lab 07
Page 58 of 99
Once the wireshark package has been installed, you can run the following
commands. Running the yum info command provides details of the package.
[root@examplehost /]# yum info wireshark
Loaded plugins: refresh-packagekit, security
Installed Packages
Name
: wireshark
Arch
: x86_64
Version
: 1.2.15
Release
: 2.0.1.el6_2.1
Size
: 57 M
Repo
: installed
From repo
: ol6_latest
Summary
: Network traffic analyzer
URL
: http://www.wireshark.org/
License
: GPL+
Description : Wireshark is a network traffic analyzer for
Unix-ish operating systems.
: This package lays base for libpcap, a packet
capture and filtering library, contains command-line
utilities, contains plugins and documentation for
wireshark. A graphical user interface is packaged
: separately to GTK+ package.
OL 6- Lab 07
Page 59 of 99
And using rpm ql option we can check the files included in this package.
[root@examplehost /]# rpm -ql wireshark | grep bin
/usr/sbin/capinfos
/usr/sbin/dftest
/usr/sbin/dumpcap
/usr/sbin/editcap
/usr/sbin/mergecap
/usr/sbin/randpkt
/usr/sbin/rawshark
/usr/sbin/tethereal
/usr/sbin/text2pcap
/usr/sbin/tshark
/usr/share/wireshark/radius/dictionary.bintec
[root@examplehost /]#
OL 6- Lab 07
Page 60 of 99
OL 6- Lab 07
Page 61 of 99
To display the network interfaces available on the system, use the D option of the
tethereal command.
[root@examplehost /]# tethereal -D
1. eth0
2. usbmon1 (USB bus number 1)
3. any (Pseudo-device that captures on all interfaces)
4. lo
[root@examplehost /]#
In this lab, we will use the GUI interface of wireshark application. But if you are
more interested in the command line interface, feel free to use the tethereal
command and other binaries included in this wireshark package.
To install the Wireshark GUI application, run the yum install wireshark-gnome
command as shown below.
OL 6- Lab 07
Page 62 of 99
OL 6- Lab 07
Page 63 of 99
Once the Wireshark GUI application has been installed, you can verify that it is
installed on your system and also read the man pages before you begin using this
application.
[root@examplehost /]# which wireshark
/usr/sbin/wireshark
[root@examplehost /]#
[root@examplehost /]# man wireshark
OL 6- Lab 07
Page 64 of 99
Alternatively, you can also find this application under the Applications -> Internet
->Wireshark Network Analyzer menu.
OL 6- Lab 07
Page 65 of 99
You should now see the Wireshark Network Analyzer application window similar to
what is shown below.
You can select an interface like eth0 and start the capture of network packets.
OL 6- Lab 07
Page 66 of 99
If you want to just capture the HTTP protocol traffic on the eth0 network interface,
you can click the Filter button and set the filter to HTTP.
OL 6- Lab 07
Page 67 of 99
After selecting the filter and network interface, you can begin the capture of this
network traffic. Examine the data being collected by this tool. You can see that it has
Source IP, Destination IP, and Protocol IP information for all packets. You can sort
based on Destination IP or any other fields. Try sorting the data and playing and
understanding the information that this tool collects.
OL 6- Lab 07
Page 68 of 99
You can save the output in a file and open it later for reading the file. In the example
screenshot shown below, we capture the network data in MyWiresharkOutput file
and save it on the /tmp file system.
OL 6- Lab 07
Page 69 of 99
You can read the saved /tmp/MyWiresharkOutput file later at any time using the
Wireshark GUI application. In fact, the Wireshark tool can read/import network
data captured in several formats like libcap, tcpdump, snoop, etc. There is no need to
tell Wireshark the type of file you are reading as it can figure out the file format by
opening the file.
To open a file, click the blue file folder icon in the GUI as shown below.
OL 6- Lab 07
Page 70 of 99
There is a lot more to play around and explore with Wireshark tool. We will let you
explore and try out more things on your own. This concludes the lab for Wireshark
tool.
OL 6- Lab 07
Page 71 of 99
The System Monitor application enables you to display basic system information
and monitor system processes, usage of system resources, and file systems.
OL 6- Lab 07
Page 72 of 99
You can use Edit -> Preferences option to configure the information fields that you
want to be displayed for the Processes running on the system. Similarly, you can
also configure some settings under the Resources and File Systems tab.
Enclosed below is a screenshot of the Resources tab where you can see CPU,
Memory, and Network performance graphs. We will not spend too much time on
this tool as most of it is self-explanatory.
OL 6- Lab 07
Page 73 of 99
OL 6- Lab 07
Page 74 of 99
OL 6- Lab 07
Page 75 of 99
Click the Enable button to enable the kdump service. Remember to click Apply
after enabling the kdump service. You should see the following screen if kdump is
already enabled. If you see an error stating "Starting kdump:[FAILED], then you
may have to reboot your system. This is because when you enable kdump it has to
create the kdump initial ramdisk and a reboot can create the kdump initial ramdisk.
Verify after reboot and confirm by running the tool again that the kdump service has
been enabled.
OL 6- Lab 07
Page 76 of 99
Review the Basic settings, Target settings, Filtering settings, Expert settings in the
Kernel Dump Configuration application window. Sample screenshots shown below.
OL 6- Lab 07
Page 77 of 99
OL 6- Lab 07
Page 78 of 99
The /etc/kdump.conf file is the configuration file for the kdump crash collection
service. This is the file where the kdump configuration information is stored. You
can read the man pages of the kdump.conf to get all the details.
OL 6- Lab 07
Page 79 of 99
Examine the /etc/kdump.conf file on your system by opening it using the view
command.
Check the status of the kdump service using the following command:
If the kdump service is not operational you can start this service as shown below:
OL 6- Lab 07
Page 80 of 99
Again, remember to reboot the system if you have to enable the kdump service or
have made some major configuration changes.
Netconsole:
The netconsole utility allows system console messages to be redirected across the
network to another server. When a Linux system experiences critical or fatal issues,
relevant console/message log capture is often lost because the system crashes and
reboots. In such situations, you are unable to retrieve the logs of the incident which
caused the issue from the affected system. The lack of diagnostic logs usually means
that you cannot easily debug the problem. In such situations, you can configure
Netconsole utility to send/redirect system console messages over the network to a
designated server in the event of a fatal issue. The netconsole capture tool redirects
system console messages from the affected system to another on the network. It is a
light-weight and non-intrusive logging capture tool that helps prevent against the
loss of important console messages, especially those produced during a failure.
There are two systems involved in setting up netconsole:
Netconsole Server system that receives the console messages
Netconsole Client system that sends the console messages to configured
server
Messages received by the netconsole server may be logged to its system log.
We will not be doing a lab on netconsole utility in this training. If you wish to learn
and explore this tool further, you can refer to the following My Oracle Support
document:
Doc ID 793684.1 - Netconsole: Dumping System Console Messages Across the
Network
OL 6- Lab 07
Page 81 of 99
To analyze crash dumps on Oracle Linux 6 systems, you will need the kerneldebuginfo packages for the version of the kernel that you are running. In the
example below, we are running x86_64 bit UEK R2 kernel with version number
2.6.39-300.17.3.
OL 6- Lab 07
Page 82 of 99
Once you have downloaded the uek-debuginfo RPMS for your kernel, you can run
the following rpm commands to install these 2 RPM packages.
OL 6- Lab 07
Page 83 of 99
Now that we have the crash utility and the kernel-debuginfo RPM packages
installed, we are ready and all setup. If your Oracle Linux 6.3 system experiences a
crash now, it should generate a vmcore file for the crash in the /var/crash
directory. The vmcore file will be inside a time-stamped directory inside of
/var/crash directory.
NOTE: Please do not try the remainder of this lab on any production or critical
Oracle Linux 6 system unless you are familiar with all these commands and
tools.
OL 6- Lab 07
Page 84 of 99
After running the above commands, the system will crash and create a vmcore file in
the /var/crash directory. It will also reboot after the crash. Change directory and
locate the vmcore file on your system as shown in the example below.
[root@examplehost /]# cd /var/crash
[root@examplehost crash]# ls
127.0.0.1-2013-01-15-11:40:06 127.0.0.1-2013-01-1513:10:06
[root@examplehost crash]# cd 127.0.0.1-2013-01-1513\:10\:06/
[root@examplehost 127.0.0.1-2013-01-15-13:10:06]# ls -l
total 41204
-rw-------. 1 root root 42187698 Jan 15 13:10 vmcore
[root@examplehost 127.0.0.1-2013-01-15-13:10:06]#
OL 6- Lab 07
Page 85 of 99
Now that we have all the things we need, we can examine and analyze the vmcore
file to debug and find the cause of the crash. Since debugging system crash files
(vmcore files) is an advanced topic, we will not cover that in this boot camp. But if
you are interested, you can read the vmcore file using the crash command as shown
below and run the bt command on crash prompt to examine the stack trace. Refer
to man pages for more options available on the crash prompt.
[root@examplehost 127.0.0.1-2013-01-15-13:10:06]# crash
/usr/lib/debug/lib/modules/2.6.39300.17.3.el6uek.x86_64.debug/vmlinux /boot/System.map2.6.39-300.17.3.el6uek.x86_64 ./vmcore
OL 6- Lab 07
Page 86 of 99
If crash is able to open and read the vmcore file properly, you should see
information similar to what is shown in the screenshot below.
You can run some commands at the crash prompt like bt to look at the stack trace
or ps to see process information. We will not go into details of crash command or
debugging kernel crashes as that is an advanced topic and requires more time.
OL 6- Lab 07
Page 87 of 99
We learned about to enable crash dumps, install debug kernel rpm packages needed
for crash analysis and also looked at opening/reading a vmcore crash file. This
concludes the lab on crash command.
OL 6- Lab 07
Page 88 of 99
OL 6- Lab 07
Page 89 of 99
OL 6- Lab 07
Page 90 of 99
OL 6- Lab 07
Page 91 of 99
OL 6- Lab 07
Page 92 of 99
OL 6- Lab 07
Page 93 of 99
Running the dtrace command without any flags lists the dtrace command line
options too.
[root@examplehost /]# dtrace
Usage: dtrace [-32|-64] [-aACeFGhHlqSvVwZ] [-b bufsz] [-c
cmd] [-D name[=def]]
[-I path] [-L path] [-o output] [-p pid] [-s script]
[-U name]
[-x opt[=val]] [-X a|c|s|t]
[-P provider [[ predicate ] action ]]
[-m [ provider: ] module [[ predicate ] action ]]
[-f [[ provider: ] module: ] func [[ predicate ]
action ]]
OL 6- Lab 07
Page 94 of 99
You may have to reboot the system after installing the system as DTrace packages
are kernel dependent before you run the following commands. The modprobe
command is used to load the kernel modules as shown below. The example shown
below loads only two modules.
[root@examplehost
[root@examplehost
[root@examplehost
[root@examplehost
OL 6- Lab 07
Page 95 of 99
To verify if DTrace is working on your system, you can try the dtrace command
first by listing the probes and providers using the l option.
[root@examplehost /]# dtrace -l
To list the probes for a particular provider (syscall in example below), you can use
the dtrace command with the P option as shown below.
[root@examplehost /]# dtrace l P syscall | more
OL 6- Lab 07
Page 96 of 99
Probes are enabled with the dtrace command. DTrace performs the associated
action when the probe fires. The default action indicates only that the probe fired.
No other data is recorded. You can enable (and list) probes by provider (-P), by
name (-n), by function (-f), and by module (-m).
Try the following syscall provider Dtrace example. In the example below, we first
find out the PID of the Firefox browser on the system and then run dtrace
command with n option to list and count syscalls. The syntax for probes is as
follows:
provider:module:function:name
Note: The PID on your system will be different from the one shown in this example.
[root@examplehost /]# ps -ef| grep firefox
root
2447
1 17 16:25 ?
00:00:01
/usr/lib64/firefox/firefox
root
2485 2253 0 16:25 pts/0
00:00:00 grep
firefox
[root@examplehost /]#
Find the pid of your Firefox application and then substitute that with the pid value
shown in the example below. Use Control-C to exit from the dtrace command. In
the example below, we use the syscall provider and the probe name is entry.
OL 6- Lab 07
Page 97 of 99
1
1
1
1
1
1
1
1
1
1
1
2
2
2
2
3
3
4
4
5
6
7
9
11
13
15
30
35
39
51
59
68
73
80
159
553
775
923
3259
3938
OL 6- Lab 07
Page 98 of 99
This confirms that DTrace is installed and working correctly. For more details and
practicing DTrace on Oracle Linux 6.3, refer to the following documentation:
Oracle Linux DTrace Documentation
Oracle Linux Dynamic Tracing Guide
6 Lab Summary
In this lab you learned about some of the System Monitoring and Troubleshooting
tools like OSWatcher tool, Sosreport tool, system tools (vmstat, top, iostat, strace
etc). We looked at Kdump and Crash to debug kernel crashes. We also introduced
you to DTrace. Theres a ton more to troubleshooting and monitoring in Oracle
Linux 6. See the references section below on how to go deeper in your knowledge
and discover all the powerful debugging and troubleshooting tools available for
Oracle Linux 6.
7 References
For more information and next steps, please consult additional resources: Click the
hyperlinks to access the resource.
Performance Tuning Guide
Support Diagnostic Tools
Oracle Linux Dynamic Tracing Guide
OL 6- Lab 07
Page 99 of 99