Вы находитесь на странице: 1из 310

Administering Splunk 4.

2
Ver.
1.0

Document usage guidelines


Should be used only for enrolled students
Not meant to be a self-paced document
Do not distribute

March 24, 2011


Operational Intelligence

Administering Splunk 4.2

Class Goals
Describe Splunk installation and server operations
Configure data inputs
Describe default processing and understand how to modify data inputs
Manage Splunk datastores
Add users, configure groups, and understand authentication
Describe alert configurations
Configure forwarding/receiving and clustering
Use Splunks Deployment Server
Manage jobs and knowledge objects
Find out where to get help
Operational Intelligence

Administering Splunk 4.2

Course Outline
1. Installing Splunk
2. Configuring Data Inputs
3. Modifying Data Inputs
4. Config File Precedence
5. Splunk's Data Stores
6. Users, Groups, and Authentication
7. Forwarding and Receiving
8. Distributed Environments
9. Licensing
10. Security
11. Jobs, Knowledge Objects, and Alerts
12. Troubleshooting
Operational Intelligence

Administering Splunk 4.2

Section 1:
Installing Splunk

Operational Intelligence

Administering Splunk 4.2

Section objectives
List Splunks hardware/software requirements
Describe how to install Splunk
Perform server basics; starting, stopping, and restarting Splunk
Describe the Splunk license model
List the basic tools to configure Splunk: Manager, CLI, and editing config
files
Describe apps
Upgrade to 4.2
List whats new in Splunk 4.2 for administrators
Operational Intelligence

Administering Splunk 4.2

OS requirements
Splunk works on Windows, Linux, Solaris, FreeBSD, MacOS
X, AIX, and HP-UX
Check current documentation for specifics for each OS

Operational Intelligence

Administering Splunk 4.2

Hardware Requirements
Platform

Recommended Configuration

Minimum Configuration

Non-Windows
OS

2x quad-core Xeon, 3GHz, 8GB


RAM, RAID 0 or 1+0, with a 64 bit
OS installed

1x1.4 GHz CPU, 1 GB


RAM

Windows

2x quad-core Xeon, 3GHz, 8GB


RAM, RAID 0 or 1+0, with a 64 bit
OS installed

Pentium 4 or equivalent
at 2Ghz, 2GB RAM

Operational Intelligence

Administering Splunk 4.2

Supported browsers
Firefox 2.x, 3.0.x, 3.5.x (3.5.x only supported on 4.0.6 and later)
Internet Explorer 6, 7, & 8
Safari 3
Chrome 9
All browsers need Flash 9 to render reports and display the
flash timeline

Operational Intelligence

Administering Splunk 4.2

Download the bits


Download Splunk from
www.splunk.com/download
(login required)
- Online installation instructions are

available from the download page

Obtain your enterprise


license from sales or support

Operational Intelligence

10

Administering Splunk 4.2

Download the right bits


There are 32 and 64 bit
versions, get the right one
- The wrong version may install,

but wont run

Various packages, tarballs,


and installers are available
for each OS

Operational Intelligence

11

Administering Splunk 4.2

Install it!
For zipped tarballs simply unpack the contents into the
directory you want to install Splunk
For Windows just double click on the MSI file
See the docs for OS specific packages, and Windows
command line install instructions
Splunk install directory is referred to as $SPLUNK_HOME in
both the docs and courseware
- UNIX default is /opt/splunk
- Windows default is C:\Program
Operational Intelligence

Files\splunk
12

Administering Splunk 4.2

Step by step instructions


www.splunk.com/base/Documentation/
latest/Installation/Chooseyourplat
form

Operational Intelligence

13

Administering Splunk 4.2

UNIX: to be or not to be root?


Splunk can be installed as any user
If you do not install as root, remember
- The Splunk account must be able to access the data sources
- /var/logis not typically open to non-root accounts
- Non-root accounts cannot access ports < 1024, so dont use them when

you configure data sources


- Make sure the Splunk account can access scripts used for inputs and
alerts

Operational Intelligence

14

Administering Splunk 4.2

Windows: local or domain user?


2 choices in Windows: local user OR domain account
Local user will have full access ONLY to the local system
You must use a domain account for Splunk if you want to:
- Read Event Logs remotely
- Collect performance counters remotely
- Read network shares for log files
- Enumerate the Active Directory schema using Active Directory Monitoring

See the docs for details


Operational Intelligence

15

Administering Splunk 4.2

Splunk subdirectories
Executables are located in $SPLUNK_HOME/bin
License and other important files are in $SPLUNK_HOME/etc
Indexes by default are in $SPLUNK_HOME/var/lib/splunk
Same directories in Windows, just different slashes

Operational Intelligence

16

Administering Splunk 4.2

Splunk directory structure


$SPLUNK_HOME
$SPLUNK_HOME

executables
executables

bin
bin

system
system

etc
etc

licenses,
licenses, configs
configs

apps
apps

users
users

var
var

lib
lib

splunk
splunk
indexes
indexes

search
search

Operational Intelligence

<custom
<custom app>
app>

launcher
launcher

17

Administering Splunk 4.2

Windows: Starting Splunk


Upon successful installation, you
can choose to add Splunk to the
start menu

Operational Intelligence

18

Administering Splunk 4.2

Windows: controlling Splunk services


Splunk installs 2
services splunkd and
Splunk Web
- Start and stop them as you

would any service


- Both are set to startup
automatically

You can also control


Splunk from the
command line
Operational Intelligence

C:\Program Files\Splunk\bin>splunk start


C:\Program Files\Splunk\bin>splunk stop
C:\Program Files\Splunk\bin>splunk restart

19

Administering Splunk 4.2

UNIX: Starting Splunk


The command for using/managing Splunk is
$SPLUNK_HOME/bin/splunk
# pwd
/opt/splunk/bin
# ./splunk start

The first time you start Splunk, avoid the prompt to accept the
license by using the command line tag --accept-license
# pwd
/opt/splunk/bin
# ./splunk start --accept-license
Operational Intelligence

20

Administering Splunk 4.2

UNIX: controlling Splunk processes


Stopping/starting Splunk

# ./splunk start
# ./splunk stop

Restarting Splunk

# ./splunk restart

Is Splunk running?

# ./splunk status

or
# ps ef | grep splunk

Operational Intelligence

21

Administering Splunk 4.2

UNIX: run Splunk at boot


Splunk comes with a command to enable it to start at boot
# ./splunk enable boot-start

This modifies or adds a script to /etc/init.d that will


automatically start Splunk when the OS starts
Even if you didnt install Splunk as root, this command must be
run as root

Operational Intelligence

22

Administering Splunk 4.2

Splunk processes splunkd


Accesses, processes, and indexes incoming data
Handles all search requests and returns results
Runs a web server on port 8089 by default
Speaks SSL by default
Runs Splunk helpers run as dependent process(es) of splunkd
- Splunk helpers run outside scripts, for example:

Scripted inputs
Cold to frozen scripts

Operational Intelligence

23

Administering Splunk 4.2

Splunk processes Splunk Web


Python based web server based
on CherryPy
Provides both search and
management web front end for
splunkd
Runs on port 8000 by default
Sets initial login to user: admin
password: changeme

Operational Intelligence

24

Administering Splunk 4.2

Apps
Apps are configurations of a Splunk environment designed to meet a specific
business need
- Manage a specific technology

Splunk for Websphere


Splunk for Cisco
and many more . . .

- Manage a specific OS

Splunk for Windows


Splunk for UNIX/LINUX

- Manage compliance

PCI
Enterprise Security Suite
Operational Intelligence

25

Administering Splunk 4.2

splunkbase
Choose from hundreds
of apps on
splunkbase.splunk.com
- Apps developed by Splunk

as well as the community


are available
- Vast majority of apps are
free, so dont be shy!

Operational Intelligence

26

Administering Splunk 4.2

Managing a Splunk installation


Three ways to manage a Splunk installation
Command Line Interface (CLI)
2. Directly editing config files
3. Splunk Manager interface in Splunk Web
1.

Operational Intelligence

27

Administering Splunk 4.2

Managing a Splunk installation - CLI


Command Line Interface (CLI)
- Shell access to Splunk server and user access to Splunk directory required
- Most commands require authentication and admin role to run
- If you dont provide inline authentication credentials, Splunk will ask you
./splunk clean eventdata main -auth admin:myadminpass

command
command
object
object

Operational Intelligence

authentication
authentication
(inline)
(inline)

28

Administering Splunk 4.2

Command line interface (CLI)


Also requires authentication
- Enter auth as part of command or wait for prompt
#./splunk add monitor /var/log host www1
Splunk username: admin
Password:

Inline help is available


#./splunk help
Welcome to Splunk's Command Line Interface (CLI).
Try typing these commands for more help:
help simple, cheatsheet
display a list of common commands with
help commands
display a full list of CLI commands
help [command]
type a command name to access its help
Operational Intelligence

29

syntax
page

Administering Splunk 4.2

Managing a Splunk installation config files


Directly editing config files
- Shell/console access to Splunk

server and sufficient user rights


to edit files in the Splunk
directory
- Config files must be saved in
UTF8, be sure to use the right
form for non-UTF8 OS
- Changes made this way more
often require a restart

Operational Intelligence

30

Administering Splunk 4.2

Direct editing of config files


Changes done this way
sometimes require a restart
or reload of Splunk

Operational Intelligence

31

Administering Splunk 4.2

Managing a Splunk installation - Manager


Splunk Manager interface in
Splunk Web
- Access to Splunk Web
- Admin role on the Splunk server
- Access from the main navigation

Manager link

Operational Intelligence

32

Administering Splunk 4.2

Splunk Manager general settings

Operational Intelligence

33

Administering Splunk 4.2

Splunk Manager general settings (cont.)

/opt/splunk

Operational Intelligence

34

Administering Splunk 4.2

Splunk Manager general settings (cont.)

Operational Intelligence

35

Administering Splunk 4.2

Splunk Manager general settings (cont.)


Click Save when you are
done
- All changes to general settings will

require a restart

Operational Intelligence

36

Administering Splunk 4.2

More Resources
Look on Splunkbase for additional Apps to help you manage your
Splunk servers
http://www.splunkbase.com/apps/All/4.x

There is a Troubleshooting section in the Splunk Admin manual


http://www.splunk.com/base/Documentation/latest/Admin

Operational Intelligence

37

Administering Splunk 4.2

Lab 1

Operational Intelligence

38

Administering Splunk 4.2

Section 2:

Configuring Data Inputs


Operational Intelligence

39

Administering Splunk 4.2

Section objectives
Set up data inputs
List Splunks data input types and explain how they differ
Set input properties such as host, ports, index, source type,
etc.

Operational Intelligence

40

Administering Splunk 4.2

Specifying data inputs


There are a number of ways you can specify a data input:
Apps
- Preconfigured inputs for various types of data sources available on splunkbase

Splunk Web
- You can configure most inputs using the Splunk Web data input pages

CLI
- You can use the CLI (command line interface) to configure most types of inputs

inputs.conf
- When you use Splunk Web or CLI, configurations are saved to inputs.conf
- You can edit that file directly to handle advanced data requirements
Operational Intelligence

41

Administering Splunk 4.2

Types of inputs
Files and directories monitor physical files on disk
Network inputs monitor network data feeds on specific ports
Scripted inputs import from non-traditional sources, APIs, databases,
etc.
Windows inputs Windows specific: Windows event logs,
performance monitoring, AD monitoring, and local registry monitoring
File system change monitoring monitor the state: permissions, read
only, last changed, etc. of key config or security files

Operational Intelligence

42

Administering Splunk 4.2

Setting up new inputs Apps / Add-ons

configure
configure input
input through
through
app
app setup
setup process
process
Operational Intelligence

43

Administering Splunk 4.2

Setting up new inputs Manager


Admin role and access
to SplunkWeb
Changes written to
inputs.conf
- Location of

inputs.conf is
determined by app
context

Operational Intelligence

44

Administering Splunk 4.2

Setting up new inputs CLI


Admin role and shell/console access to Splunk server required*
Useful for administering forwarders
Location of inputs added via the CLI is the Search app
#./splunk add monitor /var/log hostname www1 index webfarm
Your session is invalid. Please login.
Password:
Added monitor of /var/log

**Using the -uri flag you can send remote CLI commands from a local Splunk instance to a remote
instance without shell access. See the docs for details.
http://www.splunk.com/base/Documentation/latest/Admin/AccessandusetheCLIonaremoteserver

Operational Intelligence

45

Administering Splunk 4.2

Setting up new inputs inputs.conf


Skip the middleman of
Manager or the CLI and
directly edit inputs.conf

[default]
host = mysplunkserver.mycompany.com
[monitor:///opt/secure]
disabled = false
followTail = 0
host_segment = 3
index = default
sourcetype = linux_secure

Shell/console access to
Splunk server required
Changes made this way
require a restart

Operational Intelligence

[monitor:///opt/tradelog]
disabled = false
sourcetype = trade_entries

46

Administering Splunk 4.2

inputs.conf (cont.)
Input path specifications in inputs.conf (monitor stanzas) use Splunkdefined wildcards (also used by props.conf, discussed in next section)
(these are not REGEX-compliant expressions)
Wildcard

Description

...

The ellipsis wildcard recurses


through directories and
subdirectories to match.

The asterisk wildcard matches


anything in that specific directory
path segment.

Regex
equivalent
.*

[^/]*

Example(s)
/var/log//apache.log matches the
files
/var/log/www1/apache.log,
/var/log/www2/apache.log, etc.
/logs/*.log matches all files with
the .log extension, such as
/logs/apache.log.
It does not match /logs/apache.txt.

Note: must be used in the last


segment of the path.
Operational Intelligence

47

Administering Splunk 4.2

inputs.conf (cont.)
So
. . . matches any character(s) recursively
* matches anything 0 or more times except the /
. is NOT a wildcard and simply matches the . Literally

Syntax details:
$SPLUNK_HOME/etc/system/README/inputs.conf.spec
http://www.splunk.com/base/Documentation/latest/Admin/Inputsconf
http://www.splunk.com/base/Documentation/latest/Admin/Specifyinputp
athswithwildcard
Operational Intelligence

48

Administering Splunk 4.2

Setting source, sourcetype, and host


You can specify source, sourcetype, and host at the input level for
most inputs
Source
-

Should be left to the default

Sourcetype
- Most default processing for standard data types is based on sourcetype Whenever

possible use automatic sourcetype, select from Splunks list, or use the recipes

Host
- Opt for specific hostnames/FQDN as much as possible since the host field is a key

search tool
Operational Intelligence

49

Administering Splunk 4.2

Data inputs monitor


Monitor eats data from specified file(s) or directory(ies)
Where
- Can be pointed to an individual file or the top of a complex directory hierarchy
- Recurses through specified directory
- Indexes any directory the Splunk server can reach, local or remote file systems

How
- Unzips compressed files automatically before indexing them
- Eats new data as it arrives
- Automatically detects and handles log rotation
- Remembers where it was in a file and picks up from that spot after restart
Operational Intelligence

50

Administering Splunk 4.2

Data inputs monitor (cont.)


What
- Uses whitelists and blacklists to include or exclude files and directories
- Can be instructed to start only at the end of a large file (like tail f)
- Can automatically assigns a source type to events, even in directories

containing multiple log files from different systems, processes, etc.

Operational Intelligence

51

Administering Splunk 4.2

Monitor via Manager (called Files & Directories)

add
add new
new input
input

edit
edit existing
existing input
input

Operational Intelligence

52

Administering Splunk 4.2

Monitor file or directory Manager

Operational Intelligence

53

Administering Splunk 4.2

Monitor file or directory Manager: Source


Specify a file or
directory for
ongoing monitoring
Can also upload a
copy of a file
-

Useful for testing


and development

Operational Intelligence

54

Administering Splunk 4.2

Monitor a file or directory Manager: Host


Specify a constant
value if all monitored
files in an input are from
the same host

Operational Intelligence

55

Administering Splunk 4.2

Monitor a file or directory Manager: Host


When multiple hosts
write to the same
directory and the
host name appears
in the file name or
part of the path, use
REGEX on path to
extract the host
name

/var/log/www1.log
/var/log/www1.log will
will extract
extract www1
www1
/var/log/www_db1.log
/var/log/www_db1.log will
will extract
extract www_db1
www_db1

Operational Intelligence

56

Administering Splunk 4.2

Monitor a file or directory Manager: Host


When multiple hosts
write to the same
directory and host
name appears as a
consistent
subdirectory in the
path, use segment in
path
/logs/www1/web.log
/logs/www1/web.log or
or /logs/www2/web.log
/logs/www2/web.log

Operational Intelligence

57

Administering Splunk 4.2

Monitor a file or directory Manager: Sourcetype


Automatic
- Splunk automatically determines source type for most

major data types


- Useful for directories with many different types of log files

Manual
- Enter a name for a

specific sourcetype

From list
- Choose the sourectype

from the dropdown list


Operational Intelligence

58

Administering Splunk 4.2

Monitor a file or directory Manager: Index


Select the index where this
monitor input will be stored
If you want to put a new input in
a new index, you must create
the index before the input

Operational Intelligence

59

Administering Splunk 4.2

Monitor a file or directory Manager: Follow tail


Follow tail works like tail -f
it starts at the end of the file
and only eats new input from
that point forward
Only applies to the very first
time the new monitor input is
added

Operational Intelligence

60

Administering Splunk 4.2

Monitor a file or directory Manager: Whitelist


If a file is whitelisted, Splunk consumes it and
ignores all other files in the set
Use whitelist rules to tell Splunk which files to
consume when monitoring directories
This
This whitelist
whitelist will
will only
only index
index files
files
that
that end
end in
in .log
.log
Use
Use aa || to
to create
create OR
OR
statements:
statements: indexes
indexes files
files that
that
end
end in
in query.log
query.log or
or my.log
my.log
Add
Add aa leading
leading slash
slash to
to insure
insure an
an
exact
exact file
file match:
match: only
only indexes
indexes
query.log
query.log and
and my.log
my.log
Operational Intelligence

61

Administering Splunk 4.2

Monitor a file or directory Manager: Blacklist


If a file is blacklisted, Splunk ignores it and
consumes all other files in the set
Use blacklist rules to tell Splunk which files not to
consume when monitoring directories
This
This blacklist
blacklist won't
won't index
index files
files that
that
end
end in
in .txt
.txt
Use
Use aa || and
and ()
() to
to create
create OR
OR
statements:
statements: won't
won't index
index files
files that
that
end
end in
in .txt
.txt or
or .gz
.gz
This
This blacklist
blacklist avoids
avoids both
both archive
archive
and
and historical
historical directories
directories (as
(as well
well as
as
files
files named
named archive
archive and
and historical)
historical)
Operational Intelligence

62

Administering Splunk 4.2

Scripted inputs
Splunk can run scripts periodically that generate input
- Scripts need to be shell (.sh) on *nix or batch (.bat) on Windows
- Or Python on any platform
- Can use any scripting language the OS will run if wrapped in a shell or batch wrapper

Splunk eats the standard output (stdout) of the script


Use them to run diagnostic commands such as top, netstat, vmstat, ps, etc.
Used in conjunction with many Splunk Apps to gather specialized
information from the OS or other systems running on the server
Also good for gathering data from APIs, message queues, or other custom
connections
Operational Intelligence

63

Administering Splunk 4.2

Setting up a scripted input


1. Write or obtain the script
2. Copy it to your Splunk servers script directory
3. If possible, test your script from that directory to make sure it
runs correctly
4. Set up input in Manager
5. Click save and wait for a few intervals to pass, then verify
that the input is available in Search or its App

Operational Intelligence

64

Administering Splunk 4.2

Manager Scripted inputs

Operational Intelligence

65

Administering Splunk 4.2

Manager Scripted inputs (cont.)


Splunk will only run scripts from specified bin directories
$SPLUNK_HOME/etc/system/bin OR
$SPLUNK_HOME/etc/app/<app_name>/bin

Interval is in seconds,
though you can also
specify a schedule
using CRON syntax
The interval is the time
period between script
executions
Operational Intelligence

66

Administering Splunk 4.2

Manager Network inputs

Operational Intelligence

67

Administering Splunk 4.2

Manager Network inputs: Source port


TCP or UDP feeds from 3rd party
systems (not Splunk Forwarders)
Splunk can be configured to
listen to a specified UDP or TCP
data feed and index the data
- Can be set to accept feeds from any

host or just one host on that port

Can specify any non-used network


port (that is NOT splunkds or
Splunk Webs ports)
Operational Intelligence

68

Administering Splunk 4.2

Manager Network inputs: source and sourcetype


By default Splunk will set the source
to be host:port
- a syslog feed from a firewall named

fw_01 would have fw_01:514 for its


source

Only two options for sourcetype, from


list or manual
- If there are multiple sourctypes coming

from a single network feed you will need


to configure further processing to handle it
(Covered in the next section)

Operational Intelligence

69

Administering Splunk 4.2

Manager Network inputs: Host


Three choices for host:
- IP Splunk will use the IP address

of the sender (default)


- DNS Splunk will do a reverse
DNS lookup for the host name
- Custom allows you to specify a
specific host name

Operational Intelligence

70

Administering Splunk 4.2

File system change monitoring


FSChange (must be setup in inputs.conf) monitors changes to files
and directories
DOES NOT index the contents of the files and directories
Writes an event to an index when it detects a change or deletion
Monitors:
- Modification date/time
- group ID
- user ID
- file mode (read/write attributes, etc.)
- optional SHA256 hash of file contents
Operational Intelligence

71

Administering Splunk 4.2

Setting up fsmonitor
Set up a stanza in inputs.conf

[fschange:/etc/]
pollPeriod = 60
host = splunkserver.company.com

List the directory you want Splunk


to monitor
- DO NOT use file system change

monitoring on a directory that is being


indexed using Monitor

Default sourcetype=
fs_notification
pollPeriod is interval in seconds
Splunk checks the files for changes
Operational Intelligence

72

Administering Splunk 4.2

Windows inputs
Windows inputs must be set up on a Windows Splunk instance
UNIX indexers CAN and will index and search Windows inputs
Set up a Universal Forwarder or Light Forwarder to get
Windows inputs to a UNIX indexer

Operational Intelligence

73

Administering Splunk 4.2

Windows inputs Local or remote event logs


Local event logs can be
collected from a Universal
Forwarder or the local indexer
Remote event log collection
requires proper domain account
permissions on the remote
machine

Operational Intelligence

74

Administering Splunk 4.2

Windows inputs local event logs


Select the event logs you wish
Splunk to index
For further settings, edit
inputs.conf directly

Operational Intelligence

75

Administering Splunk 4.2

Windows inputs remote event logs


Enter a host to choose logs
- Click Find logs

to populate the

available logs list

Optionally, you can collect the


same set of logs from additional
hosts
- Enter host names or IP addresses,

separated by commas

Operational Intelligence

76

Administering Splunk 4.2

Windows event log settings in inputs.conf


start_from - Use this setting to tell
Splunk to start with the newest
events and then work its way back
to the oldest default is oldest
current_only - If set to 1, Splunk
will only index events starting from
the day the input was set up and
going forward default is 0

Operational Intelligence

77

Administering Splunk 4.2

Windows inputs Performance monitor


Use Performance Monitor to
collect data from a local
machine Forwarder or
Indexer

Operational Intelligence

78

Administering Splunk 4.2

Windows inputs Performance monitor (cont.)


Select an object to monitor
Based on the object you
select, the Counters section
is populated with available
counters

Operational Intelligence

79

Administering Splunk 4.2

Windows inputs Performance monitor (cont.)


Select instances
Set the polling interval

Operational Intelligence

80

Administering Splunk 4.2

Windows inputs Registry monitoring


Indexes the registry
whole cloth, as well as
any ongoing changes
See the docs for details
on limiting what is actually
monitored
www.splunk.com/base/Docume
ntation/latest/Admin/Monit
orWindowsregistrydata

Operational Intelligence

81

Administering Splunk 4.2

Windows inputs: AD monitoring


You can specify a domain controller or let Splunk discover the
nearest one
You can then specify the highest node in the tree you want
Splunk to monitor
- Splunk will move down the tree recursively
- If unchecked will index the entire tree, including the schema

Use permissions of the Splunk users to limit what it can monitor


in AD
www.splunk.com/base/Documentation/latest/Admin/AuditActiveDirectory
Operational Intelligence

82

Administering Splunk 4.2

Windows inputs Windows app


Installing the Windows app allows you to
collect and monitor several common windows
input types

Operational Intelligence

83

Administering Splunk 4.2

Lab 2 Data inputs

Operational Intelligence

84

Administering Splunk 4.2

Section 3: Modifying
Data Inputs

Operational Intelligence

85

Administering Splunk 4.2

Section objectives
Describe how data moves from input to index
Understand the default processing that occurs during indexing
List the config files that govern data processing
Learn how to override default data processing
Learn how to discard unwanted events
Learn how mask sensitive data
Learn how to extract fields
Operational Intelligence

86

Administering Splunk 4.2

Input to Index Big Picture


Ne
tw
or
ki
np
uts

Win
dow
s in
puts

Disk
Monitor inputs

s
t
u
p
n
i
d
e
t
p
ri
c
S

Operational Intelligence

87

Administering Splunk 4.2

Input Phase:
Raw data from
all forms of
input collected

Operational Intelligence

Parsing Phase:
Raw data
broken down
into events, and
then event by
event
processing

88

License Meter

Indexing phases

Indexing Phase:
Index generated
and data is
written to disk

Administering Splunk 4.2

Inputs phase details


Inputs phase works with entire streams of data, not individual events.
Overarching metadata is applied.
inputs.conf

props.conf

windows files

source, sourcetype, and


host

CHARSET and sourcetyping


based on source

wmi.conf and
regmon-filters.conf

See: www.splunk.com/wiki/Where_do_I_configure_my_Splunk_settings%3F
for details

Operational Intelligence

89

Administering Splunk 4.2

props.conf
props.conf is a config file that plays a role in all aspects of Splunk data
processing
Governs most aspects of data processing, can also invoke settings in
other config files
Uses similar stanza format of inputs.conf and other Splunk config
files
See $SPLUNK_HOME\etc\system\README\props.conf.spec and
props.conf.example for syntax and examples

Operational Intelligence

90

Administering Splunk 4.2

props.conf specifications
props.conf stanzas use specifications to map configurations to data
streams
The specification can be either host, source, or sourcetype
Patter
n
[host::<hostname>]

Examp
[host::www1]le

attribute = value

TZ = US/Pacific

[source::<source>]
attribute = value

[source::/var/log/trade.log]
sourcetype = trade_entries

[<sourcetype>]
attribute = value

[syslog]
TRANSFORMS-host=per_event_host

source and sourcetype specs are case sensitive, host is NOT


Operational Intelligence

91

Administering Splunk 4.2

Inputs phase props.conf


sourcetype can be set based on source during the inputs phase
[source::/var/log/custom*]
sourcetype = mycustomsourcetype

[source::...\\web\\iis*]
sourcetype = iis_access

CHARSET spec can be set at this time. Default is automatic, use this
setting to override if auto is not working correctly. See docs for list of
character sets
[source::.../seoul/*
]
CHARSET = EUC-KR

[source::h:\\web\\\\*]
CHARSET = Georgian-Academy

www.splunk.com/base/Documentation/base/Data/
Configurecharactersetencoding
Operational Intelligence

92

Administering Splunk 4.2

Parsing phase big picture


Data from inputs phase are broken up into individual events, and then
any event-level processing is performed.
Chunks of data
from inputs phase

Operational Intelligence

Broken into
individual events.

93

Event-by-event
processing

Administering Splunk 4.2

Parsing phase details


A majority of data processing work is done during the parsing phase
- Actual event boundaries are decided, date/timestamp are extracted and any

type of per-event operation is performed


automatic

override

auto-sourcetyping,
autodate/timestamping, and
auto-linebreaking, time
zone

per-event REGEX
based sourcetype, host,
or index settings,
custom line breaking
and date/timestamping

Operational Intelligence

94

custom
REGEX/SEDCMD
rewrites, per-event
routing to other
indexers, 3rd party
systems, or the null
queue

Administering Splunk 4.2

Parsing phase: automatic


Switches data to UTF-8
By default Splunk will attempt to automatically
- detect event boundaries (monitor and network inputs)
- extract date/timestamps (monitor and network inputs)
- assign sourcetypes (for monitor input only)

Default settings are in


$SPLUNK_HOME/etc/system/default/props.conf
- in the parsing phase props.conf can call stanzas in another config file

transforms.conf located in the same directory

Operational Intelligence

95

Administering Splunk 4.2

Its automatic . . .
Success rate of automatic processing will vary. For standard data types
such as syslog, web logs, etc., Splunk does a great job. For custom, or
esoteric logs youll need to test, though even then the odds are good it
will get it right.
- Correct date/timestamping and linebreaking are key to subsequent processing

and the ultimate searchability of data

Other types of automatic processing


- Window inputs
- syslog host extraction

www.splunk.com/base/Documentation/base/Data/Overviewofeventpro
cessing
Operational Intelligence

96

Administering Splunk 4.2

Line breaking
If automatic event boundary detection is not working correctly
- Bad event breaking is usually easy to detect in indexed test data, but be careful

since bad line breaking can show up as bad timestamping

2 methods
- SHOULD_LINEMERGE=false(most efficient)

Using this method Splunk cuts the data stream directly into finished events using either
the new line \n or carriage return \r characters (default) or a REGEX you specify with
LINE_BREAKER

- SHOULD_LINEMERGE=true

Splunk uses a configurable two-step process to split your data into individual events

Operational Intelligence

97

Administering Splunk 4.2

SHOULD_LINEMERGE = false
Already set for many standard types of data including syslog (including
snare), windows inputs, and web data
- See $SPLUNK_HOME/systemor

apps/<app_name>/default/props.conffor details

Should be set for custom data with one event per line formats
- breaking on /n or /r characters

Or if possible use other pattern breakers, but be ready to sacrifice the


characters that make up the pattern from your raw data
- The characters that make up the pattern match arent kept as part of the events

Operational Intelligence

98

Administering Splunk 4.2

SHOULD_LINEMERGE = true
The default if not specified
Splunk merges multiple lines of data into single events based on the
rule, new line with a date at the start or 256 total lines marks an event
boundary
- BREAK_ONLY_BEFORE_DATE=true(the default)
- MAX_EVENTS=256(default)

Certain predefined data types like log4j and other application server
logs use BREAK_ONLY_BEFORE=<REGEXpattern>that when
matching the start of a new line, marks the start of a new event
Operational Intelligence

99

Administering Splunk 4.2

Custom line merge


If your multiline data and default processing dont get along beyond
the BREAK_ONLY_BEFORE setting there are many more REGEX
based settings to divide up your events
- www.splunk.com/base/Documentation/latest/Data/Indexmulti-lineevents for

details
- see also

$SPLUNK_HOME/etc/system/README/props.conf.spec
or www.splunk.com/base/Documentation/latest/Admin/Propsconf

Operational Intelligence

100

Administering Splunk 4.2

Date/timestamp extraction
Like event boundaries, correct date/timestamp extraction is key to
Splunking your data
Verify timestamping when setting up new data types
- Pay close attention to time stamping during testing/staging of custom/or non-

standard data types


- Convert UNIX time or other non-human readable time stamps and compare

Well tuned for standard data types


- See props.conf in $DEFAULT and

http://www.splunk.com/base/Documentation/latest/Data/ConfigureTimestampRe
cognition
for details
Operational Intelligence

101

Administering Splunk 4.2

Custom date/timestamp props.conf


TIME_PREFIX = <REGEX> which matches characters right BEFORE
the date/timestamp
- Use this for events with multiple timestamps to pinpoint the correct one or with

events that have data that looks like a timestamp but isnt that confuses the
processor

Example data with date-like code at the start of the line

1989/12/3116:00:00edMay2315:40:212011ERRORUserManagerExceptionthrown
1989/12/3116:00:00edMay2315:40:212011ERRORUserManagerExceptionthrown

Start
Start looking
looking here
here for
for
date/timestamp
date/timestamp

[my_custom_source_or_sourcetype]
TIME_PREFIX = \d{4}/\d{2}/\d{2} \d{2}:\d{2}:\d{2} \w+\s
Operational Intelligence

102

Administering Splunk 4.2

Custom date/timestamp props.conf (cont)


MAX_TIMESTAMP_LOOKAHEAD=<integer>specifying how many
characters to look beyond the start of the line for a timestamp
- works in conjunction with TIME_PREFIX if set, in which case it starts counting from
the point the TIME_PREFIX indicates Splunk should start looking for the
date/timestamp
- Improves efficiency of timestamp extraction

As with multiline event configs, see


$SPLUNK_HOME\etc\system\README\props.conf.spec and the
docs for even more options if necessary
www.splunk.com/base/Documentation/latest/Data/Handleeventtimestamps
Operational Intelligence

103

Administering Splunk 4.2

Time zones
Splunk follows these default rules when it attaches a time zone to a
time stamp
1. It looks in the raw event data for a time zone indicator such as
GMT+8 or PST and uses that
2. It looks in props.conf to see if a TZ attribute has been given for this
data stream based on standard settings referenced here: en
.wikipedia.org/wiki/List_of_zoneinfo_timezones
3. If all else fails it will apply the time zone of the indexer
[host::nyc*]
TZ = America/New York
Operational Intelligence

[source::/mnt/cn_east/*]
TZ = Asia/Shanghai
104

Administering Splunk 4.2

Time and Splunking


Splunk depends heavily on existing time infrastructure
Timestamps in Splunk are only as good as the time settings on servers
and devices that feed into Splunk
A good enterprise time infrastructure makes for good timestamping
which makes for good Splunking

Operational Intelligence

105

Administering Splunk 4.2

Per event REGEX changes


Splunk can modify data in individual events based on REGEX pattern
matches
Requires invoking a second file, transforms.conf (see next slide)
Using props.conf and transforms.conf you can disable/modify
existing modifications, or add your own custom settings

Operational Intelligence

106

Administering Splunk 4.2

transforms.conf
Config file whose stanzas are invoked by props.conf
- All caps TRANSFORMS=<transforms.conf_stanza>syntax used to

invoke index time changes


- Required for all REGEX pattern match processing
- Resides in the same directory(ies) as props.conf
- Can also be called at search time by REPORT, LOOKUP (search time section
coming up)
$SPLUNK_HOME/etc/system/default/transforms.conf
$SPLUNK_HOME/etc/system/default/props.conf

[syslog]
TRANSFORMS = syslog-host

Operational Intelligence

[syslog-host]
DEST_KEY = MetaData:Host
REGEX = :\d\d\s+(?:\d+\s+|(?:user|daemon|
local.?)\.\w+\s+)*\[?(\w[\w\.\-]{2,})\]?\s
FORMAT = host::$1
107

Administering Splunk 4.2

transforms.conf (cont.)
Transforms uses standard settings to indicate what its REGEX will
match and what it will rewrite based on the match
The source and destination of these tranformations are referred to as
keys
- SOURCE_KEY tells Splunk where to apply the REGEX (optional)
- DEST_KEY tells Splunk where to apply the data modified by the REGEX and

FORMAT setting (required)


- REGEX is the regular expression and capture groups (if any) that operate on the
SOURCE_KEY (required)
- FORMAT controls how REGEX writes the DEST_KEY (required)
Operational Intelligence

108

Administering Splunk 4.2

Keys in action
From the default syslog host extraction
[syslog-host]
DEST_KEY = MetaData:Host
REGEX = :\d\d\s+(?:\d+\s+|(?:user|daemon|
local.?)\.\w+\s+)*\[?(\w[\w\.\-]{2,})\]?\s
FORMAT = host::$1

The
The REGEX
REGEX pattern
pattern here
here is
is looking
looking for
for aa host
host
name
name embedded
embedded in
in syslog
syslog data.
data. Only
Only one
one
nd set of
capture
capture group
group is
is referenced
referenced here:
here: the
the 22nd
set of
parenthesis.
parenthesis. In
In this
this circumstance
circumstance we
we would
would
nd
expect
expect the
the host
host name
name to
to appear
appear within
within the
the 22nd
set
set of
of parenthesis.
parenthesis.

FORMAT
FORMAT specifies
specifies what
what is
is written
written out
out to
to the
the
DEST_KEY.
DEST_KEY. Here
Here host::$1
host::$1 means
means host=1
host=1stst
REGEX
REGEX capture
capture group.
group.
Operational Intelligence

We
We are
are updating
updating the
the host
host field,
field, so
so our
our
DEST_KEY
DEST_KEY is
is MetaData:Host,
MetaData:Host, for
for sourcetype
sourcetype itit
would
would be
be MetaData:Sourcetype,
MetaData:Sourcetype, for
for index
index itit
would
would be_MetaData:Index
be_MetaData:Index (Case
(Case and
and for
for index
index
the
the underscore
underscore counts!)
counts!) See
See
transforms.conf.spec
transforms.conf.spec for
for details.
details.

109

Administering Splunk 4.2

Setting sourcetype per event


You can configure Splunk to set sourcetype on a per event basis
- This should be your sourcetypeing of last resort since inputs.conf settings and

source based sourcetyping using just props.conf are less resource intensive
In props.conf
A
A value
value after
after TRANSFORMS
TRANSFORMS give
give this
this
[source::udp:514]
TRANSFORMS-1srct = custom_sourcetyper

In transforms.conf

Any
Any event
event from
from this
this source
source where
where the
the last
last word
word
of
of the
the line
line is
is Custom
Custom will
will get
get the
the sourcetype
sourcetype of
of
custom_log
custom_log

[custom_sourcetyper]
DEST_KEY = MetaData:Sourcetype
REGEX = .*Custom$
FORMAT = sourcetype::custom_log
Operational Intelligence

transformation
transformation aa name
name space,
space, this
this comes
comes into
into
play
play for
for multiple
multiple transformations
transformations and
and provides
provides
precedence
precedence ifif needed
needed

110

Administering Splunk 4.2

Per event index routing


Like sourcetype, if at all possible specify the index for your inputs in
inputs.conf
props.conf

Note
Note the
the use
use of
of _MetaData:Index
_MetaData:Index

[routed_sourcetype]
TRANSFORMS-1indx = custom_sourcetype_index

transforms.conf
[custom_sourcetype_index]
DEST_KEY = _MetaData:Index
REGEX = .
FORMAT = custom_index

Operational Intelligence

Were
Were using
using aa wide
wide open
open REGEX
REGEX since
since
we
we want
want everything
everything classified
classified as
as this
this
sourcetype
sourcetype routed
routed to
to aa different
different index.
index.
More
More granular
granular routing
routing would
would have
have aa more
more
complex
complex REGEX
REGEX

For
For index
index routing,
routing, the
the FORMAT
FORMAT simply
simply
takes
takes the
the name
name of
of the
the index
index you
you are
are
routing
routing to
to

111

Administering Splunk 4.2

Filtering unwanted events


You can route specific unwanted events to the null queue
- Events discarded at this point do NOT count against your daily license quota

props.conf
[WinEventLog:System]
TRANSFORMS-1trash = null_queue_filter

transforms.conf

Since
Since Windows
Windows Event
Event logs
logs are
are multiline
multiline
events
events we
we need
need to
to use
use the
the REGEX
REGEX
multiline
multiline indicator
indicator (?m).
(?m). Applies
Applies to
to any
any
multiline
multiline event
event and
and REGEX,
REGEX, not
not just
just null
null
queue
queue

[null_queue_filter]
DEST_KEY = queue
REGEX = (?m)^EventCode=(592|593)
FORMAT = nullQueue

Operational Intelligence

Here
Here our
our DEST_KEY
DEST_KEY is
is queue
queue since
since were
were
routing
routing these
these events
events outside
outside the
the data
data flow
flow

FORMAT
FORMAT indicating
indicating nullQueue
nullQueue means
means we
we
are
are throwing
throwing away
away events
events that
that match
match this
this
pattern
pattern
112

Administering Splunk 4.2

Other routing
Beyond routing to the nullQueue, you can also route data to:
- other Splunk indexers
- 3rd party systems

See for details www


.splunk.com/base/Documentation/latest/Admin/Routeandfilterdata

Operational Intelligence

113

Administering Splunk 4.2

Modifying the raw data stream


Sometimes its necessary to modify the underlying log data, especially
in the case of privacy concerns
Splunk provides 2 methods of doing this, REGEX and SEDCMD
The REGEX method uses transforms.conf and works on a per-event
level, the SEDCMD uses only props.conf and operates on an entire
source, sourcetype, or host identified stream
Care should be taken when modifying _raw since unlike all other
modifications discussed, this sort actually modifies the raw log data

Operational Intelligence

114

Administering Splunk 4.2

Modifying _raw - REGEX


Works similarly to previous props.conf / transforms.conf modifications
props.conf
[source::...\\store\\purchases.log]
TRANSFORMS-1ccnum = cc_num_anon

DEST_KEY
DEST_KEY == _raw
_raw indicates
indicates we
we are
are
modifying
modifying the
the actual
actual log
log data
data

transforms.conf
[cc_num_anon]
DEST_KEY = _raw
REGEX = (.*CC_Num:/s)\d{12}(\d{4}.*)
FORMAT = $1xxxxxxxxxxxx$2

Operational Intelligence

115

$1
$1 preserves
preserves all
all the
the data
data prior
prior to
to the
the first
first
12
12 digits
digits of
of the
the credit
credit card
card number.
number. $2
$2
grabs
grabs everything
everything after
after including
including the
the last
last 44
digits,
digits, we
we need
need to
to do
do this
this since
since we
we are
are
rewriting
rewriting the
the raw
raw data
data feed
feed

Administering Splunk 4.2

Modifying _raw SEDCMD


Splunk leverages a sed-like syntax for simplified data modifications
- Note that while sed is traditionally a UNIX command, this functionality works on Windows-

based Splunk installs as well

Its all done with a single stanza in props.conf


- The REGEX syntax using

s:
SEDCMD<name>=s/<REGEX>/<replacement>/flags

flags are either g to replace all matches, or a number to just replace that number of matches

- The string match syntax using y:

SEDCMD<name>=y/<string1>/<string2>

String matches cannot be limited, all matches will be replaced

String1 will be replaced with string2

Operational Intelligence

116

Administering Splunk 4.2

SEDCMD cont
An example SEDCMD REGEX based replacement to overwrite the
first 5 digits of an account number anytime it appears in the
\1
\1 here
here works
works like
like aa
accounts.log source
[source::.../accounts.log]
SEDCMD-1accn = s/id_num=\d{5}(\d{5})/id_num=xxxxx\1/g

$1
$1 back-reference
back-reference in
in
transforms.conf
transforms.conf
REGEX
REGEX

This will replace id_num=1234567890with id_num=xxxxx


67890
You can put multiple replacement rules in a single props.conf stanza,
simply put a space and start again with s/
Operational Intelligence

117

Administering Splunk 4.2

Parsing phase: override


Splunks automatic processing can be overridden/disabled
- Make your changes to files in $SPLUNK_HOME/etc/system/localor

$SPLUNK_HOME/etc/<app_name>/local

To disable create/edit props.conf in $SPLUNK_HOME/etc/system/local


or local directory of an app
- Turn off syslog host extraction for the syslog sourcetype
$SPLUNK_HOME/etc/system/default/props.conf

$SPLUNK_HOME/etc/system/local/props.conf

[syslog]
TRANSFORMS =

Operational Intelligence

overwrites

118

[syslog]
TRANSFORMS = syslog-host

Administering Splunk 4.2

Indexing phase details


After the parsing phase Splunk passes the fully processed data off to the
index processor
index created
license meter
_raw is metered for
license usage

Operational Intelligence

Keyword index
created, _raw is
compressed and
both are written to
disk

119

Disk

Administering Splunk 4.2

Persisted to disk
Once data reaches hard disk all modifications and extractions are
written to disk along with _raw
- source, sourcetype, host, timestamp, and punct

Indexed data cannot be changed


Modifications to processing wont be retroactive without reindexing
For this reason its recommended to test default and custom index time
processing on a staging instance prior to indexing in production

Operational Intelligence

120

Administering Splunk 4.2

Search phase Big picture


Searches
Searches from
from users
users
or
or alerts
alerts

Disk

Operational Intelligence

Search time
modifications

121

Administering Splunk 4.2

Search phase Big picture RT search


Real time searches work similarly except they bypass disk
Real
Real time
time searches
searches
from
from users
users or
or alerts
alerts

Index phase

Disk

Operational Intelligence

Search time
modifications

122

Administering Splunk 4.2

Search time modifications


MANY different transformations/updates/modifications are available at
search time
- Data (usually sourcetype) dependent field extractions both custom, default, or

from add-ons or apps


- Lookups, event types, tags, field aliases, and many more . . .

These changes only apply to search results, no modification to data


written to disk
- Fully retroactive designed to be flexible
- Best way to customize data and build institutional knowledge into Splunk

Operational Intelligence

123

Administering Splunk 4.2

Search time for admins


Splunk expands the ability to create most search time mods to the
Power and User roles
- Most are covered in more user/knowledge manager oriented classes
- Most can be fully administered through Splunk Webs manager

Admins may be called on to


- install apps and add ons (already covered)

Remember, apps/add ons are bundles of search time lookups, field extractions, tags, etc.
NOT just views and dashboards

- Create custom field extractions and change/disable search time modifications

using the file system


Operational Intelligence

124

Administering Splunk 4.2

Default field extractions at search time


Most fields used in Splunk come from your data
For many common sourcetypes Splunk has default search time field
extractions in place
Additional default extractions are easy to add with Add Ons and Apps
- The *Nix app for example has many search time fields for standard UNIX-y logs

like secure.log or messages.log, etc.


- The Windows app has similar defaults for Windows data
- For non-OS data, look for an app specifically designed for that data on
www.splunkbase.com

Operational Intelligence

125

Administering Splunk 4.2

3 ways to create a search time field


1. Editing config files
- available only to admins, knowledge of REGEX required

2. Using the IFX in Splunk Web (covered in Using)


- available to admin and power role, knowledge of REGEX helpful but not required

3. Using the rex command in the search language (covered in Search


& Reporting)
- all roles can use this command, knowledge of REGEX required

Operational Intelligence

126

Administering Splunk 4.2

The usual suspects


Custom search time fields are created by stanzas in props.conf and
sometimes transforms.com
2 methods
- 1. using just props.conf EXTRACT

Simple single field extractions


Available after Splunk 4.0
Recommended method covered here

- 2. using props.conf REPORTS and transforms.conf

Useful for reusing extractions across multiple sourcetypes


www.splunk.com/base/Documentation/latest/Knowledge/Createandmaintainsearchtimefieldextractionsthroughconfigurationfiles for details
Operational Intelligence

127

Administering Splunk 4.2

props.conf EXTRACT
A single stanza in props.conf using EXTRACT with a source,
sourcetype, or host spec (usually a sourcetype)
Use the EXTRACT command with a name and the REGEX after the
equals sign
props.conf
[tradelog]
EXTRACT-1type = .*type:\s(?<acct_type>personal|business)

Wrap
Wrap parenthesis
parenthesis around
around your
your field
field value
value to
to
created
created aa named
named capture,
capture, and
and then
then embed
embed
your
your field
field name
name within
within those
those parenthesis
parenthesis
with
with ?<field_name>
?<field_name>
Operational Intelligence

128

Administering Splunk 4.2

Other search time processing


Many other knowledge objects/search time processing are stored in
other config files
- macros.conf, tags.conf, eventtypes.conf, savedsearches.conf, etc.

When users create or modify these Splunk Web simply writes to these
files for them
Admins can directly modify these files, though we recommend using
Manager if possible
See .conf files in $SPLUNK_HOME/etc/system/READMEand the
docs for details on specific files
Operational Intelligence

129

Administering Splunk 4.2

Lab 3

Operational Intelligence

130

Administering Splunk 4.2

Section 4:
Config Precedence

Operational Intelligence

131

Administering Splunk 4.2

Config files and precedence


UI or CLI changes also update config files
Splunk gathers up all of the various config files and combines them at
index and search time based on rules of precedence
Rules of precedence vary depending on if configurations are being
applied at search time or index time
Index time precedence relies solely on the location of the files
Search time precedence also takes into account which user is loggedin and which app they are using

Operational Intelligence

132

Administering Splunk 4.2

Index time precedence


At index time, Splunk applies precedence in the following order
1. $SPLUNK_HOME/etc/system/local
2. $SPLUNK_HOME/etc/apps/<app_name>/local**
3. $SPLUNK_HOME/etc/apps/<app_name>/default
4. $SPLUNK_HOME/etc/system/default
**Note that within the $SPLUNK_HOME/etc/appsdirectory individual apps get
precedence based on ASCII alphabetical order. So an app called aardvark
would have precedence over the windows app. But an app called 1windows
would have precedence over aardvark since numbers come before letters in
ASCII. Also note that ASCII order is not numerical order, so 1 would come before
2, but 10 would also come before 2!
Operational Intelligence

133

Administering Splunk 4.2

Index time precedence


$SPLUNK_HOME
$SPLUNK_HOME
etc
etc

apps
apps

system
system
6

1
default
default

users
users

unix
unix

local
local

default
default

search
search

4
local
local

default
default

joe
joe

134

admin
admin

unix
unix

search
search

local
local

local
local

local
local

2 and 3 in ASCII order by app name

Operational Intelligence

mary
mary

Administering Splunk 4.2

Search time precedence


Search time has the following precedence order
1. $SPLUNK_HOME/etc/users/<username>/<app_context>/local**
2. $SPLUNK_HOME/etc/apps/<app_context>/localanddefault**
3. $SPLUNK_HOME/etc/system/local
4. $SPLUNK_HOME/etc/apps/<app_by_ASCII>/local***
5. $SPLUNK_HOME/etc/apps/<app_by_ASCII>/default
6. $SPLUNK_HOME/etc/system/default
- ** app_context is the app the user is currently in/using and username refers to

the actual user name the user logged in as


- ***app_by_ASCII refers to the ASCII order referred to in the previous slide
Operational Intelligence

135

Administering Splunk 4.2

Search time precedence


Example:
mary working in the
unix app context

$SPLUNK_HOME
$SPLUNK_HOME
etc
etc

apps
apps

system
system
7

4
default
default

users
users

unix
unix

local
local

default
default

search
search

6
local
local

default
default

joe
joe

local
local

After 3, the earlier pattern applies

Operational Intelligence

136

mary
mary

admin
admin

unix
unix

search
search

local
local

local
local

Administering Splunk 4.2

Precedence is cumulative
At index time if $SPLUNK_HOME/etc/system/local/props.conf contained this stanza
[source::/opt/tradelog/trade.log]
sourcetype=tradelog

And if $SPLUNK_HOME/etc/apps/tradeapp/local/props.confcontained
[source::/opt/tradelog/trade.log]
SHOULD_LINEMERGE=True
BREAK_ONLY_BEFORE=TradeID

Becomes
[source::/opt/tradelog/trade.log]
sourcetype=tradelog
SHOULD_LINEMERGE=True
BREAK_ONLY_BEFORE=TradeID
Operational Intelligence

137

Administering Splunk 4.2

However
At index time if $SPLUNK_HOME/etc/system/local/props.confcontained the following stanza
[source::/opt/tradelog/trade.log]
sourcetype=tradelog

And if $SPLUNK_HOME/etc/apps/tradeapp/local/props.conf contained


[source::/opt/tradelog/trade.log]
sourcetype=log_of_trade
SHOULD_LINEMERGE=True
BREAK_ONLY_BEFORE=TradeID

Becomes:
[source::/opt/tradelog/trade.log]
sourcetype=tradelog
SHOULD_LINEMERGE=True
BREAK_ONLY_BEFORE=TradeID
Operational Intelligence

138

Administering Splunk 4.2

Section 5:
Splunks Data Store
Operational Intelligence

139

Administering Splunk 4.2

Section Objectives
Learn index directory structure
Answer the question: What are buckets? and describe how they
move from hot to cold
Describe how to configure aging and retention times
Show how to set up indexes
Learn how to set up volumes on hard disk
Describe back up strategies
Show how to clean out an entire index or selectively delete data
Operational Intelligence

140

Administering Splunk 4.2

Splunks default indexes


Splunk ships with several indexes already set up
main the default index, all inputs go here by default (called defaultdb in the file
system)
summary default index for summary indexing system
_internal Splunk indexes its own logs and metrics from its processing here
_audit Splunk stores its audit trails and other optional auditing information
_thefishbucket Splunk stores file information for its monitor function

Operational Intelligence

141

Administering Splunk 4.2

Index locations in the file system


$SPLUNK_HOME/var/lib/splunk
$SPLUNK_HOME/var/lib/splunk

os
os

defaultdb
defaultdb

$SPLUNK_DB

_internaldb
_internaldb

etc
etc

index=main
db
db

colddb
colddb

thaweddb
thaweddb

hot / warm
buckets

cold
buckets

unarchived
buckets

Each index has three subdirectories

Operational Intelligence

142

Administering Splunk 4.2

Index divisions
Splunk divides its indexes into 3 sections, plus a special restored from
archive section, for fastest searching and indexing
- Hot most recently indexed events, multiple buckets, read and write, same

directory as warm
- Warm next step in the aging process, multiple buckets, read only, same
directory as hot
- Cold final step in the aging process, multiple buckets, read only, separate
directory from warm and hot
- Thawed restored from archive data, read only, separate directory from the
rest

Operational Intelligence

143

Administering Splunk 4.2

What are buckets?


Buckets are logical groupings of indexed data based on time range
- Starting in the hot section, Splunk divides its indexed data into buckets based

on their time range

Periodically, Splunk runs the optimize process on the hot section of the index to optimize
the placement of events in the buckets
Once a hot bucket reaches its size limit, it will be automatically rolled into warm
Default bucket size is set automatically by Splunk at install based on OS type

- Once rolled into warm, each individual bucket is placed in a directory with 2 time

stamps and an id number as the directory name


- Splunk uses buckets to limit its searches to the time range specified pulling
recent results from hot right away, then those from warm or cold after that
Operational Intelligence

144

Administering Splunk 4.2

Bucket retention times


Hot buckets are segregated by date ranges
- Will roll from hot to warm once max size is met OR no data has been added to

a particular hot bucket in 24 hours

Warm by default contains 300 buckets (default)


- When bucket 301 is created, oldest is rolled into cold

Cold will keep a bucket for six years (default)


- Once the youngest event in a bucket turns 6, it will be moved to frozen
- Buckets in frozen are either archived or deleted (deleted is the default)

Operational Intelligence

145

Administering Splunk 4.2

Configuring and adding indexes


You can configure existing indexes by using the Splunk Web, the CLI,
or editing indexes.conf
You can add new indexes by Splunk Web, CLI, or editing
indexes.conf
- Certain parameters are only set in indexes.conf

Operational Intelligence

146

Administering Splunk 4.2

Adding or editing indexes with Splunk Web


Max bucket size can be set
manually
For daily indexing rates higher
than 5 GB a day set it to
auto_high_volume
- This will give you 1 GB (32-bit) or 10

GB (64-bit) buckets

Set to auto will give you 750 MB


buckets for both
Adding an index requires restart
Operational Intelligence

147

Administering Splunk 4.2

Set up and edit indexes indexes.conf


Indexes are controlled by
indexes.conf
- Global settings like default

defaultD atabase = w ebfarm

database appear before the specific


index stanzas
- Each index has its own stanza with
the name of the index in [ ]

[w ebfarm ]
hom ePath = h:\splunk_index\db
coldPath = h:\splunk_index\colddb
thaw edPath = h:\splunk_index\thaw db

Operational Intelligence

148

Administering Splunk 4.2

Set up and edit indexes indexes.conf (cont)


Some per index settings
- Change number of buckets in

warm
- Max total data size (in MB)

If data grows beyond this number,


Splunk will automatically move cold
buckets to frozen
This setting takes precedence over all
other time/retention settings

[w ebfarm ]
hom ePath = h:\splunk_index\db
coldPath = h:\splunk_index\colddb
thaw edPath = h:\splunk_index\thaw db
m axW arm D BCount = 150
m axTotalD ataSizeM B = 850000
frozenTim ePeriodInSecs = 2598000

- frozenTimePeriodInSecs = time in

seconds buckets will stay in cold


Operational Intelligence

149

Administering Splunk 4.2

Cold to frozen
Frozen is either archive or oblivion default is deletion
To archive you must define :
- coldToFrozenPath - location where Splunk automatically archives frozen data
- Splunk will strip away the index data and only stores the raw data in the frozen

location
- Frozen can be slow inexpensive NAS, tape, etc.
- Older versions of Splunk used cold to frozen scripts, those are still supported,
though if you specify both a coldToFrozenPath and a coldToFrozenScript the
path setting will take precedence

Operational Intelligence

150

Administering Splunk 4.2

Editing index settings in Manager


Navigate to Manager >>
Indexes
Select the index to view and
change the settings

Operational Intelligence

151

Administering Splunk 4.2

Storing cold in a separate location


Warm and hot live in the same
directory
Cold is separate and can be
moved to a different location
- Specify the new location for cold in

indexes.conf or in Manager

Operational Intelligence

[w ebfarm ]
hom ePath = h:\splunk_index\db
coldPath = \\fi
ler\splunk_cold\colddb
thaw edPath = h:\splunk_index\thaw db
m axW arm D BCount = 150
m axTotalD ataSizeM B = 850000
frozenTim ePeriodInSecs = 2598000

152

Administering Splunk 4.2

Storage volumes
You can specify locations and maximum size for index partitions using
volume stanzas
- Handy way to group and control multiple indexes
- Volume size limits apply to all indexes that use the volume
Use
Use volumes
volumes in
in index
index definitions
definitions

Create
Create volumes
volumes in
in indexes.conf
indexes.conf

[volum e:hotN w arm ]


path = g:\superRAID
m axVolum eD ataSizeM B = 100000
[volum e:cold]
path = \\slow N AS\splunk
m axVolum eD ataSizeM B =
1000000
Operational Intelligence

[netw ork]
m axW arm D BCount = 150
frozenTim ePeriodInSecs = 15778463
hom ePath = volum e:hotN w arm \netw ork
coldPath = volum e:cold\netw ork
Be
Be sure
sure to
to use
use subdirectories
subdirectories for
for your
your indexes
indexes to
to
avoid
avoid collisions
collisions
153

Administering Splunk 4.2

Moving an entire index


To move an index requires 4 steps
Stop Splunk
2. Copy the entire index directory to new location being sure to preserve
permissions and all subdirectories verify copy
3. Edit indexes.conf to indicate the new location
4. Restart Splunk
1.

Use cprp on UNIX or robocopy on Windows

Operational Intelligence

154

Administering Splunk 4.2

Backups: What to backup


3 main categories
Indexed event data
- Both the actual log data AND the Splunk index
- $SPLUNK_HOME/var/lib/splunk/

User data
- Things such as event types, saved searches, etc.
- $SPLUNK_HOME/etc/users/

Splunk configurations
- Configuration files updated either by hand or Manager
- $SPLUNK_HOME/etc/system/local
- $SPLUNK_HOME/etc/apps/
Operational Intelligence

155

Administering Splunk 4.2

Backups: How
Recommended method
Using the incremental backup of your choice backup:
- Warm and cold sections of your indexes
- User files
- Archive or backup configuration files
- Hot cannot be backed up without stopping Splunk

Recommended methods of backing up hot


- Use the snapshot capability of underlying file system to take a snapshot of hot,

then backup the snapshot


- Schedule multiple daily backups of warm (works best for high data volumes)
Operational Intelligence

156

Administering Splunk 4.2

Rolling hot into warm


Why?
- If your indexing rate is low, and as a result your hot doesnt roll into warm often

enough making you worried about losing data in hot between backups

How
- Roll the hot db into warm with a script right before backing up
- Restarting splunkd also forces a roll from hot to warm
- Example roll command for the CLI

./splunk _internal call /data/indexes/<index_name>/rollhot-buckets


Be careful about too many forced rolls to warm, too many warm buckets can greatly
impact search performance
Operational Intelligence

157

Administering Splunk 4.2

Deleting data: who


The delete command can be used to permanently remove data form
Splunks data store
By default, even the admin role does not have the ability to run this
command
- It is not recommended to give this ability to the admin role
- Instead, allow a few users to log in to a role specifically set up for deletions
- Create a user thats part of the Can_delete role

Operational Intelligence

158

Administering Splunk 4.2

Deleting data: how


Log in to Splunk Web as a user of the Can_delete role
Craft a search that identifies the data you wish to delete
- Double check that the search ONLY includes the data you wish to delete
- Pay special attention to which index you are using and the time range
- Once youre certain youve targeted only the data you want to delete, pipe the

search to delete

- Note that this is a virtual delete. Splunk marks the events as deleted and they

will never show in searches again, but they will continue to take up space on
disk.
Operational Intelligence

159

Administering Splunk 4.2

Cleaning out an index


Splunk clean all will remove users, saved searches and alerts
Other options:
- clean[eventdata|userdata|all][indexname][f]
eventdata - indexed events and metadata on each event
userdata - user accounts - requires a Splunk license
all - everything on the server
If no index is specified, the default is to clean all indexes
SO ALWAYS SPECIFY AN INDEX TO AVOID TEARS

Operational Intelligence

160

Administering Splunk 4.2

Restoring a frozen index


To thaw, move a copy of the bucket directory to an index directory
- ./splunkrebuild<bucketdirectory>will rebuild the index
- Will also work to recover a corrupted directory
- Does not count against license
- Must shutdown splunkd before running ./splunkrebuildcommand

Operational Intelligence

161

Administering Splunk 4.2

Section 6:
Users, Groups, and
Authentication

Operational Intelligence

162

Administering Splunk 4.2

Section Objectives
Understand user roles in Splunk
Create a custom role
Understand the methods of authentication in Splunk

Operational Intelligence

163

Administering Splunk 4.2

Manage users and roles

Operational Intelligence

164

Administering Splunk 4.2

User roles
There are three built-in user roles:
Admin, Power, User
(Can Delete is a special case already covered)
Administrators can configure custom roles
- Name the role
- Specify a default app
- Define the capabilities for the role
- Limit the time ranges the role can use
- Specify both default and accessible indexes

New roles available in Splunk Manager Access controls option


Operational Intelligence

165

Administering Splunk 4.2

Custom user roles set restrictions


Give the role a name and
select a default app
Set restrictions
- Search terms restrict

searches on certain fields,


sources, hosts, etc
- Time range default is -1
(no restriction). Set time
range in seconds

Operational Intelligence

166

Administering Splunk 4.2

Custom user roles set limits


Set limits (optional)
- Limits are per-person

Operational Intelligence

167

Administering Splunk 4.2

Custom user roles inherit


Custom roles can be based on standard roles
Administrators can then add or remove capabilities
of the imported role

Operational Intelligence

168

Administering Splunk 4.2

Custom user roles capabilities


Add or remove capabilities
See authorize.conf.spec or
http://www.splunk.com/base/Documentation/latest/Ad
min/authorizeconf
for details

Operational Intelligence

169

Administering Splunk 4.2

Custom user roles indexes


You can specify which indexes this role is allowed to
search as well as which are searched by default

Operational Intelligence

170

Administering Splunk 4.2

Splunk authentication users


Specify user name, email, and
default app

Operational Intelligence

171

Administering Splunk 4.2

Splunk authentication users (cont.)


Assign a role and set password

Operational Intelligence

172

Administering Splunk 4.2

LDAP authentication
Splunk can be configured
to work with most LDAP
including Active Directory
LDAP can be configured
from Splunk Manager
See the docs for details
www.splunk.com/base/Documentation/latest/Admin/
SetUpUserAuthenticationWithLDAP

Operational Intelligence

173

Administering Splunk 4.2

Scripted Authentication
Leverage existing PAM or RADIUS authentication systems for Splunk
For the most up-to-date information on scripted authentication, see the
README file in
$SPLUNK_HOME/share/splunk/authScriptSamples/
There are also sample authentication scripts in that directory

Operational Intelligence

174

Administering Splunk 4.2

Single Sign On
Authentication is moved to a web proxy which passes along
authentication to Splunk Web
Auth
Auth server
server

2 Proxy authorizes client

SSO
SSO client
client

Splunk
Splunk server
server
1 Splunk request

3 Proxy passes request with user name


4 Splunk Web returns page to proxy

5 Proxy returns page


Proxy
Proxy server
server
Operational Intelligence

175

Administering Splunk 4.2

Lab

Operational Intelligence

176

Administering Splunk 4.2

Section 7: Forwarding
and Receiving

Operational Intelligence

177

Administering Splunk 4.2

Section objectives
Understand forwarders
Compare forwarder types
Examine topology examples
Deploy and configure forwarders

Operational Intelligence

178

Administering Splunk 4.2

Splunk forwarder types


Universal forwarder
- Streamlined data-gathering agent version of Splunk with a separate installer
- Contains only the essential components needed to forward raw or unparsed data to receivers/indexers
- Cannot perform content-based routing
- In most cases, best tool for forwarding data
- throughput limited to 256kbps

Light forwarder
- Full Splunk in Light forwarder mode (no separate install), otherwise works the same as Universal forwarder

Heavy forwarder
- Full Splunk instance does everything but write data to index
- Breaks data into events before forwarding
- Can handle content-based routing

Operational Intelligence

179

Administering Splunk 4.2

Comparing forwarders
If you need to
use
Forward unparsed data to a
Universal forwarder
receiver or indexer
Collect data on a forwarder that
Light forwarder
requires a python-based scripted
input
Route collected data based on
event info or filter data prior to
WAN/slower connection

Operational Intelligence

Heavy forwarder

180

Administering Splunk 4.2

Forwarder topology: data consolidation


Most common topology
Multiple forwarders send data to a central indexer

Operational Intelligence

181

Administering Splunk 4.2

Forwarder topology: load balancing


Distributes data across multiple indexers
Forwarder routes data sequentially to different indexers at
specified intervals with automatic failover
*

* Requires

distributed search
covered later in
this section

Operational Intelligence

182

Administering Splunk 4.2

Setting up forwarders big picture


1. Enable receiving on your indexer(s)
2. Install forwarders on production systems
3. Configure forwarders to send to receivers
4. Test connection with small amount of test data
5. Setup inputs on forwarders
6. Verify inputs are being received

Operational Intelligence

183

Administering Splunk 4.2

Configure forwarding and receiving - Manager


You can set up basic forwarding
and receiving using Manager

Operational Intelligence

184

Administering Splunk 4.2

Set up receiving port Splunk Web


Specify TCP port you wish
Splunk to listen on and click
save
- NOT Splunk Web or

splunkd ports

Operational Intelligence

185

Administering Splunk 4.2

Enable Indexer to indexer forwarding/receiving


You can easily forward indexed data from one Splunk server to
another
Useful for replication across sites or forwarding one type of data to a
different indexer

Operational Intelligence

186

Administering Splunk 4.2

Enable forwarding Splunk Web


Enter either the hostname or IP address with the port of the receiving
server
- If multiple hosts are defined, you can optionally select Automatic Load

Balancing

Restart required

Operational Intelligence

187

Administering Splunk 4.2

Install universal forwarder: Windows


The Windows version of Universal forwarder includes an Install Shield
package that guides you through most of the forwarders configuration
If the installer detects an earlier version of Splunk Forwarder you can:
- Automatically perform a migration during installation
- Fishbucket info is migrated, config files are NOT
- Install UF in a different location to preserve legacy forwarder

Operational Intelligence

188

Administering Splunk 4.2

Install universal forwarder: Windows (cont.)


If using a deployment server,
indicate the hostname or IP and
port
- Deployment server is covered in a

later module

Indicate the receiving indexer


hostname or IP and port
- Must be listening port of indexer
- Skip if using deployment server

Operational Intelligence

189

Administering Splunk 4.2

Install universal forwarder: Windows (cont.)


Choose to forward from local or
remote
- If remote, enter domain, username

and password for remote host on


next screen

Operational Intelligence

190

Administering Splunk 4.2

Install universal forwarder: Windows (cont.)


Enable Windows inputs
- Event logs
- Performance monitoring
- AD monitoring

Clicking next begins the


installation
- You can update your universal

forwarder's configuration post-install


by directly editing its
inputs.conf and
outputs.conf
Operational Intelligence

191

Administering Splunk 4.2

Install universal forwarder: Windows CLI


Use the CLI installation method when:
You want to install the universal forwarder across your enterprise via a
deployment tool
You do not want the universal forwarder to start immediately after
installation
- Include LAUNCHSPLUNK=0in the install command

You want to clone a system image for cloning that includes a Universal
Forwarder

Operational Intelligence

192

Administering Splunk 4.2

Install universal forwarder: Windows CLI (.cont)


Run as Local System user and request configuration from
deploymentserver1
- For new deployments of the forwarder
-

msiexec.exe/isplunkuniversalforwarder_x86.msi
DEPLOYMENT_SERVER="deploymentserver1:8089"AGREETOLICENSE=Yes/quiet

Run as a domain user but dont launch immediately


- Prepare a sample host for cloning
-

msiexec.exe/isplunkuniversalforwarder_x86.msiLOGON_USERNAME="AD\splunk"
LOGON_PASSWORD="splunk123"DEPLOYMENT_SERVER="deploymentserver1:8089"
LAUNCHSPLUNK=0AGREETOLICENSE=Yes/quiet

Operational Intelligence

193

Administering Splunk 4.2

Install universal forwarder: Windows CLI (.cont)


Enable indexing of the Windows security and system event logs run
installer in silent mode
- Collect just the Security and System event logs through a "fire-and-forget"

installation
-

msiexec.exe/isplunkuniversalforwarder_x86.msi
RECEIVING_INDEXER="indexer1:9997"WINEVENTLOG_SEC_ENABLE=1
WINEVENTLOG_SYS_ENABLE=1AGREETOLICENSE=Yes/quiet

Migrate from an existing forwarder run installer in silent mode


- Migrate now and redefine your inputs later
-

msiexec.exe/isplunkuniversalforwarder_x86.msi
RECEIVING_INDEXER="indexer1:9997"MIGRATESPLUNK=1AGREETOLICENSE=Yes/quiet

Operational Intelligence

194

Administering Splunk 4.2

Install universal forwarder: *nix


Install as you would full Splunk instance, replacing the package name
- rpmisplunkuniversalforwarder_package_name.rpm

Start Splunk and accept license


Configure the following options
- Auto start: splunkenablebootstart
- Deployment server:

splunksetdeploypoll<host:port>
- Client without deployment server: splunkenabledeployclient
- Forward to an indexer: splunkaddforwardserver<host:port>

Configure inputs via inputs.conf


Operational Intelligence

195

Administering Splunk 4.2

Migrate to universal forwarder: *nix


You can migrate checkpoint data from an existing *nix light forwarder (version 4.0 or later) to the
universal forwarder
- Important:

Migration can only occur the first time you start the universal forwarder, post-installation. You
cannot migrate at any later point

1.
2.
3.
4.
5.

Stop all services on the host


Install the universal forwarder do not start
In the installation directory, create a file $SPLUNK_HOME/old_splunk.seedthat contains a single line
with the path of the old forwarder's $SPLUNK_HOMEdirectory
Start the universal forwarder
Edit / add configurations

Migration process only copies checkpoint files you should manually copy over the old
forwarder's inputs.conf
Operational Intelligence

196

Administering Splunk 4.2

Forwarding configurations
inputs.conf on the forwarder gathers the local logs/system info
needed
- You can include input phase settings in props.conf on light forwarders
- Per-event processing must be done on the indexer

outputs.conf points the forwarder to the correct receiver(s)


- If you set up forwarding in Splunk Manager, it will reside in the app context you

were in when you enabled it


- If creating by hand, best practice is to place it in

$SPLUNK_HOME/etc/system/local

Operational Intelligence

197

Administering Splunk 4.2

Outputs.conf basic example


Main [tcpout] stanza has global
settings
[tcpout:web_indexers] stanza
sets up receiving server
- Compression is turned on
- Server setting refers to either the

IP or host name plus port of


receiver

Operational Intelligence

[tcpout]

Global
Global settings
settings

defaultGroup=web_indexers
disabled=false
[tcpout:web_indexers]

Receiving
Receiving server
server

server=splunk1.company.com:9997
compressed=true
[tcpoutserver://splunk1.company.com:9997]

198

Administering Splunk 4.2

Outputs.conf indexer to indexer clone


Main [tcpout] stanza has global
settings such as whether to
index a local copy

[tcpout]

[tcpout:uk_clone] stanza sets


up receiving server

[tcpout:uk_clone]

Global
Global settings
settings

IndexAndForward=true
Receiving
Receiving server
server

Compressed=true
Server=uk_splunk.company.com:9997

- Compression is turned on
- Server setting refers to either the

IP or host name plus port of


receiver

Operational Intelligence

199

Administering Splunk 4.2

Outputs.conf single indexer and SSL


Each forwarder would have a copy of outputs.conf with the
following stanza
- Additionally the forwarders would be sending using SSL using Splunks self-

signed certificates
[tcpout:indexer]
server=splunk.company.com:9997
sslPassword=ssl_for_m3
sslCertPath=$SPLUNK_HOME/etc/auth/server.pem
sslRootCAPath=$SPLUNK_HOME/etc/auth/cacert.pem

Operational Intelligence

200

Administering Splunk 4.2

Outputs.conf clone indexers


Set multiple target groups to get forwarders to send exact
copies to multiple indexers
[tcpout:indexer1]
server=splunk1.mycompany.com:9997
[tcpout:indexer2]
server=splunk2.mycompany.com:9997

Operational Intelligence

201

Administering Splunk 4.2

Auto load balancing


Splunk also offers automatic load balancing, which switches from
server to server in a list based on a time interval
Two options:
- static list in outputs.conf (see below)
- DNS list based on a series of A records for a single host name

[tcpout:list_LB]
autoLB=true
server=splunk1.company.com:9997,splunk2.company.com:9997

Operational Intelligence

202

Administering Splunk 4.2

Auto load balancing DNS list


To set up DNS list load balancing create multiple A records with
the same name with the IP address of each indexer
From
From DNS
DNS zone
zone file
file

splunk1A10.20.30.40
splunk2A10.20.30.41
splunk1bA10.20.30.40
splunk1bA10.20.30.41
[tcpout:DNS_LB]
autoLB=true
server=splunk1b.mycompany.com:9997
autoLBFrequency=60
Operational Intelligence

203

Administering Splunk 4.2

Caching/queue size in outputs.conf


maxQueueSize = 1000 (default) is the number of events the
forwarder will queue if the target group cannot be reached
In load-balanced situations, if the forwarder cant reach one of
the indexers, it will automatically switch to another, and will only
queue if all are down/unreachable
See outputs.conf.spec for details and even more queue
settings

Operational Intelligence

204

Administering Splunk 4.2

Indexer Acknowledgement
Guards against loss of data when forwarding to an indexer
- Forwarder will re-send any data not acknowledged as "received" by the indexer

Disabled by default
Requires version 4.2 of both forwarder and receiver
Can also be used for forwarders sending to an intermediate forwarder

Operational Intelligence

205

Administering Splunk 4.2

Indexer Acknowledgement process


As forwarder sends data, it maintains a copy of each 64k block in
memory in the wait queue until it gets an acknowledgment from the
indexer
- While waiting, it continues to send more data blocks

The indexer receives a block of data, then parses and writes to disc
Once on disc, indexer sends acknowledgment to forwarder
Upon acknowledgment, the forwarder releases the block from memory
- If the wait queue is of sufficient size, it doesn't fill up while waiting for

acknowledgments to arrive
- Wait queue size can be increased (covered in a later slide)
Operational Intelligence

206

Administering Splunk 4.2

What happens when no ack is received?


If the forwarder doesn't get acknowledgment for a block within 300
seconds (by default), it closes the connection
- Change wait time by setting readTimeout in outputs.conf
- If auto load balancing is enabled, it opens a connection to the next indexer in

the group and sends the data


- If auto load balancing is not enabled, it tries to open a connection to the same
indexer as before and resend the data

Data block is kept in the wait queue until acknowledgment is received


- Once wait queue fills, forwarder stops sending until it receives acknowledgment

for one of the blocks, at which point it can free up space in the queue.
Operational Intelligence

207

Administering Splunk 4.2

Handling duplicates
If there's a network problem that prevents an acknowledgment from
reaching the forwarder, dupes may occur
- Example: indexer receives a data block then generates the acknowledgment

network goes down before forwarder gets ack.


When network comes back up, forwarder resends the data block indexer
parses and writes

Forwarders will record events to splunkd.log when it receives duplicate


acks or resends due to no response

Operational Intelligence

208

Administering Splunk 4.2

Enabling Indexer Acknowledgement


Enabled on the forwarder
- Both forwarder and indexer must be at version 4.2 or greater

Set useACK to true in outputs.conf


[tcpout:<target_group>]server=<server1>,<server2>,...useACK=true

- Disabled by default

You can set useACK either globally or by target group, at the


[tcpout]or[tcpout:<target_group>] stanza levels
You cannot set it for individual servers at the
[tcpoutserver:...]stanza level
Operational Intelligence

209

Administering Splunk 4.2

Increasing wait queue size


Max wait queue size is 3x the size of the in-memory output queue,
which you set with the maxQueueSize attribute in outputs.conf
maxQueueSize=[<integer>|<integer>[KB|MB|GB]]
Wait queue and the output queues are configured by the same
attribute but are separate queues
- Example: if you set maxQueueSize to 2MB, the maximum wait queue size will

be 6MB

Specifying a lone integer - maxQueueSize=100 sets max


events for parsed data and max blocks (~64K) for unparsed data

Operational Intelligence

210

Administering Splunk 4.2

Forwarding to an intermediate forwarder


Two main possibilities to consider:
Originating forwarder and intermediate forwarder both have
acknowledgment enabled
- Intermediate forwarder waits until it receives acknowledgment from the indexer and

then sends acknowledgment back to the originating forwarder

Originating forwarder has acknowledgment enabled - intermediate


forwarder does not
- Intermediate forwarder sends acknowledgment back to the originating forwarder as

soon as it sends the data on to the indexer


- Because it doesn't have useACK enabled, the intermediate forwarder cannot verify
delivery of the data to the indexer
Operational Intelligence

211

Administering Splunk 4.2

Lab

Operational Intelligence

212

Administering Splunk 4.2

Section 8: Distributed
Environments

Operational Intelligence

213

Administering Splunk 4.2

Objectives
List Splunk server types
Understand Distributed search
Describe search head pooling
Understand Deployment server

Operational Intelligence

214

Administering Splunk 4.2

Types of Splunk server


indexer
indexer

universal
universal forwarders
forwarders

search
search head
head

Indexers gather data


from inputs and
forwarders, process
it and write it to
disk.

Separate install.
Gathers data and
forwards to indexer.

Search peer
accessed by users.
Runs ad-hoc and
scheduled
searches/alerts.
Distributes
searches out to all
peers and
combines results.

heavy
heavy forwarders
forwarders

Gather or receives
data, processes it
and then forwards
on to indexer.
Operational Intelligence

215

Administering Splunk 4.2

Data lifecycle review


Four main phases in the data
lifecycle

forwarders
forwarders

Collect raw data


and send to indexer

- Input

Splunk forwarder or full Splunk

- Parsing

Splunk heavy forwarder or indexer

- Indexing

Indexer

- Search

Search head

Operational Intelligence

Parse data line breaks,


timestamps, index-time field
extractions, save to disc and
index

Pull events from index,


search-time field extractions,
display events, reports, etc.

216

indexer
indexer

search
search head
head

Administering Splunk 4.2

Distributed Environments Overview


The next three sections will introduce you to common topologies and
tools used in distributed environments
Distributed Search
- Search across multiple indexes

Search Head Pooling


- Multiple search heads share configuration data

Deployment Server
- Manage multiple, varying Splunk instance configurations from a single server

Operational Intelligence

217

Administering Splunk 4.2

Distributed Search

Operational Intelligence

218

Administering Splunk 4.2

Distributed search overview


Search heads send search requests to multiple indexers and merge
the results back to the user
In a typical scenario, one Splunk server searches indexes on several
other servers
Used for
- Horizontal scaling across multiple indexers used for high volume data scenarios
- Accessing geo-diverse indexers
- Access control
- High availability scenarios

Operational Intelligence

219

Administering Splunk 4.2

Distributed search topology examples


Simple distributed search for horizontal scaling one search head
searching across three peers

Operational Intelligence

220

Administering Splunk 4.2

Distributed search topology examples (cont.)


Access control example department search head has access to all
the indexing search peers
- Each search peer also has the ability to search its own data
- Department A search peer has access to both its data and the data of

department B

Operational Intelligence

221

Administering Splunk 4.2

Distributed search topology examples (cont.)


Load balancing example provides high availability access to data

Operational Intelligence

222

Administering Splunk 4.2

Distributed Search setup - Manager


Turn on Distributed
search and optionally
turn on auto-discovery
- Allows this Splunk server to

automatically add other


search peers it discovers
on the network

Operational Intelligence

223

Administering Splunk 4.2

Distributed Search Add Peers - Manager


Add individual peers
manually
- Include authentication

Operational Intelligence

224

Administering Splunk 4.2

Search Head Pooling

Operational Intelligence

225

Administering Splunk 4.2

Search head pooling overview


Multiple search heads can share configuration data
Allows horizontal scaling for users searching across the same data
Also reduces the impact if a search head becomes unavailable
Shared resources are:
- .conf files
- Search artifacts saved searches and other knowledge objects
- Scheduler state only one search head in the pool runs a particular scheduled search

Makes all files in $SPLUNK_HOME/etc/{apps,users}available for


sharing .conf files, .meta files, view files, search scripts, lookup tables, etc.
All search heads in a pool should be running same version of Splunk
Operational Intelligence

226

Administering Splunk 4.2

Topology example with loadbalancer


NFS
NFS or
or other
other
similar
similar
technology
technology

User
logs
in

Operational Intelligence

Layer
Layer 77 Load
Load
Balancer
Balancer

227

Administering Splunk 4.2

Topology example without loadbalancer


NFS
NFS

User
logs User
in logs
in

Operational Intelligence

User
logs
in

228

User
logs
in

Administering Splunk 4.2

Create a pool of search heads


Set up each search head individually in the same manner as
configuring distributed search
1. Set up shared storage that each search head can access
- For *nix, use NFS mount
- For windows, use CIFS (SMB) share
- The Splunk user account needs read/write access to shared storage

2. Stop splunkd on all search heads in pool

Operational Intelligence

229

Administering Splunk 4.2

Enable each search head


3. Use the pooling enable CLI command to enable pooling on a
search head.
splunkpoolingenable<path_to_shared_storage>[debug]
- On NFS, <path_to_shared_storage>is NFS's mountpoint.
- On Windows, <path_to_shared_storage>is UNC path of the CIFS/SMB
- Execute this command on each search head in the pool. The command:

Sets values in the [pooling]stanza of the server.conf file in


$SPLUNK_HOME/etc/system/local

Creates user and app subdirectories

Operational Intelligence

230

Administering Splunk 4.2

Copy user and app directories to share


4. Copy the contents of $SPLUNK_HOME/etc/appsand
$SPLUNK_HOME/etc/usersdirectories on existing search heads
into the empty apps and users directories on the shared storage
- For example, if your NFS mount is at /tmp/nfs, copy the apps subdirectories

into /tmp/nfs/apps
- Similarly, copy the user subdirectories: $SPLUNK_HOME/etc/users/into
/tmp/nfs/users

5. Restart each search head in the pool

Operational Intelligence

231

Administering Splunk 4.2

Using a load balancer


Allows users to access the pool of search heads through a single
interface, without needing to specify a particular one
Ensures access to search artifacts and results if one of the search
heads goes down
When configuring the load balancer:
- The load balancer must employ layer-7 (application-level) processing
- Configure the load balancer so user sessions are "sticky" or "persistent to

ensure that a user remains on a single search head throughout a session

Operational Intelligence

232

Administering Splunk 4.2

Search head management commands


splunkpoolingvalidate
- Revalidate the search head's

access to shared resources

splunkpoolingdisable

$
$

- Disables pooling for a given search

pooling
pooling

enable /opt/splunk
display

Search head pooling is enabled with


shared storage at: /tmp/nfs

head

splunkpoolingdisplay

$ splunk pooling disable


$ splunk pooling display

- Displays / verifies current status of

search head

Operational Intelligence

splunk
splunk

Search head pooling is disabled

233

Administering Splunk 4.2

Configuration changes
Once pooling is enabled on a search head, you must notify the search
head if you directly edit a .configfile
If you add a stanza to any config file in a local directory, you must run
the following command:
splunkbtoolfixdangling
Not necessary if you make changes via Splunk Web Manager or CLI

Operational Intelligence

234

Administering Splunk 4.2

Deployment Server

Operational Intelligence

235

Administering Splunk 4.2

Deployment server overview


The deployment server pushes out configurations and content
packaged in deployment apps to distributed clients
Allows you to manage multiple Splunk instances from a single Splunk
server
- Small environments deployment server can also be a deployment client
- Greater than 30 deployment clients deployment server should be its own

instance

Operational Intelligence

236

Administering Splunk 4.2

Deployment Terminology
Deployment server
- A Splunk instance that acts as a centralized configuration manager
- Supplies configurations to any number of Splunk instances
- Any Splunk instance can act as a deployment server

Deployment client
- Splunk instances that are remotely configured
- A Splunk instance can be both a deployment server and client at the same time

Server class
- A logical grouping of deployment clients based on need for the same configs

Deployment app
- Set of deployment content (including configuration files) deployed as a unit to clients of a server

class.
Operational Intelligence

237

Administering Splunk 4.2

Deployment server uses


Distribute Apps and/or configurations
- Windows file servers

Splunk for Windows App


Collect event logs and WMI

- Database group

Uptime, system health, access errors

- Web Hosting Group

Analytics, business intelligence

Operational Intelligence

238

Administering Splunk 4.2

Server Classes examples


Windows
- Windows Server 2003
- IIS

Database
- Solaris servers (sunos-sun4u)
- Oracle

Web hosting group


- Apache on Linux

Could also group clients by OS, Hardware type, location, etc.


Operational Intelligence

239

Administering Splunk 4.2

Deployment server example


www1-forwarder

www2-forwarder

www3-forwarder

db1-forwarder

db2-forwarder

db-loggingforwarder server
class

www-forwarder
server class

Deployment server

Operational Intelligence

240

Administering Splunk 4.2

Deployment server configuration overview


1. Designate a Splunk instance as deployment server
2. Create serverclass.conf on the deployment server at
$SPLUNK_HOME/etc/system/local
3. Create deployment apps on the deployment server and put the
content to be deployed into directories
4. Create deploymentclient.conf on the Deployment clients
5. Restart the deployment clients

Operational Intelligence

241

Administering Splunk 4.2

Deployment serverclass.conf (cont.)


Server classes group clients that need the same configuration
If filters match the apps and configuration, content is deployed to the client
Stanzas in serverclass.confgo from general to more specific
All configuration information is evaluated from top to bottom in the
configuration file, so order matters
Applies
Applies to
to all
all
server
server classes
classes

Server-class
Server-class
specific
specific settings
settings

[global]
repositoryLocation=$SPLUNK_HOME/etc/deploymentApps
targetRepositoryLocation=$SPLUNK_HOME/etc/apps

Where
Where apps
apps are
are stored
stored on
on
the
the deployment
deployment server
server
Where
Where apps
apps will
will be
be
delivered
delivered on
on the
the client
client

[serverClass:AppsByMachineType]
[serverClass:AppsByMachineType:app:win_eventlog]

Operational Intelligence

242

Administering Splunk 4.2

Server classes example serverclass.conf


[serverClass:wwwforwarder]
filterType=blacklist
blacklist.0=*

www-forwarder
server class

whitelist.0=*.10.1.1*

Server
Server class
class only
only applies
applies to
to
clients
clients in
in the
the 10.1.1*
10.1.1* IP
IP range
range

[serverClass:wwwforwarder:app:webfarmforwarders]
stateOnClient=enabled

Deploy
Deploy this
this
app
app to
to
clients
clients that
that
match
match

[serverClass:dbloggingforwarder]
filterType=blacklist

db-loggingforwarder server
class

blacklist.0=*
whitelist.0=*.192.2*

[serverClass:dbloggingforwarder:app:dbforwarder]
stateOnClient=enabled

Operational Intelligence

Server
Server class
class only
only applies
applies to
to
clients
clients in
in the
the 192.2*
192.2* IP
IP range
range

243

Deploy
Deploy this
this
app
app to
to
clients
clients that
that
match
match
Administering Splunk 4.2

serverclass.conf group by machine type


You can create server classes that apply to specific machine types or
OSs
Deploy this app only to Windows

Deploy this app only to Windows


machines
machines

[serverClass:AppsByMachineType:app:SplunkDesktop]
machineTypes=WindowsIntel
[serverClass:AppsByMachineType:app:unix]

Deploy
Deploy this
this app
app only
only to
to Linux
Linux 32
32
or
or 64
64 bit
bit machines
machines

machineTypes=linuxi686,linuxx86_64

Operational Intelligence

244

Administering Splunk 4.2

serverclass.conf client handling options


Optionally configure actions to take on the client after an app is
deployed
Defaults
Defaults to
to false
false

restartSplunkWeb=<TrueorFalse>

Defaults
Defaults to
to true
true

restartSplunkd=<TrueorFalse>

stateOnClient=<enabled,disabled,noop>

Enable
Enable or
or disable
disable apps
apps on
on the
the
client
client after
after installation
installation or
or change
change

Operational Intelligence

245

Administering Splunk 4.2

Setup Deployment Client


Install Splunk on the client machine
Run the following command
./splunksetdeploypoll<ipaddress/hostnameofdeployment
server>:8089authadmin:changeme

This will create a file named deploymentclient.conf


[deploymentclient]
disabled=false
[targetbroker:deploymentServer]
targetUri=225.225.225.1:8089

Operational Intelligence

246

URI
URI of
of deployment
deployment
server
server

Administering Splunk 4.2

Verify deployment Server clients


From the deployment server, you can verify deployment clients from CLI with
the following command:
./splunklistdeployclients

Command
Command output
output

Deploymentclient:ip=192.168.2.4,dns=192.168.2.4,
hostname=mycompanyPC64,mgmt=8089,build=64889,
name=deploymentClient,
id=connection_192.168.2.4_8089_192.168.2.4_deploymentClient,
utsname=windowsunknown

Operational Intelligence

247

Administering Splunk 4.2

Deployment actions
Default poll period is 30 seconds

client

- Specified in serverclass.conf

The deployment server instructs


the client what it should retrieve

Poll
Poll server
server
Send
Send
instructions
instructions

The deployment client then


retrieves the new content

Get
Get content
content

deployment server
Operational Intelligence

248

Administering Splunk 4.2

Force-notify clients of changes


If you make changes to a deployment app on the deployment server,
you may want to immediately notify the clients of the change
Run ./splunkreloaddeployserverto notify all clients
Run ./splunkreloaddeployserverclass<class
name>to notify a specific class

Operational Intelligence

249

Administering Splunk 4.2

Section 9: Licensing

Operational Intelligence

250

Administering Splunk 4.2

Section Objectives
Identify license types
Understand license violations
Define license groups
Define license pooling and stacking
Add and remove licenses

Operational Intelligence

251

Administering Splunk 4.2

Splunk license types


Enterprise license
- Purchased from Splunk
- Allows for full functionality
- License limits indexing volume

Enterprise trial license downloads with product


- 500mb per day limit
- Otherwise same as enterprise, except that it expires 60 days after install

Operational Intelligence

252

Administering Splunk 4.2

Splunk license types (cont.)


Forwarder license
- Applied to non-indexing forwarders, and deployment servers
- Allows authentication, but no indexing

Free license
- Activates automatically when 60 day trial enterprise license expires
- Can be activated before 60 days by using Manager
- Doesnt allow authentication, forwarding to non-Splunk servers, or alerts
- Does allow 500mb/day of indexing and forwarding to other Splunk instances

Operational Intelligence

253

Administering Splunk 4.2

License warnings and violations


5th warning in a rolling 30 day period causes violation and search to be
disabled
- 3rd warning in Free version
- You must be good for 30 consecutive days for warning number to reset

Indexing will continue, only search is locked out


- Note that you can still search Splunks internal indexes

Contact Splunk Support to unlock your license

Operational Intelligence

254

Administering Splunk 4.2

License groups
License types are organized into groups
- Enterprise Group

Includes Enterprise, Enterprise Trial, and sales trial

- Free Group
- Forwarder Group

Licenses are stored in directories at


$SPLUNK_HOME$/etc/licenses
- Each group is stored in a separate folder under that directory

Operational Intelligence

255

Administering Splunk 4.2

License stacking and pooling overview


Licenses in the Enterprise group can be aggregated together, or
stacked
- Available license volume is the sum of the volumes of the individual licenses

Enterprise trial license that comes with the Splunk download cannot be
stacked
Free license cannot be stacked
Pools can be created for a given stack
- Specify Splunk indexing instances as members of a pool for the purpose of

volume usage and tracking


- Allows for insulation of license usage by group of indexers or data type
Operational Intelligence

256

Administering Splunk 4.2

Topology example single pool


Master has a stack of two licenses for a total of 500GB
All indexers in the pool share 500GB entitlement collectively
This should be the most common scenario
Enterprise Stack 500 GB Total Entitlement
Default License Pool
500 GB Shared Entitlement

300GB
300GB License
License
200GB
200GB License
License

Operational Intelligence

257

Administering Splunk 4.2

Topology example multiple pools


Master has a stack of two licenses, totaling 500GB
Each pool has a specific entitlement amount
Enterprise Stack 500 GB Total Entitlement
Default
Default Pool
Pool -100GB
100GB local
local
Entitlement
Entitlement

Pool
Pool 22 100GB
100GB
Entitlement
Entitlement

Operational Intelligence

300GB
300GB License
License
200GB
200GB License
License

Pool
Pool 44 100GB
100GB
Entitlement
Entitlement

Asdasd
Asdasd
Pool
Pool 33 200GB
200GB
Entitlement
Entitlement
258

Administering Splunk 4.2

Managing licenses overview


You can manage license stacks
and pools via Manager
- Switch from master to slave
- Change license group
- View license alerts
- Add licenses and manage stacks
- Add and manage pools

Operational Intelligence

259

Administering Splunk 4.2

Managing licenses master/slave


By default, Splunk instances
are master license servers
Change an instance to slave
by entering the master license
server URI

Operational Intelligence

260

Administering Splunk 4.2

Change license group


Each master can only
manage a single license
group
Select Enterprise,
Forwarder, or Free
- Forwarder and Free cannot be

stacked or used in Pools


- Enterprise is default

Operational Intelligence

261

Administering Splunk 4.2

Adding a license
Any 4.x license can be added
- 4.2 licenses can be uploaded, or

XML can be copy/pasted


- 4.0 and 4.1 licenses must be
uploaded

Operational Intelligence

262

Administering Splunk 4.2

License stacks

4.2
4.2 Enterprise
Enterprise
license
license

Enterprise
Enterprise
Stack
Stack

Operational Intelligence

4.1
4.1 Enterprise
Enterprise
license
license

263

Administering Splunk 4.2

License pools
For each stack, you can create
one or more additional license
pools
- Define a maximum volume for the

pool
- Select indexers for the pool

Operational Intelligence

264

Administering Splunk 4.2

Viewing pool volume

Default
Default pool
pool
Added
Added pool
pool

Operational Intelligence

265

Administering Splunk 4.2

Viewing alerts

windows

Operational Intelligence

enterprise

266

Administering Splunk 4.2

Viewing license info master


For each license installed on the
master, you can view specific
license info
- Exp. Date/time
- Features allowed
- Max violations
- Quota
- Stack name and type
- Status
- Violation window period
Operational Intelligence

267

Administering Splunk 4.2

Viewing license info slave


Displays local indexer name, master license server URI, last
successful connection
Messages link displays license alerts

Operational Intelligence

268

Administering Splunk 4.2

Lab

Operational Intelligence

269

Administering Splunk 4.2

Section 10: Security

Operational Intelligence

270

Administering Splunk 4.2

Section objectives
Learn what you can secure in Splunk
Understanding SSL and Splunk
Learn about user group and index security
Learn what is recorded in the audit log
Describe how to secure the audit log
Understand archive data signing

Operational Intelligence

271

Administering Splunk 4.2

What you can secure in Splunk


SSL
- splunkd to Splunk Web
- Splunk Web to client
- forwarder to indexer

Audit
- user actions
- file system

Data Signing
- cold to frozen archive data
- audit data in Splunk
Operational Intelligence

272

Administering Splunk 4.2

SSL
Already enabled between splunkd and Splunk Web
Can be enabled via Splunk Web > Manager or by editing web.conf
- Splunk will automatically generate homemade certificates
- You can pay for certificates to avoid browser complaints

Forwarder to indexer communication can be secured


- Enabled in outputs.conf
- Adds to forwarder processor overhead

Can force Splunk to only use SSLv3 if required

Operational Intelligence

273

Administering Splunk 4.2

Data / Index Security


Securing sensitive data within Splunk is best achieved by segregating
the data by index
Index access is governed by user groups
Index level security is the best method to insure users have access to
the data they need, while preventing them from seeing sensitive data

Operational Intelligence

274

Administering Splunk 4.2

Auditing
Splunk automatically creates an audit trail of Splunk user actions
- Stored in the _audit index
- Accessible only by administrators by default
- Useful for monitoring for prying eyes

Splunk also audits file systems (FS change monitor)


- Use it on /etc/passwordor on Splunks own config files

Operational Intelligence

275

Administering Splunk 4.2

Signing audit data


Splunk has the ability to number and sign audit trail data
- Detects gaps
- Detects tampering
- Created fields called validity and gap in the audit log
- Does not work in distributed environments
- See the Knowledge Base for details on setting this up

http://www.splunk.com/base/Documentation/latest/Admin/Signauditeven
ts

Operational Intelligence

276

Administering Splunk 4.2

Signing archive data


You can sign archive data when it moves from cold to frozen
You must specify a custom archiving script
- You cannot use it if you choose to have Splunk perform the archiving

automatically

Add signing to your script using signtools<archive_path>


Splunk verifies archived data signatures automatically when the
archive is restored
- Verify signatures manually by using signtoolv<archive_path>

Operational Intelligence

277

Administering Splunk 4.2

Splunk Product Security Resources


The Splunk Product Security Portal provides a single location for:
- Splunk Product Security Announcements
- Splunk Product Security Policy
- Splunk Product Security Best Practices
- Reporting Splunk Product Security Vulnerabilities

This site is updated regularly with any security-related updates or


announcements
http://www.splunk.com/page/securityportal
splunk.com > Support > Security
Operational Intelligence

278

Administering Splunk 4.2

Section 11:
Jobs, Knowledge Objects, and
Alerts

Operational Intelligence

279

Administering Splunk 4.2

Section objectives
Understand jobs
Manage jobs
Understand alerts, and alert settings
Understand PDF server and alerts
Understand what knowledge objects are and how to set their
permissions

Operational Intelligence

280

Administering Splunk 4.2

What are jobs


Jobs are searches that users or the system runs
- A job is created when

You hit return in the search box


You load a dashboard with embedded saved searches
An alert is triggered or saved search runs

- Jobs create artifacts when they run

What are artifacts?


- Traces of jobs (such as search results) that are created on disk
- Persistence to disk allows users to recreate or resurrect jobs

Operational Intelligence

281

Administering Splunk 4.2

Managing Jobs Splunk Web


Users can mange their
own jobs
Administrators can
manage all users jobs
Click on Jobs in the
Splunk Web to
manage, rerun, and
resurrect jobs

Operational Intelligence

282

Administering Splunk 4.2

Manage jobs OS level (*nix only)


Search jobs run as processes at the OS level
View search jobs running
Included in the process description will be key information
- the actual search running
- who ran the search
- their role
- the search ID
ps
ps ef
ef || grep
grep splunkd
splunkd search
search
502
00
0:00.05
0:00.26
502 3179
3179 1662
1662
0:00.05 ??
??
0:00.26 splunkd
splunkd search
search
--id=rt_1297105108.42
--id=rt_1297105108.42 --maxbuckets=0
--maxbuckets=0 --ttl=600
--ttl=600 --maxout=10000
--maxout=10000 --maxtime=0
--maxtime=0
--lookups=1
--lookups=1 --reduce_freq=10
--reduce_freq=10 --user=admin
--user=admin --pro
--pro --roles=admin:power:user
--roles=admin:power:user
Operational Intelligence

283

Administering Splunk 4.2

Manage jobs OS level continued


There will be 2 jobs for each process
- 2nd job is the helper it will die if you kill the 1st

job

Running jobs will be writing data to


$SPLUNK_HOME/var/run/splunk/dispatch/<job_id>
- Saved searches will append the name of the saved search to the job_id
directory
- This directory exists for the TTL of the job
- You may need to delete artifact directories for jobs you kill by hand

Operational Intelligence

284

Administering Splunk 4.2

Alerts Review
Alerts are saved searches
that run on a schedule and
do something based on the
data that is returned
Alerts can send an email,
trigger a shell script, or
create an RSS feed

Operational Intelligence

285

Administering Splunk 4.2

Email alert configuration


In the Email Subject field, $name$ is
replaced by the saved search name
You must first configure email alert
settings in Manager

Operational Intelligence

286

Administering Splunk 4.2

PDF report server


Splunk offers the ability to print and email reports in PDF format
You must install the PDF print server add-on on a Linux-based
Splunk instance
- The Splunk instance doesnt have to be an indexer, but cannot be a light

forwarder

See
www.splunk.com/base/Documentation/latest/Installation/Conf
igurePDFprintingforSplunkWeb
for details
Operational Intelligence

287

Administering Splunk 4.2

Scripted alerts
You can have an alert that activates a script
Scripts must be located in $SPLUNK_HOME/bin/scripts
Scripts can be in any language the underlying operating
system can run
Splunk passes a number of variables to the script
For details on variables etc., see the docs:
http://www.splunk.com/base/Documentation/latest/admin/Confi
gureScriptedAlerts
Operational Intelligence

288

Administering Splunk 4.2

Knowledge Objects
Knowledge objects are user-created things such as
- Eventtypes
- Saved Searches
- Field Extractions using IFX (Interactive Field Extractor)
- Tags

Knowledge objects initially are only available to the user who


created them
Permissions must be granted to allow other users/apps to use
them
Operational Intelligence

289

Administering Splunk 4.2

Knowledge object permissions


Users only need read
permissions to use knowledge
objects
Use app context to segregate
app-specific knowledge objects

Operational Intelligence

290

Administering Splunk 4.2

Section 12:
Troubleshooting
Operational Intelligence

291

Administering Splunk 4.2

Section objectives
Learn how to set specific log levels using Manager
Learn basic troubleshooting steps to solve/identify common
issues
Learn how to get community help with Splunk
Understand how to contact Splunk Support

Operational Intelligence

292

Administering Splunk 4.2

Splunks log levels


Log levels from lowest to highest: crit, fatal, error, warn, info, debug
By default all subsystems are set to info or warn
All of Splunks logs can be set to debug by restarting Splunk in debug
mode
- Generally not recommended since its burdensome on production systems and

creates lots of unwanted noise in the logs


- Better to set to debug granularly on the individual subsystem(s) you are
troubleshooting (see next slide)
- Splunk Support may ask for overall debug mode in certain cases

Operational Intelligence

293

Administering Splunk 4.2

Set granular log levels


You can granularly adjust
subsystem log levels to
debug to troubleshoot
specific issues using
Manager
Can also set them using
log.cfg in
$SPLUNK_HOME/etc
(useful for light forwarders)
Operational Intelligence

294

Administering Splunk 4.2

Troubleshooting: check your search


Many times input or forwarder problems are actually
misdiagnosed search problems
Before starting to troubleshoot a missing input or forwarder that
is not forwarding, double check your search
- Sometimes inputs wind up in unexpected indexes so try adding index=*

when searching for a missing input/forwarder


- Sometimes time stamps are extracted wrong on new inputs, try searching
All Time to help diagnose this
- Generally, use wildcards in other parts of your search to cast the widest net
for missing data
Operational Intelligence

295

Administering Splunk 4.2

Deployment monitor
The Deployment Monitor is a collection of dashboards and drilldown
pages with information to help monitor the health of a system
- Index throughput over time
- Number of forwarders connecting to the indexer over time
- Indexer and forwarder abnormalities
- Details for individual forwarders and indexers, such as status and forwarding

volume over time


- Source types being indexed by the system
- License usage

Operational Intelligence

296

Administering Splunk 4.2

Main index throughput and forwarders

Operational Intelligence

297

Administering Splunk 4.2

Main indexer and forwarder warnings

Operational Intelligence

298

Administering Splunk 4.2

Main sourcetype warnings

Operational Intelligence

299

Administering Splunk 4.2

Viewing warning info


Click the arrow icon to view warning information

Operational Intelligence

300

Administering Splunk 4.2

Configuring alerts
Click configure alerting to modify
the underlying saved
search/alert

Operational Intelligence

301

Administering Splunk 4.2

Indexers All Indexers


Number of current active
searches
MB indexed today
- Can select alternate time range

Table report of indexer(s) status,


last connection, and total GB
indexed in last 30 minutes
- Can select alternate time range

Operational Intelligence

302

Administering Splunk 4.2

Indexer Properties
Data specific to a given indexer
- Drill-down from All Indexers view
- Can drill-down on any chart item to

show underlying events

Operational Intelligence

303

Administering Splunk 4.2

All Sourcetypes
Shows MB Received by
sourcetype
Table display shows each
sourcetype, current status, last
received, and total MB received
Drill down on any item for
underlying events

Operational Intelligence

304

Administering Splunk 4.2

Sourcetype info
Drill-down from All
sourcetypes shows info
for single sourcetype

Operational Intelligence

305

Administering Splunk 4.2

License Usage
Cumulative MB per day by
Sourcetype
MB Received
- By sourcetype, source, host, forwarder,

indexer, license pool


- Drill-down shows underlying events in
Search view

Usage statistics
- By sourcetype, source, host, forwarder,

indexer, license pool


- Shows last received and total MB
received
Operational Intelligence

306

Administering Splunk 4.2

Backfill data
Use backfill Summary Indexes to add two-weeks worth of data to the
summary indexes (useful for new Deployment Monitor installation on existing
Splunk instance)
Use Flush and Backfill to erase old data and re-populate

Operational Intelligence

307

Administering Splunk 4.2

Community based support


Splunk docs are constantly being updated and improved, so be sure to
select your version of Splunk to make sure the doc you are reading
applies to your version
http://www.splunk.com/base/Documentation
Splunk Answers: post specific questions and get them answered by
Splunk experts (also makes for great and informative reading)
http://answers.splunk.com
IRC Channel: Splunk maintains a channel #splunk on the EFNet IRC
server. Support engineers and many well-informed Splunk users hang
out there
Operational Intelligence

308

Administering Splunk 4.2

Splunk Support
Contact Splunk Support email: support@splunk.com
File a case online
http://www.splunk.com/index.php/submit_issue
24/7 phone depending on support contract

Operational Intelligence

309

Administering Splunk 4.2

Thanks! Please take our


survey.

Operational Intelligence

Administering Splunk 4.2

Вам также может понравиться