Splunk Admin42 Ver1.1

Administering Splunk 4.
2
Ver.
1.0
Document usage guidelines

Should be used only for enrolled students
Not meant to be a self-paced document
Do not distribute
March 24, 2011

Operational Intelligence
Administering Splunk 4.2
Class Goals
Describe Splunk installation and server operations
Configure data inputs
Describe default processing and understand how to modify data inputs
Manage Splunk datastores
Add users, configure groups, and understand authentication
Describe alert configurations
Configure forwarding/receiving and clustering
Use Splunks Deployment Server
Manage jobs and knowledge objects
Find out where to get help
Course Outline
1. Installing Splunk
2. Configuring Data Inputs
3. Modifying Data Inputs
4. Config File Precedence
5. Splunk's Data Stores
6. Users, Groups, and Authentication
7. Forwarding and Receiving
8. Distributed Environments
9. Licensing
10. Security
11. Jobs, Knowledge Objects, and Alerts
12. Troubleshooting
Section 1:
Installing Splunk
Section objectives
List Splunks hardware/software requirements
Describe how to install Splunk
Perform server basics; starting, stopping, and restarting Splunk
Describe the Splunk license model
List the basic tools to configure Splunk: Manager, CLI, and editing config
files
Describe apps
Upgrade to 4.2
List whats new in Splunk 4.2 for administrators
OS requirements
Splunk works on Windows, Linux, Solaris, FreeBSD, MacOS
X, AIX, and HP-UX
Check current documentation for specifics for each OS
Hardware Requirements
Platform
Recommended Configuration
Minimum Configuration
Non-Windows
OS
2x quad-core Xeon, 3GHz, 8GB

RAM, RAID 0 or 1+0, with a 64 bit
OS installed
1x1.4 GHz CPU, 1 GB

RAM
Windows
2x quad-core Xeon, 3GHz, 8GB

RAM, RAID 0 or 1+0, with a 64 bit
OS installed
Pentium 4 or equivalent
at 2Ghz, 2GB RAM
Supported browsers
Firefox 2.x, 3.0.x, 3.5.x (3.5.x only supported on 4.0.6 and later)
Internet Explorer 6, 7, & 8
Safari 3
Chrome 9
All browsers need Flash 9 to render reports and display the
flash timeline
Download the bits

Download Splunk from
www.splunk.com/download
(login required)
- Online installation instructions are
available from the download page
Obtain your enterprise

license from sales or support
10
Download the right bits

There are 32 and 64 bit
versions, get the right one
- The wrong version may install,
but wont run
Various packages, tarballs,

and installers are available
for each OS
11
Install it!
For zipped tarballs simply unpack the contents into the
directory you want to install Splunk
For Windows just double click on the MSI file
See the docs for OS specific packages, and Windows
command line install instructions
Splunk install directory is referred to as $SPLUNK_HOME in
both the docs and courseware
- UNIX default is /opt/splunk
- Windows default is C:\Program
Files\splunk
12
Step by step instructions

www.splunk.com/base/Documentation/
latest/Installation/Chooseyourplat
form
13
UNIX: to be or not to be root?

Splunk can be installed as any user
If you do not install as root, remember
- The Splunk account must be able to access the data sources
- /var/logis not typically open to non-root accounts
- Non-root accounts cannot access ports < 1024, so dont use them when
you configure data sources

- Make sure the Splunk account can access scripts used for inputs and
alerts
14
Windows: local or domain user?

2 choices in Windows: local user OR domain account
Local user will have full access ONLY to the local system
You must use a domain account for Splunk if you want to:
- Read Event Logs remotely
- Collect performance counters remotely
- Read network shares for log files
- Enumerate the Active Directory schema using Active Directory Monitoring
See the docs for details

15
Splunk subdirectories
Executables are located in $SPLUNK_HOME/bin
License and other important files are in $SPLUNK_HOME/etc
Indexes by default are in $SPLUNK_HOME/var/lib/splunk
Same directories in Windows, just different slashes
16
Splunk directory structure

$SPLUNK_HOME
$SPLUNK_HOME
executables
executables
bin
bin
system
system
etc
etc
licenses,
licenses, configs
configs
apps
apps
users
users
var
var
lib
lib
splunk
splunk
indexes
indexes
search
search
<custom
<custom app>
app>
launcher
launcher
17
Windows: Starting Splunk

Upon successful installation, you
can choose to add Splunk to the
start menu
18
Windows: controlling Splunk services

Splunk installs 2
services splunkd and
Splunk Web
- Start and stop them as you
would any service

- Both are set to startup
automatically
You can also control

Splunk from the
command line
C:\Program Files\Splunk\bin>splunk start

C:\Program Files\Splunk\bin>splunk stop
C:\Program Files\Splunk\bin>splunk restart
19
UNIX: Starting Splunk

The command for using/managing Splunk is
$SPLUNK_HOME/bin/splunk
# pwd
/opt/splunk/bin
# ./splunk start
The first time you start Splunk, avoid the prompt to accept the
license by using the command line tag --accept-license
# pwd
/opt/splunk/bin
# ./splunk start --accept-license
20
UNIX: controlling Splunk processes

Stopping/starting Splunk
# ./splunk start
# ./splunk stop
Restarting Splunk
# ./splunk restart
Is Splunk running?
# ./splunk status
or
# ps ef | grep splunk
21
UNIX: run Splunk at boot

Splunk comes with a command to enable it to start at boot
# ./splunk enable boot-start
This modifies or adds a script to /etc/init.d that will

automatically start Splunk when the OS starts
Even if you didnt install Splunk as root, this command must be
run as root
22
Splunk processes splunkd

Accesses, processes, and indexes incoming data
Handles all search requests and returns results
Runs a web server on port 8089 by default
Speaks SSL by default
Runs Splunk helpers run as dependent process(es) of splunkd
- Splunk helpers run outside scripts, for example:
Scripted inputs
Cold to frozen scripts
23
Splunk processes Splunk Web

Python based web server based
on CherryPy
Provides both search and
management web front end for
splunkd
Runs on port 8000 by default
Sets initial login to user: admin
password: changeme
24
Apps
Apps are configurations of a Splunk environment designed to meet a specific
business need
- Manage a specific technology
Splunk for Websphere

Splunk for Cisco
and many more . . .
- Manage a specific OS
Splunk for Windows

Splunk for UNIX/LINUX
- Manage compliance
PCI
Enterprise Security Suite
25
splunkbase
Choose from hundreds
of apps on
splunkbase.splunk.com
- Apps developed by Splunk
as well as the community

are available
- Vast majority of apps are
free, so dont be shy!
26
Managing a Splunk installation

Three ways to manage a Splunk installation
Command Line Interface (CLI)
2. Directly editing config files
3. Splunk Manager interface in Splunk Web
1.
27
Managing a Splunk installation - CLI

Command Line Interface (CLI)
- Shell access to Splunk server and user access to Splunk directory required
- Most commands require authentication and admin role to run
- If you dont provide inline authentication credentials, Splunk will ask you
./splunk clean eventdata main -auth admin:myadminpass
command
command
object
object
authentication
authentication
(inline)
(inline)
28
Command line interface (CLI)

Also requires authentication
- Enter auth as part of command or wait for prompt
#./splunk add monitor /var/log host www1
Splunk username: admin
Password:
Inline help is available

#./splunk help
Welcome to Splunk's Command Line Interface (CLI).
Try typing these commands for more help:
help simple, cheatsheet
display a list of common commands with
help commands
display a full list of CLI commands
help [command]
type a command name to access its help
29
syntax
page
Managing a Splunk installation config files

Directly editing config files
- Shell/console access to Splunk
server and sufficient user rights

to edit files in the Splunk
directory
- Config files must be saved in
UTF8, be sure to use the right
form for non-UTF8 OS
- Changes made this way more
often require a restart
30
Direct editing of config files

Changes done this way
sometimes require a restart
or reload of Splunk
31
Managing a Splunk installation - Manager

Splunk Manager interface in
Splunk Web
- Access to Splunk Web
- Admin role on the Splunk server
- Access from the main navigation
Manager link
32
Splunk Manager general settings
33
Splunk Manager general settings (cont.)
/opt/splunk
34
35

Click Save when you are
done
- All changes to general settings will
require a restart
36
More Resources
Look on Splunkbase for additional Apps to help you manage your
Splunk servers
http://www.splunkbase.com/apps/All/4.x
There is a Troubleshooting section in the Splunk Admin manual

http://www.splunk.com/base/Documentation/latest/Admin
37
Lab 1
38
Section 2:
Configuring Data Inputs

39
Section objectives
Set up data inputs
List Splunks data input types and explain how they differ
Set input properties such as host, ports, index, source type,
etc.
40
Specifying data inputs

There are a number of ways you can specify a data input:
Apps
- Preconfigured inputs for various types of data sources available on splunkbase
Splunk Web
- You can configure most inputs using the Splunk Web data input pages
CLI
- You can use the CLI (command line interface) to configure most types of inputs
inputs.conf
- When you use Splunk Web or CLI, configurations are saved to inputs.conf
- You can edit that file directly to handle advanced data requirements
41
Types of inputs
Files and directories monitor physical files on disk
Network inputs monitor network data feeds on specific ports
Scripted inputs import from non-traditional sources, APIs, databases,
etc.
Windows inputs Windows specific: Windows event logs,
performance monitoring, AD monitoring, and local registry monitoring
File system change monitoring monitor the state: permissions, read
only, last changed, etc. of key config or security files
42
Setting up new inputs Apps / Add-ons
configure
configure input
input through
through
app
app setup
setup process
process
43
Setting up new inputs Manager

Admin role and access
to SplunkWeb
Changes written to
inputs.conf
- Location of
inputs.conf is
determined by app
context
44
Setting up new inputs CLI

Admin role and shell/console access to Splunk server required*
Useful for administering forwarders
Location of inputs added via the CLI is the Search app
#./splunk add monitor /var/log hostname www1 index webfarm
Your session is invalid. Please login.
Password:
Added monitor of /var/log
**Using the -uri flag you can send remote CLI commands from a local Splunk instance to a remote
instance without shell access. See the docs for details.
http://www.splunk.com/base/Documentation/latest/Admin/AccessandusetheCLIonaremoteserver
45
Setting up new inputs inputs.conf

Skip the middleman of
Manager or the CLI and
directly edit inputs.conf
[default]
host = mysplunkserver.mycompany.com
[monitor:///opt/secure]
disabled = false
followTail = 0
host_segment = 3
index = default
sourcetype = linux_secure
Shell/console access to
Splunk server required
Changes made this way
require a restart
[monitor:///opt/tradelog]
disabled = false
sourcetype = trade_entries
46
inputs.conf (cont.)
Input path specifications in inputs.conf (monitor stanzas) use Splunkdefined wildcards (also used by props.conf, discussed in next section)
(these are not REGEX-compliant expressions)
Wildcard
Description
...
The ellipsis wildcard recurses

through directories and
subdirectories to match.
The asterisk wildcard matches

anything in that specific directory
path segment.
Regex
equivalent
.*
[^/]*
Example(s)
/var/log//apache.log matches the
files
/var/log/www1/apache.log,
/var/log/www2/apache.log, etc.
/logs/*.log matches all files with
the .log extension, such as
/logs/apache.log.
It does not match /logs/apache.txt.
Note: must be used in the last

segment of the path.
47
inputs.conf (cont.)
So
. . . matches any character(s) recursively
* matches anything 0 or more times except the /
. is NOT a wildcard and simply matches the . Literally
Syntax details:
$SPLUNK_HOME/etc/system/README/inputs.conf.spec
http://www.splunk.com/base/Documentation/latest/Admin/Inputsconf
http://www.splunk.com/base/Documentation/latest/Admin/Specifyinputp
athswithwildcard
48
Setting source, sourcetype, and host

You can specify source, sourcetype, and host at the input level for
most inputs
Source
-
Should be left to the default
Sourcetype
- Most default processing for standard data types is based on sourcetype Whenever
possible use automatic sourcetype, select from Splunks list, or use the recipes
Host
- Opt for specific hostnames/FQDN as much as possible since the host field is a key
search tool
49
Data inputs monitor

Monitor eats data from specified file(s) or directory(ies)
Where
- Can be pointed to an individual file or the top of a complex directory hierarchy
- Recurses through specified directory
- Indexes any directory the Splunk server can reach, local or remote file systems
How
- Unzips compressed files automatically before indexing them
- Eats new data as it arrives
- Automatically detects and handles log rotation
- Remembers where it was in a file and picks up from that spot after restart
50
Data inputs monitor (cont.)

What
- Uses whitelists and blacklists to include or exclude files and directories
- Can be instructed to start only at the end of a large file (like tail f)
- Can automatically assigns a source type to events, even in directories
containing multiple log files from different systems, processes, etc.
51
Monitor via Manager (called Files & Directories)
add
add new
new input
input
edit
edit existing
existing input
input
52
Monitor file or directory Manager
53
Monitor file or directory Manager: Source

Specify a file or
directory for
ongoing monitoring
Can also upload a
copy of a file
-
Useful for testing

and development
54
Monitor a file or directory Manager: Host

Specify a constant
value if all monitored
files in an input are from
the same host
55

When multiple hosts
write to the same
directory and the
host name appears
in the file name or
part of the path, use
REGEX on path to
extract the host
name
/var/log/www1.log
/var/log/www1.log will
will extract
extract www1
www1
/var/log/www_db1.log
/var/log/www_db1.log will
will extract
extract www_db1
www_db1
56

When multiple hosts
write to the same
directory and host
name appears as a
consistent
subdirectory in the
path, use segment in
path
/logs/www1/web.log
/logs/www1/web.log or
or /logs/www2/web.log
/logs/www2/web.log
57
Monitor a file or directory Manager: Sourcetype

Automatic
- Splunk automatically determines source type for most
major data types

- Useful for directories with many different types of log files
Manual
- Enter a name for a
specific sourcetype
From list
- Choose the sourectype
from the dropdown list

58
Monitor a file or directory Manager: Index

Select the index where this
monitor input will be stored
If you want to put a new input in
a new index, you must create
the index before the input
59
Monitor a file or directory Manager: Follow tail

Follow tail works like tail -f
it starts at the end of the file
and only eats new input from
that point forward
Only applies to the very first
time the new monitor input is
added
60
Monitor a file or directory Manager: Whitelist

If a file is whitelisted, Splunk consumes it and
ignores all other files in the set
Use whitelist rules to tell Splunk which files to
consume when monitoring directories
This
This whitelist
whitelist will
will only
only index
index files
files
that
that end
end in
in .log
.log
Use
Use aa || to
to create
create OR
OR
statements:
statements: indexes
indexes files
files that
that
end
end in
in query.log
query.log or
or my.log
my.log
Add
Add aa leading
leading slash
slash to
to insure
insure an
an
exact
exact file
file match:
match: only
only indexes
indexes
query.log
query.log and
and my.log
my.log
61
Monitor a file or directory Manager: Blacklist

If a file is blacklisted, Splunk ignores it and
consumes all other files in the set
Use blacklist rules to tell Splunk which files not to
consume when monitoring directories
This
This blacklist
blacklist won't
won't index
index files
files that
that
end
end in
in .txt
.txt
Use
Use aa || and
and ()
() to
to create
create OR
OR
statements:
statements: won't
won't index
index files
files that
that
end
end in
in .txt
.txt or
or .gz
.gz
This
This blacklist
blacklist avoids
avoids both
both archive
archive
and
and historical
historical directories
directories (as
(as well
well as
as
files
files named
named archive
archive and
and historical)
historical)
62
Scripted inputs
Splunk can run scripts periodically that generate input
- Scripts need to be shell (.sh) on *nix or batch (.bat) on Windows
- Or Python on any platform
- Can use any scripting language the OS will run if wrapped in a shell or batch wrapper
Splunk eats the standard output (stdout) of the script

Use them to run diagnostic commands such as top, netstat, vmstat, ps, etc.
Used in conjunction with many Splunk Apps to gather specialized
information from the OS or other systems running on the server
Also good for gathering data from APIs, message queues, or other custom
connections
63
Setting up a scripted input

1. Write or obtain the script
2. Copy it to your Splunk servers script directory
3. If possible, test your script from that directory to make sure it
runs correctly
4. Set up input in Manager
5. Click save and wait for a few intervals to pass, then verify
that the input is available in Search or its App
64
Manager Scripted inputs
65
Manager Scripted inputs (cont.)

Splunk will only run scripts from specified bin directories
$SPLUNK_HOME/etc/system/bin OR
$SPLUNK_HOME/etc/app/<app_name>/bin
Interval is in seconds,
though you can also
specify a schedule
using CRON syntax
The interval is the time
period between script
executions
66
Manager Network inputs
67
Manager Network inputs: Source port

TCP or UDP feeds from 3rd party
systems (not Splunk Forwarders)
Splunk can be configured to
listen to a specified UDP or TCP
data feed and index the data
- Can be set to accept feeds from any
host or just one host on that port
Can specify any non-used network

port (that is NOT splunkds or
Splunk Webs ports)
68
Manager Network inputs: source and sourcetype

By default Splunk will set the source
to be host:port
- a syslog feed from a firewall named
fw_01 would have fw_01:514 for its

source
Only two options for sourcetype, from

list or manual
- If there are multiple sourctypes coming
from a single network feed you will need

to configure further processing to handle it
(Covered in the next section)
69
Manager Network inputs: Host

Three choices for host:
- IP Splunk will use the IP address
of the sender (default)

- DNS Splunk will do a reverse
DNS lookup for the host name
- Custom allows you to specify a
specific host name
70
File system change monitoring

FSChange (must be setup in inputs.conf) monitors changes to files
and directories
DOES NOT index the contents of the files and directories
Writes an event to an index when it detects a change or deletion
Monitors:
- Modification date/time
- group ID
- user ID
- file mode (read/write attributes, etc.)
- optional SHA256 hash of file contents
71
Setting up fsmonitor
Set up a stanza in inputs.conf
[fschange:/etc/]
pollPeriod = 60
host = splunkserver.company.com
List the directory you want Splunk

to monitor
- DO NOT use file system change
monitoring on a directory that is being

indexed using Monitor
Default sourcetype=
fs_notification
pollPeriod is interval in seconds
Splunk checks the files for changes
72
Windows inputs
Windows inputs must be set up on a Windows Splunk instance
UNIX indexers CAN and will index and search Windows inputs
Set up a Universal Forwarder or Light Forwarder to get
Windows inputs to a UNIX indexer
73
Windows inputs Local or remote event logs

Local event logs can be
collected from a Universal
Forwarder or the local indexer
Remote event log collection
requires proper domain account
permissions on the remote
machine
74
Windows inputs local event logs

Select the event logs you wish
Splunk to index
For further settings, edit
inputs.conf directly
75
Windows inputs remote event logs

Enter a host to choose logs
- Click Find logs
to populate the
available logs list
Optionally, you can collect the

same set of logs from additional
hosts
- Enter host names or IP addresses,
separated by commas
76
Windows event log settings in inputs.conf

start_from - Use this setting to tell
Splunk to start with the newest
events and then work its way back
to the oldest default is oldest
current_only - If set to 1, Splunk
will only index events starting from
the day the input was set up and
going forward default is 0
77
Windows inputs Performance monitor

Use Performance Monitor to
collect data from a local
machine Forwarder or
Indexer
78
Windows inputs Performance monitor (cont.)

Select an object to monitor
Based on the object you
select, the Counters section
is populated with available
counters
79
Windows inputs Performance monitor (cont.)

Select instances
Set the polling interval
80
Windows inputs Registry monitoring

Indexes the registry
whole cloth, as well as
any ongoing changes
on limiting what is actually
monitored
www.splunk.com/base/Docume
ntation/latest/Admin/Monit
orWindowsregistrydata
81
Windows inputs: AD monitoring

You can specify a domain controller or let Splunk discover the
nearest one
You can then specify the highest node in the tree you want
Splunk to monitor
- Splunk will move down the tree recursively
- If unchecked will index the entire tree, including the schema
Use permissions of the Splunk users to limit what it can monitor

in AD
www.splunk.com/base/Documentation/latest/Admin/AuditActiveDirectory
82
Windows inputs Windows app

Installing the Windows app allows you to
collect and monitor several common windows
input types
83
Lab 2 Data inputs
84
Section 3: Modifying
Data Inputs
85
Section objectives
Describe how data moves from input to index
Understand the default processing that occurs during indexing
List the config files that govern data processing
Learn how to override default data processing
Learn how to discard unwanted events
Learn how mask sensitive data
Learn how to extract fields
86
Input to Index Big Picture

Ne
tw
or
ki
np
uts
Win
dow
s in
puts
Disk
Monitor inputs
s
t
u
p
n
i
d
e
t
p
ri
c
S
87
Input Phase:
Raw data from
all forms of
input collected
Parsing Phase:
Raw data
broken down
into events, and
then event by
event
processing
88
License Meter
Indexing phases
Indexing Phase:
Index generated
and data is
written to disk
Inputs phase details

Inputs phase works with entire streams of data, not individual events.
Overarching metadata is applied.
inputs.conf
props.conf
windows files
source, sourcetype, and

host
CHARSET and sourcetyping

based on source
wmi.conf and
regmon-filters.conf
See: www.splunk.com/wiki/Where_do_I_configure_my_Splunk_settings%3F
for details
89
props.conf
props.conf is a config file that plays a role in all aspects of Splunk data
processing
Governs most aspects of data processing, can also invoke settings in
other config files
Uses similar stanza format of inputs.conf and other Splunk config
files
See $SPLUNK_HOME\etc\system\README\props.conf.spec and
props.conf.example for syntax and examples
90
props.conf specifications
props.conf stanzas use specifications to map configurations to data
streams
The specification can be either host, source, or sourcetype
Patter
n
[host::<hostname>]
Examp
[host::www1]le
attribute = value
TZ = US/Pacific
[source::<source>]
attribute = value
[source::/var/log/trade.log]
sourcetype = trade_entries
[<sourcetype>]
attribute = value
[syslog]
TRANSFORMS-host=per_event_host
source and sourcetype specs are case sensitive, host is NOT

91
Inputs phase props.conf

sourcetype can be set based on source during the inputs phase
[source::/var/log/custom*]
sourcetype = mycustomsourcetype
[source::...\\web\\iis*]
sourcetype = iis_access
CHARSET spec can be set at this time. Default is automatic, use this
setting to override if auto is not working correctly. See docs for list of
character sets
[source::.../seoul/*
]
CHARSET = EUC-KR
[source::h:\\web\\\\*]
CHARSET = Georgian-Academy
www.splunk.com/base/Documentation/base/Data/
Configurecharactersetencoding
92
Parsing phase big picture

Data from inputs phase are broken up into individual events, and then
any event-level processing is performed.
Chunks of data
from inputs phase
Broken into
individual events.
93
Event-by-event
processing
Parsing phase details

A majority of data processing work is done during the parsing phase
- Actual event boundaries are decided, date/timestamp are extracted and any
type of per-event operation is performed

automatic
override
auto-sourcetyping,
autodate/timestamping, and
auto-linebreaking, time
zone
per-event REGEX
based sourcetype, host,
or index settings,
custom line breaking
and date/timestamping
94
custom
REGEX/SEDCMD
rewrites, per-event
routing to other
indexers, 3rd party
systems, or the null
queue
Parsing phase: automatic

Switches data to UTF-8
By default Splunk will attempt to automatically
- detect event boundaries (monitor and network inputs)
- extract date/timestamps (monitor and network inputs)
- assign sourcetypes (for monitor input only)
Default settings are in

$SPLUNK_HOME/etc/system/default/props.conf
- in the parsing phase props.conf can call stanzas in another config file
transforms.conf located in the same directory
95
Its automatic . . .
Success rate of automatic processing will vary. For standard data types
such as syslog, web logs, etc., Splunk does a great job. For custom, or
esoteric logs youll need to test, though even then the odds are good it
will get it right.
- Correct date/timestamping and linebreaking are key to subsequent processing
and the ultimate searchability of data
Other types of automatic processing

- Window inputs
- syslog host extraction
www.splunk.com/base/Documentation/base/Data/Overviewofeventpro
cessing
96
Line breaking
If automatic event boundary detection is not working correctly
- Bad event breaking is usually easy to detect in indexed test data, but be careful
since bad line breaking can show up as bad timestamping
2 methods
- SHOULD_LINEMERGE=false(most efficient)
Using this method Splunk cuts the data stream directly into finished events using either
the new line \n or carriage return \r characters (default) or a REGEX you specify with
LINE_BREAKER
- SHOULD_LINEMERGE=true
Splunk uses a configurable two-step process to split your data into individual events
97
SHOULD_LINEMERGE = false
Already set for many standard types of data including syslog (including
snare), windows inputs, and web data
- See $SPLUNK_HOME/systemor
apps/<app_name>/default/props.conffor details
Should be set for custom data with one event per line formats
- breaking on /n or /r characters
Or if possible use other pattern breakers, but be ready to sacrifice the

characters that make up the pattern from your raw data
- The characters that make up the pattern match arent kept as part of the events
98
SHOULD_LINEMERGE = true
The default if not specified
Splunk merges multiple lines of data into single events based on the
rule, new line with a date at the start or 256 total lines marks an event
boundary
- BREAK_ONLY_BEFORE_DATE=true(the default)
- MAX_EVENTS=256(default)
Certain predefined data types like log4j and other application server
logs use BREAK_ONLY_BEFORE=<REGEXpattern>that when
matching the start of a new line, marks the start of a new event
99
Custom line merge

If your multiline data and default processing dont get along beyond
the BREAK_ONLY_BEFORE setting there are many more REGEX
based settings to divide up your events
- www.splunk.com/base/Documentation/latest/Data/Indexmulti-lineevents for
details
- see also
$SPLUNK_HOME/etc/system/README/props.conf.spec
or www.splunk.com/base/Documentation/latest/Admin/Propsconf
100
Date/timestamp extraction
Like event boundaries, correct date/timestamp extraction is key to
Splunking your data
Verify timestamping when setting up new data types
- Pay close attention to time stamping during testing/staging of custom/or non-
standard data types

- Convert UNIX time or other non-human readable time stamps and compare
Well tuned for standard data types

- See props.conf in $DEFAULT and
http://www.splunk.com/base/Documentation/latest/Data/ConfigureTimestampRe
cognition
for details
101
Custom date/timestamp props.conf

TIME_PREFIX = <REGEX> which matches characters right BEFORE
the date/timestamp
- Use this for events with multiple timestamps to pinpoint the correct one or with
events that have data that looks like a timestamp but isnt that confuses the
processor
Example data with date-like code at the start of the line
1989/12/3116:00:00edMay2315:40:212011ERRORUserManagerExceptionthrown
1989/12/3116:00:00edMay2315:40:212011ERRORUserManagerExceptionthrown
Start
Start looking
looking here
here for
for
date/timestamp
date/timestamp
[my_custom_source_or_sourcetype]
TIME_PREFIX = \d{4}/\d{2}/\d{2} \d{2}:\d{2}:\d{2} \w+\s
102
Custom date/timestamp props.conf (cont)

MAX_TIMESTAMP_LOOKAHEAD=<integer>specifying how many
characters to look beyond the start of the line for a timestamp
- works in conjunction with TIME_PREFIX if set, in which case it starts counting from
the point the TIME_PREFIX indicates Splunk should start looking for the
date/timestamp
- Improves efficiency of timestamp extraction
As with multiline event configs, see

$SPLUNK_HOME\etc\system\README\props.conf.spec and the
docs for even more options if necessary
www.splunk.com/base/Documentation/latest/Data/Handleeventtimestamps
103
Time zones
Splunk follows these default rules when it attaches a time zone to a
time stamp
1. It looks in the raw event data for a time zone indicator such as
GMT+8 or PST and uses that
2. It looks in props.conf to see if a TZ attribute has been given for this
data stream based on standard settings referenced here: en
.wikipedia.org/wiki/List_of_zoneinfo_timezones
3. If all else fails it will apply the time zone of the indexer
[host::nyc*]
TZ = America/New York
[source::/mnt/cn_east/*]
TZ = Asia/Shanghai
104
Time and Splunking

Splunk depends heavily on existing time infrastructure
Timestamps in Splunk are only as good as the time settings on servers
and devices that feed into Splunk
A good enterprise time infrastructure makes for good timestamping
which makes for good Splunking
105
Per event REGEX changes

Splunk can modify data in individual events based on REGEX pattern
matches
Requires invoking a second file, transforms.conf (see next slide)
Using props.conf and transforms.conf you can disable/modify
existing modifications, or add your own custom settings
106
transforms.conf
Config file whose stanzas are invoked by props.conf
- All caps TRANSFORMS=<transforms.conf_stanza>syntax used to
invoke index time changes

- Required for all REGEX pattern match processing
- Resides in the same directory(ies) as props.conf
- Can also be called at search time by REPORT, LOOKUP (search time section
coming up)
$SPLUNK_HOME/etc/system/default/transforms.conf
[syslog]
TRANSFORMS = syslog-host
[syslog-host]
DEST_KEY = MetaData:Host
REGEX = :\d\d\s+(?:\d+\s+|(?:user|daemon|
local.?)\.\w+\s+)*\[?(\w[\w\.\-]{2,})\]?\s
FORMAT = host::$1
107
transforms.conf (cont.)
Transforms uses standard settings to indicate what its REGEX will
match and what it will rewrite based on the match
The source and destination of these tranformations are referred to as
keys
- SOURCE_KEY tells Splunk where to apply the REGEX (optional)
- DEST_KEY tells Splunk where to apply the data modified by the REGEX and
FORMAT setting (required)

- REGEX is the regular expression and capture groups (if any) that operate on the
SOURCE_KEY (required)
- FORMAT controls how REGEX writes the DEST_KEY (required)
108
Keys in action
From the default syslog host extraction
[syslog-host]
DEST_KEY = MetaData:Host
REGEX = :\d\d\s+(?:\d+\s+|(?:user|daemon|
local.?)\.\w+\s+)*\[?(\w[\w\.\-]{2,})\]?\s
FORMAT = host::$1
The
The REGEX
REGEX pattern
pattern here
here is
is looking
looking for
for aa host
host
name
name embedded
embedded in
in syslog
syslog data.
data. Only
Only one
one
nd set of
capture
capture group
group is
is referenced
referenced here:
here: the
the 22nd
set of
parenthesis.
parenthesis. In
In this
this circumstance
circumstance we
we would
would
nd
expect
expect the
the host
host name
name to
to appear
appear within
within the
the 22nd
set
set of
of parenthesis.
parenthesis.
FORMAT
FORMAT specifies
specifies what
what is
is written
written out
out to
to the
the
DEST_KEY.
DEST_KEY. Here
Here host::$1
host::$1 means
means host=1
host=1stst
REGEX
REGEX capture
capture group.
group.
We
We are
are updating
updating the
the host
host field,
field, so
so our
our
DEST_KEY
DEST_KEY is
is MetaData:Host,
MetaData:Host, for
for sourcetype
sourcetype itit
would
would be
be MetaData:Sourcetype,
MetaData:Sourcetype, for
for index
index itit
would
would be_MetaData:Index
be_MetaData:Index (Case
(Case and
and for
for index
index
the
the underscore
underscore counts!)
counts!) See
See
transforms.conf.spec
transforms.conf.spec for
for details.
details.
109
Setting sourcetype per event

You can configure Splunk to set sourcetype on a per event basis
- This should be your sourcetypeing of last resort since inputs.conf settings and
source based sourcetyping using just props.conf are less resource intensive
In props.conf
A
A value
value after
after TRANSFORMS
TRANSFORMS give
give this
this
[source::udp:514]
TRANSFORMS-1srct = custom_sourcetyper
In transforms.conf
Any
Any event
event from
from this
this source
source where
where the
the last
last word
word
of
of the
the line
line is
is Custom
Custom will
will get
get the
the sourcetype
sourcetype of
of
custom_log
custom_log
[custom_sourcetyper]
DEST_KEY = MetaData:Sourcetype
REGEX = .*Custom$
FORMAT = sourcetype::custom_log
transformation
transformation aa name
name space,
space, this
this comes
comes into
into
play
play for
for multiple
multiple transformations
transformations and
and provides
provides
precedence
precedence ifif needed
needed
110
Per event index routing

Like sourcetype, if at all possible specify the index for your inputs in
inputs.conf
props.conf
Note
Note the
the use
use of
of _MetaData:Index
_MetaData:Index
[routed_sourcetype]
TRANSFORMS-1indx = custom_sourcetype_index
transforms.conf
[custom_sourcetype_index]
DEST_KEY = _MetaData:Index
REGEX = .
FORMAT = custom_index
Were
Were using
using aa wide
wide open
open REGEX
REGEX since
since
we
we want
want everything
everything classified
classified as
as this
this
sourcetype
sourcetype routed
routed to
to aa different
different index.
index.
More
More granular
granular routing
routing would
would have
have aa more
more
complex
complex REGEX
REGEX
For
For index
index routing,
routing, the
the FORMAT
FORMAT simply
simply
takes
takes the
the name
name of
of the
the index
index you
you are
are
routing
routing to
to
111
Filtering unwanted events

You can route specific unwanted events to the null queue
- Events discarded at this point do NOT count against your daily license quota
props.conf
[WinEventLog:System]
TRANSFORMS-1trash = null_queue_filter
transforms.conf
Since
Since Windows
Windows Event
Event logs
logs are
are multiline
multiline
events
events we
we need
need to
to use
use the
the REGEX
REGEX
multiline
multiline indicator
indicator (?m).
(?m). Applies
Applies to
to any
any
multiline
multiline event
event and
and REGEX,
REGEX, not
not just
just null
null
queue
queue
[null_queue_filter]
DEST_KEY = queue
REGEX = (?m)^EventCode=(592|593)
FORMAT = nullQueue
Here
Here our
our DEST_KEY
DEST_KEY is
is queue
queue since
since were
were
routing
routing these
these events
events outside
outside the
the data
data flow
flow
FORMAT
FORMAT indicating
indicating nullQueue
nullQueue means
means we
we
are
are throwing
throwing away
away events
events that
that match
match this
this
pattern
pattern
112
Other routing
Beyond routing to the nullQueue, you can also route data to:
- other Splunk indexers
- 3rd party systems
See for details www

.splunk.com/base/Documentation/latest/Admin/Routeandfilterdata
113
Modifying the raw data stream

Sometimes its necessary to modify the underlying log data, especially
in the case of privacy concerns
Splunk provides 2 methods of doing this, REGEX and SEDCMD
The REGEX method uses transforms.conf and works on a per-event
level, the SEDCMD uses only props.conf and operates on an entire
source, sourcetype, or host identified stream
Care should be taken when modifying _raw since unlike all other
modifications discussed, this sort actually modifies the raw log data
114
Modifying _raw - REGEX

Works similarly to previous props.conf / transforms.conf modifications
props.conf
[source::...\\store\\purchases.log]
TRANSFORMS-1ccnum = cc_num_anon
DEST_KEY
DEST_KEY == _raw
_raw indicates
indicates we
we are
are
modifying
modifying the
the actual
actual log
log data
data
transforms.conf
[cc_num_anon]
DEST_KEY = _raw
REGEX = (.*CC_Num:/s)\d{12}(\d{4}.*)
FORMAT = $1xxxxxxxxxxxx$2
115
$1
$1 preserves
preserves all
all the
the data
data prior
prior to
to the
the first
first
12
12 digits
digits of
of the
the credit
credit card
card number.
number. $2
$2
grabs
grabs everything
everything after
after including
including the
the last
last 44
digits,
digits, we
we need
need to
to do
do this
this since
since we
we are
are
rewriting
rewriting the
the raw
raw data
data feed
feed
Modifying _raw SEDCMD

Splunk leverages a sed-like syntax for simplified data modifications
- Note that while sed is traditionally a UNIX command, this functionality works on Windows-
based Splunk installs as well
Its all done with a single stanza in props.conf

- The REGEX syntax using
s:
SEDCMD<name>=s/<REGEX>/<replacement>/flags
flags are either g to replace all matches, or a number to just replace that number of matches
- The string match syntax using y:
SEDCMD<name>=y/<string1>/<string2>
String matches cannot be limited, all matches will be replaced
String1 will be replaced with string2
116
SEDCMD cont
An example SEDCMD REGEX based replacement to overwrite the
first 5 digits of an account number anytime it appears in the
\1
\1 here
here works
works like
like aa
accounts.log source
[source::.../accounts.log]
SEDCMD-1accn = s/id_num=\d{5}(\d{5})/id_num=xxxxx\1/g
$1
$1 back-reference
back-reference in
in
transforms.conf
transforms.conf
REGEX
REGEX
This will replace id_num=1234567890with id_num=xxxxx

67890
You can put multiple replacement rules in a single props.conf stanza,
simply put a space and start again with s/
117
Parsing phase: override

Splunks automatic processing can be overridden/disabled
- Make your changes to files in $SPLUNK_HOME/etc/system/localor
$SPLUNK_HOME/etc/<app_name>/local
To disable create/edit props.conf in $SPLUNK_HOME/etc/system/local

or local directory of an app
- Turn off syslog host extraction for the syslog sourcetype
$SPLUNK_HOME/etc/system/local/props.conf
[syslog]
TRANSFORMS =
overwrites
118
[syslog]
TRANSFORMS = syslog-host
Indexing phase details

After the parsing phase Splunk passes the fully processed data off to the
index processor
index created
license meter
_raw is metered for
license usage
Keyword index
created, _raw is
compressed and
both are written to
disk
119
Disk
Persisted to disk
Once data reaches hard disk all modifications and extractions are
written to disk along with _raw
- source, sourcetype, host, timestamp, and punct
Indexed data cannot be changed

Modifications to processing wont be retroactive without reindexing
For this reason its recommended to test default and custom index time
processing on a staging instance prior to indexing in production
120
Search phase Big picture

Searches
Searches from
from users
users
or
or alerts
alerts
Disk
Search time
modifications
121
Search phase Big picture RT search

Real time searches work similarly except they bypass disk
Real
Real time
time searches
searches
from
from users
users or
or alerts
alerts
Index phase
Disk
Search time
modifications
122
Search time modifications

MANY different transformations/updates/modifications are available at
search time
- Data (usually sourcetype) dependent field extractions both custom, default, or
from add-ons or apps

- Lookups, event types, tags, field aliases, and many more . . .
These changes only apply to search results, no modification to data

written to disk
- Fully retroactive designed to be flexible
- Best way to customize data and build institutional knowledge into Splunk
123
Search time for admins

Splunk expands the ability to create most search time mods to the
Power and User roles
- Most are covered in more user/knowledge manager oriented classes
- Most can be fully administered through Splunk Webs manager
Admins may be called on to

- install apps and add ons (already covered)
Remember, apps/add ons are bundles of search time lookups, field extractions, tags, etc.
NOT just views and dashboards
- Create custom field extractions and change/disable search time modifications
using the file system

124
Default field extractions at search time

Most fields used in Splunk come from your data
For many common sourcetypes Splunk has default search time field
extractions in place
Additional default extractions are easy to add with Add Ons and Apps
- The *Nix app for example has many search time fields for standard UNIX-y logs
like secure.log or messages.log, etc.

- The Windows app has similar defaults for Windows data
- For non-OS data, look for an app specifically designed for that data on
www.splunkbase.com
125
3 ways to create a search time field

1. Editing config files
- available only to admins, knowledge of REGEX required
2. Using the IFX in Splunk Web (covered in Using)

- available to admin and power role, knowledge of REGEX helpful but not required
3. Using the rex command in the search language (covered in Search

& Reporting)
- all roles can use this command, knowledge of REGEX required
126
The usual suspects

Custom search time fields are created by stanzas in props.conf and
sometimes transforms.com
2 methods
- 1. using just props.conf EXTRACT
Simple single field extractions

Available after Splunk 4.0
Recommended method covered here
- 2. using props.conf REPORTS and transforms.conf
Useful for reusing extractions across multiple sourcetypes

www.splunk.com/base/Documentation/latest/Knowledge/Createandmaintainsearchtimefieldextractionsthroughconfigurationfiles for details
127
props.conf EXTRACT
A single stanza in props.conf using EXTRACT with a source,
sourcetype, or host spec (usually a sourcetype)
Use the EXTRACT command with a name and the REGEX after the
equals sign
props.conf
[tradelog]
EXTRACT-1type = .*type:\s(?<acct_type>personal|business)
Wrap
Wrap parenthesis
parenthesis around
around your
your field
field value
value to
to
created
created aa named
named capture,
capture, and
and then
then embed
embed
your
your field
field name
name within
within those
those parenthesis
parenthesis
with
with ?<field_name>
?<field_name>
128
Other search time processing

Many other knowledge objects/search time processing are stored in
other config files
- macros.conf, tags.conf, eventtypes.conf, savedsearches.conf, etc.
When users create or modify these Splunk Web simply writes to these
files for them
Admins can directly modify these files, though we recommend using
Manager if possible
See .conf files in $SPLUNK_HOME/etc/system/READMEand the
docs for details on specific files
129
Lab 3
130
Section 4:
Config Precedence
131
Config files and precedence

UI or CLI changes also update config files
Splunk gathers up all of the various config files and combines them at
index and search time based on rules of precedence
Rules of precedence vary depending on if configurations are being
applied at search time or index time
Index time precedence relies solely on the location of the files
Search time precedence also takes into account which user is loggedin and which app they are using
132
Index time precedence

At index time, Splunk applies precedence in the following order
1. $SPLUNK_HOME/etc/system/local
2. $SPLUNK_HOME/etc/apps/<app_name>/local**
3. $SPLUNK_HOME/etc/apps/<app_name>/default
4. $SPLUNK_HOME/etc/system/default
**Note that within the $SPLUNK_HOME/etc/appsdirectory individual apps get
precedence based on ASCII alphabetical order. So an app called aardvark
would have precedence over the windows app. But an app called 1windows
would have precedence over aardvark since numbers come before letters in
ASCII. Also note that ASCII order is not numerical order, so 1 would come before
2, but 10 would also come before 2!
133
Index time precedence

$SPLUNK_HOME
$SPLUNK_HOME
etc
etc
apps
apps
system
system
6
1
default
default
users
users
unix
unix
local
local
default
default
search
search
4
local
local
default
default
joe
joe
134
admin
admin
unix
unix
search
search
local
local
local
local
local
local
2 and 3 in ASCII order by app name
mary
mary
Search time precedence

Search time has the following precedence order
1. $SPLUNK_HOME/etc/users/<username>/<app_context>/local**
2. $SPLUNK_HOME/etc/apps/<app_context>/localanddefault**
3. $SPLUNK_HOME/etc/system/local
4. $SPLUNK_HOME/etc/apps/<app_by_ASCII>/local***
5. $SPLUNK_HOME/etc/apps/<app_by_ASCII>/default
6. $SPLUNK_HOME/etc/system/default
- ** app_context is the app the user is currently in/using and username refers to
the actual user name the user logged in as

- ***app_by_ASCII refers to the ASCII order referred to in the previous slide
135
Search time precedence

Example:
mary working in the
unix app context
$SPLUNK_HOME
$SPLUNK_HOME
etc
etc
apps
apps
system
system
7
4
default
default
users
users
unix
unix
local
local
default
default
search
search
6
local
local
default
default
joe
joe
local
local
After 3, the earlier pattern applies
136
mary
mary
admin
admin
unix
unix
search
search
local
local
local
local
Precedence is cumulative
At index time if $SPLUNK_HOME/etc/system/local/props.conf contained this stanza
[source::/opt/tradelog/trade.log]
sourcetype=tradelog
And if $SPLUNK_HOME/etc/apps/tradeapp/local/props.confcontained
SHOULD_LINEMERGE=True
BREAK_ONLY_BEFORE=TradeID
Becomes
sourcetype=tradelog
137
However
At index time if $SPLUNK_HOME/etc/system/local/props.confcontained the following stanza
sourcetype=tradelog
And if $SPLUNK_HOME/etc/apps/tradeapp/local/props.conf contained

sourcetype=log_of_trade
Becomes:
sourcetype=tradelog
138
Section 5:
Splunks Data Store
139
Section Objectives
Learn index directory structure
Answer the question: What are buckets? and describe how they
move from hot to cold
Describe how to configure aging and retention times
Show how to set up indexes
Learn how to set up volumes on hard disk
Describe back up strategies
Show how to clean out an entire index or selectively delete data
140
Splunks default indexes

Splunk ships with several indexes already set up
main the default index, all inputs go here by default (called defaultdb in the file
system)
summary default index for summary indexing system
_internal Splunk indexes its own logs and metrics from its processing here
_audit Splunk stores its audit trails and other optional auditing information
_thefishbucket Splunk stores file information for its monitor function
141
Index locations in the file system

$SPLUNK_HOME/var/lib/splunk
$SPLUNK_HOME/var/lib/splunk
os
os
defaultdb
defaultdb
$SPLUNK_DB
_internaldb
_internaldb
etc
etc
index=main
db
db
colddb
colddb
thaweddb
thaweddb
hot / warm
buckets
cold
buckets
unarchived
buckets
Each index has three subdirectories
142
Index divisions
Splunk divides its indexes into 3 sections, plus a special restored from
archive section, for fastest searching and indexing
- Hot most recently indexed events, multiple buckets, read and write, same
directory as warm
- Warm next step in the aging process, multiple buckets, read only, same
directory as hot
- Cold final step in the aging process, multiple buckets, read only, separate
directory from warm and hot
- Thawed restored from archive data, read only, separate directory from the
rest
143
What are buckets?

Buckets are logical groupings of indexed data based on time range
- Starting in the hot section, Splunk divides its indexed data into buckets based
on their time range
Periodically, Splunk runs the optimize process on the hot section of the index to optimize
the placement of events in the buckets
Once a hot bucket reaches its size limit, it will be automatically rolled into warm
Default bucket size is set automatically by Splunk at install based on OS type
- Once rolled into warm, each individual bucket is placed in a directory with 2 time
stamps and an id number as the directory name

- Splunk uses buckets to limit its searches to the time range specified pulling
recent results from hot right away, then those from warm or cold after that
144
Bucket retention times

Hot buckets are segregated by date ranges
- Will roll from hot to warm once max size is met OR no data has been added to
a particular hot bucket in 24 hours
Warm by default contains 300 buckets (default)

- When bucket 301 is created, oldest is rolled into cold
Cold will keep a bucket for six years (default)

- Once the youngest event in a bucket turns 6, it will be moved to frozen
- Buckets in frozen are either archived or deleted (deleted is the default)
145
Configuring and adding indexes

You can configure existing indexes by using the Splunk Web, the CLI,
or editing indexes.conf
You can add new indexes by Splunk Web, CLI, or editing
indexes.conf
- Certain parameters are only set in indexes.conf
146
Adding or editing indexes with Splunk Web

Max bucket size can be set
manually
For daily indexing rates higher
than 5 GB a day set it to
auto_high_volume
- This will give you 1 GB (32-bit) or 10
GB (64-bit) buckets
Set to auto will give you 750 MB

buckets for both
Adding an index requires restart
147
Set up and edit indexes indexes.conf

Indexes are controlled by
indexes.conf
- Global settings like default
defaultD atabase = w ebfarm
database appear before the specific

index stanzas
- Each index has its own stanza with
the name of the index in [ ]
[w ebfarm ]
hom ePath = h:\splunk_index\db
coldPath = h:\splunk_index\colddb
thaw edPath = h:\splunk_index\thaw db
148
Set up and edit indexes indexes.conf (cont)

Some per index settings
- Change number of buckets in
warm
- Max total data size (in MB)
If data grows beyond this number,

Splunk will automatically move cold
buckets to frozen
This setting takes precedence over all
other time/retention settings
[w ebfarm ]
coldPath = h:\splunk_index\colddb
m axW arm D BCount = 150
m axTotalD ataSizeM B = 850000
frozenTim ePeriodInSecs = 2598000
- frozenTimePeriodInSecs = time in
seconds buckets will stay in cold

149
Cold to frozen
Frozen is either archive or oblivion default is deletion
To archive you must define :
- coldToFrozenPath - location where Splunk automatically archives frozen data
- Splunk will strip away the index data and only stores the raw data in the frozen
location
- Frozen can be slow inexpensive NAS, tape, etc.
- Older versions of Splunk used cold to frozen scripts, those are still supported,
though if you specify both a coldToFrozenPath and a coldToFrozenScript the
path setting will take precedence
150
Editing index settings in Manager

Navigate to Manager >>
Indexes
Select the index to view and
change the settings
151
Storing cold in a separate location

Warm and hot live in the same
directory
Cold is separate and can be
moved to a different location
- Specify the new location for cold in
indexes.conf or in Manager
[w ebfarm ]
coldPath = \\fi
ler\splunk_cold\colddb
m axTotalD ataSizeM B = 850000
152
Storage volumes
You can specify locations and maximum size for index partitions using
volume stanzas
- Handy way to group and control multiple indexes
- Volume size limits apply to all indexes that use the volume
Use
Use volumes
volumes in
in index
index definitions
definitions
Create
Create volumes
volumes in
in indexes.conf
indexes.conf
[volum e:hotN w arm ]

path = g:\superRAID
m axVolum eD ataSizeM B = 100000
[volum e:cold]
path = \\slow N AS\splunk
m axVolum eD ataSizeM B =
1000000
[netw ork]
hom ePath = volum e:hotN w arm \netw ork
coldPath = volum e:cold\netw ork
Be
Be sure
sure to
to use
use subdirectories
subdirectories for
for your
your indexes
indexes to
to
avoid
avoid collisions
collisions
153
Moving an entire index

To move an index requires 4 steps
Stop Splunk
2. Copy the entire index directory to new location being sure to preserve
permissions and all subdirectories verify copy
3. Edit indexes.conf to indicate the new location
4. Restart Splunk
1.
Use cprp on UNIX or robocopy on Windows
154
Backups: What to backup

3 main categories
Indexed event data
- Both the actual log data AND the Splunk index
- $SPLUNK_HOME/var/lib/splunk/
User data
- Things such as event types, saved searches, etc.
- $SPLUNK_HOME/etc/users/
Splunk configurations
- Configuration files updated either by hand or Manager
- $SPLUNK_HOME/etc/system/local
- $SPLUNK_HOME/etc/apps/
155
Backups: How
Recommended method
Using the incremental backup of your choice backup:
- Warm and cold sections of your indexes
- User files
- Archive or backup configuration files
- Hot cannot be backed up without stopping Splunk
Recommended methods of backing up hot

- Use the snapshot capability of underlying file system to take a snapshot of hot,
then backup the snapshot

- Schedule multiple daily backups of warm (works best for high data volumes)
156
Rolling hot into warm

Why?
- If your indexing rate is low, and as a result your hot doesnt roll into warm often
enough making you worried about losing data in hot between backups
How
- Roll the hot db into warm with a script right before backing up
- Restarting splunkd also forces a roll from hot to warm
- Example roll command for the CLI
./splunk _internal call /data/indexes/<index_name>/rollhot-buckets

Be careful about too many forced rolls to warm, too many warm buckets can greatly
impact search performance
157
Deleting data: who

The delete command can be used to permanently remove data form
Splunks data store
By default, even the admin role does not have the ability to run this
command
- It is not recommended to give this ability to the admin role
- Instead, allow a few users to log in to a role specifically set up for deletions
- Create a user thats part of the Can_delete role
158
Deleting data: how

Log in to Splunk Web as a user of the Can_delete role
Craft a search that identifies the data you wish to delete
- Double check that the search ONLY includes the data you wish to delete
- Pay special attention to which index you are using and the time range
- Once youre certain youve targeted only the data you want to delete, pipe the
search to delete
- Note that this is a virtual delete. Splunk marks the events as deleted and they
will never show in searches again, but they will continue to take up space on
disk.
159
Cleaning out an index

Splunk clean all will remove users, saved searches and alerts
Other options:
- clean[eventdata|userdata|all][indexname][f]
eventdata - indexed events and metadata on each event
userdata - user accounts - requires a Splunk license
all - everything on the server
If no index is specified, the default is to clean all indexes
SO ALWAYS SPECIFY AN INDEX TO AVOID TEARS
160
Restoring a frozen index

To thaw, move a copy of the bucket directory to an index directory
- ./splunkrebuild<bucketdirectory>will rebuild the index
- Will also work to recover a corrupted directory
- Does not count against license
- Must shutdown splunkd before running ./splunkrebuildcommand
161
Section 6:
Users, Groups, and
Authentication
162
Section Objectives
Understand user roles in Splunk
Create a custom role
Understand the methods of authentication in Splunk
163
Manage users and roles
164
User roles
There are three built-in user roles:
Admin, Power, User
(Can Delete is a special case already covered)
Administrators can configure custom roles
- Name the role
- Specify a default app
- Define the capabilities for the role
- Limit the time ranges the role can use
- Specify both default and accessible indexes
New roles available in Splunk Manager Access controls option

165
Custom user roles set restrictions

Give the role a name and
select a default app
Set restrictions
- Search terms restrict
searches on certain fields,

sources, hosts, etc
- Time range default is -1
(no restriction). Set time
range in seconds
166
Custom user roles set limits

Set limits (optional)
- Limits are per-person
167
Custom user roles inherit

Custom roles can be based on standard roles
Administrators can then add or remove capabilities
of the imported role
168
Custom user roles capabilities

Add or remove capabilities
See authorize.conf.spec or
http://www.splunk.com/base/Documentation/latest/Ad
min/authorizeconf
for details
169
Custom user roles indexes

You can specify which indexes this role is allowed to
search as well as which are searched by default
170
Splunk authentication users

Specify user name, email, and
default app
171
Splunk authentication users (cont.)

Assign a role and set password
172
LDAP authentication
Splunk can be configured
to work with most LDAP
including Active Directory
LDAP can be configured
from Splunk Manager
www.splunk.com/base/Documentation/latest/Admin/
SetUpUserAuthenticationWithLDAP
173
Scripted Authentication
Leverage existing PAM or RADIUS authentication systems for Splunk
For the most up-to-date information on scripted authentication, see the
README file in
$SPLUNK_HOME/share/splunk/authScriptSamples/
There are also sample authentication scripts in that directory
174
Single Sign On
Authentication is moved to a web proxy which passes along
authentication to Splunk Web
Auth
Auth server
server
2 Proxy authorizes client
SSO
SSO client
client
Splunk
Splunk server
server
1 Splunk request
3 Proxy passes request with user name

4 Splunk Web returns page to proxy
5 Proxy returns page

Proxy
Proxy server
server
175
Lab
176
Section 7: Forwarding
and Receiving
177
Section objectives
Understand forwarders
Compare forwarder types
Examine topology examples
Deploy and configure forwarders
178
Splunk forwarder types

Universal forwarder
- Streamlined data-gathering agent version of Splunk with a separate installer
- Contains only the essential components needed to forward raw or unparsed data to receivers/indexers
- Cannot perform content-based routing
- In most cases, best tool for forwarding data
- throughput limited to 256kbps
Light forwarder
- Full Splunk in Light forwarder mode (no separate install), otherwise works the same as Universal forwarder
Heavy forwarder
- Full Splunk instance does everything but write data to index
- Breaks data into events before forwarding
- Can handle content-based routing
179
Comparing forwarders
If you need to
use
Forward unparsed data to a
Universal forwarder
receiver or indexer
Collect data on a forwarder that
Light forwarder
requires a python-based scripted
input
Route collected data based on
event info or filter data prior to
WAN/slower connection
Heavy forwarder
180
Forwarder topology: data consolidation

Most common topology
Multiple forwarders send data to a central indexer
181
Forwarder topology: load balancing

Distributes data across multiple indexers
Forwarder routes data sequentially to different indexers at
specified intervals with automatic failover
*
* Requires
distributed search
covered later in
this section
182
Setting up forwarders big picture

1. Enable receiving on your indexer(s)
2. Install forwarders on production systems
3. Configure forwarders to send to receivers
4. Test connection with small amount of test data
5. Setup inputs on forwarders
6. Verify inputs are being received
183
Configure forwarding and receiving - Manager

You can set up basic forwarding
and receiving using Manager
184
Set up receiving port Splunk Web

Specify TCP port you wish
Splunk to listen on and click
save
- NOT Splunk Web or
splunkd ports
185
Enable Indexer to indexer forwarding/receiving

You can easily forward indexed data from one Splunk server to
another
Useful for replication across sites or forwarding one type of data to a
different indexer
186
Enable forwarding Splunk Web

Enter either the hostname or IP address with the port of the receiving
server
- If multiple hosts are defined, you can optionally select Automatic Load
Balancing
Restart required
187
Install universal forwarder: Windows

The Windows version of Universal forwarder includes an Install Shield
package that guides you through most of the forwarders configuration
If the installer detects an earlier version of Splunk Forwarder you can:
- Automatically perform a migration during installation
- Fishbucket info is migrated, config files are NOT
- Install UF in a different location to preserve legacy forwarder
188
Install universal forwarder: Windows (cont.)

If using a deployment server,
indicate the hostname or IP and
port
- Deployment server is covered in a
later module
Indicate the receiving indexer

hostname or IP and port
- Must be listening port of indexer
- Skip if using deployment server
189

Choose to forward from local or
remote
- If remote, enter domain, username
and password for remote host on

next screen
190

Enable Windows inputs
- Event logs
- Performance monitoring
- AD monitoring
Clicking next begins the

installation
- You can update your universal
forwarder's configuration post-install

by directly editing its
inputs.conf and
outputs.conf
191
Install universal forwarder: Windows CLI

Use the CLI installation method when:
You want to install the universal forwarder across your enterprise via a
deployment tool
You do not want the universal forwarder to start immediately after
installation
- Include LAUNCHSPLUNK=0in the install command
You want to clone a system image for cloning that includes a Universal
Forwarder
192
Install universal forwarder: Windows CLI (.cont)

Run as Local System user and request configuration from
deploymentserver1
- For new deployments of the forwarder
-
msiexec.exe/isplunkuniversalforwarder_x86.msi
DEPLOYMENT_SERVER="deploymentserver1:8089"AGREETOLICENSE=Yes/quiet
Run as a domain user but dont launch immediately

- Prepare a sample host for cloning
-
msiexec.exe/isplunkuniversalforwarder_x86.msiLOGON_USERNAME="AD\splunk"
LOGON_PASSWORD="splunk123"DEPLOYMENT_SERVER="deploymentserver1:8089"
LAUNCHSPLUNK=0AGREETOLICENSE=Yes/quiet
193
Install universal forwarder: Windows CLI (.cont)

Enable indexing of the Windows security and system event logs run
installer in silent mode
- Collect just the Security and System event logs through a "fire-and-forget"
installation
-
RECEIVING_INDEXER="indexer1:9997"WINEVENTLOG_SEC_ENABLE=1
WINEVENTLOG_SYS_ENABLE=1AGREETOLICENSE=Yes/quiet
Migrate from an existing forwarder run installer in silent mode

- Migrate now and redefine your inputs later
-
RECEIVING_INDEXER="indexer1:9997"MIGRATESPLUNK=1AGREETOLICENSE=Yes/quiet
194
Install universal forwarder: *nix

Install as you would full Splunk instance, replacing the package name
- rpmisplunkuniversalforwarder_package_name.rpm
Start Splunk and accept license

Configure the following options
- Auto start: splunkenablebootstart
- Deployment server:
splunksetdeploypoll<host:port>
- Client without deployment server: splunkenabledeployclient
- Forward to an indexer: splunkaddforwardserver<host:port>
Configure inputs via inputs.conf

195
Migrate to universal forwarder: *nix

You can migrate checkpoint data from an existing *nix light forwarder (version 4.0 or later) to the
universal forwarder
- Important:
Migration can only occur the first time you start the universal forwarder, post-installation. You
cannot migrate at any later point
1.
2.
3.
4.
5.
Stop all services on the host

Install the universal forwarder do not start
In the installation directory, create a file $SPLUNK_HOME/old_splunk.seedthat contains a single line
with the path of the old forwarder's $SPLUNK_HOMEdirectory
Start the universal forwarder
Edit / add configurations
Migration process only copies checkpoint files you should manually copy over the old
forwarder's inputs.conf
196
Forwarding configurations
inputs.conf on the forwarder gathers the local logs/system info
needed
- You can include input phase settings in props.conf on light forwarders
- Per-event processing must be done on the indexer
outputs.conf points the forwarder to the correct receiver(s)

- If you set up forwarding in Splunk Manager, it will reside in the app context you
were in when you enabled it

- If creating by hand, best practice is to place it in
$SPLUNK_HOME/etc/system/local
197
Outputs.conf basic example

Main [tcpout] stanza has global
settings
[tcpout:web_indexers] stanza
sets up receiving server
- Compression is turned on
- Server setting refers to either the
IP or host name plus port of

receiver
[tcpout]
Global
Global settings
settings
defaultGroup=web_indexers
disabled=false
[tcpout:web_indexers]
Receiving
Receiving server
server
server=splunk1.company.com:9997
compressed=true
[tcpoutserver://splunk1.company.com:9997]
198
Outputs.conf indexer to indexer clone

Main [tcpout] stanza has global
settings such as whether to
index a local copy
[tcpout]
[tcpout:uk_clone] stanza sets

up receiving server
[tcpout:uk_clone]
Global
Global settings
settings
IndexAndForward=true
Receiving
Receiving server
server
Compressed=true
Server=uk_splunk.company.com:9997
- Compression is turned on
- Server setting refers to either the
IP or host name plus port of

receiver
199
Outputs.conf single indexer and SSL

Each forwarder would have a copy of outputs.conf with the
following stanza
- Additionally the forwarders would be sending using SSL using Splunks self-
signed certificates
[tcpout:indexer]
server=splunk.company.com:9997
sslPassword=ssl_for_m3
sslCertPath=$SPLUNK_HOME/etc/auth/server.pem
sslRootCAPath=$SPLUNK_HOME/etc/auth/cacert.pem
200
Outputs.conf clone indexers

Set multiple target groups to get forwarders to send exact
copies to multiple indexers
[tcpout:indexer1]
server=splunk1.mycompany.com:9997
[tcpout:indexer2]
server=splunk2.mycompany.com:9997
201
Auto load balancing

Splunk also offers automatic load balancing, which switches from
server to server in a list based on a time interval
Two options:
- static list in outputs.conf (see below)
- DNS list based on a series of A records for a single host name
[tcpout:list_LB]
autoLB=true
server=splunk1.company.com:9997,splunk2.company.com:9997
202
Auto load balancing DNS list

To set up DNS list load balancing create multiple A records with
the same name with the IP address of each indexer
From
From DNS
DNS zone
zone file
file
splunk1A10.20.30.40
splunk2A10.20.30.41
splunk1bA10.20.30.40
splunk1bA10.20.30.41
[tcpout:DNS_LB]
autoLB=true
server=splunk1b.mycompany.com:9997
autoLBFrequency=60
203
Caching/queue size in outputs.conf

maxQueueSize = 1000 (default) is the number of events the
forwarder will queue if the target group cannot be reached
In load-balanced situations, if the forwarder cant reach one of
the indexers, it will automatically switch to another, and will only
queue if all are down/unreachable
See outputs.conf.spec for details and even more queue
settings
204
Indexer Acknowledgement
Guards against loss of data when forwarding to an indexer
- Forwarder will re-send any data not acknowledged as "received" by the indexer
Disabled by default
Requires version 4.2 of both forwarder and receiver
Can also be used for forwarders sending to an intermediate forwarder
205
Indexer Acknowledgement process

As forwarder sends data, it maintains a copy of each 64k block in
memory in the wait queue until it gets an acknowledgment from the
indexer
- While waiting, it continues to send more data blocks
The indexer receives a block of data, then parses and writes to disc
Once on disc, indexer sends acknowledgment to forwarder
Upon acknowledgment, the forwarder releases the block from memory
- If the wait queue is of sufficient size, it doesn't fill up while waiting for
acknowledgments to arrive
- Wait queue size can be increased (covered in a later slide)
206
What happens when no ack is received?

If the forwarder doesn't get acknowledgment for a block within 300
seconds (by default), it closes the connection
- Change wait time by setting readTimeout in outputs.conf
- If auto load balancing is enabled, it opens a connection to the next indexer in
the group and sends the data

- If auto load balancing is not enabled, it tries to open a connection to the same
indexer as before and resend the data
Data block is kept in the wait queue until acknowledgment is received

- Once wait queue fills, forwarder stops sending until it receives acknowledgment
for one of the blocks, at which point it can free up space in the queue.
207
Handling duplicates
If there's a network problem that prevents an acknowledgment from
reaching the forwarder, dupes may occur
- Example: indexer receives a data block then generates the acknowledgment
network goes down before forwarder gets ack.

When network comes back up, forwarder resends the data block indexer
parses and writes
Forwarders will record events to splunkd.log when it receives duplicate

acks or resends due to no response
208
Enabling Indexer Acknowledgement

Enabled on the forwarder
- Both forwarder and indexer must be at version 4.2 or greater
Set useACK to true in outputs.conf

[tcpout:<target_group>]server=<server1>,<server2>,...useACK=true
- Disabled by default
You can set useACK either globally or by target group, at the

[tcpout]or[tcpout:<target_group>] stanza levels
You cannot set it for individual servers at the
[tcpoutserver:...]stanza level
209
Increasing wait queue size

Max wait queue size is 3x the size of the in-memory output queue,
which you set with the maxQueueSize attribute in outputs.conf
maxQueueSize=[<integer>|<integer>[KB|MB|GB]]
Wait queue and the output queues are configured by the same
attribute but are separate queues
- Example: if you set maxQueueSize to 2MB, the maximum wait queue size will
be 6MB
Specifying a lone integer - maxQueueSize=100 sets max

events for parsed data and max blocks (~64K) for unparsed data
210
Forwarding to an intermediate forwarder

Two main possibilities to consider:
Originating forwarder and intermediate forwarder both have
acknowledgment enabled
- Intermediate forwarder waits until it receives acknowledgment from the indexer and
then sends acknowledgment back to the originating forwarder
Originating forwarder has acknowledgment enabled - intermediate

forwarder does not
- Intermediate forwarder sends acknowledgment back to the originating forwarder as
soon as it sends the data on to the indexer

- Because it doesn't have useACK enabled, the intermediate forwarder cannot verify
delivery of the data to the indexer
211
Lab
212
Section 8: Distributed
Environments
213
Objectives
List Splunk server types
Understand Distributed search
Describe search head pooling
Understand Deployment server
214
Types of Splunk server

indexer
indexer
universal
universal forwarders
forwarders
search
search head
head
Indexers gather data

from inputs and
forwarders, process
it and write it to
disk.
Separate install.
Gathers data and
forwards to indexer.
Search peer
accessed by users.
Runs ad-hoc and
scheduled
searches/alerts.
Distributes
searches out to all
peers and
combines results.
heavy
heavy forwarders
forwarders
Gather or receives
data, processes it
and then forwards
on to indexer.
215
Data lifecycle review

Four main phases in the data
lifecycle
forwarders
forwarders
Collect raw data

and send to indexer
- Input
Splunk forwarder or full Splunk
- Parsing
Splunk heavy forwarder or indexer
- Indexing
Indexer
- Search
Search head
Parse data line breaks,

timestamps, index-time field
extractions, save to disc and
index
Pull events from index,

search-time field extractions,
display events, reports, etc.
216
indexer
indexer
search
search head
head
Distributed Environments Overview

The next three sections will introduce you to common topologies and
tools used in distributed environments
Distributed Search
- Search across multiple indexes
Search Head Pooling

- Multiple search heads share configuration data
Deployment Server
- Manage multiple, varying Splunk instance configurations from a single server
217
Distributed Search
218
Distributed search overview

Search heads send search requests to multiple indexers and merge
the results back to the user
In a typical scenario, one Splunk server searches indexes on several
other servers
Used for
- Horizontal scaling across multiple indexers used for high volume data scenarios
- Accessing geo-diverse indexers
- Access control
- High availability scenarios
219
Distributed search topology examples

Simple distributed search for horizontal scaling one search head
searching across three peers
220
Distributed search topology examples (cont.)

Access control example department search head has access to all
the indexing search peers
- Each search peer also has the ability to search its own data
- Department A search peer has access to both its data and the data of
department B
221
Distributed search topology examples (cont.)

Load balancing example provides high availability access to data
222
Distributed Search setup - Manager

Turn on Distributed
search and optionally
turn on auto-discovery
- Allows this Splunk server to
automatically add other

search peers it discovers
on the network
223
Distributed Search Add Peers - Manager

Add individual peers
manually
- Include authentication
224
Search Head Pooling
225
Search head pooling overview

Multiple search heads can share configuration data
Allows horizontal scaling for users searching across the same data
Also reduces the impact if a search head becomes unavailable
Shared resources are:
- .conf files
- Search artifacts saved searches and other knowledge objects
- Scheduler state only one search head in the pool runs a particular scheduled search
Makes all files in $SPLUNK_HOME/etc/{apps,users}available for

sharing .conf files, .meta files, view files, search scripts, lookup tables, etc.
All search heads in a pool should be running same version of Splunk
226
Topology example with loadbalancer

NFS
NFS or
or other
other
similar
similar
technology
technology
User
logs
in
Layer
Layer 77 Load
Load
Balancer
Balancer
227
Topology example without loadbalancer

NFS
NFS
User
logs User
in logs
in
User
logs
in
228
User
logs
in
Create a pool of search heads

Set up each search head individually in the same manner as
configuring distributed search
1. Set up shared storage that each search head can access
- For *nix, use NFS mount
- For windows, use CIFS (SMB) share
- The Splunk user account needs read/write access to shared storage
2. Stop splunkd on all search heads in pool
229
Enable each search head

3. Use the pooling enable CLI command to enable pooling on a
search head.
splunkpoolingenable<path_to_shared_storage>[debug]
- On NFS, <path_to_shared_storage>is NFS's mountpoint.
- On Windows, <path_to_shared_storage>is UNC path of the CIFS/SMB
- Execute this command on each search head in the pool. The command:
Sets values in the [pooling]stanza of the server.conf file in

Creates user and app subdirectories
230
Copy user and app directories to share

4. Copy the contents of $SPLUNK_HOME/etc/appsand
$SPLUNK_HOME/etc/usersdirectories on existing search heads
into the empty apps and users directories on the shared storage
- For example, if your NFS mount is at /tmp/nfs, copy the apps subdirectories
into /tmp/nfs/apps
- Similarly, copy the user subdirectories: $SPLUNK_HOME/etc/users/into
/tmp/nfs/users
5. Restart each search head in the pool
231
Using a load balancer

Allows users to access the pool of search heads through a single
interface, without needing to specify a particular one
Ensures access to search artifacts and results if one of the search
heads goes down
When configuring the load balancer:
- The load balancer must employ layer-7 (application-level) processing
- Configure the load balancer so user sessions are "sticky" or "persistent to
ensure that a user remains on a single search head throughout a session
232
Search head management commands

splunkpoolingvalidate
- Revalidate the search head's
access to shared resources
splunkpoolingdisable
$
$
- Disables pooling for a given search
pooling
pooling
enable /opt/splunk
display
Search head pooling is enabled with

shared storage at: /tmp/nfs
head
splunkpoolingdisplay
$ splunk pooling disable

$ splunk pooling display
- Displays / verifies current status of
search head
splunk
splunk
Search head pooling is disabled
233
Configuration changes
Once pooling is enabled on a search head, you must notify the search
head if you directly edit a .configfile
If you add a stanza to any config file in a local directory, you must run
the following command:
splunkbtoolfixdangling
Not necessary if you make changes via Splunk Web Manager or CLI
234
Deployment Server
235
Deployment server overview

The deployment server pushes out configurations and content
packaged in deployment apps to distributed clients
Allows you to manage multiple Splunk instances from a single Splunk
server
- Small environments deployment server can also be a deployment client
- Greater than 30 deployment clients deployment server should be its own
instance
236
Deployment Terminology
Deployment server
- A Splunk instance that acts as a centralized configuration manager
- Supplies configurations to any number of Splunk instances
- Any Splunk instance can act as a deployment server
Deployment client
- Splunk instances that are remotely configured
- A Splunk instance can be both a deployment server and client at the same time
Server class
- A logical grouping of deployment clients based on need for the same configs
Deployment app
- Set of deployment content (including configuration files) deployed as a unit to clients of a server
class.
237
Deployment server uses

Distribute Apps and/or configurations
- Windows file servers
Splunk for Windows App

Collect event logs and WMI
- Database group
Uptime, system health, access errors
- Web Hosting Group
Analytics, business intelligence
238
Server Classes examples

Windows
- Windows Server 2003
- IIS
Database
- Solaris servers (sunos-sun4u)
- Oracle
Web hosting group

- Apache on Linux
Could also group clients by OS, Hardware type, location, etc.

239
Deployment server example

www1-forwarder
www2-forwarder
www3-forwarder
db1-forwarder
db2-forwarder
db-loggingforwarder server
class
www-forwarder
server class
Deployment server
240
Deployment server configuration overview

1. Designate a Splunk instance as deployment server
2. Create serverclass.conf on the deployment server at
3. Create deployment apps on the deployment server and put the
content to be deployed into directories
4. Create deploymentclient.conf on the Deployment clients
5. Restart the deployment clients
241
Deployment serverclass.conf (cont.)

Server classes group clients that need the same configuration
If filters match the apps and configuration, content is deployed to the client
Stanzas in serverclass.confgo from general to more specific
All configuration information is evaluated from top to bottom in the
configuration file, so order matters
Applies
Applies to
to all
all
server
server classes
classes
Server-class
Server-class
specific
specific settings
settings
[global]
repositoryLocation=$SPLUNK_HOME/etc/deploymentApps
targetRepositoryLocation=$SPLUNK_HOME/etc/apps
Where
Where apps
apps are
are stored
stored on
on
the
the deployment
deployment server
server
Where
Where apps
apps will
will be
be
delivered
delivered on
on the
the client
client
[serverClass:AppsByMachineType]
[serverClass:AppsByMachineType:app:win_eventlog]
242
Server classes example serverclass.conf

[serverClass:wwwforwarder]
filterType=blacklist
blacklist.0=*
www-forwarder
server class
whitelist.0=*.10.1.1*
Server
Server class
class only
only applies
applies to
to
clients
clients in
in the
the 10.1.1*
10.1.1* IP
IP range
range
[serverClass:wwwforwarder:app:webfarmforwarders]
stateOnClient=enabled
Deploy
Deploy this
this
app
app to
to
clients
clients that
that
match
match
[serverClass:dbloggingforwarder]
filterType=blacklist
db-loggingforwarder server
class
blacklist.0=*
whitelist.0=*.192.2*
[serverClass:dbloggingforwarder:app:dbforwarder]
stateOnClient=enabled
Server
Server class
class only
only applies
applies to
to
clients
clients in
in the
the 192.2*
192.2* IP
IP range
range
243
Deploy
Deploy this
this
app
app to
to
clients
clients that
that
match
match
serverclass.conf group by machine type

You can create server classes that apply to specific machine types or
OSs
Deploy this app only to Windows
Deploy this app only to Windows

machines
machines
[serverClass:AppsByMachineType:app:SplunkDesktop]
machineTypes=WindowsIntel
[serverClass:AppsByMachineType:app:unix]
Deploy
Deploy this
this app
app only
only to
to Linux
Linux 32
32
or
or 64
64 bit
bit machines
machines
machineTypes=linuxi686,linuxx86_64
244
serverclass.conf client handling options

Optionally configure actions to take on the client after an app is
deployed
Defaults
Defaults to
to false
false
restartSplunkWeb=<TrueorFalse>
Defaults
Defaults to
to true
true
restartSplunkd=<TrueorFalse>
stateOnClient=<enabled,disabled,noop>
Enable
Enable or
or disable
disable apps
apps on
on the
the
client
client after
after installation
installation or
or change
change
245
Setup Deployment Client

Install Splunk on the client machine
Run the following command
./splunksetdeploypoll<ipaddress/hostnameofdeployment
server>:8089authadmin:changeme
This will create a file named deploymentclient.conf

[deploymentclient]
disabled=false
[targetbroker:deploymentServer]
targetUri=225.225.225.1:8089
246
URI
URI of
of deployment
deployment
server
server
Verify deployment Server clients

From the deployment server, you can verify deployment clients from CLI with
the following command:
./splunklistdeployclients
Command
Command output
output
Deploymentclient:ip=192.168.2.4,dns=192.168.2.4,
hostname=mycompanyPC64,mgmt=8089,build=64889,
name=deploymentClient,
id=connection_192.168.2.4_8089_192.168.2.4_deploymentClient,
utsname=windowsunknown
247
Deployment actions
Default poll period is 30 seconds
client
- Specified in serverclass.conf
The deployment server instructs

the client what it should retrieve
Poll
Poll server
server
Send
Send
instructions
instructions
The deployment client then

retrieves the new content
Get
Get content
content
deployment server
248
Force-notify clients of changes

If you make changes to a deployment app on the deployment server,
you may want to immediately notify the clients of the change
Run ./splunkreloaddeployserverto notify all clients
Run ./splunkreloaddeployserverclass<class
name>to notify a specific class
249
Section 9: Licensing
250
Section Objectives
Identify license types
Understand license violations
Define license groups
Define license pooling and stacking
Add and remove licenses
251
Splunk license types

Enterprise license
- Purchased from Splunk
- Allows for full functionality
- License limits indexing volume
Enterprise trial license downloads with product

- 500mb per day limit
- Otherwise same as enterprise, except that it expires 60 days after install
252
Splunk license types (cont.)

Forwarder license
- Applied to non-indexing forwarders, and deployment servers
- Allows authentication, but no indexing
Free license
- Activates automatically when 60 day trial enterprise license expires
- Can be activated before 60 days by using Manager
- Doesnt allow authentication, forwarding to non-Splunk servers, or alerts
- Does allow 500mb/day of indexing and forwarding to other Splunk instances
253
License warnings and violations

5th warning in a rolling 30 day period causes violation and search to be
disabled
- 3rd warning in Free version
- You must be good for 30 consecutive days for warning number to reset
Indexing will continue, only search is locked out

- Note that you can still search Splunks internal indexes
Contact Splunk Support to unlock your license
254
License groups
License types are organized into groups
- Enterprise Group
Includes Enterprise, Enterprise Trial, and sales trial
- Free Group
- Forwarder Group
Licenses are stored in directories at

$SPLUNK_HOME$/etc/licenses
- Each group is stored in a separate folder under that directory
255
License stacking and pooling overview

Licenses in the Enterprise group can be aggregated together, or
stacked
- Available license volume is the sum of the volumes of the individual licenses
Enterprise trial license that comes with the Splunk download cannot be
stacked
Free license cannot be stacked
Pools can be created for a given stack
- Specify Splunk indexing instances as members of a pool for the purpose of
volume usage and tracking

- Allows for insulation of license usage by group of indexers or data type
256
Topology example single pool

Master has a stack of two licenses for a total of 500GB
All indexers in the pool share 500GB entitlement collectively
This should be the most common scenario
Enterprise Stack 500 GB Total Entitlement
Default License Pool
500 GB Shared Entitlement
300GB
300GB License
License
200GB
200GB License
License
257
Topology example multiple pools

Master has a stack of two licenses, totaling 500GB
Each pool has a specific entitlement amount
Enterprise Stack 500 GB Total Entitlement
Default
Default Pool
Pool -100GB
100GB local
local
Entitlement
Entitlement
Pool
Pool 22 100GB
100GB
Entitlement
Entitlement
300GB
300GB License
License
200GB
200GB License
License
Pool
Pool 44 100GB
100GB
Entitlement
Entitlement
Asdasd
Asdasd
Pool
Pool 33 200GB
200GB
Entitlement
Entitlement
258
Managing licenses overview

You can manage license stacks
and pools via Manager
- Switch from master to slave
- Change license group
- View license alerts
- Add licenses and manage stacks
- Add and manage pools
259
Managing licenses master/slave

By default, Splunk instances
are master license servers
Change an instance to slave
by entering the master license
server URI
260
Change license group

Each master can only
manage a single license
group
Select Enterprise,
Forwarder, or Free
- Forwarder and Free cannot be
stacked or used in Pools

- Enterprise is default
261
Adding a license
Any 4.x license can be added
- 4.2 licenses can be uploaded, or
XML can be copy/pasted

- 4.0 and 4.1 licenses must be
uploaded
262
License stacks
4.2
4.2 Enterprise
Enterprise
license
license
Enterprise
Enterprise
Stack
Stack
4.1
4.1 Enterprise
Enterprise
license
license
263
License pools
For each stack, you can create
one or more additional license
pools
- Define a maximum volume for the
pool
- Select indexers for the pool
264
Viewing pool volume
Default
Default pool
pool
Added
Added pool
pool
265
Viewing alerts
windows
enterprise
266
Viewing license info master

For each license installed on the
master, you can view specific
license info
- Exp. Date/time
- Features allowed
- Max violations
- Quota
- Stack name and type
- Status
- Violation window period
267
Viewing license info slave

Displays local indexer name, master license server URI, last
successful connection
Messages link displays license alerts
268
Lab
269
Section 10: Security
270
Section objectives
Learn what you can secure in Splunk
Understanding SSL and Splunk
Learn about user group and index security
Learn what is recorded in the audit log
Describe how to secure the audit log
Understand archive data signing
271
What you can secure in Splunk

SSL
- splunkd to Splunk Web
- Splunk Web to client
- forwarder to indexer
Audit
- user actions
- file system
Data Signing
- cold to frozen archive data
- audit data in Splunk
272
SSL
Already enabled between splunkd and Splunk Web
Can be enabled via Splunk Web > Manager or by editing web.conf
- Splunk will automatically generate homemade certificates
- You can pay for certificates to avoid browser complaints
Forwarder to indexer communication can be secured

- Enabled in outputs.conf
- Adds to forwarder processor overhead
Can force Splunk to only use SSLv3 if required
273
Data / Index Security

Securing sensitive data within Splunk is best achieved by segregating
the data by index
Index access is governed by user groups
Index level security is the best method to insure users have access to
the data they need, while preventing them from seeing sensitive data
274
Auditing
Splunk automatically creates an audit trail of Splunk user actions
- Stored in the _audit index
- Accessible only by administrators by default
- Useful for monitoring for prying eyes
Splunk also audits file systems (FS change monitor)

- Use it on /etc/passwordor on Splunks own config files
275
Signing audit data

Splunk has the ability to number and sign audit trail data
- Detects gaps
- Detects tampering
- Created fields called validity and gap in the audit log
- Does not work in distributed environments
- See the Knowledge Base for details on setting this up
http://www.splunk.com/base/Documentation/latest/Admin/Signauditeven
ts
276
Signing archive data

You can sign archive data when it moves from cold to frozen
You must specify a custom archiving script
- You cannot use it if you choose to have Splunk perform the archiving
automatically
Add signing to your script using signtools<archive_path>

Splunk verifies archived data signatures automatically when the
archive is restored
- Verify signatures manually by using signtoolv<archive_path>
277
Splunk Product Security Resources

The Splunk Product Security Portal provides a single location for:
- Splunk Product Security Announcements
- Splunk Product Security Policy
- Splunk Product Security Best Practices
- Reporting Splunk Product Security Vulnerabilities
This site is updated regularly with any security-related updates or

announcements
http://www.splunk.com/page/securityportal
splunk.com > Support > Security
278
Section 11:
Jobs, Knowledge Objects, and
Alerts
279
Section objectives
Understand jobs
Manage jobs
Understand alerts, and alert settings
Understand PDF server and alerts
Understand what knowledge objects are and how to set their
permissions
280
What are jobs

Jobs are searches that users or the system runs
- A job is created when
You hit return in the search box

You load a dashboard with embedded saved searches
An alert is triggered or saved search runs
- Jobs create artifacts when they run
What are artifacts?

- Traces of jobs (such as search results) that are created on disk
- Persistence to disk allows users to recreate or resurrect jobs
281
Managing Jobs Splunk Web

Users can mange their
own jobs
Administrators can
manage all users jobs
Click on Jobs in the
Splunk Web to
manage, rerun, and
resurrect jobs
282
Manage jobs OS level (*nix only)

Search jobs run as processes at the OS level
View search jobs running
Included in the process description will be key information
- the actual search running
- who ran the search
- their role
- the search ID
ps
ps ef
ef || grep
grep splunkd
splunkd search
search
502
00
0:00.05
0:00.26
502 3179
3179 1662
1662
0:00.05 ??
??
0:00.26 splunkd
splunkd search
search
--id=rt_1297105108.42
--id=rt_1297105108.42 --maxbuckets=0
--maxbuckets=0 --ttl=600
--ttl=600 --maxout=10000
--maxout=10000 --maxtime=0
--maxtime=0
--lookups=1
--lookups=1 --reduce_freq=10
--reduce_freq=10 --user=admin
--user=admin --pro
--pro --roles=admin:power:user
--roles=admin:power:user
283
Manage jobs OS level continued

There will be 2 jobs for each process
- 2nd job is the helper it will die if you kill the 1st
job
Running jobs will be writing data to

$SPLUNK_HOME/var/run/splunk/dispatch/<job_id>
- Saved searches will append the name of the saved search to the job_id
directory
- This directory exists for the TTL of the job
- You may need to delete artifact directories for jobs you kill by hand
284
Alerts Review
Alerts are saved searches
that run on a schedule and
do something based on the
data that is returned
Alerts can send an email,
trigger a shell script, or
create an RSS feed
285
Email alert configuration

In the Email Subject field, $name$ is
replaced by the saved search name
You must first configure email alert
settings in Manager
286
PDF report server

Splunk offers the ability to print and email reports in PDF format
You must install the PDF print server add-on on a Linux-based
Splunk instance
- The Splunk instance doesnt have to be an indexer, but cannot be a light
forwarder
See
www.splunk.com/base/Documentation/latest/Installation/Conf
igurePDFprintingforSplunkWeb
for details
287
Scripted alerts
You can have an alert that activates a script
Scripts must be located in $SPLUNK_HOME/bin/scripts
Scripts can be in any language the underlying operating
system can run
Splunk passes a number of variables to the script
For details on variables etc., see the docs:
http://www.splunk.com/base/Documentation/latest/admin/Confi
gureScriptedAlerts
288
Knowledge Objects
Knowledge objects are user-created things such as
- Eventtypes
- Saved Searches
- Field Extractions using IFX (Interactive Field Extractor)
- Tags
Knowledge objects initially are only available to the user who

created them
Permissions must be granted to allow other users/apps to use
them
289
Knowledge object permissions

Users only need read
permissions to use knowledge
objects
Use app context to segregate
app-specific knowledge objects
290
Section 12:
Troubleshooting
291
Section objectives
Learn how to set specific log levels using Manager
Learn basic troubleshooting steps to solve/identify common
issues
Learn how to get community help with Splunk
Understand how to contact Splunk Support
292
Splunks log levels

Log levels from lowest to highest: crit, fatal, error, warn, info, debug
By default all subsystems are set to info or warn
All of Splunks logs can be set to debug by restarting Splunk in debug
mode
- Generally not recommended since its burdensome on production systems and
creates lots of unwanted noise in the logs

- Better to set to debug granularly on the individual subsystem(s) you are
troubleshooting (see next slide)
- Splunk Support may ask for overall debug mode in certain cases
293
Set granular log levels

You can granularly adjust
subsystem log levels to
debug to troubleshoot
specific issues using
Manager
Can also set them using
log.cfg in
$SPLUNK_HOME/etc
(useful for light forwarders)
294
Troubleshooting: check your search

Many times input or forwarder problems are actually
misdiagnosed search problems
Before starting to troubleshoot a missing input or forwarder that
is not forwarding, double check your search
- Sometimes inputs wind up in unexpected indexes so try adding index=*
when searching for a missing input/forwarder

- Sometimes time stamps are extracted wrong on new inputs, try searching
All Time to help diagnose this
- Generally, use wildcards in other parts of your search to cast the widest net
for missing data
295
Deployment monitor
The Deployment Monitor is a collection of dashboards and drilldown
pages with information to help monitor the health of a system
- Index throughput over time
- Number of forwarders connecting to the indexer over time
- Indexer and forwarder abnormalities
- Details for individual forwarders and indexers, such as status and forwarding
volume over time

- Source types being indexed by the system
- License usage
296
Main index throughput and forwarders
297
Main indexer and forwarder warnings
298
Main sourcetype warnings
299
Viewing warning info

Click the arrow icon to view warning information
300
Configuring alerts
Click configure alerting to modify
the underlying saved
search/alert
301
Indexers All Indexers

Number of current active
searches
MB indexed today
- Can select alternate time range
Table report of indexer(s) status,

last connection, and total GB
indexed in last 30 minutes
- Can select alternate time range
302
Indexer Properties
Data specific to a given indexer
- Drill-down from All Indexers view
- Can drill-down on any chart item to
show underlying events
303
All Sourcetypes
Shows MB Received by
sourcetype
Table display shows each
sourcetype, current status, last
received, and total MB received
Drill down on any item for
underlying events
304
Sourcetype info
Drill-down from All
sourcetypes shows info
for single sourcetype
305
License Usage
Cumulative MB per day by
Sourcetype
MB Received
- By sourcetype, source, host, forwarder,
indexer, license pool

- Drill-down shows underlying events in
Search view
Usage statistics
- By sourcetype, source, host, forwarder,
indexer, license pool

- Shows last received and total MB
received
306
Backfill data
Use backfill Summary Indexes to add two-weeks worth of data to the
summary indexes (useful for new Deployment Monitor installation on existing
Splunk instance)
Use Flush and Backfill to erase old data and re-populate
307
Community based support

Splunk docs are constantly being updated and improved, so be sure to
select your version of Splunk to make sure the doc you are reading
applies to your version
http://www.splunk.com/base/Documentation
Splunk Answers: post specific questions and get them answered by
Splunk experts (also makes for great and informative reading)
http://answers.splunk.com
IRC Channel: Splunk maintains a channel #splunk on the EFNet IRC
server. Support engineers and many well-informed Splunk users hang
out there
308
Splunk Support
Contact Splunk Support email: support@splunk.com
File a case online
http://www.splunk.com/index.php/submit_issue
24/7 phone depending on support contract
309
Thanks! Please take our

survey.

Splunk Admin42 Ver1.1

Загружено:

Сведения о документе

Авторское право

Доступные форматы

Поделиться этим документом

Поделиться или встроить документ

Параметры публикации

Этот документ был вам полезен?

Это неприемлемый материал?

Авторское право:

Доступные форматы

Splunk Admin42 Ver1.1

Загружено:

Авторское право:

Доступные форматы

Administering Splunk 4.

Document usage guidelines

March 24, 2011

Administering Splunk 4.2

Administering Splunk 4.2

Administering Splunk 4.2

Administering Splunk 4.2

Administering Splunk 4.2

Administering Splunk 4.2

2x quad-core Xeon, 3GHz, 8GB

1x1.4 GHz CPU, 1 GB

2x quad-core Xeon, 3GHz, 8GB

Administering Splunk 4.2

Administering Splunk 4.2

Download the bits

available from the download page

Obtain your enterprise

Administering Splunk 4.2

Download the right bits

but wont run

Various packages, tarballs,

Administering Splunk 4.2

Administering Splunk 4.2

Step by step instructions

Administering Splunk 4.2

UNIX: to be or not to be root?

you configure data sources

Administering Splunk 4.2

Windows: local or domain user?

See the docs for details

Administering Splunk 4.2

Administering Splunk 4.2

Splunk directory structure

Administering Splunk 4.2

Windows: Starting Splunk

Administering Splunk 4.2

Windows: controlling Splunk services

would any service

You can also control

C:\Program Files\Splunk\bin>splunk start

Administering Splunk 4.2

UNIX: Starting Splunk

Administering Splunk 4.2

UNIX: controlling Splunk processes

Administering Splunk 4.2

UNIX: run Splunk at boot

This modifies or adds a script to /etc/init.d that will

Administering Splunk 4.2

Splunk processes splunkd

Administering Splunk 4.2

Splunk processes Splunk Web

Administering Splunk 4.2

Splunk for Websphere

Splunk for Windows

Administering Splunk 4.2

as well as the community

Administering Splunk 4.2

Managing a Splunk installation

Administering Splunk 4.2

Managing a Splunk installation - CLI

Administering Splunk 4.2

Command line interface (CLI)

Inline help is available

Administering Splunk 4.2