SplunkAdminManual 3.4.6

Splunk Admin Manual
Version: 3.4.6
Generated: 3/19/2010 06:37 am
Copyright Splunk, Inc. All Rights Reserved
Table of Contents
About the Splunk Admin Manual........................................................................................................1
What's in the Admin Manual?......................................................................................................1
How Splunk Works..............................................................................................................................2
Overview of Splunk....................................................................................................................2
Getting Started.....................................................................................................................................9
Start Splunk................................................................................................................................9
Administration basics...............................................................................................................10
Change defaults.......................................................................................................................13
Find and index data..................................................................................................................17
Add more users........................................................................................................................18
Start searching.........................................................................................................................20
Data Inputs.........................................................................................................................................21
How input configuration works.................................................................................................21
Files and directories.................................................................................................................25
Network ports...........................................................................................................................33
Encrypted Inputs......................................................................................................................37
FIFO inputs..............................................................................................................................38
Scripted inputs.........................................................................................................................41
Whitelist and blacklist rules......................................................................................................43
Crawl........................................................................................................................................45
Windows inputs........................................................................................................................47
Windows Management Instrumentation (WMI) input...............................................................49
Windows registry input.............................................................................................................52
Windows process monitoring...................................................................................................55
Data Distribution................................................................................................................................57
How data distribution works.....................................................................................................57
Enable forwarding and receiving..............................................................................................61
Configure target groups in outputs.conf...................................................................................64
Set up routing...........................................................................................................................68
Route specific events to different queues................................................................................70
Route specific events to an alternate index.............................................................................73
Set up SSL for forwarding and receiving..................................................................................75
Enable cloning..........................................................................................................................77
Set up data balancing..............................................................................................................77
Route data to third-party systems............................................................................................79
Indexing..............................................................................................................................................81
How indexing works.................................................................................................................81
Index multi-line events.............................................................................................................83
Configure segmentation...........................................................................................................85
Configure custom segmentation for a host, source, or source type.........................................87
i
Table of Contents
Indexing
Mask sensitive data in an event...............................................................................................88
Configure character set encoding............................................................................................90
Dynamic metadata assignment................................................................................................91
Timestamps........................................................................................................................................93
How Splunk extracts timestamps.............................................................................................93
Configure timestamp recognition..............................................................................................94
Apply timezone offsets.............................................................................................................97
Recognize European date format............................................................................................99
Configure positional timestamp extraction.............................................................................100
Tune timestamp extraction for better indexing performance..................................................101
Train Splunk to recognize a timestamp..................................................................................102
Fields.................................................................................................................................................107
How fields work......................................................................................................................107
Create fields via Splunk Web.................................................................................................108
Create fields via configuration files.........................................................................................109
Create indexed fields via configuration files...........................................................................113
Field actions...........................................................................................................................116
Configure fields.conf..............................................................................................................117
Configure multi-value fields....................................................................................................118
Configure tags........................................................................................................................119
Automatic header-based field extraction................................................................................122
Hosts.................................................................................................................................................127
How host works......................................................................................................................127
Set default host for a Splunk server.......................................................................................128
Define host assignment for an input.......................................................................................128
Tag hosts...............................................................................................................................130
Extract host per event............................................................................................................131
Source Types...................................................................................................................................134
How source types work..........................................................................................................134
Rule-based association of source types.................................................................................136
Set source type for an input...................................................................................................137
Set source type for a source..................................................................................................138
Train Splunk to recognize a source type................................................................................139
Source type settings in props.conf.........................................................................................139
Configure a source type alias.................................................................................................140
Event Types......................................................................................................................................142
How event types work............................................................................................................142
Save event types via Splunk Web..........................................................................................144
Configure eventtypes.conf.....................................................................................................144
ii
Table of Contents
Event Types
Tag event types......................................................................................................................146
Event type discovery..............................................................................................................146
Event type templates..............................................................................................................147
Dynamic event rendering.......................................................................................................148
Transaction Types...........................................................................................................................151
How transactions work...........................................................................................................151
Transaction types via configuration files................................................................................151
Transaction search.................................................................................................................154
Search...............................................................................................................................................157
How search works..................................................................................................................157
Set up saved searches via Splunk Web.................................................................................158
Set up saved searches via savedsearches.conf....................................................................160
Create a form search.............................................................................................................162
Macro searches......................................................................................................................164
Configure summary indexing..................................................................................................165
Live tail...................................................................................................................................168
Distributed Search...........................................................................................................................171
How distributed search works................................................................................................171
Enable distributed search via Splunk Web.............................................................................172
Enable distributed search via the CLI....................................................................................172
Configure distributed search via distsearch.conf....................................................................173
Exclude specific Splunk servers from distributed searches...................................................175
Alerts.................................................................................................................................................177
How Alerts Work....................................................................................................................177
Set up alerts via Splunk Web.................................................................................................178
Set up alerts via savedsearches.conf....................................................................................181
Scripted Alerts........................................................................................................................185
Customize alert options..........................................................................................................186
Send SNMP traps..................................................................................................................188
Security.............................................................................................................................................191
Security options......................................................................................................................191
Enable HTTPS.......................................................................................................................192
SSL........................................................................................................................................194
Set up LDAP..........................................................................................................................196
Configure roles.......................................................................................................................204
Scripted authentication...........................................................................................................207
File system change monitor...................................................................................................210
Audit events...........................................................................................................................213
Audit event signing.................................................................................................................215
iii
Table of Contents
Security
Event hashing........................................................................................................................217
IT data signing........................................................................................................................221
Archive signing.......................................................................................................................225
Data Management............................................................................................................................228
Splunk data management......................................................................................................228
Create an index......................................................................................................................229
Remove (delete) data.............................................................................................................231
Export event data...................................................................................................................235
Move the Splunk index...........................................................................................................236
Set a retirement and archiving policy.....................................................................................237
Automate archiving................................................................................................................239
Restore archived data............................................................................................................240
Back up your data..................................................................................................................241
Disk usage.............................................................................................................................244
Use separate partitions for Splunk's datastore.......................................................................245
Use WORM (Write Once Read Many) volumes for Splunk's datastore.................................247
Deployment Server..........................................................................................................................249
How the deployment server works.........................................................................................249
Configure a Splunk deployment server..................................................................................251
Configure server classes........................................................................................................254
Configure deployment clients.................................................................................................255
Sync the server and client......................................................................................................258
Performance Tuning........................................................................................................................260
Performance tuning Splunk....................................................................................................260
Indexing performance............................................................................................................261
Search performance...............................................................................................................264
Storage efficiency...................................................................................................................266
CPU and memory footprint.....................................................................................................268
Multi-CPU servers..................................................................................................................268
64-bit operating systems........................................................................................................269
Configuration Files..........................................................................................................................271
How do configuration files work?............................................................................................271
Configure application directories............................................................................................274
Configuration file list...............................................................................................................276
Applications.....................................................................................................................................278
About apps.............................................................................................................................278
About Splunk's app manager.................................................................................................279
Install Splunk apps.................................................................................................................280
iv
Table of Contents
Reference..........................................................................................................................................283
Pre-trained source types........................................................................................................283
Splunk log files.......................................................................................................................287
Work with metrics.log.............................................................................................................290
Log file rotation.......................................................................................................................293
Determine what files Splunk is monitoring.............................................................................294
Index SNMP events with Splunk............................................................................................294
log4j ........................................................................................................................................294
Strip syslog headers before processing.................................................................................295
Wildcards...............................................................................................................................296
alert_actions.conf...................................................................................................................296
app.conf.................................................................................................................................298
audit.conf................................................................................................................................300
authentication.conf.................................................................................................................302
authorize.conf.........................................................................................................................305
commands.conf......................................................................................................................309
crawl.conf...............................................................................................................................310
decorations.conf.....................................................................................................................312
deployment.conf.....................................................................................................................313
distsearch.conf.......................................................................................................................315
eventdiscoverer.conf..............................................................................................................317
eventtypes.conf......................................................................................................................318
field_actions.conf...................................................................................................................319
fields.conf...............................................................................................................................322
indexes.conf...........................................................................................................................323
inputs.conf..............................................................................................................................327
limits.conf...............................................................................................................................334
literals.conf.............................................................................................................................339
multikv.conf............................................................................................................................340
outputs.conf............................................................................................................................344
prefs.conf...............................................................................................................................348
props.conf..............................................................................................................................352
regmon-filters.conf.................................................................................................................359
restmap.conf..........................................................................................................................360
savedsearches.conf...............................................................................................................362
segmenters.conf.....................................................................................................................366
server.conf.............................................................................................................................368
setup.conf...............................................................................................................................370
source-classifier.conf.............................................................................................................371
sourcetypes.conf....................................................................................................................372
streams.conf...........................................................................................................................373
strings.conf.............................................................................................................................374
sysmon.conf...........................................................................................................................375
tags.conf.................................................................................................................................376
transactiontypes.conf.............................................................................................................377
v
Table of Contents
Reference
transforms.conf......................................................................................................................379
user-seed.conf.......................................................................................................................382
web.conf.................................................................................................................................383
wmi.conf.................................................................................................................................386
Troubleshooting...............................................................................................................................389
Contact Support.....................................................................................................................389
Splunkd is down.....................................................................................................................392
License issues........................................................................................................................392
Anonymize data samples.......................................................................................................396
Unable to get a properly formatted response from the server................................................399
Command line tools...............................................................................................................399
vi
About the Splunk Admin Manual
What's in the Admin Manual?
What's in the Admin Manual?
Everything you need to know to configure and manage Splunk can be found in this guide.
Start with an overview, and then get started with some administration basics.
Find what you need.
Use the table of contents to the left of this panel, or search for what you want by using the search box
located in the upper right.
If you're interested in more specific scenarios and best practices, you can visit the Splunk Community
Wiki to see how other users Splunk IT.
Need something a little more user-oriented?
Try the User Manual.
1
How Splunk Works
Overview of Splunk
Overview of Splunk
Splunk is search software for any type of data. Learn more about how Splunk works by reading
through this introductory page. You'll find many links here for installing, configuring and customizing
your Splunk installation.
Configuration options
Splunk has several options for configuration: a Web interface (Splunk Web), a command line
interface (the CLI), and configuration files. Most of Splunk's configuration can be accomplished by
using the Admin page of Splunk Web, and the CLI. Configure advanced settings through
configuration files.
Installation and upgrade
Installing Splunk is easy and fast. These instructions show you how to install, upgrade, or back up an
existing copy of Splunk.
Installation
Installation instructions for all supported platforms are found in the Installation Manual
On *nix platforms, use a tarball or RPM file.
On Mac, use a tarball or DMG file.
On Windows, download the .exe file and install. Instructions for Windows installation are
located here.

Upgrade
There are a lot of new features available in 3.3. You may want to consider an upgrade if
you are running an earlier version.

Upgrade instructions are here.

Important: It's a good idea to back up your current instance before you upgrade.
Data sources
Splunk is capable of receiving data in a variety of ways. Read on for a brief description of each input
type. For a more in-depth description of inputs, read how input configuration works.
Files and directories
Use monitor to stream live data into Splunk.

2
Or batch to upload a file directly to Splunk Web.
Network ports
Splunk supports UDP and TCP connections.
Configure syslog on UDP 514.
Use TCP connections for log4j.

Scripted inputs
Use scripted inputs to receive the outputs of command-line tools (such as vmstat, iostat,
netstat, top, etc.) or other programs.

Crawl
Use Splunk's new crawl feature to search for new data sources and files on your Splunk
server.

Distributed data
One Splunk Server can receive data from any number of other Splunk Servers via data
distribution (description below).

This port is configurable, but defaults to 9998.

Windows
Splunk for Windows comes with its own set of configuration files for setting up Windows-specific
inputs, including Windows registry and WMI. Read more about configuring Windows inputs.
Distributed data
Configure distributed inputs and outputs across your network. Send data between one Splunk
instance and another, or third party software. For an overview on all the available configuration
options, see How data distribution works.
Forwarding and receiving
A Splunk Server in forwarding mode can send data to one or more Splunk instances.
Any Splunk Server can receive data from one or more Splunk instances.
Learn more about forwarding and receiving.

3rd party systems
Splunk can also forward raw data to any other system or software.
You can set up Splunk to send or receive data from 3rd party systems. Learn how.

Indexing
Splunk takes all data from inputs and sends it to an indexing pipeline. Data is then broken up into
separate events via segmentation rules. Most data is segmented and timestamped correctly.
However, you may wish to configure Splunk to index your data in particular ways. Learn more about
how indexing works.
Here are some things you might want to consider:
Configure event boundaries
Configure segmentation
Mask sensitive data
Character set
3
Timestamp recognition
Configuration for indexing is set mostly through props.conf and transforms.conf.
Fields
Fields are a useful aspect of Splunk's search interface. You can use Splunk's built-in fields that are
enabled by default. Here's a list of Splunk's default fields, including links to more in-depth
documentation:
Source
The source field specifies the path to the original data input.
It is set automatically, but can be tagged.

Host
Host is the label for the device that originated the event.
Read more about host.

Source type
A source type refers to any common format of data produced by a group of sources,
such as weblogic or syslog.

Learn more about source types.

Event types
Event types are groups of common events.
Learn more about event types.

You can also create your own fields. Custom fields are useful for:
Customizing searches (see below for search options).
Creating field actions.
Enabling event type templates.
To learn more about creating custom fields, see how fields work.
Search
Splunk's search interface is useful for tracking down different aspects of your data. Here are a few
things you can do with your searches:
Search commands.
Splunk has a powerful search language.
Craft simple to sophisticated searches.

Save searches.
Any search can be saved and run at any time.
Save searches with variables to fill in at search time, including:
Form search.

4
Macro search.
LiveTail
Run a search to watch data as it's indexed.
Read more about Live Tail.

Alerts
Schedule Splunk to send search results via email or RSS.

Summary indexing
Save the output of any search to a special index.

Transactions
Search for transactions that occur across events, such as email threads, store
purchases

For a more detailed overview of search, see how search works.
Distributed search
In a distributed set up, you may want to search across multiple instances of Splunk. Enable
distributed search to federate searches across your entire Splunk deployment. Read more about how
distributed search works.
Security
Secure your Splunk server with the following security configuration options. Here's a brief overview of
the available features. For a more detailed overview, see security options.
Authentication
Splunk includes several authentication options, including:
Roles, which allow you to set up:
User roles capabilities.
User-based access controls.

LDAP
Set up LDAP.

SSL
Enable HTTPS.
Or SSL for Splunk's back-end.

Audit
Use the following options to enable separate auditing configurations:
File system monitor
5
The file system change monitor watches any designated file system and sends an event
if files or directories are affected in any way.

By default, Splunk monitors its own $SPLUNK_HOME/etc/ directory for configuration
changes.

Audit events
Events generated by the file system change monitor as well as user activity within
Splunk.

Audit events are stored in a separate index, _audit.

Audit event signing
Set up cryptographic signing for audit events.

IT data signing
Enable cryptographic signing for all your events as they enter Splunk.

Archive signing
Sign your data as it is archived.

Event decorations
Mark your audit events with icons so they're more noticeable.

Data management
Splunk servers often index large amounts of data each day. You may want to enable advanced
settings to handle the following data management scenarios.
Index management, including:
Add or remove an index.
Delete data from the index.
Move an index.

Data archiving, including:
Set retirement policy.
Automate archiving.
Restore archived data.
Export event data.

Storage options:
Disk usage.
Use separate partitions for Splunk's data store.
Use WORM (Write Once Read Many) volumes for Splunk's data store.

Note: Many data management settings are enabled on a per-index basis, using indexes.conf. To
learn more about indexes, see how indexes work.
Deployment server
In a distributed set up, enable one or more Splunk instances as deployment servers. A deployment
server pushes out configuration changes to other Splunk instances.
For a complete overview of all deployment options, read the Deployment manual. For instructions on
configuring and enabling the deployment server and clients, read the Admin manual section on the
deployment server.
6
Performance tuning
The following options help you tune Splunk's performance for your environment. Depending on your
system and requirements, you may want to change one or more of the following settings:
Indexing
Change various configurations to speed up Splunk's intake of data.

Search
Settings for faster return of search results.

Storage efficiency
Cut down on the space of your Splunk index.

CPU and memory footprint
Tune Splunk's CPU usage and memory settings.

Backup
Back up your Splunk install.
Note: It is a good idea to backup Splunk before performing any migrations or upgrades.

A more in-depth overview of performance tuning options is available here.
Configuration files
Many of Splunk's advanced configurations and customizations are available only through
configuration files. Create configurations by copying files into a custom application directory. Learn
more about application directories and configuring application directories.
Applications
Applications are directories of configuration files with specific purposes. Configure your own
applications by following these instructions.
You can also share your configuration file directories as applications with the Splunk community on
SplunkBase.
Customization
Pimp your Splunk! Everybody's data is a little bit different. Maybe you want to set custom
configurations for the system you're running Splunk on. Here are options for personalizing your
Splunk instance.
7
Splunk Web appearance
Change various aspects of Splunk Web's appearance:
Dashboards
Configure user settings and dashboards via prefs.conf.

Decorations
Set icons for event types with dynamic event rendering.

Literals
Change the externalized strings in Splunk Web via literals.conf.

Skinning
Change the way your web interface looks.
Read the Developer's Guide for help with skinning Splunk.

Extend Splunk
Splunk includes a REST API. Read the Developer's Guide to learn more about the REST API. To
configure additional REST endpoints, use restmap.conf.
Troubleshooting
If there's something you need help with, even after reading the documentation, contact Splunk
support.
If there's a feature you don't see here that you want included, file an enhancement request with
Splunk support.
We're always interested in your feedback.
Splunk support.
Splunk forums.
8
Getting Started
Start Splunk
Start Splunk
This topic serves only as a brief instruction to starting Splunk. If you are new to Splunk, we
recommend reviewing the User Manual first.
Before you start
Before starting Splunk, install the software. Refer to the Installation Manual for system requirements
and step-by-step instructions. Make sure you install the correct version of Splunk and that you are
installing on a supported filesystem.
Start Splunk on non-Windows platforms
Splunk's command line interface is located in $SPLUNK_HOME/bin/. $SPLUNK_HOME refers to the
path you installed under. Navigate to this location and run the following command:
# ./splunk start
You must accept Splunk's EULA the first time you start Splunk after a new installation. To bypass this
step, start Splunk and accept the license in one step:
# ./splunk start --accept-license
NOTE: There are two dashes before the accept-license option.
Start Splunk on Windows
On Windows, Splunk is installed by default into \Program Files\Splunk
Start and stop the following Splunk processes via the Windows Services Manager:
Server daemon: splunkd
Web interface: splunkweb
You can also start, stop, and restart both processes at once by going to \Program
Files\Splunk\bin and typing
# splunk.exe [start|stop|restart]
9
Load Splunk Web in your browser
Navigate to:
http://mysplunkhost:8000
Use whatever host and port you chose during installation.
The first time you login to Splunk with an Enterprise license, use username admin and password
changeme. Splunk with a free license does not have access controls.
Administration basics
Administration basics
The $SPLUNK_HOME variable refers to the top level directory of your installation. By default, this is
/opt/splunk/.
Add Splunk to your shell path
To save a lot of typing, set a SPLUNK_HOME environment variable and add $SPLUNK_HOME/bin to
your shell's path.
This example works for Linux/BSD/Solaris users who accepted the default installation location:
# export SPLUNK_HOME=/opt/splunk
# export PATH=$SPLUNK_HOME/bin:$PATH
This example works for Mac users who accepted the default installation location:
# export SPLUNK_HOME=/Applications/Splunk
# export PATH=$SPLUNK_HOME/bin:$PATH
Alternatively, Splunk supplies a script which can be sourced to set up the Splunk environment,
regardless
of where it has been installed. This perform the equivalent of the above steps, and obey the values in
etc/splunk-launch.conf.
# source <your splunk directory>/bin/setSplunkEnv
10
Splunk's CLI
Splunk's command line interface is located in $SPLUNK_HOME/bin/. If you have exported the path
and environment variables (as explained above), you can use the splunk command as follows:
# splunk [action] [object] [-parameter value] ....
If you haven't set an environment variable, navigate to $SPLUNK_HOME/bin/ and run commands as
follows:
#./splunk [action] [object] [-parameter value] ....
For general help, type:
# splunk help
For a list of commands and options, type:
# splunk help commands
For Splunk with an Enterprise license, administration commands must be authenticated with a
username and password. To authenticate for an entire session, type:
# splunk login
This command prompts you for a Splunk username and password. Use the same username and
password for the CLI and Splunk Web. By default, the login is set to admin and the password is
changeme.
Logout at any time by typing:
# splunk logout
To authenticate a single command, use the -auth parameter:
# splunk search foo -auth username:password
Note: the -auth string must be the last term in the CLI command.
Start/stop Splunk, check status
Ensure that you have added Splunk to your server host's path (as explained above, in "Adding
Splunk to your shell path"). Otherwise you must use the ./splunk command.
Start the Server
From a shell prompt on the Splunk sever host, run this command:
11
# splunk start
Alternately, start either splunkd (to load back-end configuration) or Splunk Web (to load web
configuration):
# splunk start splunkd
# splunk start splunkweb
Note: manually starting splunkweb will not override the setting startwebserver in
web.conf. If it is disabled in configfiles, it will not start.
Or restart Splunk (splunkd or Splunk Web) by running:
# splunk restart
# splunk restart splunkd
# splunk restart splunkweb
Stop the Server
To shut down Splunk, run this command:
# splunk stop
Also available for splunkd and Splunk Web:
# splunk stop splunkd
# splunk stop splunkweb
Check if Splunk is running
To check if Splunk is running, type this command at the shell prompt on the sever host:
# splunk status
You should see this output:
splunkd is running (PID: 3162).
splunk helpers are running (PIDs: 3164).
splunkweb is running (PID: 3216).
Or you can use ps to check for running Splunk processes:
# ps aux | grep splunk | grep -v grep
Solaris users, type -ef instead of aux:
12
# ps -ef | grep splunk | grep -v grep
Where to find help
Help is available in several forms.
From the CLI:
Type # splunk help

From Splunk Web:
Follow the help link in the upper right hand corner of Splunk Web.
Click the tutorial link from the Splunk Web landing page.

Contact Splunk Support:
Many options are available on the support portal.
Email Splunk support.

Change defaults
Change defaults
Changing the admin default password
Splunk with an Enterprise license has a default administration account and password. It is highly
recommended that you change the default. You can do this via Splunk's CLI or Splunk Web.
Note: CLI commands assume you have set a Splunk environment variable. If you have not, navigate
to $SPLUNK_HOME/bin and run the ./splunk command.
via Splunk Web
Log in as admin.
Click Admin in the top-right of the interface:
Click the Users tab:
Under the Action heading click Edit.
Type in the new information and click Save.
13
via Splunk CLI
The Splunk CLI command is:
# splunk edit user
Note: You must authenticate with the existing password before it can be changed. Log into Splunk
via the CLI or use the -auth parameter.
For example:
# splunk edit user admin -password foo -auth admin:changeme
This command changes the admin password from changeme to foo.
Changing network ports
Splunk uses two ports. They default to:
8000 - HTTP or HTTPS socket for Splunk Web.
8089 - Splunkd management port. Used to communicate with the splunkd daemon. Splunk
Web talks to splunkd on this port, as does the command line interface and any distributed
connections from other servers.

via Splunk Web
To change the port settings via Splunk Web, click the Admin link in the upper right hand
corner:

Then, click the Server tab. Click on Settings and change the port assignments:
via Splunk CLI
To change the port settings via the Splunk CLI, use the CLI command set.
# splunk set web-port 9000
14
This command sets the Splunk Web port to 9000.
# splunk set splunkd-port 9089
This command sets the splunkd port to 9089.
Changing the default Splunk server name
The Splunk server name setting controls both the name displayed within Splunk Web and the name
sent to other Splunk Servers in a distributed setting.
The default name is taken from either the DNS or IP address of the Splunk Server host.
via Splunk Web
To change this setting, click the Admin link in the upper right-hand corner:
Then, click the Server tab and modify the Splunk Server name variable under the Settings
tab:

via Splunk CLI
To change the server name via the CLI, type the following:
# splunk set servername foo
This command sets the servername to foo.
Changing the datastore location
The datastore is the top-level directory where the Splunk Server stores all indexed data, user
accounts, and working files.
15
Note: If you change this directory, the server does not migrate old datastore files. Instead, it starts
over again at the new location.
To migrate your data to another directory follow the instructions in Move an index.
via Splunk Web
To change this setting, click the Admin link in the upper right hand corner:
Then, click the Server tab and modify the Datastore path variable under the Settings tab:
via Splunk CLI
# splunk set datastore-dir /var/splunk/
This command sets the datastore directory to /var/splunk/.
Set minimum free disk space
The minimum free disk space setting controls how low disk space in the datastore location can fall
before Splunk stops indexing.
Splunk resumes indexing when more space becomes available. For detailed information on how to
manage Splunk server disk usage, see Disk usage.
via Splunk Web
To change this setting, click the Admin link in the upper right-hand corner:
Then, click the Server tab and modify the variable below Pause indexing if free disk space
falls below under the Settings tab:

16
via Splunk CLI
# splunk set minfreemb 2000
This command sets the minimum free space to 2000 MB.
Find and index data
Find and index data
There are many ways to set up data inputs in Splunk. This section is a high-level description of these
techniques. For more detailed methods, see the data inputs section.
Here's a brief intro on getting data into Splunk.
Monitor a file
When you first log in to Splunk Web, you're provided a link to begin monitoring /var/log locally.
You can monitor other files and directories you're interested in. When you specify a file to monitor,
Splunk processes the entire file and then watches the file and processes additions to it. When you
monitor a directory, Splunk recursively searches all subdirectories looking for files resembling log
files. You can explicitly include or exclude files with whitelisting and blacklisting.
Monitor files via Splunk Web
Manage your indexed files and add new files to your index from the Admin > Data Inputs: Files &
Directories page.
1. To access the Admin page, click the Admin link in the upper right-hand corner.
The Admin page opens to the Server settings page.
2. From the navigation links on the left, click Data Inputs.
The Admin > Data Inputs: All page opens.
3. From the navigation links on the left or the table of input types, click Files & Directories.
17
The Admin > Data Inputs: FIles & Directories page opens.
4. Click New Input.
The Admin > Data Inputs: Files & Directories: New Input opens.
Monitor files via the CLI
Use the splunk add command. These commands assume you have set a Splunk environment
variable. If you have not, you must navigate to $SPLUNK_HOME/bin and run the ./splunk
command.
For example:
splunk add monitor /var/log/
This command monitors all files in /var/log/.
Crawl for inputs
Splunk 3.3 introduces the new crawl feature. Crawl your file system for potential logs and data to
index. Read more about Using crawl and Configuring crawl.
Add more users
Add more users
There are three default user roles and three different authentication methods to choose from when
you set up Splunk with an Enterprise license. Users authenticate with Splunk's built-in system
(described below), LDAP or scripted authentication (for third-party auth systems). Either method
works with Splunk's roles system.
You must be logged in as a Splunk administrator to add or edit user accounts. The default Admin
account password is changeme.
Note: Splunk with a Free license does not contain access control features. To access this page, you
must run Splunk with an Enterprise license. For more information, read About Splunk licenses.
Lost admin password
If you lose the password to your admin account, contact Splunk Support for assistance.
Splunk local users
A Splunk Admin can create new users either via Splunk Web or Splunk's CLI. Users can be mapped
to Splunk's default roles or any custom roles via authorize.conf
18
via Splunk Web
To manage users accounts, click the Admin link in the upper right-hand corner:
From the left hand navigation list, click Users.
To add a new user, click the New User button.
To edit existing accounts, click the Edit link under the Action heading.
Enter the new or changed information and then click Save.
via Splunk CLI
From the CLI, use the following commands to add, edit, remove, or list users.
add user [-parameter value] ...
edit user [-parameter value] ...
remove user [-parameter value] ...
list user
Required (default) Parameters:
username -- the name of the Splunk user account to manage.
full-name -- the full name of the user in quotes, for example "Nikola Tesla".
role -- either User, Power, or Admin.
Note: The role names are case sensitive.
Optional Parameters:
password -- the password to set for the account.
Examples
The following are examples of editing a user's properties and adding a new user. Only Admin roles
can modify user properties. To login, use the splunk login command or -auth, as exemplified in
these examples.
Note: These examples assume you have set a Splunk environment variable. If you have not,
navigate to $SPLUNK_HOME/bin and run the ./splunk command.
Example 1
Let's say, as an admin on a Splunk server, you want to change the password for another user. The
syntax for this looks something like:
# splunk edit user <username> -password <newpassword> -auth <your_username>:<your_password>
Note: When editing a specific user's properties, you can list the user without the -username
parameter.
19
Therefore, to authenticate as user admin to change the password for user newbie:
# splunk edit user newbie -password f8h2.$R -auth admin:adminpw
Example 2
Now, as an admin on a Splunk server, you want to add a new user with more than one role. The
syntax for this looks something like:
# splunk add user -username <username> -full-name "First Last" -role <role1> -role <role2> -password <password> -auth <your_username>:<your_password>
Therefore, to add a new user deep, with Everybody and Admin permissions:
# splunk add user -username deep -full-name "the deep" -role Everybody -role Admin -password foobar -auth admin:adminpw
Start searching
Start searching
Now you're ready to start using Splunk's search capabilities. Here are a few pages to help you start
searching:
Search reference. 1.
Search syntax. 2.
Search tutorial. 3.
20
Data Inputs
How input configuration works
How input configuration works
Splunk consumes any data you point it at. Before indexing data, you must add your data source as
an input. The source is then listed as one of Splunk's default fields (whether it's a file, directory or
network port).
Note: Splunk looks for the inputs it is configured to monitor every 24 hours starting from the time it
was last restarted. This means that if you add a stanza to monitor a directory or file that doesn't exist
yet, it could take up to 24 hours for Splunk to start indexing its contents.
Data input methods
Specify data inputs via the following methods:
Splunk Web.
Splunk's CLI.
The inputs.conf configuration file.
Data distribution.
Most data sources can be specified via Splunk Web. For more extensive configuration options, use
inputs.conf. Changes made via Splunk Web or the Splunk CLI are written to
$SPLUNK_HOME/etc/system/local/inputs.conf. Configure Windows inputs via
inputs.conf as well.
Sources
Splunk accepts data inputs from a wide range of sources. Here's a basic overview of your options.
Read on through the Data Inputs and Data Distribution sections of this manual for configuration
specifics.
Many data inputs come directly from files and directories. For the most part, you can use Splunk's
monitor processor to index data in files and directories. If you have a large archive of historical data,
you may want to use batch. Data sent via batch is loaded once and the original files are deleted
when Splunk is done indexing them. Keep this in mind when using batch input.
You can also configure Splunk's file system change monitor to watch for changes in your file
system. However, you cannot currently use both monitor and file system change monitor to follow
the same directory or file. If you want to see changes (eg. file edits, ownership changes) in a
directory, use file system change monitor. If you want to index new events (eg. from log files) in a
directory, use monitor.
21
To configure files and directories, see files and directories.
To configure file system change monitor, see the page on file system change monitor.
Monitor
Specify a path to a file or directory and Splunk's monitor processor consumes any new input. You
can also specify a mounted or shared directory, as long as the Splunk server can see the directory. If
the specified directory contains subdirectories, Splunk recursively examines them for new files.
Splunk only checks for files and directories each time the Splunk server starts/restarts, so be sure to
add new sources when they become available if you don't want to restart the server. You can also
use crawl to discover new sources
When using monitor:
Files can be opened or closed for writing. Splunk consumes files even if they're still being
written to by the operating system.

Files or directories can be included or excluded via whitelists and blacklists. For more
information, see "Whitelist and blacklist rules" in this manual.

Upon restart, Splunk continues processing files where it left off.
Splunk unpacks compressed archive files before it reads them. Splunk can handle the
following common archive filetypes: tar, gz, bz2, tar.gz, tgz, tbz, tbz2, zip, and z, and it
processes compressed files according to their extension. Keep in mind that unpacking large
amounts of compressed files can cause performance issues, so you may want to store old
archive files where they are not monitored by Splunk.

Splunk detects log file rotation and does not process renamed files it has already
indexed, with the exception of archive filetypes such as .tar and .gz, which it will not recognize
as being the same as the uncompressed originals (you can exclude them with the blacklist
functionality mentioned above). For more information see "Log file rotation" in this manual.

The entire path dir/filename for a monitored file must not exceed 993 characters. Paths
longer than this are indexed, but the soure key is truncated.

Set the sourcetype to Automatic when you monitor a directory. If the directory contains
multiple files of different formats, do not set a value for the source type manually. Manually
setting a source type forces a single source type for all files in that directory.

Removing an input does not stop Splunk from indexing files right away. The input will be
disabled when the Splunk server is restarted. Additionally, some small amount of data already
read from these files may be indexed after the restart.

Note: Splunk rescans the inputs it is configured to monitor every 24 hours starting from the time it
yet, it could take up to 24 hours for Splunk to start indexing its contents.
Important: To avoid performance issues, Splunk recommends that you set followTail=1 in
inputs.conf if you are deploying Splunk to systems containing significant quantities of historical
data. Setting followTail=1 for a monitor input means that any new incoming data is indexed when
it arrives, but anything already in files on the system when Splunk was first started will not be
indexed.
For the curious, some detail on How Splunk Reads Input Files is available on the Community wiki.
22
Upload files
Upload files directly through Splunk Web. If necessary, Splunk decompresses files before indexing.
Uploading files through Splunk Web places them in the spool directory
$SPLUNK_HOME/var/spool/splunk.
Use the batch processor at the CLI to load files once and destructively. By default, Splunk's batch
processor is located in $SPLUNK_HOME/var/spool/splunk. If you move a file into this directory,
Splunk indexes it and deletes it. You should only use this for large archives of historical data. For
most inputs, use monitor.
FIFO queues
Caution: Due to common issues with deadlock and data loss, the use of FIFOs is not recommended.
Monitor is a more reliable, stable method. Support for FIFO inputs is deprecated and will be removed
in a future release of Splunk.
A FIFO (AKA named pipe) is a queue of data maintained in memory. File systems can write log
messages directly to a FIFO. Splunk then accesses the FIFO as though it were a file. FIFO access is
very fast, but FIFOs are vulnerable when there are processing disruptions because the in-memory
data may be lost.
To configure FIFO cues, see "FIFO" in this manual.
Network ports
You can configure Splunk with an Enterprise license to listen on any network port. This is the best
method to send data to your Splunk server from any machine (see data distribution for more
information). When configuring network ports, keep in mind that you cannot use privileged ports (i.e.
any port lower than 1025) if you have not installed Splunk as root on Linux, Unix, Mac, or FreeBSD.
Windows does not implement privileged ports, so Splunk can bind to any port when running under
any user context.
To configure network ports, see "Network ports" in this manual.
UDP
UDP is a best effort protocol, so you may experience data loss under certain conditions such as high
network or system utilization. Use UDP inputs only when the sending device does not support TCP.
Splunk with an Enterprise license can listen for data on any UDP port. When configured to listen on
UDP port 514, Splunk eliminates the need to install and configure a syslog server to listen for syslog
data sent from remote hosts.
TCP
TCP is a reliable, connection-oriented protocol that should be used instead of UDP to transmit and
receive data whenever possible. Splunk with an Enterprise license can receive data on any TCP port,
allowing Splunk to receive remote data from syslog-ng and other application that transmit via TCP.
TCP is the foundation of Splunk's data distribution architecture.
23
Scripted inputs
Configure Splunk to run shell commands on a schedule, and then index whatever the command
writes to standard output.
For example:
vmstat, iostat, netstat, and any other network or system status commands.
SQL DBI.
HTTP and HTTPS requests.
SNMP.
See configure scripted inputs for details on setting this up.
Windows data sources
By default, Splunk for Windows indexes the Windows Application, System, and Security event logs.
Splunk for Windows can also monitor and index changes to your registry and accept WMI data input.
For more information on configuring Splunk for Windows, see "Windows inputs" in this manual.
Crawl
Discover new inputs automatically. Crawl uses rules you configure to traverse any directory structure.
Splunk adds new inputs you find via crawl to inputs.conf.
Data processing
Once Splunk consumes data, it is sent to the universal processing pipeline. Splunk can
automatically learn event boundaries, classify events and sources, and extract timestamps. However,
you may want to manually override Splunk's automatic processing. Change processing settings and
indexing properties via props.conf.
Some attributes within props.conf can be customized by defining new stanzas in other
configuration files. For example, transforms.conf defines regex-based rules for extracting fields,
routing events, and performing other transformations. Segmenters.conf and outputs.conf can also
define attribute values referenced by props.conf.
Common use cases for custom indexing properties include:
Define additional indexed or extracted fields.
Override the value of host on a per-event basis, such as for syslog coming from multiple
servers.

Customize how Splunk recognizes timestamps.
Change how Splunk recognizes multi-line event boundaries.
24
Mask sensitive data in an event, such as social security numbers.
Customize how Splunk segments events in its index.
Point Splunk at a file or a directory. If you specify a directory, Splunk consumes everything in the
directory. Splunk has two different file input processors: monitor and batch. For the most part, use
monitor to input all your data sources from files and directories. The only time you should use batch
is to load a large archive of historical files. Read on for more specifics.
Monitor
Specify a path to a file or directory and Splunk's monitor processor consumes any new input. You can
also specify a mounted or shared directory, including network filesystems, as long as the Splunk
server can read from the directory. If the specified directory contains subdirectories, Splunk
recursively examines them for new files.
Splunk checks for the file or directory specified in a monitor configuration on Splunk server start and
restart. If the file or directory specified is not present on start, Splunk checks for it again in 24
intervals from the time of the last restart. Subdirectories of monitored directories are scanned
continuously. To add new inputs without restarting Splunk, use Splunk Web or the command line
interface. If you want Splunk to find potential new inputs automatically, use crawl.
When using monitor:
On most operating systems, files can be opened or closed for writing. With the exception of
Windows, Splunk consumes files even if they're still being written to by the operating system.

Files or directories can be included or excluded via whitelists and blacklists.
Upon restart, Splunk continues processing files where it left off.
Splunk decompresses archive files before it indexes them. It can handle the following common
archive file types: .tar, .gz, .bz2, .tar.bz2 , and .zip.

Splunk detects log file rotation and does not process renamed files it has already indexed (with
the exception of .tar and .gz archives; for more information see "Log file rotation" in this
manual).

The entire dir/filename path must not exceed 1024 characters.
Set the sourcetype for directories to Automatic. If the directory contains multiple files of
different formats, do not set a value for the source type manually. Manually setting a source
type forces a single source type for all files in that directory.

Removing an input does not stop the the input's files from being indexed. Rather, it stops files
from being checked again, but all the initial content will be indexed. To stop all in-process data,
you must restart the Splunk server.

Note: You cannot currently use both monitor and file system change monitor to follow the same
directory or file. If you want to see changes in a directory, use file system change monitor. If you want
to index new events in a directory, use monitor.
25
Note: Monitor input stanzas may not overlap. That is, monitoring /a/path while also monitoring
/a/path/subdir will produce unreliable results. Similarly, monitor input stanzas which watch the
same directory with different whitelists, blacklists, and wildcard components are not supported.
Batch
Use the batch processor at the CLI or in inputs.conf to load files once and destructively. By
default, Splunk's batch processor is located in $SPLUNK_HOME/var/spool/splunk. If you move a
file into this directory, Splunk indexes it and then deletes it.
Note: Batch is most useful for loading in historical data, such as large archives of files. For best
practices on loading file archives, see "How to index different sized archives".
Splunk Web
Add inputs from files and directories via Splunk Web.
1. Click Admin in the upper right-hand corner of Splunk Web.
2. Then click Data Inputs.
3. Pick files and directories.
4. Click New Input to add an input.
5. Under Data access, pick Monitor a directory.
You can also:
Upload a local file from your local machine into Splunk.
Index a file on the Splunk server, which copies a file on the server into Splunk via the batch
directory.

6. Specify the pathname to the file or directory. If you select Upload, use the Browse... button.
To monitor a shared network drive, enter the following: <myhost><mypath> (or
\\<myhost>\<mypath> on Windows). Make sure your Splunk server has read access to the
mounted drive as well as the files you wish to monitor.
7. Under the Host heading, select the host name. You have several choices if you are using Monitor
or Batch methods. Learn more about setting host value.
Note: Host only sets the host field in Splunk. It does not direct Splunk to look on a specific host on
your network.
8. Now set the Source Type. Source type is a default field added to events. Source type is used to
determine processing characteristics such as timestamps and event boundaries. Learn more about
source type.
26
9. After specifying the source, host, and source type, click Submit.
CLI
Monitor files and directories via Splunk's Command Line Interface (CLI). To use Splunk's CLI,
navigate to the $SPLUNK_HOME/bin/ directory and use the ./splunk command from the UNIX or
Windows command prompt. Or add Splunk to your path and use the splunk command.
If you get stuck, Splunk's CLI has built-in help. Access the main CLI help by typing splunk help.
Individual commands have their own help pages as well -- type splunk help <command>.
The following commands are available for input configuration via the CLI:
Command Command syntax Action
add
add monitor $SOURCE [-parameter
value] ...
Add inputs from $SOURCE.
edit
edit monitor $SOURCE [-parameter
value] ...
Edit a previously added input for
$SOURCE.
remove remove monitor $SOURCE Remove a previously added $SOURCE.
list list monitor List the currently configured monitor.
spool spool source
Copy a file into Splunk via the sinkhole
directory.
Change the configuration of each data input type by setting additional parameters. Parameters are
set via the syntax: -parameter value.
Note: You can only set one -hostname, -hostregex or -hostsegmentnum per command.
Parameter Required? Description
source Required Path to the file or directory to monitor for new input.
sourcetype Optional Specify a sourcetype field value for events from the input source.
index Optional Specify the destination index for events from the input source.
hostname Optional
Specify a host name to set as the host field value for events from
the input source.
hostregex Optional
Specify a regular expression on the source file path to set as the
host field value for events from the input source.
hostsegmentnum Optional
Set the number of segments of the source file path to set as the
follow-only Optional
(T/F) True or False. Default False. When set to True, Splunk will
read from the end of the source (like the "tail -f" Unix command).
27
Example: use the CLI to monitor /var/log/
The following example shows how to monitor files in /var/log/:
Add /var/log/ as a data input:
./splunk add monitor /var/log/
Example: use the CLI to monitor windowsupdate.log
The following example shows how to monitor the Windows Update log (where Windows logs
automatic updates):
Add C:\Windows\windowsupdate.log as a data input:
./splunk add monitor C:\Windows\windowsupdate.log
Example: use the CLI to monitor IIS logging
This example shows how to monitor the default location for Windows IIS logging: Add
C:\windows\system32\LogFiles\W3SVC as a data input:
./splunk add monitor c:\windows\system32\LogFiles\W3SVC
Inputs.conf
To add an input, add a stanza for it to inputs.conf in $SPLUNK_HOME/etc/system/local/, or your
own custom application directory in $SPLUNK_HOME/etc/apps/. If you have not worked with
Splunk's configuration files before, read how configuration files work before you begin.
You can set any number of attributes and values following an input type. If you do not specify a value
for one or more attributes, Splunk uses the defaults that are preset in
$SPLUNK_HOME/etc/system/default/ (noted below).
Monitor
[monitor://<path>]
<attrbute1> = <val1>
...
This type of input stanza (monitor) directs Splunk to watch all files in the <path> (or just <path>
itself if it represents a single file). You must specify the input type and then the path, so put three
slashes in your path if you're starting at root. You can use wildcards for the path. For more
information, see the "Wildcards" subsection, below.
Note: To ensure new events are indexed when you copy over an existing file with new contents, set
CHECK_METHOD = modtime in props.conf for the source. This checks the modtime of the file and
re-indexes when it changes. Note that the entire file is indexed, which can result in duplicate events.
host = <string>
28
Set the host value of your input to a static value.
host= is automatically prepended to the value when this shortcut is used.
Defaults to the IP address of fully qualified domain name of the host where the data originated.
For more information about the host field, see "How host works," in this manual.
index = <string>
Set the index where events from this input will be stored.
index= is automatically prepended to the value when this shortcut is used.
Defaults to main (or whatever you have set as your default index).
For more information about the index field, see "Splunk data management," in this manual.
sourcetype = <string>
Set the sourcetype name of events from this input.
sourcetype= is automatically prepended to the value when this shortcut is used.
Splunk automatically picks a source type based on various aspects of your data. There is no
hard-coded default.

For more information about the sourcetype field, see the "How source types work," in this
manual.

source = <string>
Set the source name of events from this input.
Defaults to the file path.
source= is automatically prepended to the value when this shortcut is used.
queue = <string> (parsingQueue, indexQueue, etc)
Specify where the input processor should deposit the events that it reads.
Can be any valid, existing queue in the pipeline.
Defaults to parsingQueue.
host_regex = <regular expression>
If specified, the regex extracts host from the filename of each input.
Specifically, the first group of the regex is used as the host.
Defaults to the default host= attribute if the regex fails to match.
host_segment = <integer>
If specified, the '/' separated segment of the path is set as host.
Defaults to the default host:: attribute if the value is not an integer, or is less than 1.
crcSalt = <string>
If set, this string is added to the CRC.
Use this setting to force Splunk to consume files that have matching CRCs.
29
If set to crcSalt = <SOURCE> (note: This setting is case sensitive), then the full source path
is added to the CRC.

followTail = 0|1
If set to 1, monitoring begins at the end of the file (like tail -f).
This only applies to files the first time they are picked up.
After that, Splunk's internal file position records keep track of the file.
_whitelist = <regular expression>
If set, files from this path are monitored only if they match the specified regex.
_blacklist = <regular expression>
If set, files from this path are NOT monitored if they match the specified regex.
Wildcards
You can use wildcards to specify your input path for monitored input. Use ... for paths and * for
files.
... recurses through directories until the match is met. This means that /foo/.../bar will
match foo/bar, foo/1/bar, foo/1/2/bar, etc. but only if bar is a file.
To recurse through a subdirectory, use another .... For example
/foo/.../bar/....

* matches anything in that specific path segment. It cannot be used inside of a directory path;
it must be used in the last segment of the path. For example /foo/*.log matches
/foo/bar.log but not /foo/bar.txt or /foo/bar/test.log.

Combine * and ... for more specific matches:
foo/.../bar/* matches any file in the bar directory within the specified path.

Note: In Windows, you must use two backslashes \\ to escape wildcards. Regexes with backslashes
in them are not currently supported for _whitelist and _blacklist in Windows.
Specifying wildcards results in an implicit _whitelist created for that stanza. The longest fully
qualified path is used as the monitor stanza, and the wildcards are translated into regular expressions
using the following map:
wildcard regex meaning
* [^/]* anything but /
... .* anything (greedy)
. \. literal .
Additionally, the converted expression is anchored to the right end of the file path, so that the entire
path must be matched.
30
For example, if you specify
[monitor:///foo/bar*.log]
Splunk translates this into
[monitor:///foo/]
_whitelist = bar[^/]*\.log$
As a consequence, you can't have multiple stanzas with wildcards for files in the same director.
Also, you cannot use a _whitelist declaration in conjunction with wildcards.
For example:
[monitor:///foo/bar_baz*]
[monitor:///foo/bar_qux*]
This results in overlapping stanzas indexing the directory /foo/. Splunk takes the first one, so only
files starting with /foo/bar_baz will be indexed. To include both sources, manually specify a
_whitelist using regular expression syntax for "or":
[monitor:///foo]
_whitelist = (bar_baz[^/]*|bar_qux[^/]*)$
Note: To set any additional attributes (such as sourcetype) for multiple whitelisted/blacklisted inputs
that may have different attributes, use props.conf.
Examples
To load anything in /apache/foo/logs or /apache/bar/logs, etc.
[monitor:///apache/.../logs]
To load anything in /apache/ that ends in .log.
[monitor:///apache/*.log]
Batch
[batch://<path>]
move_policy = sinkhole
...
Use batch to set up a one time, destructive input of data from a source. For continuous,
non-destructive inputs, use monitor.
Note: You must set move_policy = sinkhole. This loads the file destructively. Do not use this
input type for files you do not want to consume destructively.
31
host = <string>
For more information about the host field, see the host section.
index = <string>
For more information about the index field, see the data management section.
hard-coded default.

For more information about the sourcetype field, see the source type section.
source = <string>
If specified, the regex extracts host from the filename of each input.
Specifically, the first group of the regex is used as the host.
Defaults to the default host= attribute if the regex fails to match.
If specified, the '/' separated segment of the path is set as host.
Defaults to the default host:: attribute if the value is not an integer, or is less than 1.
Note: source = <string> and <KEY> = <string> are not used by batch.
32
Example
This example batch loads all files from the directory /system/flight815/.
[batch://system/flight815/*]
move_policy = sinkhole
Network ports
Network ports
You can enable Splunk to accept an input on any TCP or UDP port. Splunk consumes any data sent
on these ports. TCP is the protocol underlying Splunk's data distribution, which is the recommended
method for sending data from any remote machine to your Splunk server. Note that the user you run
Splunk as must have access to the port. On a Unix system you must run as root to access a port
under 1024.
Important: In version 3.3.3 of Splunk, default syslog processing via UDP does not correctly handle
line-breaks. To work around this issue, add _linebreaker = _linebreaker to the UDP stanza in
$SPLUNK_HOME/etc/system/local/inputs.conf. This issue was resolved in 3.3.4.
Splunk Web
Add inputs from network ports via Splunk Web.
3. Pick Network Ports - Display and access configuration for UDP and TCP ports.
5. Under the Source heading, select Protocol of UDP or TCP.
6. Accept the default port, 9998, or enter another port number.
7. Specify whether this port should accept connections from all hosts or one host. If you specify one
host, enter the IP address of the host.
setting source type. Choose:
From List
Select one of the pre-defined source types from the drop-down list.

Manual
Label your own source type in the text box.

33
CLI
Monitor files and directories via Splunk's Command Line Interface (CLI). To use Splunk's CLI,
navigate to the $SPLUNK_HOME/bin/ directory and use the ./splunk command. Or add Splunk to
your path and use the splunk command.
add
add tcp | udp $SOURCE [-parameter
value] ...
edit
edit tcp | udp $SOURCE [-parameter
value] ...
$SOURCE.
remove remove tcp | udp $SOURCE
Remove a previously added data
input.
list list tcp | udp List the currently configured monitor.
Change the configuration of each data input type by setting additional parameters. Parameters are
set via the syntax: -parameter value.
$SOURCE Require Port number to listen for data to index.
hostname Optional
Specify a host name to set as the host field value for events from the input
source.
remotehost Optional Specify an IP address to exclusively accept data from.
resolvehost Optional
Set True of False (T | F). Default is False. Set True to use DNS to set the
Example
Configure a network input, then set the sourcetype:
Configure a UDP input to watch port 514 and set the sourcetype to "syslog".
34
$SPLUNK_HOME/etc/system/local/inputs.conf. This issue was resolved in 3.3.4.
Check the Splunk Wiki for information about the best practices for using UDP when configuring
Syslog input.
./splunk add udp 514 -sourcetype syslog
Set the UDP input's host value via DNS. Use auth with your username and password.
./splunk edit udp 514 -resolvehost true -auth admin:changeme
Note: Splunk must be running as root to watch ports under 1024.
inputs.conf
TCP
[tcp://<remote server>:<port>]
...
This type of input stanza tells Splunk to listen to <remote server> on <port>. If <remote server> is
blank, Splunk listens to all connections on the specified port.
host = <string>
host:: is automatically prepended to the value when this shortcut is used.
index = <string>
index:: is automatically prepended to the value when this shortcut is used.
35
sourcetype:: is automatically prepended to the value when this shortcut is used.
hard-coded default.

source = <string>
source:: is automatically prepended to the value when this shortcut is used.
connection_host = [ip | dns]
If set to ip: the TCP input processor rewrites the host with the ip address of the remote server.
If set to dns: the host is rewritten with the DNS entry of the remote server.
Defaults to ip.
UDP
$SPLUNK_HOME/etc/system/local/inputs.conf. This was resolved in 3.3.4.
[udp://<port>]
...
This type of input stanza is similar to the TCP type, except that it listens on a UDP port.
host = <string>
index = <string>
36
hard-coded default.

source = <string>
_rcvbuf = <int>
Specify the receive buffer for the UDP port.
If the value is 0 or negative, it is ignored.
The default value for Splunk is 1MB (the default in the OS varies).
no_priority_stripping = true | false
If this attribute is set to true, then Splunk does NOT strip the <priority> syslog field from
received events.

Otherwise, Splunk strips syslog priority from events.
no_appending_timestamp = true
If this attribute is set to true, then Splunk does NOT append a timestamp and host to received
events.

Note: Do NOT include this key at all if you want to append timestamp and host to received
events.

Encrypted Inputs
Encrypted Inputs
You can add encrypted inputs to Splunk. Use this configuration if you want to send data to Splunk
from a third-party system. You can encrypt data via SSL and send it to Splunk over a TCP port.
Configure inputs.conf by adding this stanza to the version in $SPLUNK_HOME/etc/system/local,
or in your own custom application directory.
37
If you want to configure two instances of Splunk to talk to each other, see the section on data
distribution in this manual.
Define the TCP port
Add a tcp-ssl stanza to specify which TCP port receives the encrypted data:
[tcp-ssl:PORT]
Set PORT to the port on which your forwarder is sending raw (e.g. uncooked by Splunk), encrypted
data.
Encrypt the data with SSL
1. Use the SSL stanza to define the encryption:
[SSL]
2. Provide a path to the server certificate:
serverCert = <path>
3. If there is a server certificate password, specify it:
password = <string>
4. Provide the certificate authority list (root file).
rootCA = <string>
5. Toggle whether it is required for a client to authenticate.
requireClientCert = true | false
FIFO inputs
FIFO inputs
Caution: Data sent via FIFO is not persisted in memory and can be an unreliable method for data
sources. To ensure your data is not lost, use monitor instead.
Splunk Web
Add inputs from FIFOs via Splunk Web.
38
3. Pick files and directories.
5. Under Source, type in the path to the FIFO.
6. Under the Host heading, accept the default host name or enter a new hostname/IP address. Learn
more about setting host value.
Note: Host only sets the host field in Splunk. It does not direct Splunk to look on a specific host on
your network.
setting source type. Choose:
From List
Select one of the pre-defined source types from the drop-down list.

Manual
Label your own source type in the text box.

CLI
Add a FIFO via Splunk's Command Line Interface (CLI). To use Splunk's CLI, navigate to the
$SPLUNK_HOME/bin/ directory and use the ./splunk command. Or add Splunk to your path and
use the splunk command.
add
add fifo $SOURCE [-parameter value]
...
edit
edit fifo $SOURCE [-parameter
value] ...
$SOURCE.
remove remove fifo $SOURCE
Remove a previously added
$SOURCE.
list list fifo List the currently configured $SOURCE.
39
source Required Path to a FIFO or named pipe to index.
hostname Optional
Specify a host name to set as the host field value for events from the
input source.
hostregex Optional
Specify a regular expression on the source file path to set as the
hostsegmentnum Optional
Set the number of segments of the source file path to set as the host
field value for events from the input source.
Example
This example shows how to enable a FIFO input, then set the host and sourcetype.
1. Add the FIFO /var/run/syslogfifo and set the sourcetype to linux_messages_syslog.
./splunk add fifo /var/run/syslogfifo -sourcetype linux_messages_syslog
2. Edit the input configuration to set the host to web01.
./splunk edit fifo /var/run/syslogfifo -hostname web01
inputs.conf
[fifo://<path>]
This input stanza type directs Splunk to read from a FIFO at the specified path.
host = <string>
index = <string>
40
hard-coded default.

source = <string>
Scripted inputs
Scripted inputs
By configuring inputs.conf, Splunk can accept events from scripts. Scripted input is useful for
command-line tools, such as vmstat, iostat, netstat, top, etc.
Note: Currently, scripted inputs do not get sent via the deployment server. In the future, Splunk will
support this behavior. For now, use your preferred configuration automation tool to push your script
directory to your server classes.
Note: On Windows platforms, use of text-based scripts such those in perl and python can be handled
via the use of an intermediary window batch (.bat) file.
Caution: Scripted input-launched scripts inherit Splunk's environment, so be sure to clear
environment variables which may affect your script's operation. The only environment variable that's
likely to cause problems is the library path (most commonly known as LD_LIBRARY_PATH on
linux/solaris/freebsd).
41
Configuration
Configure inputs.conf, using the following attributes:
[script://$SCRIPT]
interval = X
index = <index>
sourcetype = <iostat, vmstat, etc> OPTIONAL
source = <iostat, vmstat, etc> OPTIONAL
disabled = <true | false>
script is the fully-qualified path to the location of the script.
As a best practice, put your script in the bin/ directory nearest the inputs.conf
where your script is specified. So if you are configuring
$SPLUNK_HOME/etc/system/local/inputs.conf, place your script in
$SPLUNK_HOME/etc/system/bin/. If you're working on an application in
$SPLUNK_HOME/etc/apps/$APPLICATION/, put your script in
$SPLUNK_HOME/etc/apps/$APPLICATION/bin/.

interval is in seconds.
Splunk keeps one invocation of a script per instance. Intervals are based on when the
script completes. So if you have a script configured to run every ten minutes and the
script takes 20 minutes complete the next run will be 30 minutes after the first run.

for constant data streams, enter 1 (or a value smaller than the script's interval).
for one-shot data streams, enter -1.
Note: Setting interval to -1 will cause the script to re-run each time the splunk daemon
restarts.

index can be any index in your Splunk instance.
Default is main.

disabled is a boolean value that can be set to true if you want to disable the input.
Defaults to false.

sourcetype and source can be any value you'd like.
The value you specify is appended to data coming from your script in the
sourcetype= or source= fields.

These are optional settings.

If you want the script to run continuously, write the script to never exit and set it on a short interval.
This helps to ensure that if there is a problem the script gets restarted. Splunk keeps track of scripts it
has spawned and will shut them down upon exit.
Example
This example shows the use of the UNIX top command as a data input source.
Start by creating a new application directory. This example uses scripts/:
$ mkdir $SPLUNK_HOME/etc/apps/scripts
All scripts should be run out of a bin/ directory inside your application directory:
$ mkdir $SPLUNK_HOME/etc/apps/scripts/bin
42
This example uses a small shell script top.sh:
$ #!/bin/sh
top -bn 1 # linux only - different OSes have different paramaters
Make sure the script is executable:
chmod +x $SPLUNK_HOME/etc/apps/scripts/bin/top.sh
Test that the script works by running it via the shell:
$SPLUNK_HOME/etc/apps/scripts/bin/top.sh
The script should have sent one top output.
Add the script entry to inputs.conf in $SPLUNK_HOME/etc/apps/scripts/default/:
[script:///opt/splunk/etc/apps/scripts/bin/top.sh]
interval = 5 # run every 5 seconds
sourcetype = top # set sourcetype to top
source = script://./bin/top.sh # set source to name of script
props.conf
You may need to modify props.conf:
By default Splunk breaks the single top entry into multiple events.
The easiest way to fix this problem is to tell the Splunk server to break only before something
that does not exist in the output.

For example, adding the following to
$SPLUNK_HOME/etc/apps/scripts/default/props.conf forces all lines into a single event:
[top]
BREAK_ONLY_BEFORE = <stuff>
Since there is no timestamp in the top output we need to tell Splunk to use the current time. This is
done in props.conf by setting:
DATETIME_CONFIG = CURRENT
Whitelist and blacklist rules
Whitelist and blacklist rules
For monitored inputs enabled via inputs.conf, use whitelist and blacklist rules to explicitly tell Splunk
which files to consume. When you define a whitelist, Splunk indexes ONLY the files in that list.
Alternately, when you define a blacklist, Splunk ignores the files in that list and consumes everything
else. You don't have to define both a whitelist and a blacklist, they are independent settings. If you
happen to have both, and a file that matches both of them, that file WILL NOT be indexed, ie.
_blacklist will override _whitelist.
43
Note - If you define an input using ... or * wildcards, that will create an implicit whitelist. Further
'_whitelist' settings will be ignored.
Whitelist and blacklist rules use regular expression syntax to define the match on the file name/path.
Also, your rules must be contained within a configuration stanza, for example
[monitor://<path>]; those outside a stanza (global entries) are ignored.
Instead of whitelisting or blacklisting your data inputs, you can filter specific events and send them to
different queues or indexes. Read more about filtering and routing events to different queues and
filtering and routing events to alternate indexes. You can also use the crawl feature to predefine files
you want Splunk to index or not index automatically when they are added to your filesystem.
Define whitelist and blacklist entries with exact regex syntax; the "..." wildcard is not supported.
Whitelist (allow) files
To define the files you want Splunk to exclusively index, add the following line to your monitor
stanza in $SPLUNK_HOME/etc/system/local/inputs.conf:
_whitelist = $YOUR_CUSTOM_REGEX
For example, if you want Splunk to monitor only files with the .log extension:
[monitor:///mnt/logs]
_whitelist = \.log$
You can whitelist multiple files in one line, using the "|" (OR) operator. For example, to whitelist
filenames that contain query.log OR my.log:
_whitelist = query\.log$|my\.log$
Or, to whitelist exact matches:
_whitelist = /query\.log$|/my\.log$
Note: The "$" anchors the regex to the end of the line. There is no space before or after the "|"
operator.
Blacklist (ignore) files
To define the files you want Splunk to exclude from indexing, add the following line to your monitor
stanza in $SPLUNK_HOME/etc/system/local/inputs.conf:
_blacklist = $YOUR_CUSTOM_REGEX
Important: If you create a _blacklist line for each file you want to ignore, Splunk activates only
the last filter.
If you want Splunk to ignore and not monitor only files with the .txt extension:
44
_blacklist = \.(txt)$
If you want Splunk to ignore and not monitor all files with either the .txt extension OR the .gz
extension (note that you use the "|" for this):
_blacklist = \.(txt|gz)$
If you want Splunk to ignore entire directories beneath a monitor input refer to this example:
_blacklist = (archive|historical|\.bak$)
The above example tells Splunk to ignore all files under /mnt/logs/ within the archive directory, within
historical directory and to ignore all files ending in *.bak.
If you want Splunk to ignore files that contain a specific string you could do something like this:
_blacklist = 2009022[89]file\.txt$
The above example will ignore the webserver20090228file.txt and webserver20090229file.txt files
under /mnt/logs/.
Verify your lists
To verify that your whitelist and blacklist rules are configured properly, run the listtails utility
found in your $SPLUNK_HOME/bin directory. listtails reads in the configuration of
inputs.conf in all application directories, scans the directories, and displays an exact list of files
that Splunk will monitor when you restart.
In your $SPLUNK_HOME/bin directory, run:
./splunk cmd listtails
Crawl
Crawl
Use crawl to search your filesystem for new data sources to add to your index. Configure one or more
types of crawlers in crawl.conf to define the type of data sources to include in or exclude from your
results.
Configuration
Edit $SPLUNK_HOME/etc/system/local/crawl.conf to configure one or more crawlers that
browse your data sources when you run the crawl command. Define each crawler by specifying
values for each of the crawl attributes. Enable the crawler by adding it to crawlers_list.
45
Crawl logging
The crawl command produces a log of crawl activity that's stored in
$SPLUNK_HOME/var/log/splunk/crawl.log. Set the logging level with the logging key in the
[default] stanza of crawl.conf:
[default]
logging = <warn | error | info | debug>
Enable crawlers
Enable a crawler by listing the crawler specification stanza name in the crawlers_list key of the
[crawlers] stanza.
Use a comma-separated list to specify multiple crawlers.
Enable crawlers that are defined in the stanzas: [file_crawler], [port_crawler], and
[db_crawler].
[crawlers]
crawlers_list = file_crawler, port_crawler, db_crawler
Define crawlers
Define a crawler by adding a definition stanza in crawl.conf. Add additional crawler definitions by
adding additional stanzas.
Example crawler stanzas in crawl.conf:
[Example_crawler_name]
....
[Another_crawler_name]
....
Add key/value pairs to crawler definition stanzas to set a crawler's behavior. The following keys are
available for defining a file_crawler:
Argument Description
bad_directories_list Specify directories to exclude.
bad_extensions_list Specify file extensions to exclude.
bad_file_matches_list
Specify a string, or a comma-separated list of strings that filenames
must contain to be excluded. You can use wildcards (examples:
foo*.*,foo*bar, *baz*).
packed_extensions_list
Specify extensions of common archive filetypes to include. Splunk
unpacks compressed files before it reads them. It can handle tar, gz,
bz2, tar.gz, tgz, tbz, tbz2, zip, and z files. Leave this empty if you
don't want to add any archive filetypes.
collapse_threshold
Specify the minimum number of files a source must have to be
considered a directory.
46
days_sizek_pairs_list
Specify a comma-separated list of age (days) and size (kb) pairs to
constrain what files are crawled. For example: days_sizek_pairs_list
= 7-0, 30-1000 tells Splunk to crawl only files last modified within 7
days and at least 0kb in size, or modified within the last 30 days and
at least 1000kb in size.
big_dir_filecount
Set the maximum number of files a directory can have in order to be
crawled. crawl excludes directories that contain more than the
maximum number you specify.
index
Specify the name of the index to which you want to add crawled file
and directory contents.
max_badfiles_per_dir
Specify how far to crawl into a directory for files. If Splunk crawls a
directory and doesn't find valid files within the specified
max_badfiles_per_dir, then Splunk excludes the directory.
root Specify directories for a crawler to crawl through.
Example
Here's an example crawler called simple_file_crawler may look like:
[simple_file_crawler]
bad_directories_list= bin, sbin, boot, mnt, proc, tmp, temp, home, mail, .thumbnails, cache, old
bad_extensions_list= mp3, mpg, jpeg, jpg, m4, mcp, mid
bad_file_matches_list= *example*, *makefile, core.*
packed_extensions_list= gz, tgz, tar, zip
collapse_threshold= 10
days_sizek_pairs_list= 3-0,7-1000, 30-10000
big_dir_filecount= 100
index=main
max_badfiles_per_dir=100
Windows inputs
Windows inputs
Configure Splunk for Windows to index your Windows Application, System, and Security event logs.
Splunk for Windows can also monitor and index changes to your registry and accept WMI data input.
This functionality is not yet exposed in Splunk Web or the CLI.
In addition to the information in this topic and the subsequent topics on Windows inputs, you can
watch this step-by-step video that covers installing and configuring inputs for Splunk on Windows.
When you run the Splunk Windows installer, you are given the option to set up indexing and/or
monitoring for the event logs, the registry, and for WMI. If you choose to do this, the default values for
these settings are assumed. Once you have completed the installation, you can then make changes
to the default values set by the installation process.
If you want to make changes to the default values, edit a copy of inputs.conf in
$SPLUNK_HOME\etc\system\local\. You only have to provide values for the attributes you want
47
to change within the stanza. For more information about how to work with Splunk configuration files,
refer to How configuration files work.
At a high level, here're the basic steps to get data into Splunk on Windows. Use the more detailed
information in this and the next topics in this manual to proceed:
1. Copy inputs.conf from $SPLUNK_HOME\etc\system\default to etc\system\local .
2. Un-mark it "Read Only".
3. Open and enable the Windows Event Log inputs.
4. Enable the Registry and WMI scripted inputs.
5. Copy wmi.conf from $SPLUNK_HOME\etc\system\default to etc\system\local .
6. Un-mark it "Read Only".
7. Enable local WMI polling.
8. Restart Splunk.
Configure indexing for Windows event logs
Windows event logs are from binary format *.evt files and cannot be monitored like a flat file. The
settings for which event logs to index are in the following stanza in inputs.conf:
# Windows platform specific input processor.
[WinEventLog:Application]
disabled = 0
[WinEventLog:Security]
disabled = 0
[WinEventLog:System]
disabled = 0
You can configure Splunk to read non-default Windows event logs as well, but you must import them
to the Windows Event Viewer first, and then add them to your local copy of inputs.conf, (usually in
$SPLUNK_HOME\etc\system\local\inputs.conf) as follows:
[WinEventLog:DNS Server]
disabled = 0
[WinEventLog:Directory Service]
disabled = 0
[WinEventLog:File Replication Service]
disabled = 0
To disable indexing for an event log, add disabled = 1 below its listing in the stanza in
$SPLUNK_HOME\etc\system\local\inputs.conf.
48
Configure Windows registry monitoring input
The global settings for Windows registry monitoring are in the following stanza in inputs.conf:
[script://$SPLUNK_HOME\bin\scripts\splunk-regmon.py]
interval = 60
sourcetype = WinRegistry
source = WinRegistry
disabled = 0
Note: The Splunk registry input monitoring script (splunk-regmon.py) is configured as a scripted
input. Do not change this value.
source: labels these events as coming from the registry.
sourcetype: assigns these events as registry events.
interval: specifies how frequently to poll the registry for changes, in seconds.
disabled: indicates whether the feature is enabled. Set this to 1 to disable this feature.
The Windows registry monitoring functionality uses two additional configuration files that are
described in Windows registry input. You may wish to review that page before proceeding.
Note: You must use two backslashes \\ to escape wildcards in stanza names in inputs.conf.
Regexes with backslashes in them are not currently supported when specifying paths to files.
Windows Management Instrumentation (WMI) input
Windows Management Instrumentation (WMI) input
Splunk supports WMI (Windows Management Instrumentation) data input for agentless access to
Windows performance data and event logs. This means you can pull event logs from all the Windows
servers and desktops in your environment without having to install anything on those machines.
The Splunk WMI data input can connect to multiple WMI providers and pull data from them. The WMI
data input runs as a separate process (splunk-wmi.exe) on the Splunk server. It is configured as a
scripted input in $SPLUNK_HOME\etc\system\default\inputs.conf. Do not edit this file.
Note: This feature is only available on the Windows versions of Splunk and is NOT enabled by
default. To enable it, add the following line to
$SPLUNK_HOME\etc\system\local\inputs.conf:
[script://$SPLUNK_HOME\bin\scripts\splunk-wmi.py]
disabled = 0
Important: There is an issue with stopping and restarting Splunk currently affecting users of remote
WMI polling. If one or more of your WMI sources is unavailable at the time that you stop Splunk,
Splunk will not come back up unless you wait for the splunk-wmi.exe process to exit, or kill it
manually. To avoid this issue, do not unnecessarily list non-existent/non-functioning machines in
wmi.conf.
49
Security and remote access considerations
Splunk requires privileged access to index many Windows data sources, including WMI, Event Log,
and the registry. This includes both the ability to connect to the box, as well as permissions to read
the appropriate data once connected. To access WMI data, Splunk must run as a user with
permissions to perform remote WMI connections. This user name must be a member of an Active
Directory domain and must have appropriate privileges to query WMI. Both the Splunk server making
the query and the target systems being queried must be part of this Active Directory domain.
Note: If you installed Splunk as the LOCAL SYSTEM user, WMI remote authentication will not work;
this user has null credentials and Windows servers normally disallow such connections.
There are several things to consider:
For remote data collection via WMI, the Splunk service must run as a user who has sufficient
OS privileges to access the WMI resources you wish to poll. At a minimum, Splunk requires
access to the following privileges on every machine you poll:
Profile System Performance
Access this Computer from the Network
The simplest way to ensure Splunk has access to these resources is to add Splunk's
user to the Performance Log Users and Distributed COM Users Domain groups. If
these additions fail to provide sufficient permissions, add Splunk's user to the remote
machine's Administrators group.

You must enable DCOM for remote machine access, and it must be accessible to Splunk's
user. See the Microsoft topic about Securing a Remote WMI Connection for more information.
Adding Splunk's user to the Distributed COM Users local group is the fastest way to enable
this permission. If this fails to provide sufficient permissions, add Splunk's user to the remote
machine's Administrators group.

The WMI namespace that Splunk accesses (most commonly root\cimv2) must have proper
permissions set. Enable the following permissions on the WMI tree at root for the Splunk user:
Execute Methods, Enable Account, Remote Enable, and Read Security.
See the Microsoft how-to HOW TO: Set WMI Namespace Security in Windows Server
2003 for more information.

If you have a firewall enabled, you must allow access for WMI. If you are using the Windows
Firewall, the exceptions list explicitly lists WMI. You must set this exception for both the
originating and the remote machine. See the Microsoft topic about Connecting to WMI
Remotely Starting with Vista for more details.

Test access to WMI
Follow these steps to test the configuration of the Splunk server and the remote machine:
1. Log into the machine Splunk runs on as the user Splunk runs as.
2. Click Start -> Run and type wbemtest. The wbemtest application starts.
3. Click Connect and type \\<server>\root\cimv2, replacing <server> with the name of the
remote server. Click Connect. If you are unable to connect, there is a problem with the authentication
between the machines.
50
4. If you are able to connect, click Query and type select * from win32_service. Click Apply.
After a short wait, you should see a list of running services. If this does not work, then the
authentication works, but the user Splunk is running as does not have enough privileges to run that
operation.
Configure WMI input
Look at wmi.conf to see the default values for the WMI input. If you want to make changes to the
default values, edit a copy of wmi.conf in $SPLUNK_HOME\etc\system\local\. Only set values
for the attributes you want to change for a given type of data input. Refer to How configuration files
work for more information about how Splunk uses configuration files.
[settings]
initial_backoff = 5
max_backoff = 20
max_retries_at_max_backoff = 2
result_queue_size = 1000
checkpoint_sync_interval = 2
heartbeat_interval = 500
[WMI:AppAndSys]
server = foo, bar
interval = 10
event_log_file = Application, System, Directory Service
disabled = 0
[WMI:LocalSplunkWmiProcess]
interval = 5
wql = select * from Win32_PerfFormattedData_PerfProc_Process where Name = "splunk-wmi"
disabled = 0
The [settings] stanza specifies runtime parameters. The entire stanza and every parameter
within it are optional. If the stanza is missing, Splunk assumes system defaults.
The following attributes control how the agent reconnects to a given WMI provider when an
error occurs. All times are in seconds:
initial_backoff: how much time to wait the first time after an error occurs before
trying to reconnect. Thereafter, if errors keep occurring, the wait time doubles, until it
reaches max_backoff.

max_backoff: the maximum amount of time to wait before invoking
max_retries_at_max_backoff.

max_retries_at_max_backoff : if the wait time reaches max_backoff, try this
many times at this wait time. If the error continues to occur, Splunk will not reconnect to
the WMI provider in question until the Splunk services are restarted.

result_queue_size: size of the queue that ensures that WMI providers don't end up
blocking while waiting for data to be written to the output. Results received from the WMI
providers are put into this queue.

checkpoint_sync_interval: minimum wait time for state data (event log checkpoint) to be
written to disk. In seconds.

heartbeat_interval: the thread that manages the connection to WMI providers will
51
execute at this interval. In milliseconds.
You can specify two types of data input: event log, and raw WQL (WMI query language) The event
log input stanza contains the event_log_file parameter, and the WQL input stanza contains wql.
The common parameters for both types are:
server: a comma-separated list of servers from which to pull data. If this parameter is
missing, Splunk assumes the local machine.

interval : how often to poll for new data, in seconds. Required.
disabled: indicates whether this feature is enabled or disabled. Set this parameter to 1 to
disable WMI input into Splunk.

WQL-specific parameters:
namespace: specifies the path to the WMI provider. The local machine must be able to
connect to the remote machine using delegated authentication. This attrbitue is optional. If you
don't specify a path to a remote machine, Splunk will connect to the default local namespace
(\root\cimv2), which is where most of the providers you are likely to query reside. Microsoft
provides a list of namespaces for Windows XP and later versions of Windows.

wql: provides the WQL query. The example above polls data about a running process named
splunkd every 5 seconds.

Event log-specific parameter: event_log_file: specify a comma-separated list of log files to poll in
the event_log_file parameter. File names that include spaces are supported, as shown in the
example.
Fields for WMI data
All events received from WMI have the source set to wmi.
For event log data, the source type is set to WinEventLog:<name of log file> (for
example WinEventLog:Application).

For WQL data, the the source type is set to the name of the config stanza (for example, for a
stanza named [WMI:LocalSplunkdProcess], the field is set to
WMI:LocalSplunkProcess).

The host is identified automatically from the data received.
Windows registry input
Windows registry input
Splunk supports the capture of Windows registry settings and lets you monitor changes to the
registry. You can know when registry entries are added, updated, and deleted. When a registry entry
is changed, Splunk captures the name of the process that made the change and the key path from
the hive to the entry being changed.
52
The Windows registry input monitor application runs as a process called splunk-regmon.exe.
Note: This feature is not currently supported on Windows 2000 due to an issue with a Windows 2000
dll (PSAPI.DLL).
Warning: Do not stop or kill the splunk-regmon.exe process manually; this could result in system
instability. To stop the process, stop the Splunk server process from the Windows Task Manager or
from within Splunk Web.
How it works
Windows registries can be extremely dynamic (thereby generating a great many events). Splunk
provides a two-tiered configuration for fine-tuning the filters that are applied to the registry event data
coming into Splunk.
Splunk Windows registry monitoring uses two configuration files to determine what to monitor on your
system, sysmon.conf and regmon-filters.conf, both located in
$SPLUNK_HOME\etc\system\local\. These configuration files work as a hierarchy:
sysmon.conf contains global settings for which event types (adds, deletes, renames, and so
on) to monitor, which regular expression filters from the regmon-filters.conf file to use,
and whether or not Windows registry events are monitored at all.

regmon-filters.conf contains the specific regular expressions you create to refine and
filter the hive key paths you want Splunk to monitor.

sysmon.conf contains only one stanza, where you specify:
event_types: the superset of registry event types you want to monitor. Can be delete,
set, create, rename, open, close, query.

active_filters: the list of regular expression filters you've defined in
regmon-filters.conf that specify exactly which processes and hive paths you want
Splunk to monitor. This is a comma-separated list of the stanza names from
regmon-filters.conf. You can use wildcards, which can be useful in case you want to
name and invoke groups of related or similar filters based on a naming convention. If a given
filter is not named in this list, it will not be used, even if it is present in
regmon-filters.conf. This means you can turn on and off monitoring for various filters or
groups of filters as desired.

disabled: whether to monitor registry settings changes or not. Set this to 0 to disable
Windows registry monitoring altogether.

Each stanza in regmon-filters.conf represents a particular filter whose definition includes:
proc: a regular expression containing the path to the process or processes you want to
monitor

hive: a regular expression containing the hive path to the entry or entries you want to monitor.
Splunk supports the root key value mappings predefined in Windows:
\\REGISTRY\\USER\\ maps to HKEY_USERS or HKU

53
\\REGISTRY\\USER\\ maps to HKEY_CURRENT_USER or HKCU
\\REGISTRY\\USER\\_Classes maps to HKEY_CLASSES_ROOT or HKCR
\\REGISTRY\\MACHINE maps to HKEY_LOCAL_MACHINE or {{HKLM
\\REGISTRY\\MACHINE\\SOFTWARE\\Classes maps to HKEY_CLASSES_ROOT or
HKCR

\\REGISTRY\\MACHINE\\SYSTEM\\CurrentControlSet\\Hardware
Profiles\\Current maps to HKEY_CURRENT_CONFIG or HKCC

type: the subset of event types to monitor. Can be delete, set, create, rename,
open, close, query. The values here must be a subset of the values for event_types
that you set in sysmon.conf.

baseline: whether or not to capture a baseline snapshot for that particular hive path. 0 for no
and 1 for yes.

baseline interval: how long Splunk has to have been down before re-taking the
snapshot, in seconds. The default value is 24 hours.

Get a baseline snapshot
When you install Splunk, you're given the option of recording a baseline snapshot of your registry
hives the next time Splunk starts. By default, the snapshot covers the entirety of the user keys and
machine keys hives. It also establishes a timeline for when to retake the snapshot; by default, if
Splunk has been down for more than 24 hours since the last checkpoint, it will retake the baseline
snapshot. You can customize this value for each of the filters in regmon-filters.conf by setting
the value of baseline interval.
Note: Executing a splunk clean all -f deletes the current baseline snapshot.
What to consider
When you install Splunk on a Windows machine and enable registry monitoring, you specify which
major hive paths to monitor: key users (HKEY) and/or key local machine (HKLM). Depending on how
dynamic you expect the registry to be on this machine, checking both could result in a great deal of
data for Splunk to monitor. If you're expecting a lot of registry events, you may want to specify some
filters in regmon-filters.conf to narrow the scope of your monitoring immediately after you
install Splunk and enable registry event monitoring but before you start Splunk up.
Similarly, you have the option of capturing a baseline snapshot of the current state of your Windows
registry when you first start Splunk, and again every time a specified amount of time has passed. The
baselining process can be somewhat processor-intensive, and may take several minutes. You can
postpone taking a baseline snapshot until you've edited regmon-filters.conf and narrowed the
scope of the registry entries to those you specifically want Splunk to monitor.
54
Configure Windows registry input
Look at inputs.conf to see the default values for Windows registry input. They are also shown below.
If you want to make changes to the default values, edit a copy of inputs.conf in
$SPLUNK_HOME\etc\system\local\. You only have to provide values for the parameters you
want to change within the stanza. For more information about how to work with Splunk configuration
files, refer to How do configuration files work?
[script://$SPLUNK_HOME\bin\scripts\splunk-regmon.py]
interval = 60
sourcetype = WinRegistry
source = WinRegistry
disabled = 0
source: labels these events as coming from the registry.
sourcetype: assigns these events as registry events.
interval: specifies how frequently to poll the registry for changes, in seconds.
disabled: indicates whether the feature is enabled. Set this to 1 to disable this feature.
Windows process monitoring
Windows process monitoring
Starting with version 3.4.2 of Splunk, you can enable native Windows process monitoring within
Splunk. Because this can generate a high volume of events, this is not enabled by default. If you
enable this feature, you can reduce the volume of events by creating regular expressions to filter out
data you do not want sent to Splunk using the information in this topic.
It works the same way as configuring registry monitoring:
sysmon.conf contains global settings for which event types to monitor, which regular
expression filters from the procmon-filters.conf file to use, and whether or not Windows
process events are monitored at all.

procmon-filters.conf contains the specific regular expressions you create to refine and
filter the process events you want Splunk to monitor.

sysmon.conf contains a stanza called [ProcessMonitor], where you specify:
event_types: the superset of process event types you want to monitor. Can be create,
exit, image.

active_filters: the list of regular expression filters you've defined in
procmon-filters.conf that specify exactly which processes you want Splunk to monitor.
This is a comma-separated list of the stanza names from procmon-filters.conf. You can
use wildcards, which can be useful in case you want to name and invoke groups of related or
similar filters based on a naming convention. If a given filter is not named in this list, it will not
be used, even if it is present in regmon-filters.conf. This means you can turn on and off
monitoring for various filters or groups of filters as desired.

disable: whether to monitor process events or not. Set this to 0 to disable Windows process
monitoring altogether.

55
inclusive: whether the filters of this monitor are inclusive or exclusive filters. Values can be
0 or 1, default value is 1 (inclusive).

filter_file_name = specifies the name of the file containing the filters for this monitor,
which should be procmon-filters.conf by default.

Each stanza in procmon-filters.conf represents a particular filter whose definition includes:
proc: a regular expression containing the path to the process or processes you want to
monitor

hive: not used in this context, specific only to the registry monitor, should always be set to ".*"
(dot star).

type: the subset of event types to monitor. The values here must be a subset of the values for
event_types that you set in sysmon.conf.

Note: You must restart Splunk if you change these configuration files.
The following are the default settings if you enable process monitoring:
The stanza in sysmon.conf:
[ProcessMonitor]
filter_file_name = procmon-filters
event_types = create.*|exit.*|image.*
active_filters = "not-splunk-optimize"
inclusive = 0
disabled = 1
The corresponding individual filter stanzas in procmon.conf:
[default]
hive = .*
[not-splunk-optimize]
proc = splunk-optimize.exe
type = create|exit|image
56
Data Distribution
How data distribution works
How data distribution works
Splunk servers running on any supported OS platform can forward data to one another (as well as to
other systems) in real time. This setup allows data inputs gathered on one Splunk server in a specific
environment to be sent to another Splunk server for indexing and search. Also, Splunk servers can
forward data to groups of other Splunk servers, to enable horizontal scaling via clustered indexing.
Splunk servers can also clone data to multiple groups of other Splunk servers to provide for data
redundancy in high availability environments.
Data distribution covers all configurations in which one Splunk server (the forwarder) is sending data
to one or more Splunk servers (the receivers) prior to being indexed. The forwarder can also index
data locally.
Note: All Splunk instances in a distributed cluster must be running the same version of Splunk,
although they can be running on any variety of support OSes.
Important: Beginning with 3.4.2, users running Splunk with the Free license can set their instance to
receive data from a fowarder. For earlier versions of Splunk, users must have an Enterprise license to
change this distributed setting on each receiving Splunk instance.
Forwarding
Forwarding is the simplest setup for forwarding and receiving. Forwarding refers to any server that
sends data to another server for indexing.
Learn how to enable forwarding and receiving.
57
Routing
With routing enabled, the forwarder matches conditions based on patterns in the events themselves
to selectively send some events to one other server and other events to another server.
Learn how to enable data routing.
Cloning
Cloning refers specifically to a forwarder sending every event to two or more other Splunk servers to
provide for data redundancy.
Learn how to enable cloning.
Data balancing
Data balancing refers to data that is sent in a balanced fashion to groups of servers. This set up
supports large volumes of data. All of the forwarders send data to some number of receivers, and the
receivers indexes data in a round-robin fashion.
58
Data balanced target groups are made up of multiple servers. Learn how to set up data balancing.
Buffering during data balancing
If a server becomes inaccessible during data balancing, Splunk continues to send events to all
accessible servers.
Eventually, Splunk stops trying to send to an unresponsive server, and notes that the server has
gone off line. If all servers are inaccessible, Splunk writes to a buffer on the forwarder's side.
Data buffering values are set in outputs.conf on the forwarding side.
Target groups
Rather than output data to one receiver, forwarders can send to target groups. Target groups are
made of one or more receiving servers:
[target group 1]
server 1, server 2
[target group 2]
server 3
[target group 3]
server 4, server 5, server 6
Cloning sends every event to all target groups; routing sends specific events to one target group and
different events to other target groups. You can also set up default groups, which receive all the data
not sent to target groups. If more than one group is specified, Splunk clones events to all listed
default groups.
defaultGroup=<groupname1>,<groupname2>...
Learn more about target group configuration.
59
Security
Any Splunk server can route some or all of its incoming data in real time to other Splunk servers and
to other systems via TCP, either in the clear text or via SSL. Learn how to set up SSL.
Send to 3rd party systems
By default, data is routed between Splunk servers as cooked data -- meaning events have been
parsed and tagged. However, Splunk can be configured to either receive or send raw data in order to
interact with third party systems.
Learn how to configure Splunk to send to or receive from third party software.
Distributed search
Splunk servers can distribute search requests to other Splunk servers and merge the results back to
the user. Distributed search combines with balanced indexing to provide horizontal scaling, allowing
you to search and index hundreds of gigabytes or terabytes per day. Additionally, distributed search
allows select users to correlate data across different data silos.
Learn more about distributed search.
60
Configuration files for data distribution
The forwarder uses the TCP output processor, configured by outputs.conf.
Configure the receiver via inputs.conf.
Conditions for routing are established in transforms.conf and linked to specific sources, source
types or hosts in props.conf.

Enable forwarding and receiving
Enable forwarding and receiving
Version 3.4 of Splunk includes the Splunk forwarder and light forwarder configurations, packaged as
Splunk applications. You can enable and disable these configurations as desired, in conjunction with
the information and procedures described in this topic.
For a general overview of how forwarding and receiving work, please read the introduction to
forwarding and receiving.
Important: If you are configuring forwarding and receiving, your receiving Splunk instance must run
the same version or a later version of Splunk as your forwarders.
Important: Beginning with 3.4.2, users running Splunk with the Free license can set their instance to
receive data from a fowarder. In earlier versions of Splunk, users needed an Enterprise license to
change this distributed setting.
Read this before you enable Splunk forwarder or light forwarder
Splunk Web is turned off in the forwarder and light forwarder to reduce the footprint of Splunk on the
forwarding host. Therefore, if you want to use Splunk Web to configure your forwarding Splunk
instance, do this before you enable forwarding. After you enable forwarding, you can only configure
your forwarder via the Splunk CLI.
You must configure a receiver before setting up forwarding. This way, the Splunk receiving host is
prepared for the forwarded data. Then, configure your forwarder(s). Follow these general steps to
deploy Splunk forwarders and light forwarders effectively.
First, enable a Splunk server to receive data:
1. Decide which machine to use as a receiver.
2. Configure it to receive data using these instructions.
Note: Your receiving Splunk instance must be running the same version of Splunk as your
forwarders, or a later version.
Then, on the forwarding Splunk instance:
61
1. Install Splunk on the machine that will be forwarding data.
2. Point your forwarder at the receiver using these instructions. You have the option of enabling local
indexing at this time, which means that any data that is forwarded is also indexed locally. This applies
to any pre-existing data on the forwarder as well.
3. Use Splunk Web or the CLI to add inputs as described here. Data from these inputs will be sent via
the forwarder to the receiver. Data from these inputs will be sent via the forwarder to the receiver as
soon as you do this (and indexed locally if you've configured this)
4. Then, use Splunk Web or the CLI to enable Splunk forwarder or light forwarder.
5. Install applications on your light forwarder. Specifically, install any applications that you're running
on your receiver that also contain inputs.conf.
After you configure a Splunk instance to forward data, add any additional settings, such as routing,
cloning, filtering or data balancing. Configuration changes are done on the forwarder side, on the host
that is reading the data input.
Note: If you are running a version of Splunk that is older than 3.4.2, you must have an Enterprise
license on the receiver. Splunk instances before 3.4.2 running with the default license can forward but
cannot receive data. Customers who require Enterprise features (such as authentication) on
forwarding instances of Splunk can enable the
$SPLUNK_HOME/etc/splunk-forwarder.license file. Alternately, you can upgrade to 3.4.2 or
later and enable receiving without an Enterprise license.
Receiving
Follow these instructions to configure a Splunk instance as a receiver.
Note: Your receiving Splunk instance must be running the same (or later) version of Splunk as your
forwarders. For example, a 3.3 receiver can accept traffic from forwarders running earlier versions. A
3.2 receiver cannot accept connections from a 3.3 forwarder.
via Splunk Web
Enable receiving via Splunk Web.
Navigate to Splunk Web on the server that will receive data for indexing.
Click the Admin link in the upper right hand corner of Splunk Web.
Select the Distributed tab.
Click Receive Data.
To begin receiving data:
Set the radio button to Yes.
Specify the port that you want Splunk to listen on. This is also the port that Splunk
instances use to forward data to this server.

Click the Save button to commit the configuration. Restart Splunk for your changes to
take effect.

62
via Splunk CLI
Enable receiving from Splunk's CLI. To use Splunk's CLI, navigate to the $SPLUNK_HOME/bin/
directory and use the ./splunk command. Also, add Splunk to your path and use the splunk
command.
To log in:
./splunk login
Splunk username: admin
Password:
To enable receiving:
# ./splunk enable listen 42099 -auth admin:changeme
Listening for Splunk data on TCP port 42099.
To disable receiving:
# ./splunk disable listen -auth admin:changeme
No longer listening for Splunk TCP data.
You need to restart the Splunk Server for your changes to take effect.
Forwarding
You must first configure your receiving Splunk host using the instructions above before configuring
forwarders.
via Splunk Web
Enable forwarding via Splunk Web.
Navigate to Splunk Web on the server that will be forwarding data for indexing.
Click the Admin link in the upper right-hand corner of Splunk Web.
Select the Distributed tab.
Click Forward Data.
To begin forwarding data:
Set the Forward data to other Splunk Servers? radio button to Yes.
Specify whether you want to keep a copy of the data local in addition to forwarding or just
forward. Keeping a local copy allows you to search from the local server, but requires more
space and memory.

Specify the Splunk server(s) and port number to forward data to. The port number should be
the same one that you specified when you configured receiving.

Click the Save button to commit the configuration. Restart Splunk for your changes to take
effect.

63
via Splunk CLI
Enable forwarding from the Splunk CLI. Navigate to your $SPLUNK_HOME/bin directory on the
forwarding server and log in to the CLI. Also add Splunk to your path and use the splunk command.
./splunk login
Splunk username: admin
Password:
To enable forwarding:
# ./splunk add forward-server <host:port> -auth admin:changeme
where <host:port> are the hostname and port of the Splunk server to which this forwarder or light
forwarder should send data.
To disable forwarding:
# ./splunk remove forward-server <host:port> -auth admin:changeme
where <host:port> are the hostname and port of the Splunk server to which this forwarder or light
forwarder is currently sending data.
Note: Although this command disables the forwarding activity, this machine will still be configured as
a Splunk forwarder or light forwarder.
Configure target groups in outputs.conf
Configure target groups in outputs.conf
Configure outputs.conf to send to multiple groups of one or more servers, called target groups. Also,
you can set up a default group, made up of one or more target groups, which receives all the data
not sent to target groups. If there is more than one group specified in the default group, Splunk clones
events to all listed default groups.
Note: While forwarding, events are stored in memory. If any receiver goes down, Splunk buffers the
events in memory on the forwarder. Also, by default, time extraction is based on the timestamp in the
event, not when Splunk receives the event. If you want to change this default behavior while
forwarding, configure your forwarder to turn off timestamping, in which case Splunk uses the time the
forwarder saw the event.
Configuration
Default group and global settings
Add your default group stanza to $SPLUNK_HOME/etc/system/local/outputs.conf on the
forwarding server.
[tcpout]
64
defaultGroup= Group1, Group2, ...
attribute1 = val1
attribute2 = val2
...
If you have no default group, set global settings in the [tcpout] stanza.
Note: Settings for your default group are global and inherited by all target groups. Override these
settings by creating explicit rules for each target group.
Target groups
Add any number of target group stanzas to $SPLUNK_HOME/etc/system/local/outputs.conf
on the forwarding server.
[tcpout:$TARGET_GROUP]
server=$IP:$PORT, $IP2:$PORT2...
attribute1 = val1
attribute2 = val2
...
Note: If your target group is made up of more than one $IP:$PORT, the forwarder sends events in a
round robin between these URIs.
Optional attributes
There are a number of optional attributes you can set in outputs.conf.
sendCookedData=true/false
If true, events are cooked (have been processed by Splunk and are not raw)
If false, events are raw and untouched prior to sending
Defaults to true

heartbeatFrequency=60
How often in seconds to send a heartbeat packet to the receiver
Heartbeats are only sent if sendCookedData=true
Defaults to 30 seconds

Queue settings
Your data stream enters a queue as it leaves the forwarder. There are a few queue settings you can
tweak in outputs.conf.
maxQueueSize=20000
The maximum number of queued events (queue size)
Defaults to 1000

usePersistentQueue=false
If set to true and the queue is full, write events to the disk
Directory is specified with persistentQueuePath
Defaults to false

maxPersistentQueueSizeInMegs=1000
The maximum size in megabytes of the disk file where the persistent queue stores its

65
events
Defaults to 1000
dropEventsOnQueueFull=10
Wait N * 5 seconds before throwing out all new events until the queue has space.
Setting this to -1 or 0 will set the queue to block when it gets full causing blocking up the
processor chain.

When any target group's queue is blocked, no more data will reach any other target
group.

Using load balanced groups is the best way to alleviate this condition because multiple
receivers must be down (or jammed up) before queue blocking occurs.

Defaults to -1 (do not drop events)

Single server
Add any number of single server stanzas to $SPLUNK_HOME/etc/system/local/outputs.conf
on the forwarding server. Use single server configuration to set up SSL and backoff settings (see
below). Servers indicated in single server stanzas must also be a part of a target group in order to
send data.
[tcpout-server://$IP:$PORT]
attribute1 = val1
attribute2 = val2
...
Backoff settings
Backoff settings are server specific, meaning they must be set in a
[tcpout-server://$IP:$PORT] stanza. They cannot be set for a target or default group.
If one of the target group servers becomes unreachable, you can configure the forwarder to retry the
connection. If a connection needs to be retried, the forwarder uses backoffAtStartup or
initialBackoff as the number of seconds to wait. After this time expires, the forwarder doubles
the number of seconds over and over again until reaching maxBackoff. When this is reached, the
forwarder stops doubling the number of seconds in between retries and uses the same maxBackoff
seconds. It retries at this frequency maxNumberOfRetriesAtHighestBackoff times or forever if
that value is -1.
backoffAtStartup=N
Defines how many seconds to wait until retrying the first time a retry is needed

initialBackoff=N
Defines how many seconds to wait until retrying every time other than the first time a
retry is needed


maxBackoff=N
Specifies the number of seconds before reaching the maximum backoff frequency.
Defaults to 20

maxNumberOfRetriesAtHighestBackoff=N
Specifies the number of times the system should retry after reaching the highest backoff
period before stopping completely.

66
-1 means to try forever.
It is suggested that you never change this from the default, or the forwarder will
completely stop forwarding to a downed URI at some point.

Defaults to -1 (forever)
Example
Specify a target group for an IP:PORT which consists of a single receiver. This is the simplest
possible configuration; it sends data to the host at 10.1.1.197 on port 9997.
[tcpout:group1]
server=10.1.1.197:9997
Specify a target group for a hostname which consists of a single receiver.
[tcpout:group2]
server=myhost.Splunk.com:9997
Specify a target group made up of two receivers. In this case, the data is balanced (round-robin)
between these two receivers. Specify as many receivers as you wish here. Combine host name and
IP if you wish.
[tcpout:group3]
server=myhost.Splunk.com:9997,10.1.1.197:6666
Send every event to a receiver at foo.Splunk.com:9997 and send heartbeats every 45 seconds with a
maximum queue size of 100,500 events.
[tcpout:group4]
server=foo.Splunk.com:9997
maxQueueSize=100500
Set the hearbeat frequency to 15 for each group and clone the events to groups indexer1 and
indexer2. Also, index all this data locally as well.
[tcpout]
indexAndForward=true
[tcpout:indexer1]
server=Y.Y.Y.Y:9997
[tcpout:indexer2]
server=X.X.X.X:6666
Data balance between Y.Y.Y.Y and X.X.X.X.
[tcpout:indexerGroup]
server=Y.Y.Y.Y:9997, X.X.X.X:6666
Clone events between two data balanced groups.
67
[tcpout:indexer1]
server=A.A.A.A:1111, B.B.B.B:2222
[tcpout:indexer2]
server=C.C.C.C:3333, D.D.D.D:4444
Set up routing
Set up routing
Enable routing to forward data from one Splunk server to another based on content. For example,
data may be routed to systems based on sourcetype, a custom indexed field, or the content of the
raw event. Routing allows you to specifically distribute events to any system.
Configuration
To set up routing:
First, decide which events to route to which servers.
Then, edit the props.conf, transforms.conf, and outputs.conf files on the forwarding servers.
props.conf
Edit $SPLUNK_HOME/etc/system/local/props.conf and set a TRANSFORMS-routing=
attribute:
[<spec>]
TRANSFORMS-routing=$UNIQUE_STANZA_NAME
<spec> can be:
<sourcetype>, the sourcetype of an event
host::<host>, where <host> is the host for an event
source::<source>, where <source> is the source for an event
Use the $UNIQUE_STANZA_NAME when creating your entry in transforms.conf (below).
transforms.conf
Edit $SPLUNK_HOME/etc/system/local/transforms.conf and set rules to match your
props.conf stanza:
[$UNIQUE_STANZA_NAME]
REGEX=$YOUR_REGEX
DEST_KEY=_TCP_ROUTING
FORMAT=$UNIQUE_GROUP_NAME
$UNIQUE_STANZA_NAME must match the name you created in props.conf.
Enter the regex rules in $YOUR_REGEX to determine which events get conditionally routed.
DEST_KEY should be set to _TCP_ROUTING to send events via TCP
68
Set FORMAT to $UNIQUE_GROUP_NAME. This should match the group name you create in
outputs.conf

outputs.conf
Edit $SPLUNK_HOME/etc/system/local/outputs.conf and set which tcpout outputs go to
which servers or groups:
[tcpout:$UNIQUE_GROUP_NAME]
server=$IP:$PORT
Set $UNIQUE_GROUP_NAME to match the name you created in transforms.conf.
Set the IP address and port to match the receiving server.
Basic example
The following example sends all events with sourcetype="syslog" to one target group, all events
that contain the word error to another target group, and everything else to a third target group.
1. Edit $SPLUNK_HOME/etc/system/local/props.conf and set a TRANSFORMS-routing=
attribute:
[default]
TRANSFORMS-routing=errorRouting
[syslog]
TRANSFORMS-routing=syslogRouting
2. Edit $SPLUNK_HOME/etc/system/local/transforms.conf and set errorRouting and
syslogRouting rules:
[errorRouting]
REGEX=error
FORMAT=errorGroup
[syslogRouting]
REGEX=.
FORMAT=syslogGroup
3. Edit $SPLUNK_HOME/etc/system/local/outputs.conf and set which tcpout outputs go to
with servers or groups:
[tcpout]
defaultGroup=everythingElseGroup
[tcpout:syslogGroup]
server=10.1.1.197:9997
[tcpout:errorGroup]
server=10.1.1.200:9999
69
[tcpout:everythingElseGroup]
server=10.1.1.250:6666
Advanced example
This examples combines routing, data balancing and target group specific parameters. This
outputs.conf sends all events with sourcetype="syslog" to one balanced target group, all
events that contain the word error to a different target group, and clones everything else to two
target groups. The syslogGroup uses a persistent queue which lives in the /tmp directory and is
capped at a maximum on disk size of 100MB. The heartbeat frequency for all target groups is dialed
down to 10 seconds.
Note: Steps 1 and 2, props.conf and transforms.conf, are the same as the example above.
3. Edit $SPLUNK_HOME/etc/system/local/outputs.conf and set which tcpout outputs go to
with servers or groups:
[tcpout]
defaultGroup=everythingElseGroup1, everthingElseGroup2
[tcpout:syslogGroup]
server=10.1.1.197:9997, 10.1.1.198:7777
usePersistentQueue=true
blockOnQueueFull=true
[tcpout:errorGroup]
server=10.1.1.200:9999
[tcpout:everythingElseGroup1]
server=10.1.1.240:6666
[tcpout:everythingElseGroup2]
server=10.1.1.245:5555
Route specific events to different queues
Route specific events to different queues
In a distributed Splunk setup, you can send specific data to different queues for further processing.
This topic discusses how to filter your data and send it specifically to nullQueue, or Splunk's
/dev/null directory.
To filter certain events out before your data is indexed, use the instructions below to send those
events to nullQueue.
Read more about how to filter and route to an alternate index.
Important: When you choose to filter your data depends on your distributed setup. However, the
filtering needs to occur on the Splunk instance that parses the data; this may be either the indexer or
the forwarder instance.
70
Configuration
To filter out specific events:
1. Identify an attribute of the event that can be used to separate it from others.
2. Create an entry in props.conf for the source, source type or host and specify a TRANSFORMS
class and a TRANSFORMS name. The class name refers to a regular expression stanza you will
place in transforms.conf.
3. Create an entry in transforms.conf with a regular expression that matches the identified
attributes (from Step 1) and sets the DEST_KEY to queue and the FORMAT key to a specific queue
(indexQueue, parsingQueue, nullQueue, etc).
Use the $SPLUNK_HOME/etc/system/README/props.conf.example and
../transforms.conf.example as examples, or create your own props.conf and
transforms.conf. Make any changes in $SPLUNK_HOME/etc/system/local/, or your own
custom application directory in $SPLUNK_HOME/etc/apps/. For more information on configuration
files in general, see how configuration files work.
props.conf
In $SPLUNK_HOME/etc/system/local/props.conf add the following stanza:
[<spec>]
TRANSFORMS-$name=$UNIQUE_STANZA_NAME
<spec> can be:
$NAME is whatever unique identifier you want to give to your transform.
$UNIQUE_STANZA_NAME must match the stanza name of the transform you create in
transforms.conf.
transforms.conf
In $SPLUNK_HOME/etc/system/local/transforms.conf add the following stanza:
REGEX = $YOUR_CUSTOM_REGEX
DEST_KEY = queue
FORMAT = nullQueue
Name your stanza with $UNIQUE_STANZA_NAME to match the name you specified in props.conf.
Add $YOUR_CUSTOM_REGEX based on the attribute you've identified; it should specify the key
71
term that identifies the events you want to remove.
Leave DEST_KEY and FORMAT with the above values to send identified events to the nullQueue
(delete them before indexing).
Send matching events to nullQueue
This example sends all sshd events from /var/log/messages to nullQueue.
In props.conf:
[source::/var/log/messages]
TRANSFORMS-null= setnull
In transforms.conf:
[setnull]
REGEX = \[sshd\]
DEST_KEY = queue
FORMAT = nullQueue
Send matching WMI events to nullQueue
For those using WMI to capture events from Windows machines, the syntax is specific in props.conf
on the source. This example will allow you to filter out two different event codes (592 or 593) using an
"or" statement in regex.
In props.conf:
[wmi]
TRANSFORMS-foo=wminull
In transforms.conf:
[wminull]
REGEX=(?m)^EventCode=(592|593)
DEST_KEY=queue
FORMAT=nullQueue
Send matching events to indexQueue, everything else to nullQueue
This example is the reverse of the previous. The user wants to keep only sshd events from
/var/log/messages; everything else goes to nullQueue. In this case, you need to define two
transforms.
In props.conf:
[source::/var/log/messages]
TRANSFORMS-set= setnull,setparsing
In transforms.conf
[setnull]
72
REGEX = .
DEST_KEY = queue
FORMAT = nullQueue
[setparsing]
REGEX = \[sshd\]
DEST_KEY = queue
FORMAT = indexQueue
Route specific events to an alternate index
Route specific events to an alternate index
By default, all events are sent to an index called main. However, you may wish to send specific
events to other indexes. For example, if you want to segment data or you want to send sizable event
data from a noisy source to an index that is dedicated to receiving it. You can route data locally or
route data you are receiving from remote sources or Splunk instances.
Note: When you place data in an alternate index, you must specify the index in your search with the
index= key:
index=foo
To configure routing for all events from a particular data input to an alternate index, add the following
to the appropriate stanza in inputs.conf.
index = myindex
Example
The following example inputs.conf entry routes data to index = hatch:
[monitor:///var/log]
disabled = false
index = hatch
If you specify different indexes on the forwarder, when the events reach the indexing instance they
will be routed to the named index, which must already exist.
To configure routing for certain events to an alternate index, edit props.conf and transforms.conf on
the local Splunk instance.
Configuration
1. Identify an attribute of the event that can be used to separate it from others.
2. Create an entry in props.conf for the source, source type or host and specify a TRANSFORMS
class and a TRANSFORMS name. The class name refers to a regular expression stanza you will
place in transforms.conf.
73
In this example, the TRANSFORMS class name is index and the TRANSFORMS name is
AppRedirect.
3. Create an entry in transforms.conf with a regular expression that matches the identified attributes
(from step 1) and writes the alternate index name (in this example, Verbose) to the FORMAT key and
sets the DEST_KEY to specify the index attribute _MetaData:Index.
props.conf
Add the following stanza to $SPLUNK_HOME/etc/system/local/props.conf:
[<spec>]
TRANSFORMS-$NAME = $UNIQUE_STANZA_NAME
<spec> can be:
$NAME is whatever unique identifier you want to give to your transform.
transforms.conf
Add the following stanza to $SPLUNK_HOME/etc/system/local/transforms.conf:
REGEX = $YOUR_CUSTOM_REGEX
DEST_KEY = _MetaData:Index
FORMAT = Verbose
Name your stanza with $UNIQUE_STANZA_NAME to match the name you specified in props.conf.
Add $YOUR_CUSTOM_REGEX based on the attribute you've identified.
Example
Identify an attribute
web1.example.com MSWinEventLog 1 Application 721 Wed Sep 06 17:05:31 2006
4156 MSDTC Unknown User N/A Information WEB1 Printers String
message: Session idle timeout over, tearing down the session. 179
web1.example.com MSWinEventLog 1 Security 722 Wed Sep 06 17:59:08 2006
576 Security SYSTEM User Success Audit WEB1 Privilege Use
Special privileges assigned to new logon: User Name: Domain: Logon
ID: (0x0,0x4F3C5880) Assigned: SeBackupPrivilege SeRestorePrivilege
SeDebugPrivilege SeChangeNotifyPrivilege SeAssignPrimaryTokenPrivilege 525
For this example we will use the Application field as our trigger. A match on "Application" in the
events from sourcetype {windows_snare_log} will cause the value assignments in the transforms
stanza, AppRedirect. One assignment is the index name, verbose.
74
props.conf
Add the following stanza to $SPLUNK_HOME/etc/system/local/props.conf:
[windows_snare_syslog]
TRANSFORMS-index = AppRedirect
transforms.conf
Add the following stanza to $SPLUNK_HOME/etc/system/local/transforms.conf:
[AppRedirect]
REGEX = Application
FORMAT = Verbose
Set up SSL for forwarding and receiving
Set up SSL for forwarding and receiving
Each forwarder and receiver can be configured to use SSL. To set up SSL in inputs.conf for the
receiver and outputs.conf for the forwarder. Use SSL for both authentication and encryption, or simply
for encryption.
Note: SSL configurations for distributed data are separate from SSL/HTTPS configuration for Splunk
Web.
Forwarder
To set up SSL on the forwarder, edit $SPLUNK_HOME/etc/system/local/outputs.conf. If you
want to use SSL for authentication, add a stanza for each receiver that needs to be certified.
sslCertPath=<full path to client certificate>
sslPassword=<password for cert>
sslRootCAPath=<optional path to root certificate authority file>
sslVerifyServerCert=<true|false>
sslCommonNameToCheck=<server's common name, set only if sslVerifyServerCert is set to true>
altCommonNameToCheck=<server's alternate name, set only if sslVerifyServerCert is set to true>
The sslCertPath key/value pair is used to specify the full path to the server certificate file.
sslRootCAPath
key/value pair is used to specify the local path to the root certificate authority file.
Optional. Set if the root CA is local.

sslPassword
password for the certificate. Default sslPassword = password.

If set to true sslVerifyServerCert will make sure that the server you are connecting is a valid
one (authenticated). Both the common name and the alternate name of the server are then checked
for a match. Defaults to false.
75
sslCommonNameToCheck checks the common name of the server's certificate against this name. If
there is no match, assume that we aren't authenticated against this server. You must specify this
key/value pair if 'sslVerifyServerCert' is true.
altCommonNameToCheck checks the alternate name of the server's certificate against this name. If
there is no match, assume that we aren't authenticated against this server. You must specify this
key/value pair if 'sslVerifyServerCert' is true.
encryption only
To send with encryption only, configure your SSL stanza in
$SPLUNK_HOME/etc/system/local/outputs.conf as follows:
sslCertPath=/home/myhome/certs/foo.pem
sslPassword=password
sslRootCAPath=/home/myhome/certs/root.pem
sslVerifyServerCert=<false>
Note: you can set up the stanza only for a specific [tcpout-server://$IP:$PORT]. You cannot
set up SSL for a server group or a default group.
encryption and authentication
To set up SSL for authentication as well as encryption, configure your SSL stanza in
$SPLUNK_HOME/etc/system/local/outputs.conf as follows:
sslCertPath=<full path to client certificate>
sslRootCAPath=<optional path to root certificate authority file>
sslVerifyServerCert=<true|false>
sslCommonNameToCheck=<server's common name, set only if sslVerifyServerCert is set to true>
altCommonNameToCheck=<server's alternate name, set only if sslVerifyServerCert is set to true>
Note: You will have to write a stanza for each unique outbound connection that authenticates via
SSL.
Receiver
In order to use SSL for receiving you must include a stanza called [SSL] in
$SPLUNK_HOME/etc/system/local/inputs.conf:
[SSL]
serverCert=<full path to the server certificate>
password=<server certificate password, if any>
rootCA=<certificate authority list (root file)>
dhfile=<optional path to the dhfile.pem>
requireClientCert=<true|false> - set to true if you are setting up authentication
The serverCert key/value pair is used to specify the path to the server certificate file.
password is used if the certificate uses a password. Optional.
76
The rootCA key/value pair is used to specifically the path to the root certificate authority file.
If you want the system to require a valid certificate from the client in order to complete the connection,
set requireClientCert to 'true' otherwise set it to 'false'.
If you wish, you can use different certificates on different ports, thus allowing different sets of clients
to connect to different ports.
configuration
You will also have to add a listener stanza in $SPLUNK_HOME/etc/system/local/inputs.conf:
[splunktcp-ssl:9996]
queue=indexQueue
The above stanza will start a listener for another Splunk server's encrypted cooked data on port
9996.
[tcp-ssl:9995]
queue=parsingQueue
The above stanza will start a listener for raw encrypted data on port 9995.
Enable cloning
Enable cloning
With cloning enabled, a Splunk forwarder sends its data to two or more other Splunk instances.
Configure cloning in outputs.conf on the forwarding server. Set up a target group of receiving servers
to which the forwarder sends all its data.
On the forwarding server, add the following to
$SPLUNK_HOME/etc/system/local/outputs.conf:
[tcpout]
defaultGroup = indexer1, indexer2
maxQueueSize=10000
[tcpout:indexer1]
server=10.1.1.197:9997
[tcpout:indexer2]
server=10.1.1.200:9999
This configuration will send every event to both 10.1.1.197:9997 and 10.1.1.200:9999. Make sure you
enable receiving on all the servers you are sending cloned data to.
Set up data balancing
77
Set up data balancing
Set up your forwarding servers to balance outputs by sending events in a round-robin fashion to
separate Splunk servers. To set up data balancing, add a stanza to outputs.conf.
Note: Data balancing is an advanced feature that you can configure in outputs.conf. However,
the topology and input configuration does not display properly in Splunk Web's Admin > Distributed
> View Topology page. This issue will be resolved in a future release.
Caution: Ensure that all instances of Splunk that are indexing data in a round-robin configuration
have plenty of disk space. A current limitation of Splunk exists such that if a Splunk indexer runs out
of disk space, all forwarders involved in the round-robin configuration will stop forwarding data to all
Splunk indexers.
Configuration
Edit $SPLUNK_HOME/etc/system/local/outputs.conf:
[tcpout:FooGroup]
server=$IP:$PORT, $IP2:$PORT2, etc...
Specify the $IP:$PORT of the Splunk servers that will receive the forwarded data. You can enter any
number of servers for Splunk to round-robin between. If one of the receiving servers goes down, the
forwarder sends all events to the one that is still up, while simultaneously retrying the one that is
down. If all servers are down, the forwarder goes into retry loops, and the queue fills according to the
queue configuration parameters.
Also, you can optionally specify back off and queue settings in outputs.conf. For more
information, read Configure outputs.conf.
Example
[tcpout:SwanGroup]
server=10.1.1.197:9997, 10.1.1.200:9999
[tcpout:PearlGroup]
server=10.1.1.220:9997, 10.1.1.300:9999
With this configuration, Splunk clones every event into 2 round-robin target groups.
78
Route data to third-party systems
Route data to third-party systems
Splunk can be configured to route data to non-Splunk systems. To do this, configure a Splunk server
to send raw data over TCP to a server and port via outputs.conf. The receiving server should be
expecting to receive the data stream on that port.
Additionally, enable conditional routing with props.conf and transforms.conf to be more specific about
which data gets routed to third party systems.
Configuration
To configure data routing, you need to edit props.conf, transforms.conf, and outputs.conf.
These files are located in $SPLUNK_HOME/etc/system/local/ on the Splunk server.
Note: If these files are not located in $SPLUNK_HOME/etc/system/local/, copy them from
$SPLUNK_HOME/etc/system/default/.
In props.conf, specify the host, source, or source type of your data stream. Specify a transform to
perform on the input.
In transforms.conf, define the transforms and specify the TCP_ROUTING to apply. You can also
use REGEX if you wish to be more selective on the input.
In outputs.conf:
Define the target groups that will receive the data.
Specify the IP address and TCP port, $IP:$PORT , for the third party system to receive data.
Set sendCookedData to false so that your Splunk server forwards raw data.
Note: List any single server as a part of a target group or default group to send data. Read more
about configuring target groups in outputs.conf.
Example
Send a subset of data
This example shows how to forward a subset of your data from Splunk.
1. First, edit props.conf and transforms.conf to specify which data to send to the non-Splunk
system.
In props.conf, apply the bigmoney transform to all hostnames beginning with nyc:
[host::nyc*]
79
TRANSFORMS-nyc = bigmoney
In transforms.conf, set the TCP routing to a group that has the default TCP group and the
non-Splunk server group.
[bigmoney]
FORMAT=bigmoneyreader
2. Next, define the target groups in outputs.conf:
[tcpout]
defaultGroup = default-clone-group-192_168_1_104_9997
[tcpout:default-clone-group-192_168_1_104_9997]
disabled = false
server = 192.168.1.104:9997
[tcpout:bigmoneyreader]
disabled = false
server=10.1.1.197:7999
sendCookedData=false
Send all data
This example shows how to forward all of your data from Splunk.
Since you are sending all of your data simply edit outputs.conf to specify that all data will be sent
to the non-Splunk system.
[tcpout]
defaultGroup = fastlane
disabled = false
indexAndForward = true
[tcpout:fastlane]
disabled = false
server = 10.1.1.35:6996
sendCookedData = false
80
Indexing
How indexing works
How indexing works
All data that comes into Splunk is indexed through the universal pipeline. Data enters the universal
pipeline as large (10,000 bytes) chunks. As part of pipeline processing, these chunks are broken into
events. Initially, newline characters signal an event boundary. In the next stage of processing, Splunk
applies line merging rules specified in props.conf.
As part of indexing, events are broken into sections called segments. Splunk uses a list of breaking
characters and other rules (such as the maximum number of characters per segment) that are
configurable through segmenters.conf.
The splunk-optimize process
While Splunk is indexing data, one or more instances of the splunk-optimize process will run
intermittently, merging index files together to optimize performance when searching the data. The
splunk-optimize process can use a significant amount of cpu, but should not consume it
indefinitely, only for a short amounts of time. You can alter the number of concurrent instances of
splunk-optimize by changing the value set for maxConcurrentOptimizes in indexes.conf,
but this is not typically necessary.
splunk-optimize should only run on db-hot.
You can run it on warm DB's manually if you find one with a larger number of .tsidx files (more
81
than 25) - ./splunk-optimize <directory>
If splunk-optimize does not run often enough, search efficiency will be affected.
How events work
Events are a single record of activity within a log file. An event typically includes a timestamp (for
more information about timestamp configuration, read how timestamps work). Events also provide
information about the system that Splunk is monitoring.
Here's a sample event:
172.26.34.223 - - [01/Jul/2005:12:05:27 -0700] "GET
/trade/app?action=logout HTTP/1.1" 200 2953
Event or event type
Events differ from event types. Event types are a classification system and can be made up of any
number of events. Events are single instances of data -- a single log entry, for example.
Change Splunk's default line-breaking behavior in multi-line events. Learn more here.
Note: Before manually modifying any configuration file, read about configuration files.
Lines over 10,000 bytes
Splunk breaks lines over 10,000 bytes into multiple lines of 10,000 bytes each when indexing them. It
appends the field meta::truncated to the end of each truncated section. However, Splunk still groups
these lines into a single event.
Events over 100,000 bytes
Segments after the first 100,000 bytes of a very long line are searchable, but Splunk does not display
them in search results. It only displays the first 100,000 bytes.
Events over 1,000 segments
Splunk only displays the first 1,000 individual segments of an event as segments separated by
whitespace and highlighted on mouseover. It displays the rest of the event as raw text without
interactive formatting.
How segmentation works
There are two types of segments; major and minor. Major segments are words, phrases or terms in
your data that are surrounded by breaking characters -- such as a blank space. By default, major
breakers are set to most characters and blank spaces.
82
Minor segments are breaks within a major segment. For example, the IP address 192.168.1.254 is
indexed entirely as a major segment and then broken up into the following minor segments: 192,
192.168, and 192.168.1.
Splunk stores each minor segment in addition to each major segment. Therefore, enabling more
minor breakers generally increases index size. However, minor segments provide more flexibility
when searching in Splunk Web. With minor breakers enabled, you can search for a term you know is
part of a minor segment without using a wildcard. For example, with "." set as a minor breaker, the
search "10.2" will return the same as the search "10.2*". Minor breakers also allow you to drag and
select parts of search terms from within Splunk Web. Use segmentation configurations to reduce both
indexing density and the time it takes to index by changing minor breakers to major.
To configure segmentation, first decide what type of segmentation works best for your data. Then,
use segmenters.conf to create segmentation rules. Finally, tie your custom segmentation rules to
a host, source or sourcetype via props.conf.
Index multi-line events
Index multi-line events
Many event logs have a strict one-line-per-event format, but some do not. Usually, Splunk can figure
out where event boundaries are automatically. However, if event boundary recognition is not working
as desired, set custom rules by configuring props.conf.
Configuration
To configure multi-line events, examine the format of the events. Determine a pattern in the events to
set as the start or end of an event. Then, edit $SPLUNK_HOME/etc/system/local/props.conf,
and set the necessary attributes for your data handling.
There are two ways to handle multiline events
1) Break the event stream into real events. This is recommended, as it increases indexing speed
significantly. Use LINE_BREAKER (see below).
2) Break the event stream into lines, and reassemble. This is slower, but affords more robust
configuration options. Use any line-breaking attribute besides LINE_BREAKER (see below).
Here are possible attributes to set for line-breaking rules from
$SPLUNK_HOME/etc/system/README/props.conf.spec:
TRUNCATE = <non-negative integer>
* Change the default maximum line length.
* Set to 0 if you do not want truncation ever (very long lines are, however, often a sign of
garbage data).
* Defaults to 10000.
LINE_BREAKER = <regular expression>
* If not set, the raw stream will be broken into an event for each line delimited by \r or \n.
* If set, the given regex will be used to break the raw stream into events.
83
* The regex must contain a matching group.
* Wherever the regex matches, the start of the first matched group is considered the first text NOT in the
previous event.
* The end of the first matched group is considered the end of the delimiter and the next
character is considered the beginning of the next event.
* For example, "LINE_BREAKER = ([\r\n]+)" is equivalent to the default rule.
* The contents of the first matching group will not occur in either the previous or next events.
* NOTE: There is a significant speed boost by using the LINE_BREAKER to delimit multiline events
rather than using line merging to reassemble individual lines into events.
LINE_BREAKER_LOOKBEHIND = <integer> (100)
* Change the default lookbehind for the regex based linebreaker.
* When there is leftover data from a previous raw chunk, this is how far before the end
the raw chunk (with the next chunk concatenated) we should begin applying
the regex.
SHOULD_LINEMERGE = <true/false>
* When set to true, Splunk combines several input lines into a single event, based on the
following configuration attributes.
* Defaults to true.
# The following are used only when SHOULD_LINEMERGE = True
AUTO_LINEMERGE = <true/false>
* Directs Splunk to use automatic learning methods to determine where to break lines in events.
* Defaults to true.
BREAK_ONLY_BEFORE_DATE = <true/false>
* When set to true, Splunk will create a new event if and only if it encounters
a new line with a date.
* Defaults to false.
BREAK_ONLY_BEFORE = <regular expression>
* When set, Splunk will create a new event if and only if it encounters
a new line that matches the regular expression.
* Defaults to empty.
MUST_BREAK_AFTER = <regular expression>
* When set, and the regular expression matches the current line,
Splunk is guaranteed to create a new event for the next input line.
* Splunk may still break before the current line if another rule matches.
MUST_NOT_BREAK_AFTER = <regular expression>
* When set and the current line matches the regular expression, Splunk will
not break on any subsequent lines until the MUST_BREAK_AFTER expression
matches.
MUST_NOT_BREAK_BEFORE = <regular expression>
* When set and the current line matches the regular expression, Splunk will not break the last
event before the current line.
MAX_EVENTS = <integer>
* Specifies the maximum number of input lines that will be added to any event.
* Splunk will break after the specified number of lines are read.
* Defaults to 256.
Examples
[my_custom_sourcetype]
BREAK_ONLY_BEFORE = ^\d+\s*$
This example instructs Splunk to divide events in a file or stream by presuming any line that consists
of all digits is the start of a new event, for any source whose source type was configured or
determined by Splunk to be sourcetype::my_custom_sourcetype .
84
Another example:
The following log event contains several lines that are part of the same request. The differentiator
between requests is "Path". The customer would like all these lines shown as one event entry.
{{"2006-09-21, 02:57:11.58", 122, 11, "Path=/LoginUser Query=CrmId=ClientABC&ContentItemId=TotalAccess&SessionId=3A1785URH117BEA&Ticket=646A1DA4STF896EE&SessionTime=25368&ReturnUrl=http://www.clientabc.com, Method=GET, IP=209.51.249.195, Content=", ""}}
{{"2006-09-21, 02:57:11.60", 122, 15, "UserData:<User CrmId="clientabc" UserId="p12345678"><EntitlementList></EntitlementList></User>", ""}}
{{"2006-09-21, 02:57:11.60", 122, 15, "New Cookie: SessionId=3A1785URH117BEA&Ticket=646A1DA4STF896EE&CrmId=clientabc&UserId=p12345678&AccountId=&AgentHost=man&AgentId=man, MANUser: Version=1&Name=&Debit=&Credit=&AccessTime=&BillDay=&Status=&Language=&Country=&Email=&EmailNotify=&Pin=&PinPayment=&PinAmount=&PinPG=&PinPGRate=&PinMenu=&", ""}}
To index this multiple line event properly, use the Path differentiator in your configuration. Add the
following to your $SPLUNK_HOME/etc/system/local/props.conf:
[source::source-to-break]
SHOULD_LINEMERGE = True
BREAK_ONLY_BEFORE = Path=
This code tells Splunk to merge the lines of the event, and only break before the term Path=.
Segmentation rules can be tweaked to provide better index compression or improve the usability for a
particular data source. If you want to change Splunk's default segmentation behavior, edit
segmenters.conf. Once you have set up rules in segmenters.conf, tie them to a specific source,
host or souce types via props.conf. Segmentation modes other than inner and full are not
recommended.
Edit all configuration files in $SPLUNK_HOME/etc/system/local, or your own custom application
directory in $SPLUNK_HOME/etc/apps/.
Note: You can enable any number of segmentation rules applied to different hosts, sources and/or
source types in this manner.
There are many different ways you can configure segementers.conf, and you should figure out
what works best for your data. Specify which segmentation rules to use for specific hosts, sources or
sourcetypes by using props.conf and segmentation. Here are a few general examples of
configuration changes you can make:
Full segmentation
Splunk is set to use full segmentation by default. Full segmentation is the combination of both inner
and outer segmentation.
Inner segmentation
Inner segmentation is the most efficient segmentation setting, for both search and indexing, while still
retaining the most search functionality. It does, however, make typeahead less comprehensive.
Switching to inner segmentation at indexing time does not change search behavior at all.
85
To configure inner segmentation at index time, set SEGMENTATION = inner for your source,
sourcetype or host in props.conf. Under these settings, Splunk indexes smaller chunks of data.
For example, user.id=foo is indexed as user id foo.
Outer segmentation
Outer segmentation is the opposite of inner segmentation. Instead of indexing only the small tokens
individually, outer segmentation indexes entire terms, yielding fewer, larger tokens. For example,
"10.1.2.5" is indexed as "10.1.2.5," meaning you cannot search on individual pieces of the phrase.
You can still use wildcards, however, to search for pieces of a phrase. For example, you can search
for "10.1*" and you will get any events that have IP addresses that start with "10.1". Also, outer
segmentation disables the ability to click on different segments of search results, such as the 48.15
segment of the IP address 48.15.16.23. Outer segmentation tends to be marginally more efficient
than full segmentation, while inner segmentation tends to be much more efficient.
To enable outer segmentation at index time, set SEGMENTATION = outer for your source,
sourcetype or host in props.conf. Also for search to behave properly, add the following lines to
$SPLUNK_HOME/etc/system/local/segmenters.conf, so that the search system knows to
search for larger tokens:
[search]
MAJOR = [ ] < > ( ) { } | ! ; , ' " * \n \r \s \t & ? + %21 %26 %2526 %3B %7C %20 %2B %3D -- %2520
MINOR =
This is what's known as tuning "search segmentation". Note that the '.' has been removed from the
list of breakers here, so that a search for an IP address for example, will now perform much quicker.
The downside of this is that a search partial IP address must now include the '*' wildcard, because
your search will no longer look at the individual octets in the index, but will be searching for a
complete string. If you implement this scenario, make sure your users are aware of the '*'
requirement.
Note: Changes to search segmentation affect all searches across all indexes--it is not a per-index
setting. Before you make search segmentation changes, ensure that tuning for one use-case does
not negatively impact other indexes.
No segmentation
The most expedient segmentation setting is to disable segmentation completely. There are significant
implications for search, however. For example, setting Splunk to index with no segmentation restricts
your searches to time, source, host and source type. Only use this setting if you do not need any
advanced search capabilities.
To enable this configuration, set SEGMENTATION = none for your source, source type or host in
props.conf. Searches for keywords in this source, source type or host will return no results. You
can still search for indexed fields.
No segmentation is the most space efficient configuration, but makes searching very difficult. You
must pipe your searches through the search command in order to further restrict results. This type of
configuration is useful if you value storage efficiency over search performance.
86
Splunk Web segmentation
Splunk Web also has settings for segmentation. These have nothing to do with indexing
segmentation. Splunk Web segmentation affects browser interaction and may speed up search
results. To configure Splunk Web segmentation, refer to the User Manual topic, Change Splunk Web
preferences.
Click on the Preferences tab in the upper right-hand corner of Splunk Web.
Configure custom segmentation for a host, source, or source
type
Configure custom segmentation for a host, source, or source type
By default, Splunk fully segments events to allow for the most flexible searching. To learn more about
segmentation in general, see this page. If you know how you want to search or process events from a
specific host, source, or source type, configure custom segmentation for that specific type of event.
Configuring custom segmentation for a given host, source, or source type improves indexing and
search performance and can reduce index size (on disk).
Via props.conf
Configure custom segmentation for events of a host, source, or source type by adding the
SEGMENTATION and SEGMENTATION-<segment selection> attributes to a host, source, or
source type stanza in props.conf. Assign values to the attributes using rules for index time and search
time (Splunk Web) segmentation that are defined in segmenters.conf.
Add your stanza to $SPLUNK_HOME/etc/system/local/props.conf. Specify the following
attribute/value pairs:
[<spec>]
SEGMENTATION = $SEG_RULE
SEGMENTATION-<segment selection> = $SEG_RULE
[<spec>] can be:
<sourcetype>: A source type in your event data.
host::<host>: A host value in your event data.
source::<source>: A source of your event data.
SEGMENTATION = $SEG_RULE
Specify the segmentation to use at index time.
Set $SEG_RULE to inner, outer, none, or full.
SEGMENTATION-<segment selection> = $SEG_RULE
87
Specify the type of segmentation to use at search time.
This only applies to the appearance in Splunk Web.
<segment selection> refers to the radio buttons in Splunk Web preferences panel. Map
these radio buttons to your custom $SEG_RULE.

<segment selection> can be one of the following: all, inner, outer, raw.
$SEG_RULE
A segmentation rule defined in segmenters.conf
Defaults are inner, outer, none, full.
Create your own custom rule by editing
$SPLUNK_HOME/etc/system/local/segmenters.conf.

For more information on configuring segmenters.conf, see this page.
Example
The following example can increase search performance (in Splunk Web) and reduce the index size
of your syslog events.
Add the following to the [syslog] source type stanza in props.conf:
[syslog]
SEGMENTATION = inner
SEGMENTATION-all = inner
This example changes the segmentation of all events that have sourcetype=syslog to inner
segmentation at index time (using the SEGMENTATION attribute), and in Splunk Web (using the
SEGMENTATION-<segment selection> attribute).
Note: You must restart Splunk to apply changes to Splunk Web segmentation, and you must re-index
your data to apply changes to index time segmentation.
Mask sensitive data in an event
Mask sensitive data in an event
You may want to mask sensitive personal data that goes into logs. Credit card numbers and social
security numbers are two examples of data that you may not want to index in Splunk. This page
shows how to mask part of confidential fields so that privacy is protected but there is enough of the
data remaining to be able to use it to trace events.
This example masks all but the last four characters of fields SessionId and Ticket number in an
application server log.
An example of the desired output:
SessionId=###########7BEA&Ticket=############96EE
A sample input:
88
"2006-09-21, 02:57:11.58", 122, 11, "Path=/LoginUser Query=CrmId=ClientABC&ContentItemId=TotalAccess&SessionId=3A1785URH117BEA&
Ticket=646A1DA4STF896EE&SessionTime=25368&ReturnUrl=http://www.clientabc.com, Method=GET, IP=209.51.249.195, Content=", ""
"2006-09-21, 02:57:11.60", 122, 15, "UserData:<User CrmId="clientabc" UserId="p12345678"><EntitlementList></EntitlementList></User>", ""
"2006-09-21, 02:57:11.60", 122, 15, "New Cookie: SessionId=3A1785URH117BEA&Ticket=646A1DA4STF896EE&CrmId=clientabcUserId=
p12345678&AccountId=&AgentHost=man&AgentId=man, MANUser: Version=1&Name=&Debit=&Credit=&AccessTime=&BillDay=&Status=
&Language=&Country=&Email=&EmailNotify=&Pin=&PinPayment=&PinAmount=&PinPG=&PinPGRate=&PinMenu=&", ""
Configuration
To mask the data, modify the props.conf and transforms.conf files in your
$SPLUNK_HOME/etc/system/local/ directory.
props.conf
Edit $SPLUNK_HOME/etc/system/local/props.conf and add the following:
[<spec>]
TRANSFORMS-anonymize = session-anonymizer, ticket-anonymizer
<spec> can be:
<sourcetype>, the sourcetype of an event 1.
host::<host>, where <host> is the host for an event 2.
source::<source>, where <source> is the source for an event. 3.
session-anonymizer and ticket-anonymizer are TRANSFORMS class names whose actions
are defined in transforms.conf. For your data, use the class names you create in
transforms.conf.
transforms.conf
In $SPLUNK_HOME/etc/system/local/transforms.conf, add your TRANSFORMS:
[session-anonymizer]
REGEX = (?m)^(.*)SessionId=\w+(\w{4}[&"].*)$
FORMAT = $1SessionId=########$2
DEST_KEY = _raw
[ticket-anonymizer]
REGEX = (?m)^(.*)Ticket=\w+(\w{4}&.*)$
FORMAT = $1Ticket=########$2
DEST_KEY = _raw
REGEX should specify the regular expression that will point to the string in the event you want to
anonymize.
Note: The regex processor can't handle multi-line events. To get around this you need to specify in
transforms.conf that the event is multi-line. Use the (?m) before the regular expression.
FORMAT specifies the masked values. $1 is all the text leading up to the regex and $2 is all the text of
the event after the regex.
89
DEST_KEY = _raw specifies to write the value from FORMAT to the raw value in the log - thus
modifying the event.
Configure character set encoding
Configure character set encoding
Splunk allows you to configure character set encoding for your data sources. Splunk has built in
character set specifications to support internationalization of your Splunk deployment. Splunk
supports 71 languages (including 20 that aren't UTF-8 encoded). You can Retrieve a list of Splunk's
valid character encoding specifications using the iconv -l command on most *nix systems.
Splunk attempts to apply UTF-8 encoding to your sources by default. If a source doesn't use UTF-8
encoding or is a non-ASCII file, Splunk will try and convert data from the source to UTF-8 encoding
unless you specify a character set to use by setting the CHARSET key in props.conf.
Note: If a source's character set encoding is valid, but some characters from the specification are not
valid in the encoding you specify then Splunk escapes the invalid characters as hex values (for
example: "\xF3").
Supported character sets
Language Code
Arabic CP1256
Arabic ISO-8859-6
Armenian i ARMSCII-8
Belarus CP1251
Bulgarian ISO-8859-5
Czech ISO-8859-2
Georgian Georgian-Academy
Greek ISO-8859-7
Hebrew ISO-8859-8
Japanese EUC-JP
Japanese SHIFT-JIS
Korean EUC-KR
Russian CP1251
Russian ISO-8859-5
Russian KOI8-R
Slovak CP1250
Slovenian ISO-8859-2
Thai TIS-620
Ukrainian KOI8-U
Vietnamese VISCII
90
Manually specify a character set
Manually specify a character set to apply to a source by setting the CHARSET key for a source in
props.conf.
For example, if you have a source the is in Greek and that uses ISO-8859-7 encoding, set
CHARSET=ISO-8859-7 for that source in props.conf.
[source::$SOURCE]
CHARSET=ISO-8859-7
Automatically specify a character set
Splunk can automatically detect languages and proper character sets using its sophisticated
character set encoding algorithm.
Configure Splunk automatically detect the proper language and character set encoding for a source
by setting CHARSET=AUTO for that source in props.conf. For example, if you want Splunk to
automatically detect character set encoding for the source "my-foreign-docs", then set
CHARSET=AUTO for that source in props.conf.
[my-foreign-docs]
CHARSET=AUTO
If Splunk doesn't recognize a character set
Train Splunk to recognize a character set if you want to use an encoding that Splunk doesn't
recognize, by adding a character set sample file to the following directory:
SPLUNK_HOME/etc/ngram-models/_<language>-<encoding>.txt
Once you add the character set specification, then restart Splunk. After you restart, Splunk can
recognize sources that use the new character set, and will automatically convert them to UTF-8
format at index time.
For example, if you want to use the "vulcan-ISO-12345" character set, copy the specification file to
the following path:
/SPLUNK_HOME/etc/ngram-models/_vulcan-ISO-12345.txt
Dynamic metadata assignment
Dynamic metadata assignment
Dynamically assign metadata to files as they are being consumed by Splunk. Append the dynamic
input header to your file and set any metadata fields you'd like. You can see the available pipeline
metadata fields in transforms.conf.spec.
91
Use this feature for any incoming data streams that might have different sourcetypes, hosts or other
metadata that you would like to indicate dynamically. Set any metadata in this manner, as opposed to
using inputs.conf, props.conf and transforms.conf.
Configuration
Edit any file to add the dynamic input header.
Add the following header to your file:
***SPLUNK*** $ATTR1=$VAL1 $ATTR2=$VAL2 etc
Set $ATTR1=$VAL1 to the values you wish.
For example, set sourcetype=log4j host=swan.

Add the header anywhere in your file
Note: Any data following the header will be appended with the attributes and values you
assign until the end of the file is reached.

Add your file to $SPLUNK_HOME/var/spool/splunk or any other directory being monitored
by Splunk.

Set values with a script
Write a script to automatically add the dynamic input header to your incoming data streams. Your
script can also set attributes dynamically based on the contents of your file.
For example, Splunk's report caching script takes an index as a variable and automatically assigns
that index to incoming data streams.
92
Timestamps
How Splunk extracts timestamps
How Splunk extracts timestamps
Splunk uses timestamps to correlate events by time, create the histogram in Splunk Web and to set
time ranges for searches. Timestamps are assigned to events at index time. Most events get a
timestamp value assigned to them based on information in the raw event data. If an event doesn't
contain timestamp information, Splunk attempts to assign a timestamp value to the event as it's
indexed. Splunk stores timestamp values in the _time field (in UTC time format).
Precedence rules for timestamp assignment
Splunk uses the following precedence to assign timestamps to events:
1. Look for a time or date in the event itself using an explicit TIME_FORMAT if provided.
Use positional timestamp extraction for events that have more than one timestamp value in the raw
data, or when the desired timestamp is not at the start of the line.
2. If no TIME_FORMAT is provided, or no match is found, attempt to automatically identify a time or
date in the event itself.
Use positional timestamp extraction for events that have more than one timestamp value in the raw
data.
3. If an event doesn't have a time or date, use the timestamp from the most recent previous event of
the same source.
4. If no events in a source have a date, look in the source (or file) name (Must have time in the
event).
5. For file sources, if no date can be identified in the file name, use the modification time on the file.
6. If no other timestamp is found, set the timestamp to the current system time (at the event's index
time).
Configure timestamps
Most events don't require any special timestamp handling. For some sources and distributed
deployments, you may have to configure timestamp formatting to extract timestamps from events.
Configure Splunk's timestamp extraction processor by editing props.conf. For a complete discussion
of the timestamp configurations available in props.conf, see this overview.
93
You can also configure Splunk's timestamp extraction processor to:
Apply timezone offsets.
Recognize European date format.
Pull the correct timestamp from events with more than one timestamp.
Improve indexing performance.
Finally, train Splunk to recognize new timestamp formats.
Configure timestamp recognition
Configure timestamp recognition
Configure how Splunk recognizes timestamps by editing props.conf. Splunk uses strptime()
formatting to identify timestamp values in your events. Specify what Splunk recognizes as a
timestamp by setting a strptime() format in the TIME_FORMAT= key.
When forwarding data using the SplunkLightForwarder application, all timestamp
recognition/extraction will take place on the receiving/indexing instance.
Learn about Splunk's enhanced strptime() format support.
Note: If your event has more than one timestamp, set Splunk to recognize the correct timestamp with
positional timestamp extraction.
Configure timestamp extraction in props.conf
Use $SPLUNK_HOME/etc/system/README/props.conf.example as an example, or create
your own props.conf. Make any configuration changes to a copy of props.conf in
$SPLUNK_HOME/etc/system/local/, or your own custom application directory in
$SPLUNK_HOME/etc/apps/. For more information on configuration files in general, see how
configuration files work.
Configure any of the following attributes in props.conf to set Splunk's timestamp recognition. Refer
to $SPLUNK_HOME/etc/system/README/props.conf.spec for full specification of the keys.
[<spec>]
DATETIME_CONFIG = <filename relative to $SPLUNK_HOME>
MAX_TIMESTAMP_LOOKAHEAD = <integer>
TIME_PREFIX = <regular expression>
TIME_FORMAT = <strptime-style format>
TZ = <posix timezone string>
MAX_DAYS_AGO = <integer>
MAX_DAYS_HENCE = <integer>
[<spec>]
<spec> indicates what to apply timestamp extraction to. Can be one of the following:
<sourcetype>, the sourcetype of an event.

94
host::<host>, where <host> is the host of an event.
source::<source>, where <source> is the source of an event.
If an event contains data that matches the value of <spec>, then the timestamp rules
specified in the stanza apply to that event.

Add additional stanzas to customize timestamp recognition for any type of event.
Specify a file to use to configure Splunk's timestamp processor (by default Splunk uses
$SPLUNK_HOME/etc/datetime.xml).

To use a custom datetime.xml, specify the correct path to your custom file in all keys that
refer to datetime.xml.

Set DATETIME_CONFIG = NONE to prevent the timestamp processor from running.
Set DATETIME_CONFIG = CURRENT to assign current system time to each event as it's
indexed.

Specify how far (how many characters) into an event Splunk should look for a timestamp.
Default is 150 characters.
Set to 0 to assign current system time at an event's index time.
Use a regular expression that points to the space exactly before your event's timestamp.
For example, if your timestamp follows the phrase Time=, your regular expression
should match this part of the event.

The timestamp processor only looks for a timestamp after the TIME_PREFIX in an event.
Default is none (empty).
Specify a strptime() format string to extract the date.
Set strptime() values in the order that matches the order of the elements in the timestamp you
want to extract.

Splunk's timestamp processor starts processing TIME_FORMAT immediately after a matching
TIME_PREFIX value.

TIME_FORMAT starts reading after a matching TIME_PREFIX.
The <strptime-style format> value must contain the hour, minute, month, and day.
Default is empty.
Learn what strptime() formats Splunk supports.
TZ = <timezone string>
Specify a time-zone setting using a value from the zoneinfo TZID database.
For more details and examples learn how to configure timezone offsets.
Default is empty.
95
Specify the maximum number of days in the past (from the current date) for an extracted date
to be valid.

For example, if MAX_DAYS_AGO = 10 then dates that are older than 10 days ago are ignored.
Default is 1000.
Note: You must configure this setting if your data is more than 1000 days old.
Specify the maximum number of days in the future (from the current date) for an extracted
date to be valid.

For example, if MAX_DAYS_HENCE = 3 then dates that are more than 3 days in the future are
ignored.

The default value (2) allows dates that are tomorrow.
Note: If your machines have the wrong date set or are in a timezone that is one day ahead, set this
value to at least 3.
Enhanced strptime support
Configure timestamp parsing in props.conf with the TIME_FORMAT= key. Splunk implements an
enhanced version of Unix strptime() that supports additional formats (allowing for microsecond,
millisecond, any time width format, and some additional time formats for compatibility). See the table
below for a list of the additionally supported strptime() formats.
In previous versions, Splunk parsed timestamps using only the standard Linux strptime() conversion
specifications. Now, in addition to standard Unix strptime() formats, Splunk's strptime()
implementation supports recognition of the following date-time formats:
%N
For GNU date-time nanoseconds. Specify any sub-second parsing by
providing the width: %3N = milliseconds, %6N = microseconds, %9N =
nanoseconds.
%Q,%q
For milliseconds, microseconds for Apache Tomcat. %Q and %q can
format any time resolution if the width is specified.
%I
For hours on a 12-hour clock format. If %I appears after %S or %s (like
"%H:%M:%S.%l") it takes on the log4cpp meaning of milliseconds.
%+ For standard UNIX date format timestamps.
%v For BSD and OSX standard date format.
%z, %:z, %::z, %:::z
GNU libc support for RFC 822 / SMTP header timezones. %z maps to
-0800 for US pacific time. %:z -08:00, %::z -08:00:00, %:::z
-08
%o For AIX timestamp support (%o used as an alias for %Y).
%p The locale's equivalent of AM or PM. (Note: there may be none.)
96
strptime() format expression examples
Below are some sample date formats with strptime() expressions that handle them.
1998-12-31 %Y-%m-%d
98-12-31 %y-%m-%d
1998 years, 312 days %Y years, %j days
Jan 24, 2003 %b %d, %Y
January 24, 2003 %B %d, %Y
2007?1?22? ??03?25?26? %Y?%m?%d? ??%H?%M?%S
q|25 Feb '03 = 2003-02-25| q|%d %b '%y = %Y-%m-%d|
Examples
Your data might contain an easily recognizable timestamp to extract such as:
...FOR: 04/24/07 PAGE 01...
The entry in props.conf is:
[host::foo]
TIME_PREFIX = FOR:
TIME_FORMAT = %m/%d/%y
Your data might contain other information that Splunk parses as timestamps, for example:
...1989/12/31 16:00:00 ed May 23 15:40:21 2007...
Splunk extracts the date as Dec 31, 1989, which is not useful. In this case, configure props.conf to
extract the correct timestamp from events from host::foo:
[host::foo]
TIME_PREFIX = \d{4}/\d{2}/\d{2} \d{2}:\d{2}:\d{2} \w+\s
TIME_FORMAT = %b %d %H:%M:%S %Y
This configuration assumes that all timestamps from host::foo are in the same format. Configure
your props.conf stanza to be as granular as possible to avoid potential timestamping errors.
Apply timezone offsets
Apply timezone offsets
Important: If you have configured timestamp offsets using pre-Splunk 3.2 POSIX instructions, you
must reconfigure them using the information on this page. If you do not do this, your timestamp
information will be incorrect.
97
Use timezone offsets to correctly correlate events from different timezones. Configure timezone
offsets for events based on host, source, or sourcetype. Configure timezone offsets in props.conf. By
default, Splunk applies timezone offsets using these rules, in the following order:
1. Use the timezone in raw event data (for example, PST, -0800).
2. Use TZ if it is set in a stanza in props.conf and the event matches the host, source, or
sourcetype specified by a stanza.
3. Use the timezone offset of the Splunk server that indexes the event.
Configure timezone offsets in props.conf
Use $SPLUNK_HOME/etc/system/README/props.conf.example as an example, or create
your own props.conf. Make any configuration changes to a copy of props.conf in
Configure timezone offsets by adding a TZ = key to a timestamp configuration stanza for a host,
source, or sourcetype in props.conf. The Splunk TZ = key recognizes zoneinfo TZID's (See all the
timezone TZ ID's in the zoneinfo (TZ) database). Set a TZ = value to a TZID to a desired timezone
offset for any host, source, or sourcetype.
Examples
This example sets the timezone offset of events from host names that match the regular expression
nyc* to the Eastern time zone.
[host::nyc*]
TZ = US/Eastern
This example sets the timezone offset of events from sources in the path /mnt/ca/... to the
Pacific time zone.
[source::/mnt/ca/...]
TZ = US/Pacific
zoneinfo (TZ) database
The zoneinfo database is a publicly maintained database of timezone values.
UNIX versions of Splunk rely on a TZ database included with the UNIX distribution you're
installing on. Most UNIX distributions store the database in the directory:
/usr/share/zoneinfo.

Solaris versions of Splunk store TZ information in this directory:
/usr/share/lib/zoneinfo.

98
Windows versions of Splunk ship with a copy of the TZ database.
Refer to the zoneinfo (TZ) database for values you can set as TZ = in props.conf.
Configure timezone offsets for Splunk versions before 3.2
If you're running a version of Splunk that is older than 3.2, you must use POSIX values for the value
of TZ =. See man tzset for help with POSIX formatting.
Important: Prior to version 3.2, Splunk used an external timezone utility to parse POSIX timezones.
The external utility has a bug that causes it to parse POSIX TZ values as east of Greenwich Mean
Time (for example PST is "-0800"). Here is the thread describing the bug.
Examples
Timezone pre-Splunk 3.2
Splunk 3.2 and
newer
US Eastern TZ=EST-5EDT01:00:00,M3.2.0/02:00:00,M11.1.0/02:00:00 TZ=US/Eastern
US Central TZ=CST+6CDT01:00:00,M3.2.0/02:00:00,M11.1.0/02:00:00 TZ=US/Central
US
Mountain
TZ=MST-7EDT01:00:00,M3.2.0/02:00:00,M11.1.0/02:00:00 TZ=US/Mountain
US Pacific TZ=PST-8PDT01:00:00,M3.2.0/02:00:00,M11.1.0/02:00:00 TZ=US/Pacific
US Alaska TZ=AKST-9PDT01:00:00,M3.2.0/02:00:00,M11.1.0/02:00:00 TZ=US/Alaska
US Hawaii TZ=HST-10HDT01:00:00,M3.2.0/02:00:00,M11.1.0/02:00:00 TZ=US/Hawaii
Western
Europe - UK
and Ireland
TZ=GMT+0BST01:00:00,M3.5.0/01:00:00,M10.5.0/02:00:00 TZ=Europe/Dublin
Central
Europe -
Netherlands
and
Germany
TZ=CET-1CEST01:00:00,M3.5.0/02:00:00,M10.5.0/03:00:00 TZ=Europe/Berlin
UTC TZ=UTC
Recognize European date format
Recognize European date format
By default, timestamps in Splunk follow the convention of MM/DD/YYYY:HH:MM:SS. Configure
Splunk to use the European date format for timestamps, either permanently (by editing literals.conf)
or temporarily (search-by-search basis) by using the timeformat search modifier.
Note: The only European date format that Splunk currently supports swaps %m and %d
(DD/MM/YYYY:HH:MM:SS). Any other changes to the date string format may cause significant errors
in Splunk Web.
99
Configure European date format in literals.conf
Configure the date format in literals.conf using the SEARCH_TERM_TIME_FORMAT key. This
key changes the format used by search modifiers, search terms, and Splunk Web. Configure your
timestamps permanently by changing the string value of the SEARCH_TERM_TIME_FORMAT key.
Use $SPLUNK_HOME/etc/system/README/literals.conf.example as an example, or create
your own literals.conf. Make any configuration changes to a copy of literals.conf in
Default:
[ui]
SEARCH_TERM_TIME_FORMAT=%m/%d/%Y:%H:%M:%S
SEARCH_RESULTS_TIME_FORMAT = %m/%d/%Y %H:%M:%S
European date format:
[ui]
SEARCH_TERM_TIME_FORMAT= %d/%m/%Y:%H:%M:%S
SEARCH_RESULTS_TIME_FORMAT = %d/%m/%Y %H:%M:%S
Note: You may have to clear your browser's cache to see the result of this change.
Use the timeformat modifier
Use the timeformat search modifier to set timestamps to European format for a single search. Splunk
timestamps have a the format timeformat=%m/%d/%Y:%H:%M:%S by default. Set European date
format by swapping %m and %d in the argument string.
Note: timeformat temporarily overrides the SEARCH_TERM_TIME_FORMAT= setting in
literals.conf.
Example
Use timeformat as an argument to the search command or in Splunk Web's search bar.
timeformat=%d/%m/%Y:%H:%M:%S
Configure positional timestamp extraction
Configure positional timestamp extraction
Set Splunk to use a particular timestamp if an event contains more than one recognizable timestamp.
This is especially useful when indexing events that contain syslog host-chaining data.
Configure positional timestamp extraction by editing props.conf.
100
Configure positional timestamp extraction in props.conf
Note: Use $SPLUNK_HOME/etc/system/README/props.conf.example as an example, or
create your own props.conf. Make any configuration changes to a copy of props.conf in
Configure Splunk to recognize a timestamp anywhere in an event by adding TIME_PREFIX = and
MAX_TIMESTAMP_LOOKAHEAD = keys to a [<spec>] stanza in props.conf. Set a value for
MAX_TIMESTAMP_LOOKAHEAD = to tell Splunk how far into an event to look for the timestamp. Set a
value for TIME_PREFIX = to tell Splunk what pattern of characters to look for to indicate the
beginning of the timestamp.
Example:
If an event looks like:
1989/12/31 16:00:00 ed May 23 15:40:21 2007 ERROR UserManager - Exception thrown Ignoring unsupported search for eventtype: /doc sourcetype="access_combined" NOT eventtypetag=bot
To identify the timestamp: May 23 15:40:21 2007
Configure props.conf:
[source::/Applications/splunk/var/spool/splunk]
TIME_PREFIX = \d{4}/\d{2}/\d{2} \d{2}:\d{2}:\d{2} \w+\s
MAX_TIMESTAMP_LOOKAHEAD = 44
Note: Optimize the speed of timestamp extraction by setting the value of
MAX_TIMESTAMP_LOOKAHEAD = to look only as far into an event as needed for the timestamp you
want to extract. In this example MAX_TIMESTAMP_LOOKAHEAD = is optimized to look 44 characters
into the event .
Tune timestamp extraction for better indexing performance
Tune timestamp extraction for better indexing performance
Tune Splunk's timestamp extraction by editing props.conf. Adjust how far Splunk's timestamp
processor looks into events, or turn off the timestamp processor to make indexing faster.
Note: Use $SPLUNK_HOME/etc/system/README/props.conf.example as an example, or
create your own props.conf. Make any configuration changes to a copy of props.conf in
101
Adjust timestamp lookahead
Timestamp lookahead determines how far (how many characters) into an event the timestamp
processor looks for a timestamp. Adjust how far the timestamp processor looks by setting a value
(the number of characters) for the MAX_TIMESTAMP_LOOKAHEAD = key in any timestamp stanza.
Note: You can set MAX_TIMESTAMP_LOOKAHEAD = to different values for each timestamp stanza.
The default number of characters that the timestamp processor looks into an event is 150. Set
MAX_TIMESTAMP_LOOKAHEAD = to a lower value to speed up how fast events are indexed. You
should do this if your timestamps occur in the first part of your event.
If your events are indexed in real time, increase Splunk's overall indexing performance by turning off
timestamp lookahead (set MAX_TIMESTAMP_LOOKAHEAD = 0). This causes Splunk to not look into
event's for a timestamp, and sets an event's timestamp to be its indexing time (using current system
time).
Example:
This example tells the timestamp processor to look 20 characters into events from source foo.
[source::foo]
MAX_TIMESTAMP_LOOKAHEAD = 20
...
Turn off the timestamp processor
Turn off the timestamp processor entirely to significantly improve indexing performance. Turn off
timestamp processing for events matching a host, source, sourcetype specified by a timestamp
stanza by adding a DATETIME_CONFIG = key to a stanza and setting the value to NONE. When
timestamp processing is off, Splunk won't look for timestamps to extract from event data. Splunk will
instead set an event's timestamp to be its indexing time (using current system time).
Example:
This example turns off timestamp extraction for events that come from the source foo.
[source::foo]
DATETIME_CONFIG = NONE
...
Train Splunk to recognize a timestamp
Train Splunk to recognize a timestamp
Splunk recognizes most timestamps by default; for more information read How Splunk extracts
timestamps. If Splunk doesn't recognize a particular timestamp, you can use the train dates
command to teach Splunk the pattern. The output of train dates is a regular expression that you
can add to datetime.xml and props.conf to configure the unique timestamp extraction.
102
The train command lets you interactively teach Splunk new patterns for timestamps, fields, and
sourcetypes. for more information about train and the different arguments you can use with it, refer
to the train help page:
./splunk help train
Important: Use train dates only when you can't configure the timestamp with props.conf.
Steps to configure timestamps with train dates
To teach Splunk a new timestamp pattern, complete the following steps:
1. Copy a sampling of your timestamp data into a plain text file.
Splunk learns the pattern of the timestamp based on the patterns in this text file.
2. Run the train dates command.
This feature is interactive. When prompted, provide the path to the text file containing your timestamp
data. The command produces a regular expression for your timestamp.
3. Create a custom datetime.xml.
Copy the output of the train command into a copy of datetime.xml file.
Note: The default datetime.xml file is located in $SPLUNK_HOME/etc/datetime.xml. Do not
modify this file; instead, copy the default datetime.xml into a custom application directory in
$SPLUNK_HOME/etc/apps/ or $SPLUNK_HOME/etc/system/local/. Refer to the User Manual
topic about applications for more information.
4. Edit your local props.conf.
Include the path to your custom datetime.xml file in the relevant stanzas.
Note: The following instructions assume that you have set a Splunk environment variable. Otherwise,
navigate to SPLUNK_HOME/bin and run Splunk CLI commands with:
./splunk [command]
Run the train dates command
The train command is an interactive CLI tool. For Splunk to learn a new date format, you need to
explicitly provide a file and pattern. Afterwards, Splunk returns a string for you to add to
datetime.xml.
1. To begin training Splunk to recognize a new timestamp, type:
103
./splunk train dates
Splunk prompts you for an action:
------------------------------------------------------
What operation do you want to perform? (default=learn)
------------------------------------------------------
Enter choice: [Learn]/Test/Quit >
The default action is Learn.
2. To perform the training operation, type "L", "l", or "learn". Click enter.
Splunk prompts you to give it the sample file you want to use to train it:
Enter full filename from which to learn dates > sampling.txt
3. Enter the path of the file on your Splunk server (this step doesn't allow tab-complete).
Splunk displays the first line of your sample and asks you to teach it the values for the timestamp:
------------------------------------
Interactively learning date formats.
------------------------------------
INSTRUCTIONS: If a sample line does not have a timestamp, hit Enter.
If it does have a timestamp, enter the timestamp separated by commas
in this order: month, day, year, hour, minute, second, ampm, timezone.
Use a comma as a placeholder for missing values. For example, for a
sample line like this "[Jan/1/08 11:56:45 GMT] login", the input
should be: "Jan, 1, 08, 11, 56, 45, , GMT" (note missing AM/PM).
Spaces are optional.
SAMPLE LINE 1:
Tue Jul 10 21:23:06 PDT 2007 Received Trade 330 with detail user: user3456 date: date: 10Jul200721:
23:06 action: sell 3583 MNAG @ 42
--------------------------------------------------------------------------------
Enter timestamp values as: month, day, year, hour, minute, second, ampm, timezone.
> 7, 10, 2007, 9, 23, 06, pm, PDT
4. Enter values for month, day, year, hour, minute, second, ampm, and timezone (as shown above).
This trains Splunk to recognize the values you enter as the designated portions of the timestamp.
If the values are sufficient, Splunk displays:
Learned pattern.
----------------------------------------------------------------------------------
If you are satisfied that the timestamps formats have been learned, hit control-c.
----------------------------------------------------------------------------------
5. After you hit control-c, Splunk displays:
Patterns Learned.
It is highly recommended that you make changes to a copy of the default datetime.xml file.
For example, copy "/Applications/splunk/etc/datetime.xml" to "/Applications/splunk/etc/system/local/datetime.xml", and work with that file.
In that custom file, add the below timestamp definitions, and add the pattern names
104
to timePatterns and datePatterns list.
For more details, see http://www.splunk.com/doc/latest/admin/TrainTimestampRecognition
--------------------------------------------------------------------------------
<define name="trainwreck_1_date" extract="day,litmonth,year,">
<text><![CDATA[:\d+\s\w+\s(\d+)\s(\w+)\s(\d+)]]></text>
</define>
<define name="trainwreck_1_time" extract="hour,minute,second,ampm,">
<text><![CDATA[(\d+):(\d+):(\d+)\s(\w+)]]></text>
</define>
------------------------------------------------------
What operation do you want to perform? (default=learn)
------------------------------------------------------
Enter choice: [Learn]/Test/Quit > q
6. Check the output.
If it's correct, quit. Then, copy the output and continue to the next section.
If it's not correct, enter the Learn choice to re-train Splunk.
Create a custom datetime.xml
After running train, Splunk outputs a string describing the new timestamp pattern.
In your custom datetime.xml file:
1. Paste the string returned from train before the <timePatterns> and <datePatterns>
stanzas.
2. Add <use name="define name"/> for both <timePatterns> and <datePatterns> with the
string defined as the <define name="string".
Example:
For the following train dates output:
<define name="_utcepoch" extract="utcepoch">
<text><![CDATA[((?<=^|[\s#,"=\(\[\|\{])(?:1[01]|9)\d{8}|^@[\da-fA-F]{16,24})(?:\d{3})?(?![\d\(])]]></text>
</define>
The modified datetime.xml file might look something like:
</define>
<timePatterns>
<use name="_time"/>
<use name="_hmtime"/>
<use name="_hmtime"/>
<use name="_dottime"/>
<use name="_combdatetime"/>
<use name="_utcepoch"/>
</timePatterns>
105
</define>
<datePatterns>
<use name="_usdate"/>
<use name="_isodate"/>
<use name="_eurodate"/>
<use name="_bareurlitdate"/>
<use name="_orddate"/>
<use name="_combdatetime"/>
<use name="_masheddate"/>
<use name="_masheddate2"/>
<use name="_utcepoch"/>
</datePatterns>
Edit your local props.conf
To apply your custom timestamp, Splunk needs to know where to find your new datetime.xml.
Modify props.conf to:
1. Add a DATETIME_CONFIG key to the timestamp configuration stanzas.
2. Set the value of DATETIME_CONFIG to the path of your custom datetime.xml.
Note: See all of the keys you can set in a stanza to configure timestamp recognition.
Example:
This example applies a custom datetime.xml to events from the host, "london".
[host::london]
DATETIME_CONFIG = /etc/system/local/datetime.xml
You can set custom timestamp extraction patterns for any host, source, or sourcetype by editing
props.conf.
106
Fields
How fields work
How fields work
A field is any searchable name/value pair. A field is distinguished from the free-form indexed
segments of an event in that fields are labeled and can be searched by label. For example,
host=foo is a field with the name host and value foo. Search for any field name or specific value
of a field.
The majority of fields are created at search time. Splunk picks out obvious name/value pairs in search
results, such as user_id or client_ip. This dynamic extracted field list can be used in filters and
reports. Configure Splunk to recognize new fields.
When creating field names, Splunk uses the following rules:
1. All characters that are not in a-z,A-Z,0-9 ranges are replaced with an underscore (_).
2. All leading underscores are removed (since they're reserved for internal variables).
This is applied to all extracted fields whether they are automatically extracted by, or custom
configured.
Add custom fields
Define your own custom fields in Splunk Web with interactive field extraction. Or create fields using
configuration files. Use props.conf and transforms.conf.
To make new fields via configuration files, use the following process:
1. Determine a pattern to identify the field in the event.
2. Write a regular expression to extract the field from the event.
3. Edit your custom props.conf and transforms.conf files. (Note: DO NOT edit the copy in
$SPLUNK_HOME/etc/system/default/.)
4. In props.conf, specify either the source, source type or host containing the events and assign a
name to identify the transform in transforms.conf.
5. In transforms.conf, create the named transform stanza, and supply the regex to extract the
field.
107
Disable automatically extracted fields
Splunk automatically extracts fields from your data and adds them to the Fields drop-down menu in
Splunk Web. Disable this feature via props.conf. You can turn off extracted fields for a specific
source, sourcetype or host. Add the attribute/value pair KV_MODE = none for the appropriate
[<spec>] in $SPLUNK_HOME/etc/system/local/props.conf:
[<spec>]
KV_MODE = none
<spec> can be:
Indexed fields
Indexed fields are captured as events are processed and indexed by Splunk. Splunk's input
processor extracts information on where the event came from, what type of event it is, source type,
etc. In general, indexed fields are not recommend unless you notice a significant impact on search
performance with your extracted fields. This may happen if you search for expressions like
foo!="bar" or NOT foo="bar" and the field foo nearly always takes on the value bar. Also, you
may want to use indexed fields if the value of the field exists outside of the field more often than not.
For example, if you commonly search for foo="1", but 1 occurs in many events that do not have
foo="1", you may want to index foo.
Fields extracted at index time have a negative impact on indexing performance. They may also affect
search times, as each indexed field increases the size of the searchable index. Indexed fields are
also less flexible -- if you want to make changes to indexed fields you must re-index the entire
dataset.
To configure indexed fields, see this page. You may also configure fields.conf to set additional
processing information. Read more about how to configure fields.conf.
Create fields via Splunk Web
Create fields via Splunk Web
Use interactive field extration to create new fields dynamically via Splunk Web. Any search can be
turned into one or more fields. You can use interactive field extraction on the local indexer; it is not
supported when attempting to extract from a non-local event (in a distributed search environment).
Note: You cannot use a field you've extracted based on event types to define another event type or
field.
To extract fields with Splunk Web:
108
1. Run a search in Splunk Web:
host=pearl
2. Each event has a drop-down arrow under the timestamp. Click the drop-down arrow under the
timestamp of any interesting event.
3. Choose Extract field. A dialog box pops up, allowing you to configure your field extraction rules:
View the Sample Event dialog to see the event that you chose to extract fields from.
4. Enter values in the Example Value(s) dialog to tell Splunk what you want to extract as a field.
5. From the Rules section, select an event type, host, source, or sourcetype to restrict events you're
extracting from.
6. Click Preview to show the rules (regular expressions under Generated rules) that Splunk uses to
extract the example values. View the events Splunk extracted values from via the Preview window.
7. Select or de-select rules (Generated rules) or Preview extractions to alter the field extraction rule
you want to create.
8. When you are satisfied with the results, click Save to save and name the field.
Important: Do not include spaces in your field name. Splunk may not format the regex (in
transforms.conf properly for field names that contain spaces. Also, if you include
non-alphanumeric characters in your field name, Splunk:
Trims leading non-alphanumeric characters.
Replaces other non-alphanumeric characters with an underscore.
You can now use the extracted field you just created in a search.
The field extraction action above will be stored in props.conf and transforms.conf in
$SPLUNK_HOME/etc/system/local directory. In order to undo the field extracted, comment respective
stanzas in props.conf and transforms.conf and restart Splunk.
Create fields via configuration files
Create fields via configuration files
Splunk automatically extracts fields during searches using known keywords for the source type and
name/value pairs in the events. Examine the fields in Splunk Web by clicking the Fields... link above
the event display:
You can add your own custom fields. Use the instructions below to create fields. The basic steps are:
109
1. Determine a pattern to identify the field in the event.
2. Write a regular expression to extract the field from the event.
3. Add your regex to transforms.conf.
4. In props.conf, link your regex to the source, source type or host containing the events.
5. If your field value is a portion of a word, you must also add an entry to fields.conf. See the example
"create a field from a subtoken" below.
Configuration
To create additional fields, edit transforms.conf and props.conf. Edit these files in
Note: DO NOT edit files in $SPLUNK_HOME/etc/system/default/.
transforms.conf
Add the following lines to $SPLUNK_HOME/etc/system/local/transforms.conf:
[<unique_stanza_name>]
REGEX = <your_regex>
FORMAT = <your_custom_field_name>::$1
<unique_stanza_name> = name your stanza. Use this name later in configuring
props.conf.

<your_regex> = create a regex that recognizes your custom field value.
FORMAT = <your_custom_field_name>::$1 is the name of your field; $1 is the value
specified by the regular expression.
In order to properly display field values containing whitespace in Splunk Web, you must
apply quotes to the FORMAT key.

FORMAT = <your_custom_field_name>::"$1"

Note: In order to preserve previous matching extractions, include a $0 in the FORMAT key. If you
don't include $0, the previously extracted fields will be erased and only the last matching extraction
specified in transforms.conf will be kept.
Note: Unlike configuring indexed fields, transforms.conf requires no DEST_KEY since nothing is
being written to the index. The field is extracted at search time and is not persisted in the index as a
key.
110
props.conf
Add the following lines to $SPLUNK_HOME/etc/system/local/props.conf:
[<spec>]
REPORT-<value> = <unique_stanza_name>
<spec> can be:
<sourcetype>, the sourcetype of an event. 1.
host::<host>, where <host> is the host for an event. 2.
source::<source>, where <source> is the source for an event. 3.
<unique_stanza_name> is the name of your stanza from transforms.conf.
<value> is any value you want to give to your stanza to identify its name-space.
To display only your explicitly configured extracted fields and not the automatically recognized ones,
add KV_MODE = none to your stanza in transforms.conf.
Note: Extracted fields props.conf uses REPORT-$VALUE as opposed to TRANSFORMS-$VALUE
used in configuring indexed fields.
Examples
Add a new field
This examples shows how to create a new "error" field. The field can be identified by the occurrence
of device_id= followed by a word within brackets and a text string terminating with a colon. The
sourcetype of the events is testlog.
In transforms.conf add:
[netscreen-error]
REGEX = device_id=[^ ]+\s+\[w+\](.*)(?
FORMAT = err_code::$1
In props.conf add:
[testlog]
REPORT-netscreen = netscreen-error
Extract fields from multi-line events
This example shows how to anonymize fields in multi-line events.
Below is a sample event from an application log:
"2006-09-21, 02:57:11.58", 122, 11, "Path=/LoginUser
Query=CrmId=ClientABC&ContentItemId=TotalAccess&SessionId=3A1785URH117BEA&Ticket=646A1DA4STF896EE&SessionTime=25368&ReturnUrl=http://www.clientabc.com,
111
Method=GET, IP=209.51.249.195, Content=", ""
"2006-09-21, 02:57:11.60", 122, 15, "UserData:<User CrmId="clientabc"
UserId="p12345678"><EntitlementList></EntitlementList></User>", ""
"2006-09-21, 02:57:11.60", 122, 15, "New Cookie:
SessionId=3A1785URH117BEA&Ticket=646A1DA4STF896EE&CrmId=clientabc&UserId=p12345678&AccountId=&AgentHost=man&AgentId=man,
MANUser:
Version=1&Name=&Debit=&Credit=&AccessTime=&BillDay=&Status=&Language=&Country=&Email=&EmailNotify=&Pin=&PinPayment=&PinAmount=&PinPG=&PinPGRate=&PinMenu=&",
""
The administrator wants to protect some of the information, specifically the fields SessionId and
Ticket. This example masks these IDs except the last 4 characters, for example:
SessionId=###########7BEA&Ticket=############96EE
To anonymize the data, modify props.conf and transforms.conf in
$SPLUNK_HOME/etc/system/local/.
Add the following to props.conf:
[source::source-to-anonymize]
TRANSFORM-anonymize = session-anonymizer, ticket-anonymizer
Now, configure transforms.conf to recognize multi-line data. To extract fields from multi-line
events, you must enable the multi-line mode of Splunk's regular expression processor. Turn on
multi-line mode by including (?m) at the beginning of a regular expression.
Add the following to transforms.conf:
DEST_KEY = _raw
[ticket-anonymizer]
REGEX = (?m)^(.*)Ticket=\w+(\w{4}&.*)$
FORMAT = $1Ticket=########$2
DEST_KEY = _raw
When the regular expression processor is in multi-line mode( (?m) at the start of a regex pattern),
the ^ and $ characters denote the beginning and ending of lines instead of the beginning and ending
of the entire string.
Create a field from a subtoken
If your field value is a smaller part of a token, you must add an entry to fields.conf. For example, your
field's value is "123" but it occurs as "foo123" in your event.
Configure props.conf and transforms.conf as explained above. Then, add an entry to
fields.conf:
[<fieldname>]
INDEXED = False
112
INDEXED_VALUE = False
Fill in <fieldname> with the name of your field.
For example, [url] if you've configured a field named "url."

Set INDEXED and INDEXED_VALUE to false.
This tells Splunk that the value you're searching for is not a token in the index.

For more information on using fields.conf, see the page on "configuring fields.conf".
Create indexed fields via configuration files
Create indexed fields via configuration files
Splunk automatically adds indexed fields such as host, source, source type, event type,
etc. Create your own custom indexed fields. Once you have created a new indexed field, it appears in
the Fields drop-down menu in Splunk Web. You can also search on it, as well, by typing
$CUSTOM_FIELD=foo in your search.
Note: Indexed fields have performance implications. Read about how fields work for more
information. It is rarely necessary to create indexed fields (versus extracted fields). You may want to
use indexed fields if you search for expressions like foo!="bar" or NOT foo="bar" and the field
foo nearly always takes on the value bar. Another common reason to use indexed fields is if the
value of the field exists outside of the field more often than not. For example, if you commonly search
for foo="1", but 1 occurs in many events that do not have foo="1", you may want to index foo.
Configuration
Define additional indexed fields by editing props.conf, transforms.conf and fields.conf.
Edit these files in $SPLUNK_HOME/etc/system/local/, or your own custom application directory
in $SPLUNK_HOME/etc/apps/. For more information on configuration files in general, see how
transforms.conf
Add the following lines to $SPLUNK_HOME/etc/system/local/transforms.conf:
REGEX = <your_regex>
WRITE_META = true
$UNIQUE_STANZA_NAME = name your stanza. Use this name later to configure
props.conf.

REGEX = create a regex that recognizes your custom field value.
FORMAT = inserts <your_custom_field_name> before the value you've extracted via regex as
$1.
In order to properly display field values containing whitespace in Splunk Web, apply

113
quotes to the FORMAT key.
Multiple fields can be extracted using a single regex that contains multiple match groups
FORMAT = <your_first_field>::"$1" <your_second_field>::"$2"
WRITE_META = set this to true to write your field name and value to meta. This is where
indexed fields are stored.

props.conf
[<spec>]
TRANSFORMS-<value> = <unique_stanza_name>
<spec> can be:
<sourcetype>, the sourcetype of an event.
host=<host>, where <host> is the host for an event.
source=<source>, where <source> is the source for an event.

<unique_stanza_name> is the name of your stanza from transforms.conf.
<value> is any value you want to give to your stanza to identify its name-space.
fields.conf
Add an entry to fields.conf for your new indexed field.
[<your_custom_field_name>]
INDEXED=true
<your_custom_field_name> is the name of the custom field you set in
transforms.conf.

Set INDEXED=true to indicate that the field is indexed.
If a field of the same name is extracted in other data (rather than indexed), you must not set
INDEXED=true. In this case, you must also set INDEXED_VALUE=false if events exists that
have values of that field which are not indexed. An example of this case would be a regex like:
A(\d+)B, where the string A1234B would yield the value 1234 for the field, but the event
cannot be found by searching for 1234.

Examples
Example 1
This example creates an indexed field called err_code.
transforms.conf
In $SPLUNK_HOME/etc/system/local/transforms.conf add:
[netscreen-error]
FORMAT = err_code::"$1"
114
WRITE_META = true
This stanza takes 'device_id=' followed with a word within brackets and a text string terminating
with a colon. The source type of the events is testlog.
Comments:
The FORMAT = line contains the following values:
err_code:: is the name of the field.
$1 refers to the new field written to the index. It is the value extracted by REGEX.

WRITE_META = true is an instruction to write the content of FORMAT to the index.
props.conf
[testlog]
TRANSFORMS-netscreen = netscreen-error
fields.conf
Add the following lines to $SPLUNK_HOME/etc/system/local/fields.conf:
[err_code]
INDEXED=true
Example 2
This example creates two indexed fields called username and login_result.
transforms.conf
[ftpd-login]
REGEX = Attempt to login by user: (.*): login (.*)\.
FORMAT = username::"$1" login_result::"$2"
WRITE_META = true
This stanza finds the literal text Attempt to login by user: , extracts a username, followed by
a colon, and then the result, which is followed by a period. A line might look like
2008-10-30 14:15:21 mightyhost awesomeftpd INFO Attempt to login by user:
root: login FAILED.
props.conf
[ftpd-log]
TRANSFORMS-login = ftpd-login
fields.conf
[username]
INDEXED=true
[login_result]
INDEXED=true
115
How indexed fields work in detail
Splunk builds indexed fields by writing to _meta. Here's how it works:
_meta is modified by all matching transform that contain either DEST_KEY = meta or
WRITE_META = true.

Each transform can overwrite _meta, so use WRITE_META = true to append _meta.
If you don't use WRITE_META, then start your FORMAT with $0.

After _meta is fully built during parsing, the text is interpreted in the following way.
The text is broken into units; each unit is separated by whitespace.
Quotation marks (" ") group characters into larger units, regardless of whitespace.
Backslashes ( \ ) immediately preceding quotation marks disable the grouping
properties of quotation marks.

Backslashes preceding a backslash disable that backslash.
Units of text that contain a double colon (::) are turned into extracted field. The text on
the left side of the double colon becomes the field name, and the right side becomes
the value.

Note: Indexed fields with regex-extracted values containing quotation marks will generally not work,
and backslashes may also have problems. Extracted fields do not have these limitations.
Quoting example
WRITE_META = true
FORMAT = field1::value field2::"value 2" field3::"a field with a \" quotation mark" field4::"a field which
ends with a backslash\\"
Field actions
Field actions
Enable interactions between your indexed fields and other web resources via field_actions.conf. For
example, enable a reverse lookup of an IP address. Edit field_actions.conf in
NOTE: You must both restart your Splunk server and clear your browser's cache before any changes
take place. Some versions of Firefox may not clear the cache completely when instructed, so you
may have to completely restart your browser to see your changes.
Configuration
Add a stanza to specify which host, uri and label to use for your custom field action. Once this is
enabled, your label will be added to the drop down menu next to the field specified by the metaKeys
attribute, if two or more metaKeys are specified the label will appear in the drop down menu under
the time stamp. Other attribute/value pairs are available for stanzas in field_actions.conf.
116
Show source is a type of field action. If the host or source fields are not present then Show source
is not available from the drop-down menu next to the timestamp. If your field action does not appear,
ensure the correct fields are visible by selecting them from the Fields menu.
Example
[googleExample]
metaKeys=clientip
uri=http://google.com/search?q={$clientip}
label=Google this ip
method=GET
This example enables you to look up the clientip= field via Google. Once you have set up the
clientip field through the fields drop-down menu, you can select the new Google this IP link from
the drop down next to the clientip field.
[some_custom_search]
metaKeys = ruser,rhost
term=authentication failure | filter ruser={$ruser} rhost={$rhost}
label=Search for other break in attempts by this user
alwaysReplace=true
This example enables you to run another search for authentication failures on the ruser and rhost
fields.
Learn more about field_actions.conf, including which other attribute/value pairs are available.
Configure fields.conf
Configure fields.conf
Use fields.conf to configure how Splunk handles user-defined fields at index time and search time.
Edit fields.conf in $SPLUNK_HOME/etc/system/local/, or your own custom application
directory in $SPLUNK_HOME/etc/apps/. For more information on configuration files in general, see
how configuration files work.
Configure fields.conf to:
Tell Splunk how to handle multi-value fields.
Distinguish indexed and extracted fields.
Improve search performance by telling the search processor how to handle field values.
117
Configuration
[<field name>]
TOKENIZER = <regex>
INDEXED = true | false
INDEXED_VALUE = true | false
[<field name>]
Name of the field you're configuring.
Follow this stanza name with any number of the following attribute/value pairs.
TOKENIZER = <regular expression>
A regular expression that indicates how the field can take on multiple values at the same time.
Use this setting to configure multi-value fields.

If empty, the field can only take on a single value.
Otherwise, the first group is taken from each match to form the set of values.
This setting is used by search/where (the search command), the summary and XML outputs of
the asynchronous search API, and by the top, timeline and stats commands.

Default to empty.
Indicate whether a field is indexed or not.
Set to true if the field is indexed.
Set to false for fields extracted at search time (the majority of fields).
Defaults to false.
Set indexed_value to true if the value is in the raw text of the event.
Set it to false if the value is not in the raw text of the event.
Setting this to true expands any search for key=value into a search of value AND key=value
(since value is indexed).

Defaults to true.
Note: You only need to set indexed_value if indexed = false.
Configure multi-value fields
Configure multi-value fields
Configure multi-value fields in fields.conf to tell Splunk how to recognize more than one field
value in a single extracted field value. Edit fields.conf in
Splunk parses multi-value fields at search time, and allows you to process the values in the search
pipeline. Learn which search commands support multi-value fields).
118
Learn more about using multi-value fields.
Configure multi-value fields via fields.conf
Define a multi-value field by adding a stanza for it in
$SPLUNK_HOME/etc/system/local/fields.conf. Tell Splunk how to parse values from a field
value by defining a regular expression with the tokenizer key.
Note: If you have other attributes to set for a field, set them in the same stanza underneath
tokenizer. See configure fields.conf for more information.
[<field name>]
tokenizer = $REGEX
[<field name>]
Set this to the name of the field you've defined in props.conf and transforms.conf.
Add indexed or extracted fields.
tokenizer
Define a regular expression to tell Splunk how to parse the field into multiple values.
Example
The following examples from $SPLUNK_HOME/etc/system/README/fields.conf.example
break email fields To, From and CC into mutliple values.
[To]
TOKENIZER = (\w[\w.\-]*@[\w.\-]*\w)
[From]
TOKENIZER = (\w[\w.\-]*@[\w.\-]*\w)
[Cc]
TOKENIZER = (\w[\w.\-]*@[\w.\-]*\w)
Configure tags
Configure tags
Splunk stores tag information in the tags.conf configuration file. The tags.conf file enables you to
define tags directly in the configuration file. You can also use it to access and edit any tags you've
created through Splunk Web. The tags.conf file is located in
$SPLUNK_HOME/etc/system/local/. (For more information about managing tags through Splunk
Web, see the section on tags in the User Manual.)
With tags.conf, you can:
119
Edit the file to add and remove tags
Share tags among Splunk servers by copying tags.conf from one server to another
Use the deployment server to push tags to deployment clients
Back up your tags when you back up your configuration files
Disable default tags from applications without editing the applications
Note: Splunk doesn't allow the use of wildcards in any part of tags.conf. If you want to include
more than one host for tagging, save a search as an event type and tag it.
Configure tags with tags.conf files
When you first create tags in Splunk Web for your Splunk server, Splunk automatically creates a
tags.conf file in $SPLUNK_HOME/etc/system/local/. Any tags you create through Splunk
Web will show up in this primary tags.conf file.
If you use a Splunk application, you may want to define a separate set of tags that are specific to that
application. If that is the case, you need to manually create a tags.conf file in the folder for that
application in $SPLUNK_HOME/etc/apps/, and define the tags specific to that application within it.
Each Splunk application you use can have its own separate tags.conf file. Keep in mind that even
when you are using Splunk applications, tags you create through Splunk Web will always be added
by Splunk to the primary tags.conf file in $SPLUNK_HOME/etc/system/local/.
For more information on configuration files in general, see how configuration files work.
In the tags.conf file:
Stanzas group values for specific fields together, and tags are then associated with these
values

Each stanza line can contain only one tag, but you can use the same tag for multiple values
within a stanza

There can be any number of stanzas, but each stanza refers to just one field in your system
Each tag in the stanza must be either enabled or disabled
A stanza can contain any number of tags as long as there is only one tag per line
So the basic syntax of a tags.conf stanza is as follows:
[<field name>]
tag::<value>::<tag> = <enabled|disabled>
The following syntax example shows how you can apply multiple tags to a single field value and
associate specific tags with multiple field values:
[<field name>]
tag::<value1>::<tag1> = <enabled|disabled>
120
In the above syntax example, note that:
value1 and value2 are each associated with two tags
tag2 is associated with both value1 and value2
Examples
These examples illustrate how to create, edit, and disable tags in a tags.conf file.
Note: After you make changes to a tags.conf file you must restart Splunk to apply those changes.
Create or edit tags
To create a group of tags for the host field:
host="localhost" with tags local and dharma
host="hulk" with tags remote and linuxhost
All active tags for host are enabled.
[host]
tag::localhost::local= enabled
tag::localhost::dharma= enabled
tag::hulk::remote = enabled
tag::hulk::linuxhost = enabled
Note: You can also create tags using the tagcreate function in Splunk Web. For more information,
see the topic Manage tags with tagcreate and tagdelete.
Disable tags
To disable the local and dharma tags, change their entries from enabled to disabled:
[host]
tag::localhost::local = disabled
tag::localhost::dharma = disabled
tag::hulk::remote = enabled
tag::hulk::linuxhost=enabled
Note: You can also disable tags using the tagdelete function in Splunk Web. For more information,
see the topic Manage tags with tagcreate and tagdelete.
121
Automatic header-based field extraction
Automatic header-based field extraction
You can configure Splunk to extract fields automatically from data sources that contain headers.
Examples of sources that have headers are: CSV, TM3, or MS Exchange log files. To do this, use
automatic header-based field extraction.
How automatic header-based field extraction works
If you enable automatic header-based field extraction for a source or source type, Splunk scans that
source or source type for header information to use to extract fields. If a source has the necessary
information, Splunk extracts fields using delimiter-based key/value extraction.
Splunk does this by creating an entry in transforms.conf for the source, and populating it with
transforms to extract the fields. Splunk also adds a source type stanza to props.conf to tie the field
extraction transforms to the source. Splunk then applies the transforms to events from the source at
search time.
Note: Automatic header-based field extraction doesn't impact index size or indexing performance
because it occurs during source typing (before index time).
Once Splunk has extracted fields, you can use them for filtering and reporting just like any other field
by selecting them from the Fields picker in Splunk Web.
Configure automatic header-based field extraction
Configure automatic header-based field extraction for any source or source type by editing
props.conf. Edit this file in $SPLUNK_HOME/etc/system/local/, or your own custom
application directory in $SPLUNK_HOME/etc/apps/.
For more information on configuration files in general, see how configuration files work.
To turn on automatic header-based field extraction for a source or source type, add
CHECK_FOR_HEADER=TRUE under that source or source type's stanza in props.conf.
Important: If you have already defined a source type for the source for which you want to enable
automatic header-based field extraction, you must edit the stanza in inputs.conf and remove the
sourcetype = [name] before you set CHECK_FOR_HEADER=TRUE in props.conf so that it
doesn't conflict with the value that is generated by the automatic extraction.
Example props.conf entry for an MS Exchange source:
[MSExchange]
CHECK_FOR_HEADER=TRUE
...
122
Note: Set CHECK_FOR_HEADER=FALSE to turn off automatic header-based field extraction for a
source or source type.
Changes Splunk makes to configuration files
If you enable automatic header-based field extraction for a source or sourcetype, Splunk adds
information to copies of transforms.conf and props.conf in
$SPLUNK_HOME/etc/apps/learned/ when it extracts fields for that source or sourcetype.
Important: Don't edit this information afterward, or the extracted fields will not work.
Splunk creates a stanza in transforms.conf for each source type with unique header information
that matches a source type defined in props.conf. Splunk names each stanza it creates as
[AutoHeader-M], where M in an integer that increments sequentially for each source that has a
unique header ([AutoHeader-1], [AutoHeader-2],...,[AutoHeader-M]). Splunk
populates each stanza with transforms to extract the fields (using header information).
Important: If you have already defined a source type for the source for which you want to enable
automatic header-based field extraction, you must edit the stanza in inputs.conf and remove the
sourcetype = [name] before you set CHECK_FOR_HEADER=TRUE so that it doesn't conflict with
the value that is generated by the automatic extraction.
Example of an transforms.conf entry made automatically by Splunk for the MS Exchange source
mentioned above:
...
[AutoHeader-1]
FIELDS="time", "client-ip", "cs-method", "sc-status"
DELIMS=" "
...
Splunk then adds new source type stanzas to props.conf for each unique source. Splunk names
the stanzas as [yoursource-N], where yoursource is the source type configured with automatic
header-based field extraction, and N is an integer that increments sequentially for each transform in
transforms.conf.
Example props.conf entry using the MS Exchange file from the introduction:
# the original source you configured
[MSExchange]
...
# source type that Splunk added to handle transforms for automatic header-based field extraction for the same source
[MSExchange-1]
REPORT-AutoHeader = AutoHeader-1
...
123
Note about search and header-based field extraction
To return all events that Splunk has typed with a source type it generated while running automatic
header-based field extraction, use a wildcard to search for all events of that source type.
A search for sourcetype="yoursource" looks like this:
sourcetype=yoursource*
Examples
These examples show how header-based field extraction works with common source types.
MS Exchange source file
This example shows how Splunk extracts fields from an MS Exchange file using automatic
header-based field extraction.
This sample MS Exchange log file has a header containing a list of field names, delimited by spaces:
# Message Tracking Log File
# Exchange System Attendant Version 6.5.7638.1
# Fields: time client-ip cs-method sc-status
14:13:11 10.1.1.9 HELO 250
14:13:13 10.1.1.9 MAIL 250
14:13:19 10.1.1.9 RCPT 250
14:13:29 10.1.1.9 DATA 250
14:13:31 10.1.1.9 QUIT 240
Splunk creates a header and transform in tranforms.conf:
[AutoHeader-1]
FIELDS="time", "client-ip", "cs-method", "sc-status"
DELIMS=" "
Splunk then ties the transform to the source by adding this to the source type stanza in props.conf:
# Original source type stanza you create
[MSExchange]
...
# source type stanza that Splunk creates
[MSExchange-1]
...
Splunk automatically extracts the following fields from each event:
14:13:11 10.1.1.9 HELO 250
time="14:13:11" client-ip="10.1.1.9" cs-method="HELO" sc-status="250"
124
14:13:13 10.1.1.9 MAIL 250
time="14:13:13" client-ip="10.1.1.9" cs-method="MAIL" sc-status="250"
14:13:19 10.1.1.9 RCPT 250
time="14:13:19" client-ip="10.1.1.9" cs-method="RCPT" sc-status="250"
14:13:29 10.1.1.9 DATA 250
time="14:13:29" client-ip="10.1.1.9" cs-method="DATA" sc-status="250"
14:13:31 10.1.1.9 QUIT 240
time="14:13:31" client-ip="10.1.1.9" cs-method="QUIT" sc-status="240"
CSV file
This example shows how Splunk extracts fields from a CSV file using automatic header-based field
extraction.
Example CSV file contents:
foo,bar,anotherfoo,anotherbar
100,21,this is a long file,nomore
200,22,wow,o rly?
300,12,ya rly!,no wai!
Splunk creates a header and transform in tranforms.conf (located in:
$SPLUNK_HOME/etc/apps/learned/transforms.conf):
# Some previous automatic header-based field extraction
[AutoHeader-1]
...
# source type stanza that Splunk creates
[AutoHeader-2]
FIELDS="foo", "bar", "anotherfoo", "anotherbar"
DELIMS=","
Splunk then ties the transform to the source by adding this to a new source type stanza in
props.conf:
...
[CSV-1]
...
Splunk extracts the following fields from each event:
100,21,this is a long file,nomore
125
foo="100" bar="21" anotherfoo="this is a long file"
anotherbar="nomore"

200,22,wow,o rly?
foo="200" bar="22" anotherfoo="wow" anotherbar="o rly?"
300,12,ya rly!,no wai!
foo="300" bar="12" anotherfoo="ya rly!" anotherbar="no wai!"
126
Hosts
How host works
How host works
An event's host value is the name of the physical device on the network where the event originates.
Host provides an easy way to find all data originating from a given device. Tagging hosts lets you find
data from a group of hosts with a common function or configuration.The value of host may be an IP
address, hostname, or fully qualified domain name. Splunk indexes and stores a host value for
every event it indexes.
How host is assigned
Default assignment
If no other host rules are specified for a source, host will be set to a default host value that applies to
all data coming via inputs on a given Splunk server. The default host value is the hostname or IP
address of the network host. When Splunk is running on the server where the event occurred (which
is the most common case) this is correct and no manual intervention is required.
Learn how to set a default host for a Splunk server.
Override host for remote archive files
If you are running Splunk on a central log archive, or you are working with files copied from other
hosts in the environment, you may need to override the default assignment. You can define host
assignment for an input based on either a custom host value for all data for that input or matching a
portion of the path or filename of a source, such as when you have a directory structure that
segregates the log archive for each host in a different subdirectory.
Centralized log server environment
In the case where there is a centralized log host sending events to Splunk, there may be many
servers involved. The central log server is called the reporting host. The system where the event
occurred is called the originating host (or just the host). In this case you will need to define rules to
extract host per event.
Host tagging
Tag a value of a host field to provide extra information to help you search. This helps you execute
more robust searches by allowing you to cluster multiple hosts into useful categories.
127
Configuration files for host
Set the values for host in inputs.conf. More advanced host extraction configurations require
changes to transforms.conf and props.conf. Before manually modifying any configuration file,
read about configuration files.[[Category:inputs]
Set default host for a Splunk server
Set default host for a Splunk server
The host value of an event is the hostname or IP address of the network host which originated the
event. When Splunk is running on the server where the event occurred the assignment of host is
straight forward. The default name is the host of the Splunk server. Host is added as a tag to all
events in Splunk's index.
via Splunk Web
Change the default host value via Splunk Web. Click on the Admin button in the upper right hand
corner. Select Server: View Settings. Change the Default host name under the Datastore section.
This sets the host tag for all events that don't receive any other host name.
via configuration files
This host assignment is written in inputs.conf during installation. Modify the host entry by editing
$SPLUNK_HOME/etc/system/local/inputs.conf.
This is the format of the host assignment in inputs.conf:
host = <string>
* This is a shortcut for MetaData:Host = <string>. It sets the host of
events from this input to be the specified string. "host::" is
automatically prepended to the value when this shortcut is used.
Set your own host value by changing the entry for <string>.
Define host assignment for an input
Define host assignment for an input
Use these instructions if you want to explicitly set a host value for all data coming in via a specific
configured input. Set host statically for every event in the same input, or dynamically with regex or
segment on the full path of the source. To assign a different host for different sources or sourcetypes
in the same input, extract host per event.
128
Statically
This method assigns the same host for every event for the input.
Also, this will only impact new data coming in via the input. If you need to correct the host displayed
in Splunk Web for data that has already been indexed, you will need to tag hosts instead.
via Splunk Web
Set host whenever you add a data input through the Data Inputs section of Splunk Web's Admin
interface.
Choose Constant value to assign a static value as host for each event that comes from your data
source. Enter the value for host in the DNS name or IP address box.
Edit inputs.conf to specify a host value. Include a host = attribute within the appropriate stanza in
$SPLUNK_HOME/etc/system/local/inputs.conf. Edit inputs.conf in
Configuration
[<inputtype>://<path>]
host = $YOUR_HOST
sourcetype = $YOUR_SOURCETYPE
source = $YOUR_SOURCE
Learn more about inputs types..
Example
[tcp://10.1.1.10:9995]
host = webhead-1
sourcetype = access_common
source = //10.1.1.10/var/log/apache/access.log
This will set the host as "webhead-1" for any events coming from 10.1.1.10, on TCP port 9995.
Dynamically
Use this method if you want to extract the host name from a segment of the source input. For
example, if you have an archived directory you want to index, and the name of each file in the
directory contains relevant host information, you can use Splunk to extract this information and assign
it to the host field.
129
via SplunkWeb
Follow the steps outlined above. However, instead of choosing Constant value, you can choose
either:
Regex on path: Choose this option if you want to extract the host name via a regular expression.
Enter the regular expression for host extraction in the regular expression box.
Segment in path: Choose this option if you want to extract the host name from a segment in your
data source's path. Enter the segment number in the segment # box.
You can set up dynamic host extraction rules when you are configuring inputs.conf. You can add
the following attribute/value pairs to override the host field.
If specified, the regular expression extracts the host from the filename of each input.
Specifically the first group of the regex is used as the host.
If the regex fails to match, the default host = attribute is set as the host.
If specified, the specified '/' separated segment of the path is set as the host of each input.
If the value is not an integer, or is less than 1, the default host = attribute is set as the host.
Examples
This examples uses regex on the file path to set the host:
host_regex = /var/log/(\w+)
Events from /var/log/foo.log are given the hostname "foo".
This examples uses the segment of the path to set the host:
[tail://apache/logs/]
host_segment = 3
This extracts the host name as the third segment in the path apache/logs.
Tag hosts
Tag hosts
Tagging hosts is useful for knowledge capture and sharing, and for crafting more precise searches.
Hosts can be tagged with one or more words describing their function or type, enabling users to
easily search for all activity on a group of similar servers.
130
via Splunk Web
Use the drop down arrow next to the host field in Splunk Web to tag your hosts. Choose Edit tags for
this host.
Then, enter your tags, separated by commas.
Host tags vs host names
Host name is extracted at indexing time. Host tags can be added to any host for additional
information during searches. Each event can have only one host name, but multiple host tags.
For example, if your Splunk server is receiving compliance data from a specific host, tagging that host
with compliance will help your compliance searches. With host tags, you can create a loose
grouping of data without masking the underlying host name.
Extract host per event
Extract host per event
Use these instructions if you want to override the default host name that is assigned to your events.
Configuration
Configure a dynamically extracted host name for any source or sourcetype via transforms.conf and
props.conf. Edit these files in $SPLUNK_HOME/etc/system/local/, or your own custom
application directory in $SPLUNK_HOME/etc/apps/. For more information on configuration files in
general, see how configuration files work.
transforms.conf
Add your custom stanza to $SPLUNK_HOME/etc/system/local/transforms.conf. Configure
your stanza as follows:
DEST_KEY = MetaData:Host
REGEX = $YOUR_REGEX
FORMAT = host::$1
Fill in the stanza name and the regex fields with the correct values for your data.
Leave DEST_KEY = MetaData:Host to write a value to the host:: field. FORMAT = host::$1
writes the REGEX value into the host:: field.
131
Note: Name your stanza with a unique identifier (so it is not confused with a stanza in
$SPLUNK_HOME/etc/system/default/transforms.conf).
props.conf
Create a stanza in $SPLUNK_HOME/etc/system/local/props.conf to map the
transforms.conf regex to the source type in props.conf.
[<spec>]
TRANSFORMS-$name=$UNIQUE_STANZA_NAME
<spec> can be:
<sourcetype>, the sourcetype of an event 1.
host::<host>, where <host> is the host for an event 2.
source::<source>, where <source> is the source for an event 3.
$name is whatever unique identifier you want to give to your transform.
$UNIQUE_STANZA_NAME must match the stanza name of the transform you just created in
transforms.conf.
Note: Optionally add any other valid attribute/value pairs from props.conf when defining your stanza.
This assigns the attributes to the <spec> you have set. For example, if you have custom
line-breaking rules to set for the same <spec>, append those attributes to your stanza.
Example
The following logs contain the host in the third position.
41602046:53 accepted pearl
41602050:29 accepted swan
41602052:17 accepted pearl
Create a regex to extract the host value and add it to a new stanza in
$SPLUNK_HOME/etc/system/local/transforms.conf:
[station]
REGEX = \s(\w*)$
FORMAT = host::$1
Now, link your transforms.conf stanza to $SPLUNK_HOME/etc/system/local/props.conf
so your transforms are called. For example, the above transform works with the following stanza in
props.conf:
[source::.../hatch.log]
TRANSFORMS-dharma=station
SHOULD_LINEMERGE = false
132
The above stanza has the additional attribute/value pair SHOULD_LINEMERGE = false. This
specifies that Splunk should create new events at a newline.
Note: Optionally add any additional attribute/value pairs from props.conf as needed.
The events now appear in SplunkWeb as the following:
133
Source Types
How source types work
How source types work
A source type is any common format of data. sourcetype is one of Splunk's default fields (it's
indexed and stored with every event). It provides an easy way to find similar types of data from any
input. For example, you might search sourcetype=weblogic_stdout even though weblogic
might be logging from two different domains.
Source vs source type
Source is also one of Splunk's default fields, indexed and stored with every event as source. It
refers to any file, stream, or other input sending data to Splunk. For data coming from files and
directories, the value of source is the full path, such as
/archive/server1/var/log/messages.0 or /var/log/. The value of source for
network-based data sources is the protocol and port, such as UDP:514.
Different sources can have the same source type. For example, you may monitor
source=/var/log/messages and receive direct syslog input from udp:514. Find both by
searching for sourcetype=linux_syslog.
How Splunk can set sourcetype field values
Automatic source type classification
During indexing, Splunk classifies source types automatically by calculating signatures for patterns in
the first few thousand lines of any file or stream of network input. These signatures pick up things like
repeating patterns of words, punctuation patterns, line length, etc. Once Splunk has calculated a
signature, it compares the signature to previously seen signatures - if it's a radically new pattern,
Splunk creates a new source type. Learned pattern information is stored in sourcetypes.conf.
To configure your own automatic source type recognition, use Splunk's rule-based source type
feature. Rule-based source types are automatically assigned based on regular expressions you
specify in props.conf. Learn more about how to configure rule-based source types.
Rename source types
To assign new source type names, edit sourcetypes.conf. However, this only changes the name
of future data inputs. To change the source type for events that have already been indexed, create an
alias for a source type. Aliasing source types is a cosmetic change that allows users to search for
source type values that make sense.
Note: If you set indexing properties for a source type in props.conf, you must use the actual stored
134
source type value from sourcetypes.conf.
Train the source type auto-classifier
To customize source type names, use Splunk's auto-classifier with a set of representative example
files. If you train it with a wide enough range of files that you'd like share the same source type, it
learns more good rules. Then, Splunk's recognition improves for new indexed files of that source
type. Pre-training is how Splunk ships with the ability to assign sourcetype=syslog to most syslog
files.
Bypass Splunk's auto-classification, skip the training step and simply hardcode a sourcetype for each
data input. However, training may still be more effective if you plan to have Splunk index entire
directories of mixed sourcetypes (such as /var/log). Learn how to train Splunk to recognize source
types.
If Splunk fails to recognize a common format, or applies an incorrect source type value, you should
report the problem to Splunk support and send us a sample file.
You can also anonymize your file using Splunk's built in anonymizer too.
Hard-coded source type assignment
Bypass automatic source type classification entirely and set a source type when you configure a data
input (see the topic on setting a source type for an input). However, this method is not very granular --
all data from the same host or source is assigned the same source type name.
If you need to give different sources with in a single directory input different names, try setting source
type for a source.
How Splunk applies source type values (precedence)
You can either configure how Splunk applies source type values to events, or you can let Splunk
automatically apply them. The following list shows the methods and in what order that Splunk uses to
apply source type values to events:
1. Explicit specification of source type per input stanza in inputs.conf:
[monitor://$PATH]
sourcetype=$SOURCETYPE
2. Explicit specification of source type per source by creating a stanza in props.conf:
[$SOURCE]
3. Rule-based association of source types:
Allows you to match sources to source types using classification rules specified in rule:: stanzas in
props.conf.
135
4. Intelligent matching:
Matches similar-looking files and creates a source type.
5. Delayed rules:
Works like rule-based associations, except you create a [delayedrule:: ] stanza in
props.conf. This is a useful "catch-all" for source types, in case Splunk missed any.
6. Automatic source type learning:
Splunk creates new source types based on sources that don't already have source types associated
with them.
Configuration files for source types
Set source type for a source in inputs.conf. Configure custom indexing properties and rule-based
associations of source types via props.conf. Before manually modifying any configuration file, read
about configuration files.
Rule-based association of source types
Rule-based association of source types
Create rules to automatically assign source types in Splunk. Use props.conf to set source typing
rules. Edit props.conf in $SPLUNK_HOME/etc/system/local/, or your own custom application
Configuration
Create a rule by adding a rule:: or delayedrule:: stanza to props.conf. Under the rule
stanza, declare the name of the source type. After the source type declaration, list the rules to assign
the source type. Rules are created based on a series of MORE_THAN, and LESS_THAN statements
that must be matched. The statements are regular expressions that must be matched by the specified
percentage of lines that match the regular expression. Any number of statements can be specified,
and all statements must match in order for the source to fit the source type rule.
Add the following to $SPLUNK_HOME/etc/system/local/props.conf:
[rule::$RULE_NAME] OR [delayedrule::$RULE_NAME]
MORE_THAN = $REGEX
LESS_THAN = $REGEX
136
Note: A rule can have many MORE_THAN and LESS_THAN patterns. All must be met in order for the
rule to match.
Rules are created based on the percentage of the number of lines that contain the specified string. To
match, a rule can be either MORE_THAN or LESS_THAN that percentage.
Examples
The following examples come from $SPLUNK_HOME/etc/system/default.
postfix syslog files
# postfix_syslog sourcetype rule
[rule::postfix_syslog]
sourcetype = postfix_syslog
# If 80% of lines match this regex, then it must be this type
MORE_THAN_80=^\w{3} +\d+ \d\d:\d\d:\d\d .* postfix(/\w+)?\[\d+\]:
delayed rule for breakable text
# breaks text on ascii art and blanklines if more than 10% of lines have
# ascii art or blanklines, and less than 10% have timestamps
[delayedrule::breakable_text]
sourcetype = breakable_text
MORE_THAN_10 = (^(?:---|===|\*\*\*|___|=+=))|^\s*$
LESSS_THAN_10 = [: ][012]?[0-9]:[0-5][0-9]
Set source type for an input
Set source type for an input
Use these instructions to explicitly set a source type for all data coming in via an input.
If you have a directory input (such as monitoring /var/log/), this method assigns the same source
type for every file in the directory. To assign different source types for each discrete source in the
same input directory, set source type for a source instead.
Note: This configuration only impacts new data coming in. To correct the source type displayed in
Splunk Web for data that has already been indexed, create an alias instead.
via Splunk Web
When you configure your data inputs through Splunk Web, you can hardcode a sourcetype.
Pick from a list of sourcetypes
If your source is one of Splunk's pre-trained source types, it's a good idea to pick the same name
Splunk would try to assign automatically. For a description of Splunk's pre-trained source types, see
137
the sourcetype reference page.
Choose From list from the set source type drop down.
Use a new source type name
Select Manual from the drop down menu at the bottom of the data input screen.
Input your source type name in the Source Type box.
Your events now have that sourcetype= value.
When you configure inputs via inputs.conf, you can set a sourcetype as well. Include a sourcetype
= attribute within the appropriate stanza in $SPLUNK_HOME/etc/system/local/inputs.conf:
[tcp://:9995]
connection_host = dns
sourcetype = log4j
source = tcp:9995
This sets any events coming from your TCP input on port 9995 as sourcetype=log4j.
Set source type for a source
Set source type for a source
Use these instructions to assign a source type based on a source via props.conf. Edit props.conf
in $SPLUNK_HOME/etc/system/local/, or your own custom application directory in
Note: This only impacts new data coming in following your configuration change. If you want to
correct the source type displayed in Splunk Web for data that has already been indexed, create an
alias instead.
Add a stanza for your source in $SPLUNK_HOME/etc/system/local/props.conf and set a
sourcetype = attribute:
[source::.../var/log/anaconda.log(.\d+)?]
sourcetype = anaconda
This sets any events from sources containing the string /var/log/anaconda.log followed by any
number of numeric characters to sourcetype=anaconda.
Splunk recommends that your stanza source path regexes (such as
[source::.../web/....log]) be as specific as possible and it is HIGHLY recommended that
the regex does not end in "...". For example, don't do this:
138
[source::/home/fflanda/...]
sourcetype = mytype
This is dangerous, because as in this example, gzip files in /home/fflanda will be processed as
mytype files rather than gzip files.
It would be much better to write:
[source::/home/fflanda/....log(.\d+)?]
sourcetype = mytype
Learn more about props.conf.
Train Splunk to recognize a source type
Train Splunk to recognize a source type
Use these instructions to train Splunk to recognize a new source type, or give it new samples to
better recognize a pre-trained sourcetype. This enables Splunk to classify future files with similar
patterns as a specific source type.
Bypass auto-classification in favor of hardcoded configurations, and just set a sourcetype for an input,
or set a sourcetype for a source. Or set your own rules for source type association.
via the CLI
These commands assume you have set a Splunk environment variable. If you have not, navigate to
$SPLUNK_HOME/bin and run the ./splunk command.
# splunk train sourcetype $FILE_NAME $SOURCETYPE_NAME
Fill in $FILE_NAME with the entire path to your file. $SOURCETYPE_NAME is the custom source type
you wish to create.
It's usually a good idea to train on a few different samples for any new source type so that Splunk
learns how varied a source type can be.
Source type settings in props.conf
Source type settings in props.conf
There are source type specific settings in props.conf. Specify settings for a source type using the
following attribute/value pairs. Add a sourcetype stanza to props.conf in
$SPLUNK_HOME/etc/apps/. For more information on configuration files, see how configuration files
work.
139
Note: The following attribute/value pairs can only be set for a stanza that begins with
[<$SOURCETYPE>]:
invalid_cause = <string>
Can only be set for a [<sourcetype>] stanza.
Splunk will not index any data with invalid_cause set.
Set <string> to "archive" to send the file to the archive processor (specified in unarchive_cmd).
Set to any other string to throw an error in the splunkd.log if running Splunklogger in debug
mode.

Defaults to empty.
unarchive_cmd = <string>
Only called if invalid_cause is set to "archive".
<string> specifies the shell command to run to extract an archived source.
Must be a shell command that takes input on stdin and produces output on stdout.
DOES NOT WORK ON BATCH PROCESSED FILES. Use preprocessing_script.
Defaults to empty.
LEARN_MODEL = <true/false>
For known sourcetypes, the fileclassifier will add a model file to the learned directory.
To disable this behavior for diverse sourcetypes (such as sourcecode, where there is no good
exemplar to make a sourcetype) set LEARN_MODEL = false.
More specifically, set LEARN_MODEL to false if you can easily classify your source by
its name or a rule and there's nothing gained from trying to analyze the content.

Defaults to empty.
maxDist = <integer>
Determines how different a sourcetype model may be from the current file.
The larger the value, the more forgiving.
For example, if the value is very small (e.g., 10), then files of the specified sourcetype should
not vary much.

A larger value indicates that files of the given sourcetype vary quite a bit.
Defaults to 300.
Configure a source type alias
Configure a source type alias
Think of a source type alias as a tag for a value of the sourcetype field. Besides aliasing a source
type via Splunk Web, you can configure a source type alias in tags.conf the same way you
configure tags for a field (via tags.conf).
In tags.conf you can:
140
Add new source type aliases by adding
tag::<sourcetype_value>::<sourcetype_alias>=enabled in the [sourcetype]
stanza (there should only be one such stanza in the tags.conf file--if it doesn't already exist
you can create it manually).

Enable and disable source type aliases by changing their values to enabled or disabled.
Note: You can only enter one source type alias (or tag) per line in a tags.conf stanza.
The following example shows a sample configuration of source type aliases (tags for values of the
sourcetype field). In this example, events from access_common, cups_access, and syslog
source types all are aliased as FAIL. The source type alias for syslog is disabled.
[sourcetype]
tag::syslog::syslog = disabled
tag::access_common::FAIL = enabled
tag::cups_access::FAIL = enabled
tag::syslog::FAIL = enabled
If you search for sourcetype=FAIL with this configuration, your search will return events from the
access_common, cups_access, and syslog source types.
141
Event Types
How event types work
How event types work
Event types are a categorization system to help you make sense of your data. Event types let you sift
through huge amounts of data, find similar patterns, and create alerts and reports.
Events versus event types
An event is a single record of activity within a log file. An event typically includes a timestamp and
provides information about what occurred on the system being monitored or logged.
An event type is a user-defined field that simplifies search by letting you categorize events. Event
types let you classify events that have common characteristics. When your search results come back,
they're checked against known event types. An event type is applied to an event at search time if that
event matches the event type definition in eventtypes.conf. Tag or save event types after
indexing your data.
Event type classification
There are several ways to create your own event types. Define event types via Splunk Web or
through configuration files, or you can save any search as an event type. When saving a search as
an event type, you may want to use the punct field to craft your searches. The punct field helps you
narrow down searches based on the structure of the event.
punct field
Because the format of an event is often unique to an event type, Splunk indexes the punctuation
characters of events as a field called punct. The punct field stores the first 30 punctuation
characters in the first line of the event. This field is useful for finding similar events quickly.
When you use punct, keep in mind:
Quotes and backslashes are escaped.
Spaces are replaced with an underscore (_).
Tabs are replaced with a "t".
Dashes that follow alphanumeric characters are ignored.
Interesting punctuation characters are:
",;-#$%&+./:=?@\\'|*\n\r\"(){}<>[]^!"
The punct field is not available for events in the _audit index because those events are signed
using PKI at the time they are generated.

142
Also see the Splunk Tutorial section about punct for a quick introduction.
punct examples
This event:
####<Jun 3, 2005 5:38:22 PM MDT> <Notice> <WebLogicServer> <bea03>
<asiAdminServer> <WrapperStartStopAppMain> <>WLS Kernel<> <> <BEA-000360>
<Server started in RUNNING mode>
Produces this punctuation:
####<_,__::__>_<>_<>_<>_<>_<>_
This event:
172.26.34.223 - - [01/Jul/2005:12:05:27 -0700] "GET
/trade/app?action=logout HTTP/1.1" 200 2953
Produces this punctuation:
..._-_-_[:::_-]_\"_?=_/.\"__
Event type discovery
Pipe any search to the new typelearner command and create event types directly from Splunk Web.
The file eventdiscoverer.conf is mostly deprecated, although you can still specify terms to ignore
when learning new event types in Splunk Web.
Learn more about event type discovery.
Create new event types
The simplest way to create a new event type is through Splunk Web. Save an event type much in the
same way you save a search. Learn more about saving event types.
Create new event types by modifying eventtypes.conf. Learn more about creating new event
types.
Event type tags
Tag event types to organize your data into categories. There can be multiple tags per event. Learn
more about tagging event types
143
Configuration files for event types
Event types are stored in eventtypes.conf.
Terms for event type discovery are set in eventdiscoverer.conf.
Save event types via Splunk Web
Save event types via Splunk Web
Most searches can be saved as an event type. There can be multiple event types for an event. You
cannot create an event type with searches specifying an index, hosttag, eventtypetag, sourcetype or
the pipe operator. Any event types you create through Splunk Web are automatically added to
$SPLUNK_HOME/etc/system/local/eventtypes.conf.
Configuration
To save a search as an event:
Type the search in the search box.
Click the arrow to the left of the search box.
Click Save as event type...
The Save Event Type dialog box will pop up, pre-populated with your search terms.
Name the event type.
Optionally add an event type tag (you can add more than one tag, comma-separated).
Click the Save button.
You can now use your event type in searches:
eventtype=foo
Configure eventtypes.conf
Configure eventtypes.conf
Add your own event types by configuring eventtypes.conf. There are a few default event types
defined in $SPLUNK_HOME/etc/system/default/eventtypes.conf. Any event types you
create through Splunk Web are automatically added to
$SPLUNK_HOME/etc/system/local/eventtypes.conf.
144
Configuration
Make changes to event types in eventtypes.conf. Use
$SPLUNK_HOME/etc/system/README/eventtypes.conf.example as an example, or create
your own eventtypes.conf. Edit eventtypes.conf in $SPLUNK_HOME/etc/system/local/,
or your own custom application directory in $SPLUNK_HOME/etc/apps/. For more information on
configuration files in general, see how configuration files work.
[$EVENTTYPE]
Header for the event type
$EVENTTYPE is the name of your event type.
You can have any number of event types, each represented by a stanza and any
number of the following attribute/value pairs.

Note: If the name of the event type includes field names surrounded by the percent character
(e.g. "%$FIELD%") then the value of $FIELD is substituted into the event type name for that
event. For example, an event type with the header [cisco-%code%] that has "code=432"
becomes labeled "cisco-432".

search = <string>
Search terms for this event type.
For example: error OR warn.
tags = <string>
Space separated words that are used to tag an event type.
isglobal = <1 or 0>
Toggle whether event type is shared.
If isglobal is set to 1, everyone can see/use this event type.
Defaults to 1.
disabled = <1 or 0>
Toggle event type on or off.
Set to 1 to disable.
Example
[web]
search = html OR http OR https OR css OR htm OR html OR shtml OR xls OR cgi
[fatal]
search = FATAL
145
Disable event types
Disable specific event types by adding the following tag to
$SPLUNK_HOME/etc/system/local/eventtypes.conf:
[$EVENTTYPE]
disabled = 1
$EVENTTYPE is the name of the event type you wish to disable.
So if you want to disable the [web] event type, add the following entry to
../local/eventtypes.conf:
[web]
disabled = 1
Tag event types
Tag event types
Event types can be tagged to add information to your data. Any event type can have multiple tags.
For example, you can tag all firewall events as firewall, tag a subset as deny and tag another subset
as allow. Once an event type is tagged, any event type matching the tagged pattern will also be
tagged.
Note: An event type must be configured via eventtypes.conf or saved in order to be tagged.
Configuration
Tag an event type from within Splunk Web. Note: Make sure you have enabled the eventtype field
from the fields drop down menu.
Click on the drop-down arrow next to the eventtype field.
Select Tag event type.
The Tag This Field dialog box pops up.
Enter your tags and click save.
Once you have tagged an event type, you can search for it in the search bar with the eventtypetag
preface.
eventtypetag=foo
Instead of using auto-discovery at index time, use Splunk's new event type discovery at search time.
Use this feature to create custom event types directly in Splunk Web.
146
Configure event types
Configure event types with the typelearner command or by choosing Discover event types from
Splunk's drop-down menu.
Pipe any search to typelearner:
user=Hume | typelearner
Or choose the Discover event types... option from the Splunk drop-down menu (to the left of
the search box).

Now pick Add Event Type underneath the event you want to classify as a new event type.
This will launch a new window where you can label and tag your event type.
Event type templates
Event type templates
Create an event type based on a field via eventtypes.conf. Edit eventtypes.conf in
For example:
[$NAME %$FIELD%]
$SEARCH_QUERY
Event type templates works a lot like macro searches: %$FIELD% gets filled in at search time with
field=foo or field=bar, etc -- whatever the search query yields for that event type's field.
Configuration
When setting the name in eventtypes.conf, follow these specifications:
[$EVENTTYPE]
Header for the event type
$EVENTTYPE is the name of your event type.
You can have any number of event types, each represented by a stanza and any number of
the following attribute/value pairs.
NOTE: If the name of the event type includes field names surrounded by the percent
character (e.g. "%$FIELD%") then the value of $FIELD is substituted into the event type
name for that event.

147
Example
[cisco-%code%]
cisco
If "code=432", this event type becomes "cisco-432".
Dynamic event rendering
Dynamic event rendering
Dynamic event rendering, or decoration, uses CSS to set how different types of events (including
audit events) are displayed in Splunk Web based on criteria that you define. Add a text label or
change the background color of an event in Splunk Web.
How dynamic event rendering works
Events displayed in Splunk Web as search results are decorated with CSS styles based on what
audit event they represent, or what event type they are. If you have enabled auditing, Splunk
identifies different audit events by default and populates the field _decorations with a string that
represents the type of audit event that occurred.
The following is a list of audit event types:
audit_valid: the event is valid.
audit_gap: there is a gap between events that may indicate tampering.
audit_tampered: an event that has been tampered with.
audit_cantvalidate: tags events that can't be validated.
If you have not enabled auditing, the _decorations field is empty. Use any criteria you want to
decorate an event by setting decorations for event types.
Event decorations
To set how events are decorated, edit the relevant CSS in
$SPLUNK_HOME/share/splunk/search_oxiclean/static/css/default.css. If you want
unique decorations for events displayed in the Splunk basic and black skins (or your custom skin),
specify decorations in the respective CSS files in
$SPLUNK_HOME/splunk/share/splunk/search_oxiclean/static/css/skins/ as well. If
you use either the basic or black skins and don't specify a decoration for a given event type, Splunk
Web uses the value from default.css.
Splunk comes with CSS styles predefined to:
add a text box to an event
change the background color behind the text of an event
Define any number of additional styles in the relevant Splunk Web style sheet.
148
Configure dynamic event rendering
Once you have defined CSS, specify which audit events and event types you want to decorate by
configuring prefs.conf. Create your own prefs.conf and copy it into your own custom
application in $SPLUNK_HOME/etc/apps/. Do not edit the copy in
$SPLUNK_HOME/etc/system/default/.
Enable or disable dynamic event rendering
Turn dynamic event rendering on or off using the decoration_enabled key in prefs.conf. This
key is boolean; when set to true, dynamic event rendering is turned ON.
decoration_enabled = True
Specify events to be rendered
You don't have to put decoration entries into prefs.conf in any specific order or stanza structure.
Entries are identified by the keys themselves. So any entry in prefs.conf that begins with
decoration_$EVENT is read as a key for an event decoration. To specify what you want to
decorate, set $EVENT to match your audit event or event type name.
The following is an example of an audit event decoration for valid events. It uses the classes defined
in the CSS stylesheet to display a text label for all valid events:
decoration_audit_valid = {"align": "top", "wrapperclass": "defaultDecorationWrapperclass", "textclass": "auditValidTextclass", "text": "Valid." }
This is a similar audit event decoration for events that were tampered with:
decoration_audit_tampered = {"align": "top", "wrapperclass": "defaultDecorationWrapperclass", "textclass": "auditTamperedTextclass", "text": "Tampered!"}
This is an event decoration for adding a text label to a diff event:
decoration_diff = {"align": "top", "wrapperclass": "diffHeaderWrapperclass", "text": "<pre>diff x y compares x to y<br/>- indicates a line present in x but missing in y<br/>+ indicates a line present in y but missing in x<br/>! indicates a line that exists in both x and y, but contains different information
" }
</pre>
Example
Here is a step-by-step example for configuring a custom decoration for a new event type. (To create
decorations for an existing eventtype, skip the first step.)
1. Add to $SPLUNK_HOME/etc/system/local/eventtypes.conf:
eventtype=non-auth
search = * Failed authentication
149
2. Add the following to
$SPLUNK_HOME/share/splunk/search_oxiclean/static/css/default.css (make a
backup first):
.iErrorTextclass {
padding-left: 20px;
padding-top: 3px;
padding-bottom: 3px;
color: #A22;
}
.iErrorRowclass {
background-color: #FAA !important;
}
3. Add to $SPLUNK_HOME/etc/system/local/prefs.conf:
decoration_non-auth={"align":"top","wrapperclass":"defaultDecorationWrapperclass","textclass":"iErrorTextclass", "text":"Intruder Alert."}
4. Add to $SPLUNK_HOME/etc/system/local/decorations.conf
non-auth=decoration_non-auth
5. Restart Splunk and clear your browser cache.
6. To see your new custom events, make sure you have the eventtype field enabled.
150
Transaction Types
How transactions work
How transactions work
A transaction is any group of conceptually related events that spans time. A transaction type is a
configured transaction, saved as a field in Splunk. Any number of data sources can generate
transactions over multiple log entries.
For example, a customer shopping in an on-line store could generate a transaction across multiple
sources. Web access events might share a session ID with the event in the application server log; the
application server log might contain the account ID, transaction ID, and product ID; the transaction ID
may live in the message queue with a message ID, and the fulfillment application may log the
message ID along with the shipping status. All of this data represents a single user transaction.
Here are some other examples of transactions:
Web access events
Application server events
Business transactions
E-mails
Security violations
System failures
Transaction search
Transaction search is useful for a single observation of any physical event stretching over multiple
logged events. Use the the transaction command to define a transaction or override transaction
options specified in transactiontypes.conf.
To learn more, read about transaction search.
Configure transaction types
You may want to persist the transaction search you've created. Or you might want to create a lasting
transaction type. You can save transactions by editing transactiontypes.conf. Define transactions by
creating a stanza and listing specifications.
To learn more, read about configuring transaction types.
Transaction types via configuration files
151
Transaction types via configuration files
Any series of events can be turned into a transaction type. Read more about use cases in how
transaction types work.
You can create transaction types via transactiontypes.conf. See below for configuration details. For
more information on configuration files in general, see how configuration files work.
Configuration
1. Create a transactiontypes.conf file in $SPLUNK_HOME/etc/system/local/, or your own
custom application directory in $SPLUNK_HOME/etc/apps/.
2. Define transactions by creating a stanza and listing specifications for each transaction within its
stanza. Use the following attributes:
[<transactiontype>]
maxspan = [<integer> s|m|h|d]
maxpause = [<integer> s|m|h|d]
maxrepeats = <integer>
fields = <comma-separated list of fields
exclusive = <true | false>
aliases = <comma-separated list of alias=event_type>
pattern = <ordered pattern of named aliases>
match = closest
[<TRANSACTIONTYPE>]
Create any number of transaction types, each represented by a stanza name and any number
of the following attribute/value pairs.

Use the stanza name, [<TRANSACTIONTYPE>], to search for the transaction in Splunk Web.
If you do not specify an entry for each of the following attributes, Splunk uses the default value.
Set the maximum time span for the transaction.
Can be in seconds, minutes, hours or days.
For example: 5s, 6m, 12h or 30d.

If there is no "pattern" set (below), defaults to 5m. Otherwise, defaults to -1 (unlimited).
Set the maximum pause between the events in a transaction.

If there is no "pattern" set (below), defaults to 2s. Otherwise, defaults to -1 (unlimited).
152
Set the maximum number of repeated event types to match against pattern (see below).
For example, if maxrepeats is 10, and there are 100 events in a row, all with the same
eventtype, only the first and last 10 are matched against pattern.

A negative value means no limit on repeats, but can possibly cause memory problems.
Defaults to 10.
fields = <comma-separated list of fields>
If set, each event must have the same field(s) to be considered part of the same transaction.
Defaults to "".
Toggle whether events can be in multiple transactions, or 'exclusive' to a single transaction.
Applies to 'fields' (above).
For example, if fields=url,cookie, and exclusive=false, then an event with a 'cookie',
but not a 'url' value could be in multiple transactions that share the same 'cookie', but have
different URLs.

Setting to 'false' causes the matcher to look for multiple matches for each event and
approximately doubles the processing time.

Defaults to "true".
Define a short-hand alias for an eventtype to be used in pattern (below).
For example, A=login, B=purchase, C=logout means "A" is equal to eventtype=login,
"B" to "purchase", "C" to "logout".

Defaults to "".
pattern = <regular expression-like pattern>
Defines the pattern of event types in events making up the transaction.
Uses aliases to refer to eventtypes.
For example, "A, B, B, C" means this transaction consists of a "login" event, followed by a two
"purchase" events, and followed by a "logout" event.

Defaults to "".
match = closest
Specify the match type to use.
Currently, the only value supported is "closest."
Defaults to "closest."
3. Use the transaction command in Splunk Web to call your defined transaction (by its transaction
type name). You can override configuration specifics during search. Read more about transaction
search.
153
Transaction search
Transaction search
Search for transactions using the transaction search command either in Splunk Web or at the CLI.
The transaction command yields groupings of events which may then be used in reports. To use
transaction, either call a transaction type (that you configured via transactiontypes.conf), or define
transaction constraints in your search by setting the search options of transaction.
Search options
Transactions returned at search time consist of the raw text of each event, the shared event types,
and the field values. Transactions also have additional data that is stored in the fields: duration and
transactiontype. duration contains the duration of the transaction (the difference between the
timestamps of the first and last events of the transaction). transactiontype is the name of the
transaction (defined in transactiontypes.conf by the transaction's stanza name).
You may add transaction to any search. For best search performance, craft your search and then
pipe it to the transaction command.
Follow the transaction command with the following options. Note: Some options do not work with
others.
Aliases=<comma-separated list of alias=event_type>
Define a short-hand alias for eventtypes to be used in pattern (below).
For example, aliases="A=sendmail-from, B=sendmail-to".
This means A stands for eventtype=sendmail-from.
Read more about eventtypes.

Note: You cannot use startswith and endswith (below) when using aliases.
pattern=<quoted regular expression-like pattern>
Defines the pattern of event types in events making up the transaction.
Use the aliases you defined (above).
For example, aliases="A=sendmail-from, B=sendmail-to" pattern="A, B"
fields=<quoted comma-separated list of fields>
If set, each event must have the same field(s) to be considered part of the same transaction.
Specify multiple fields in quotes, eg. fields="field1, field2"
Events with common field names and different values will not be grouped.
For example, if fields=host, then a search result that has "host=mylaptop" can never be
in the same transaction as a search result with "host=myserver".

A search result that has no "host" value can be in a transaction with a result that has
"host=mylaptop".

154
Note: When specifying more than one field, you must quote all the fields, like this:
transaction fields="host,thread"

match=closest
Specify the matching type to use with a transaction definition.
The only value supported currently is closest.
maxspan=[<integer> s|m|h|d]
Set the maximum pause between the events in a transaction.

If there is no "pattern" set (below), defaults to 2s. Otherwise, defaults to -1 (unlimited).
maxpause=[<integer> s|m|h|d]
Specifies the maximum pause between transactions.
Requires there be no pause between a transaction's events greater than maxpause.
If the value is negative, the maxspause constraint is disabled.
The default maxpause is 2 seconds. If a pattern constraint is specified, the default maxpause
is -1 (disabled)

startswith=<string>
Specify a SQLite expression that must be true to begin a transaction.
Strings must be quoted with " ".
You can use SQLite wildcards (%) and use single quotes(' ') to specify a literal term.
This syntax refers to an event type name, not an event string itself
endswith=<quoted string>
Specify a SQLite expression that must be true to end a transaction.
Strings must be quoted with " ".
You can use SQLite wildcards (%) and use single quotes(' ') to specify a literal term.
This syntax refers to an event type name, not an event string itself
Transactions and macro search
Transactions and macro search are a powerful combination that allow substitution into your
transaction searches. Make a transaction search and then save it with $field$ to allow substitution.
Example transaction searches
Run a search that groups together all of the pages a single user (or client IP address) looked
at over a time range.
This search takes events from the access logs, and creates a transaction from events that share the
same clientip value that occurred within 5 minutes of each other (within a 3 hour time span).
155
sourcetype=access_combined | transaction fields=clientip maxpause=5m
maxspan=3h
156
Search
How search works
How search works
Splunk includes a powerful search language for crafting simple to sophisticated searches. To learn
more about Splunk's search syntax, see the User Manual search reference section. This section
describes how to administer searches, including various configuration options for saved searches.
Saved searches
Once you have set up a search, you can save it for reuse as a saved search. Splunk ships with a
few pre-configured saved searches. These are listed on the bottom of the landing page in Splunk
Web.
Splunk administrators can create saved searches to distribute to all their Splunk users. Learn more
about creating saved searches, either via Splunk Web or via savedsearches.conf.
Saving searches allows for knowledge capture and sharing. You can share any saved search or save
it as private. Shared and personally owned private saved searches appear by default on the bottom
of the user's landing page.
Form search and Macro search
Form searches and macro searches are wrappers for saved searches. They work just like saved
searches, but take variables at search time. So you can set up a search for others to use. They won't
have to see the search at all -- they simply input the variable(s) they're interested in finding.
Macro searches are saved searches with variables. Fill in the variables at search time.
Form searches work just like macro searches, but include an additional interface for searching.
Alerting
Set any saved search to run on a specific schedule, trigger alerts, send emails or RSS feeds. Read
more about alerts here
157
Live tail
Use Live tail to watch data streaming into Splunk. Live tail works just like tail -f in *nix systems.
Learn more about live tail.
Summary indexing
Summary indexing provides support for greater efficiency when running reports on large datasets
over large time spans. Summary indexing saves the results of a scheduled search into a special
summary index that you designate. You can then search and run reports on this smaller, restricted
index instead of working with the much larger original data set.
Use summary indexing to:
Aggregate results.
Generate statistics.
Index rare original events into a smaller index for more efficient reporting.
For example, you may want to run a report at the end of every month that tells you how many page
views and visitors each of your Web sites had, broken out by site. If you just run this report at the end
of the month, it could take a very long time to run because Splunk has to look through a great deal of
data to extract the information you want. However, if you use summary indexing, you schedule a
saved search that runs periodically over smaller slices of time and Splunk saves the results (since the
last time the report was run) into a special (summary) index. You can then run an "end of the month"
report on the data indexed in this much smaller index.
Or, you may want to run a report that shows a running count of a statistic over a long period of time.
For example, you may want a running count of downloads of a file from a Web site you manage.
Schedule a saved search to return the total number of downloads over a specified slice of time. Use
summary indexing to have Splunk save the results into a summary index. You can then run a report
any time you want on the data in the summary index to obtain the latest count of the total number of
downloads.
Learn more about Summary indexing.
Set up saved searches via Splunk Web
Set up saved searches via Splunk Web
Turn any search into a saved search via Splunk Web. Just craft a search and use the built-in Save
search screen to set values for the search. You can also create saved searches via
savedsearches.conf.
158
Note: Many complex, long running searches may slow down your Splunk instance. Make sure you
optimize your searches before saving them in a saved search. You can also use summary indexing to
optimize long running searches.
Save your search
Refine the search until you consider it worthy. If you want to limit your search to a specific time
period, add a modifier such as daysago:1 or hoursago:4. See the search reference.
1. Click on the drop-down arrow next to the search bar:
2. Select Save search...
3. Then, fill in the options presented on the save search screen.
4. Give your saved search a name.
5. Pick a role to share your search with, or leave the drop down as Don't share.
6. Optionally add the saved search to any existing dashboard.
7. Click the Save button.
Note: All admin level users see all saved searches, whether the user who created it explicitly shared
it or not.
Edit saved searches at any time by clicking on the Admin link in the upper right hand corner. Select
the Saved Searches link.
159
Schedule a saved search
Optionally schedule your Saved Search to run on a schedule by clicking the Schedules & Alerts link.
Note: Too many searches running too often can slow down the server.
1. Check the box Run this search on a schedule.
2. Choose either Basic or Cron scheduling:
Basic lets you choose from predefined schedule options.
Use Cron to specify cron-style scheduling.
Caution: Splunk implements cron differently than standard POSIX cron. Use the */n
as "divide by n" (instead of crontab's "every n").

For example, enter */3* * * 1-5 to run your search every twenty minutes, Monday
through Friday.

Here are some other Splunk cron examples:
"*/12 * * * *" : "Every 5 minutes"
"10,40 * * * *" : "Every 30 minutes, at 10 and 40 minutes after the hour"
"0 0,12 * * *" : "Every 12 hours, at midnight and noon"
After you've scheduled your search, you can configure it to send alerts. To turn your search into an
alert, see set up alerts via Splunk Web.
Set up saved searches via savedsearches.conf
160
Set up saved searches via savedsearches.conf
Configure saved searches with savedsearches.conf. Use the
$SPLUNK_HOME/etc/system/README/savedsearches.conf.example as an example, or
create your own savedsearches.conf. Make any changes in
To turn your saved search into an alert, see set up alerts via savedsearches.conf.
Configuration
Edit $SPLUNK_HOME/etc/system/local/savedsearches.conf to create a saved search. A
savedsearches.conf stanza looks like:
[<Splunk name>]
attribute1 = val1
attribute2 = val2
There are several attribute/value pairs available for savedsearches.conf. The following pairs may
be used to create a saved search.
search = <string>
Actual query terms of the saved search.
For example index::sampledata http NOT 500.

Your query can include macro searches for substition.
To create a macro search, read the documentation on macro search.
role = <string>
Role (from authorize.conf that this saved search is shared with.
Anyone that is a member of that role will see the saved search in their dashboard.
userid = <integer>
UserId of the user who created this saved search.
Splunk needs this information to log who ran the search, and create editing capabilities
in Splunk Web.

Possible values: Any Splunk user ID.
User IDs are found in $SPLUNK_HOME/etc/passwd.
Look for the first number on each line, right before the username.
For example 2:penelope....

161
Example
This example search is called j_client_ip and runs the search host="j_apache" | top
limit=100 clientip. It's shared with the Admin role -- role is set to 'Admin.'
[j_client_ip]
search = host="j_apache" | top limit=100 clientip
role = Admin
userid = 1
Note: In versions 3.2 and above, saved searches set to run on a schedule don't show a nextrun time
in savedsearches.conf.
Create a form search
Create a form search
Create a form search the same way you create a saved search, with these additional steps:
Decide which parts of the search to turn into variables.
Edit the search to be part of variables as form fields by surrounding them with dollar signs ($).
For example, the variable foo with ($) is saved search:
$foo$
When the saved search is clicked it will appear as the following:
Form searches with fields
Create form searches for indexed and extracted fields.
Preface your form field with the field name and surround the form field with quotes.
162
For example:
index=_internal AND sourcetype=splunkd
can be made into a general (form) search for any sourcetype by adding sourcetype after the
indexed field name and surrounding it with dollar signs:
index=_internal AND sourcetype="$sourcetype$"
Save this search as Daily indexing volume, and a user running the search sees:
Form searches with predefined values
You can also specify form searches that have a list of valid values. The form generated will show a
drop-down list. For example, the search
sourcetype=_trade_entry AND TradeID="$Trade ID$" AND TradeType $TradeType=Accepted,Rejected,Hold$
This search limits TradeType to three values and presents them in a drop-down:
Valid values can also come from an external source. For example:
$user={/static/html/imap.users}$
163
Note: The external source must be accessible as a URL from the local domain. The file should live in
$SPLUNK_HOME/share/splunk/search_oxiclean/static/html, should be a plain text file
and contain the values that you want to show in the drop-down list in the following format:
['value1','value2','value3','value4']
Share your form search
Once you have refined your search, you can distribute it to your users.
Save it
Save your search via the drop-down arrow next to the search box.
From within the form search interface, click click show as text to return to the search
box.

You can share your saved search with all users.

Permalink it
Once you have saved a search, you can permalink to the form search box.
View the saved search in the form view mode, and click the permalink option above the
form search box. This creates a permalink URL that you can send to other Splunk
users.

Macro searches
Macro searches
Macro searches are a powerful new feature for saved searches. Save searches with macro fields,
which are values you set at search time. You can create a sophisticated saved search with as many
macro fields as you like.
Use macro searches in Splunk Web or in Splunk's CLI. Macro searches work similarly to form
searches, except there is no graphical user interface component.
Configure a macro search
Create a saved search. Use $TERM$ to specify a macro field for substitution. You can specify
any number of macro fields.

host=swan OR host=pearl $user$ $trans$
Save the search and name it. The following example calls the search usertrans.
Call your saved search with the savedsearch command. Enter the values to substitute for
the macro fields specified in the saved search usertrans. You can specify key value pairs from

164
search or extracted fields, or any other value in your data.
|savedsearch usertrans user=KateAusten trans=query
Note: Use the "I" (pipe) operator before the savedsearch command.
The macro search above is equivalent to this search:
host=swan OR host=pearl user=KateAusten trans=query
Configure summary indexing
Configure summary indexing
For a general overview of summary indexing and instructions for setting up summary indexing
through Splunk Web, see the topic Increase reporting efficiency with summary indexing in the Users
Manual.
You can't manually configure a summary index for a search in savedsearches.conf until the
search:
Is saved
Is scheduled
Has an alert configured for it
Is enabled
Note: You must configure an alert for your saved search if you want to use it in conjunction with a
summary index; if you do not the search will run but it won't populate the summary index.
When you perform these steps through Splunk Web, the system generates an index for you when
you enable the summary index for the saved, scheduled, alarm-set search. The index will have the
same name as the saved search. At this point you can manually configure summary indexing for the
saved search.
For details about using Splunk Web to perform these actions for searches, see the Save, schedule,
set alerts, and enable summary indexing topic in the User Manual.
Alternatively, you can use the addinfo and collect search commands to create a search that will
be saved and scheduled, and which will populate a pre-created summary index. For more information
about that method, see "Manually populate the summary index" in this topic.
Note: Indexing events in a summary index counts against your license volume. We recommend that
you not index more events in your summary indexes than you really need. Consult Splunk support for
specific information on license volume impact.
Customize summary indexing for a saved, scheduled, alert-configured search
When you use Splunk Web to enable summary indexing for a scheduled saved search, Splunk
automatically generates a stanza in
165
$SPLUNK_HOME/etc/system/local/savedsearches.conf. You can customize summary
indexing for the saved search by editing this stanza.
[ < name > ]
action.summary_index = < 1 | 0 >
action.summary_index.name = <string>
action.summary_index.<field> = <string>
[<name>]: Splunk names the stanza based on the name of the saved, scheduled,
alert-configured search for which you enabled summary indexing.

action.summary_index =: Set to 1 to enable summary indexing. Set to 0 to disable
summary indexing.

action.summary_index.<field> = <string>: Specify a field/string pair to add to every
search result indexed in the summary index.

Note: This field/string pair acts as a "tag" of sorts that makes it easier for you to identify the
events that go into the summary index when you are performing searches amongst the greater
population of event data. This key is optional but we recommend that you never set up a
summary index without at least one field/string pair.

Manually populate a manually created summary index
If you want to configure summary indexing without using the search options dialog in Splunk Web,
you must first configure a summary index just like you would any other index via indexes.conf. For
more information about manual index configuration, see, see the topic How indexing works in this
manual.
Important: You must restart Splunk for changes in indexes.conf to take effect.
1. Run a search that you want to summarize results from in the Splunk Web search bar.
Be sure to limit the time range of your search. The number of results that your search
generates needs to fit within the maximum search result limits you have set for searching.

Make sure to choose a time interval that works for your data, such as 10 minutes, 2 hours, or 1
day. (For more information about setting intervals through the search bar, see the Schedule a
search subtopic in the User Manual.)

2. Use the addinfo search command. Append | addinfo to the end of your search.
This command adds information about the search to events that the collect command requires
in order to place them into a summary index.

You can always add | addinfo to any search to preview what the results of a search will
look like in a summary index.

3. Add the collect search command. Append |collect index=<index_name> addtime
marker="info_search_name=\"<summary_search_name>\"" to the end of the search.
Replace index_name with the name of the summary index
Replace summary_search_name with a key to find the results of this search in the index.
166
A summary_search_name *must* be set if you wish to use the overlap search command on
the generated events.

Note: For the general case we recommend that you use the provided summary_index alert action.
Configuring via addinfo and collect requires some redundant steps that are not needed when
generating summary index events from scheduled searches. Manual configuration remains
necessary when backfilling a summary index for timeranges which have already transpired.
Manually configure a search to populate a summary index
If you've used Splunk Web to save, schedule, and configure an alert for a search, but haven't used
Splunk Web to enable the summary index for the search, you can easily enable summary indexing for
the saved search through savedsearches.conf as long as you have a new index for it to populate.
For more information about manual index configuration, see, see the topic How indexing works in this
manual.
Add the following keys to $SPLUNK_HOME/etc/system/local/savedsearches.conf:
action.summary_index = <1 | 0>: Set to 1 to enable summary indexing for a saved search.
action.summary_index._name = <string>: Add the name of the summary index you created in
step 1.

Add additional data to your events going into a summary index using this key:
[[action.summary_index.<field> = <string>]]: Add additional field/value pairs to events going
into your summary index. Add as many as you like.

Example of a summary index configuration
This example shows a configuration for a summary index of Web statistics as it might appear in
savedsearches.conf. The keys listed below enable summary indexing for the saved search
"MonthlyWebstatsReport", and append the field Webstatsreport with a value of 2008 to every
event going into the summary index.
#name of the saved search = Apache Method Summary
[Apache Method Summary]
# sets the search to run at each search interval
counttype = always
# enable the search schedule
enableSched = 1
# search interval in cron notation (this means "every 5 minutes")
schedule = */12****
# id of user for saved search
userid = jsmith
# search string for summary index
search = index=apache_raw startminutesago=30 endminutesago=25 | extract auto=false | stats count by method
# enable summary indexing
action.summary_index = 1
#name of summary index to which search results are added
action.summary_index._name = summary
# add these keys to each event
167
action.summary_index.report = "count by method"
Other configuration files affected by summary indexing
In addition to the settings you configure in savedsearches.conf, there are also settings for summary
indexing in indexes.conf and alert_actions.conf.
Indexes.conf specifies index configuration for the summary index. Alert_actions.conf controls the alert
actions (including summary indexing) associated with saved searches.
Caution: Do not edit settings in alert_actions.conf without explicit instructions from Splunk
staff.
Live tail
Live tail
Live tail for Splunk Web lets you watch data streaming into Splunk. Search for any text in data as it is
indexed into Splunk. Live tail streams data to the browser based on a simple text search.
Live tail has a variety of uses. Some of the more common use cases are:
Passive monitoring
If you want to know the moment specific events occur in your environment.

Troubleshooting
Set up live tail to search for a particular type of event and set it to monitor your
environment.

Change your environment and monitor the effects in the live tail stream.
For example, send an email and see whether it passes your spam filter.

Use live tail in Splunk Web
Live tail launches in a new window (or new tab - depending on your browser configuration). The live
tail processor takes the search terms you input (before they get piped to data processing commands),
creates a search based on those, and streams search results to your browser.
To start live tail, select View in live tail menu item in the search bar drop-down menu.
The live tail interface
Overview of controls in the live tail window:
168
The search box:
Enter your search terms here.

The green button:
Clicking on the green button opens a new stream based on the search terms you
entered in the search box.

Each time you click on the green button, you launch a new stream based on your
search terms.

ctrl-c: Pressing ctrl-c terminates the current stream (just like with tail -f in a Linux or Unix
shell).
Note: Currently, ctrl-c is the only implemented tail -f Linux/Unix shell feature.

Wrap results check box:
Wraps the search results.
Functions similarly to the the wrap results check box in the main window of Splunk
Web.

Pressing the Enter key anywhere outside the search box inserts a new line in the displayed
stream.

Use ctrl + shift + b to pause or un-pause live tail.
On a Mac, use cmd + shift + b.

Note: To increase the text size of live tail, increase your browser's text display size.
Start live tail from the Splunk CLI
Log into Splunk. ./splunk login 1.
Use the live-tail CLI command to start live tail. 1.
Type: ./splunk live-tail "your search string", where "your search string" is
whatever simple search terms you want to search for (surrounded by quotes).
1.
Current limitations
The following are the current limitations of live tail:
You can only perform a simple text search while using live tail. You can't use any Splunk
search commands or any data extractions in a search.

If the client is overloaded by the volume of the data coming in to the processor, it will arbitrarily
omit chunks of data. This means that with a very high volume of data, some events may never
be displayed on screen for live tail.

There are REST endpoints on both splunkd and SplunkWeb. Application developers are free
to use these APIs to use the streams directly and bypass the client.
To configure the REST endpoints, use restmaps.conf and streams.conf.

LiveTail doesn't work in IE 6.
LiveTail doesn't work in distributed search.
169
By default, Livetail is only enabled for users assigned Admin and Power roles. To allow the
User role access, the allow_livetail capability must be enabled in authorize.conf.

170
Distributed Search
How distributed search works
How distributed search works
Distributed search is a peer-to-peer configuration that enables one Splunk server to send searches
across many other Splunk instances. Upon login, authentication attempts are federated across all
other included servers. Users who want to search across distributed Splunk hosts must have the
exact same credentials (username and password) on all the included servers. You can propagate
user credentials using the information in this Community wiki topic.
Users can restrict any search to explicitly search only a subset of the servers.
Each Splunk server in a distributed search configuration must have an Enterprise license.
Distributed search is typically used to:
enable correlation among multiple silos of data for a subset of users.
provide a single view of data across multiple indexing servers.
provide a single view across Splunk servers that are indexing data locally on production hosts,
where network bandwidth favors centralizing data at search time rather than index time.

Note: Distributed search uses the management port (default 8089), so SSL must be either off or on
for all servers. By default, SSL is turned on for the management port. If you turn it off for one server,
you must turn it off for all servers.
Known issues with distributed search
You can mix 3.3.x with 3.2.x, but mixing 3.1.x and 3.2.x nodes in a distributed search cluster is
not supported; you must upgrade all your Splunk servers to at least 3.2 in order to use
distributed search across versions.

Network speed affects distributed search speed. If you're searching over a VPN, you may
notice distributed search taking longer, depending on your connection speed.

Search time field extraction configuration must be configured on each of the servers providing
the distributed search results.

171
Each instance in the distributed search cluster must have a unique server name. The server
name is specified in $SPLUNK_HOME/etc/myinstall/splunkd.xml

The savedsearch search command is not supported when searching across distributed
Splunk systems.

Dynamic field extraction (the interactive field extractor) is not supported when attempting to
extract from a non-local event (in a distributed search environment)."

Enable distributed search via Splunk Web
Enable distributed search via Splunk Web
Follow these instructions to enable distributed search via Splunk Web. You can also enable
distributed search via the CLI or distsearch.conf.
Click the Admin link in the upper right hand corner:
Select the Distributed tab, and click Distributed Search.
To turn on distributed search:
Set the Distributed Searches to other Splunk servers? radio button to Yes.
If you want other Splunk instances to automatically find this instance, set the
Auto-Discoverable? radio button to Yes.

Note: Discovered servers are not displayed until you commit the change and restart Splunk.
Add the IP address and port number of any other Splunk instances you want to include in the
distributed search cluster. This port number must match the same splunkd port number as in
the Admin / Server / Settings on the remote instance. Defaults to 8089.

Note: If you enabled Auto discoverable on other Splunk instances they are displayed in the
Discovered Servers column. Each server has an Add button next to it. Click Add to add the servers
to cluster.
Click the Save button to commit the changes.
Enable distributed search via the CLI
Enable distributed search via the CLI
Follow these instructions to enable distributed search via Splunk's CLI. . You can also enable
distributed search via Splunk Web or distsearch.conf.
Configuration
To use Splunk's CLI, navigate to the $SPLUNK_HOME/bin/ directory and use the ./splunk
command. You can also add Splunk to your path and use the splunk command.
172
Enable distributed search
splunk enable dist-search -auth admin:changeme
Distributed search enabled.
Enable auto-discovery
splunk enable discoverable -auth admin:changeme
Discoverable mode is now enabled.
Add a search server
splunk add search-server -host 10.10.10.10 -port 8888 -auth admin:changeme
Success.
Search via the CLI
Use the dispatch command to send out searches via Splunk's CLI.
splunk dispatch "source::/var/log/tomcat55/catalina.out minutesago::5" -auth admin:changeme
Configure distributed search via distsearch.conf
Configure distributed search via distsearch.conf
The most advanced specifications for distributed are available in distsearch.conf. Edit this file in
Configuration
[distributedSearch]
Set distributed search configuration options under this stanza name.
If you do not set any attribute, Splunk uses the default value (if there is one listed).
Toggle distributed search off and on.
Defaults to false (your distributed search stanza is enabled by default).
heartbeatFrequency = <in seconds>
Heartbeat in seconds.
0 disables all heartbeats.
If the heartbeat is disabled, no other Spolunk Server is able to auto-discover this instance.
Defaults to 2.
173
heartbeatMcastAddr = <IP address>
Set a multicast address.
Defaults to 255.0.0.37.
heartbeatPort = <port>
Set heartbeat port.
Defaults to 60.
serverTimeout = <in seconds>
How long to wait for a connection to a server.
If a connection occurs, a search times out in 10x this value.
For example, if set to 10 seconds, the maximum search allowed is 100 seconds.

This setting works in tandem with 'removeTimedOutPeers.'
Defaults to 10.
statusTimeout = <in seconds>
Set how long to wait for a server to return its status.
Up this number if your peered servers are slow or if the server name disappears from the
SplunkWeb widget.

removedTimedOutServers = <true | false>
If true, remove a server connection that cannot be made within 'serverTimeout.'
If false, every call to that server attempts to connect.
NOTE: This may result in a slow user interface.

checkTimedOutServersFrequency = <in seconds>
This tag is ONLY relevant if 'removeTimedOutServers' is set to true.
If 'removeTimedOutServers' is false, this attribute is ignored.

Rechecks servers at this frequency (in seconds).
If this is set to 0, then no recheck will occur.
Defaults to 60.
autoAddServers = [True | False]
If this tag is set to 'true', this node will automatically add all discovered servers.
skipOurselves = [True | False]
If this is set to 'true', then this server will NOT participate as a server in any search or other
call.

This is used for building a node that does nothing but merge the results from other servers.
Defaults to 'false.'
174
allowDescent = [True | False]
ttl = <integer>
Time To Live.
Increasing this number allows the UDP multicast packets to spread beyond the current subnet
to the specified number of hops.

NOTE: This only will work if routers along the way are configured to pass UDP multicast
packets.

Defaults to 1 (this subnet).
servers =
Initial list of servers.
If operating completely in 'autoAddServers' mode (discovering all servers), there is no need to
have any servers listed here.

blacklistNames =
List of server names that you do not want to peer with.
Server names are the 'server name' that is created for you at startup time.
blacklistURLs =
Comma-delimited lists of blacklisted discovered servers.
You can black list on server name (above) or server URI (x.x.x.x:port).
Example
[distributedSearch]
heartbeatFrequency = 10
servers = 192.168.1.1:8059,192.168.1.2:8059
blacklistNames = the-others,them
blacklistURLs = 192.168.1.3:8059,192.168.1.4:8059
This entry distributes searches to 192.168.1.1:8059,192.168.1.2:8059.
The server sends a heartbeat every 10 seconds.
There are four blacklisted instances, listed across blacklistNames and blacklistURLs.
Attributes not set here use the defaults listed in distsearch.conf.spec.
Exclude specific Splunk servers from distributed searches
Exclude specific Splunk servers from distributed searches
To exclude certain Splunk servers from distributed searches (also referred to as blacklisting), add a
comma-delimited list of servers to distsearch.conf. Add IP addresses or fully-qualified domain names
in your stanza:
175
blacklistURLs = 192.168.1.3:8059,192.168.1.4:8059
blacklistNames
the names you have defined in Splunk for your distributed search servers.
blacklistURLs
full URL paths to your distributed search servers.
176
Alerts
How Alerts Work
How Alerts Work
Alerts are searches you've configured to run on a schedule and send you their results. Use alerts to
notify you of changes in your data, network infrastructure, file system or other devices you're
monitoring. Alerts can be sent via email or RSS, or trigger a shell script. You can turn any saved
search into an alert.
An alert is comprised of:
a schedule for performing the search
conditions for triggering an alert
actions to perform when the triggering conditions are met
Enable alerts
Set up an alert at the time you create a saved search, or enable an alert on any existing saved
search you have permission to edit. Configure alerts via:
Splunk Web
savedsearches.conf
Scripted alerts
Alerts can also trigger shell scripts. When you configure an alert, specify a script you've written. You
can use this feature to send alerts to other applications. Learn more about configuring scripted alerts.
You can use scripted alerts to send syslog events, or SNMP traps.
Customize alerts
Use the alert_actions.conf file to customize alert settings. For example, change email configuration
(mail server, subject line, etc). Learn more about customizing alert options.
Considerations
When configuring alerts, keep the following in mind:
Too many alerts/saved searches running at once may slow down your system -- depending on
the hardware, 20-30 alerts running at once should be OK. If the searches your alerts are
based on are complex, you should make the interval longer and spread the searches out
more.

Set a time frame for alerts that makes sense -- if the search takes longer than 4-5 minutes to
run, don't set it to run every five minutes.

You must have a mail server running on the LAN that the Splunk server can connect to.
Splunk does not authenticate against the mail server.

177
Best practices for alert configuration are located here.
Set up alerts via Splunk Web
Set up alerts via Splunk Web
Use Splunk Web to set up alerts. Follow these steps:
1. Create a saved search.
2. Schedule the search.
3. Define alert conditions.
4. Configure alert actions.
You can set up an alert at the time you create a saved search, or you can enable an alert on any
existing saved search you have permission to edit.
Note: You must have email enabled on your Splunk server for alerts to be sent out. Alternately, your
Splunk server must be able to contact your email server. Configure email settings by customizing
alerts.
Create a saved search
First, set up a saved search:
Enter your search terms into the search bar and choose Save search... from the drop-down
menu to the left of the search bar.

Fill in the fields to save your search and then click the Schedule & Output link at the top of
the Save Search pop up.

Schedule the search
Next, schedule your search. This means your search runs on the specified schedule. For example,
Splunk runs your search every hour or at midnight. If your search meets the alert conditions, then
Splunk alerts you.
178
Check the box run this search on a schedule.
Choose either basic or cron scheduling.
Note: Too many searches running every minute can slow down the server.
Time ranges in a search
To get all the results from a set window of time, you may include a specific time range in your search,
for example hoursago=1. Especially in distributed setups, data may not reach the indexer exactly
when it is generated. Thus, it is a good idea to run your searches with a few minutes of delay.
For example, you want all the results from an hour time window, such as 4 PM to 5 PM.
Add the terms startminutesago=90 and endminutesago=30 to your search.
Then, schedule your search to run on the half hour using cron notation.
This ensures that you get all the results from the specified time period.
Define alert conditions
Now define alert conditions. Alert conditions tell Splunk whether or not to send you an alert. Enter a
threshold number of events, sources, or hosts in your results. If the alert conditions are met, Splunk
notifies you via email or RSS feed or triggers a shell script.
1. In the first drop-down menu under Alert when choose:
always
Splunk will always send you alerts when your search runs.
If you choose this option, all other conditions are grayed out in the second drop-down
menu.

number of events
Splunk sends alerts only if the number of events your search returns matches the rest
of the alert conditions.

number of sources
Splunk sends alerts only if the number of sources your search returns matches the rest
of the alert conditions.

number of hosts
Splunk sends alerts only if the number of hosts your search returns matches the rest of
the alert conditions.

179
2. In the second drop-down menu under Alert when choose a comparison operation:
greater than
less than
equal to
rises by
drops by
3. In the text field under Alert when, enter a value.
For example, you may want to "Alert when number of events [is] greater than 10".
Configure alert actions
Tell Splunk what to do once an alert is triggered.
1. Now set up how you want Splunk to notify you. You can combine any of these options:
Create an RSS feed
This creates a link to an RSS feed of alerts.

Send email
Enter one or more email addresses. Separate multiple addresses with a comma.

2. Next, if you want to include the search results in your alert, check Include results.
3. Finally, if you want to run a shell command when an alert triggers, enter the command under
Trigger shell script. For example, you may want to trigger a script to generate an SNMP trap or call
an API to send the event to another system. For more details see the page on scripted alerts.
Set up an alert on an existing saved search
You can take a saved search you've already created and turn it into an alert.
1. From the drop-down menu to the left of the search bar, choose Saved searches > Manage saves
searches. This will launch the saved searches window.
2. In the table, locate the saved search that you want to turn into an alert.
180
3. Click enable in the Running column.
If you do not have permission to edit this search, the Running column will show *No*.
If there is already an alert defined for this saved search, it will either be Running or give the
option to start it if you have the proper permissions.

4. To set up an alert, click the box next to Run this search on a schedule under Alert properties.
The options under Alert properties are the same described above for Schedule & Output.
Specify which fields to show
When you receive alerts, any fields included in your search are also displayed. Edit the saved search
to change which fields show up in your alert.
To remove a field, pipe your search to fields - <field>. For example:
error starthoursago::01 | fields - sourcetype
This search keeps the sourcetype field from appearing in your alerts.
To add a field, pipe your search to fields + <field>. For example:
error starthoursago::01 | fields + clientIP
This search adds the clientip field to your alerts.
You can add or subtract any number of fields -- just separate them with a comma: fields -
<field1>, <field2> + <field3>, <field4>.
View alert history
The alert history page shows which alerts have been triggered since Splunk's last reboot. To access,
click the Admin link in the upper right hand corner and select the Saved Searches tab. Your alerts
show up in the Alert History column.
Set up alerts via savedsearches.conf
181
Set up alerts via savedsearches.conf
Configure alerts with savedsearches.conf. Use the
create your own savedsearches.conf. Edit this file in $SPLUNK_HOME/etc/system/local/, or
your own custom application directory in $SPLUNK_HOME/etc/apps/. For more information on
Follow these steps:
1. Create a saved search.
2. Schedule the search.
3. Define alert conditions.
4. Configure alert actions.
You can set up an alert at the time you create a saved search, or add the alert configurations to your
saved search stanza later.
Note: You must have email enabled on your Splunk server for alerts to be sent out. Alternately, your
Splunk server must be able to contact your email server. Configure email settings by customizing
alerts.
Create a saved search
First, set up a saved search:
Enter your search terms into the search bar and choose Save search... from the drop-down
menu to the left of the search bar.

Fill in the fields to save your search and then click the Schedule & Output link at the top of
the Save Search pop up.

You can also set up a saved search via savedsearches.conf.
Schedule the search
Next, schedule your search. This means your search runs on the specified schedule. For example,
Splunk runs your search every hour or at midnight. If your search meets the alert conditions, then
Splunk alerts you.
Add the following attribute/value pairs to your saved search stanza to run the search on a schedule:
userid = <integer>
182
UserId of the user who created this saved search.
Splunk needs this information to log who ran the search, and create editing capabilities
in Splunk Web.

Possible values: Any Splunk user ID.
User IDs are found in $SPLUNK_HOME/etc/passwd.

enableSched = < 0 | 1 >
Set this to 1 to enable schedule for search
Defaults to 0.
schedule = <string>
Cron style schedule.
For example, */12 * * * *).
execDelay = <integer>
Amount of time (in seconds) from most recent event to the execution of the scheduled search
query.

Defaults to 0.
Alert conditions
Now define alert conditions. Alert conditions tell Splunk whether or not to send you an alert. Enter a
threshold number of events, sources, or hosts in your results. If the alert conditions are met, Splunk
notifies you via email or RSS feed or triggers a shell script.
counttype = <string>
Set the type of count for alerting.
Possible values: number of events, number of hosts, number of sources, number of
sourcetypes.

relation = <string>
How to compare against counttype.
Possible values: greater than, less than, equal to, drops by, rises by.
quantity = <integer>
Number to compare against the given counttype.
So if you have the following:
counttype = number of events
183
relation = rises by
quantity = 25
Splunk alerts you if your search results have risen by 25 since the last time the search ran.
Configure alert actions
Tell Splunk what to do once an alert is triggered. You can either:
Enable RSS
action_rss = < 0 | 1 >
Toggle whether or not to create an RSS link.
1 to send, 0 to disable.
Send Email
action_email = <string>
Comma separated list of email addresses to send alerts to.
sendresults = < 0 | 1 >
Whether or not to send the results along with the email/shell script.
1 to send, 0 to disable.
maxresults = <integer>
The maximum number of results the entire query pipeline can generate.
Defaults to 50000.
Note: This is different from specifying maxresults via prefs.conf or during a search
(maxresults: search modifier in older versions, or -maxresults in the CLI in versions 3.2
and above).

Example
This example runs a search for events containing the term "sudo" on a schedule, and sends the
results via an RSS feed.
[sudoalert]
action_rss = 1
enableSched = 1
quantity = 10
search = sudo
184
relation = greater than
schedule = */12 * * * *
sendresults = 0
role = Admin
Scripted Alerts
Scripted Alerts
Configure scripted alerts with savedsearches.conf. Use the
create your own savedsearches.conf. Edit this file in $SPLUNK_HOME/etc/system/local/, or
Script options
Your alert can trigger a shell script, which must be located in $SPLUNK_HOME/bin/scripts. Use the
following attribute/value pairs:
action_script = <string>
Your search can trigger a shell script.
Specify the name of the shell script to run.
Place the script in $SPLUNK_HOME/bin/scripts.
Command line arguments passed to the script are:
$0 = script name.
$1 = number of events returned.
$2 = search terms.
$3 = fully qualified query string.
$4 = name of saved splunk.
$5 = trigger reason (i.e. "The number of events was greater than 1").
$6 = link to saved search.
$7 = This option has been deprecated and is no longer used as of Splunk 3.4.6.
$8 = file where the results for this search are stored (contains raw results).
Note: If there are no saved tags, $7 becomes the name of the file containing the search results ($8).
This note is applicable to Splunk versions 3.3-3.5
If you want to run a script written in a different language (e.g. Perl, Python, VBScript) you must
specify the interpreter you want Splunk to use in the first line of your script, following the #!. For
example:
to run a Perl script:
---- myscript.pl ----
#!/path/to/perl
......
......
to use Python to interpret the script file:
185
---- myscript.py -----
#!/path/to/python
.....
.....
For an example on how scripts can be configured to work with alerts, see send SNMP traps.
Example
You can configure Splunk to send alerts to syslog. This is useful if you already have syslog set up to
send alerts to other applications, and you want Splunk's alerts to be included.
Check the Splunk Wiki for information about the best practices for using UDP when configuring
Syslog input.
Write a script that calls logger (or any other program that writes to syslog). Your script can call any
number of the variables your alert returns.
Create the following script and make it executable:
logger $5
Put your script in $SPLUNK_HOME/bin/scripts.
Now write an alert that calls your script. See Set Up Alerts for information on alert configuration.
Configure the alert to call your script by specifying the path in the Trigger shell script field of the
alert.
Edit your saved search to call the script. If your script is in $SPLUNK_HOME/bin/scripts you don't
have to specify the full path.
This logs the trigger reason to syslog:
Aug 15 15:01:40 localhost logger: Saved Search [j_myadmin]: The number of events(65) was greater than 10
Customize alert options
Customize alert options
Edit alert_actions.conf to specify the message subject and from address used for alert emails. For
more information on configuration files in general, see how configuration files work.
Note: Email must be enabled on your Splunk server to send alerts. Or you can specify another email
server, but your Splunk server must be able to connect to it.
186
Configuration
Add a stanza to alert_actions.conf. Edit this file in $SPLUNK_HOME/etc/system/local/, or
your own custom application directory in $SPLUNK_HOME/etc/apps/.
Global settings
Global options: these settings do not need to be prefaced by a stanza name. If you do not specify an
entry for each attribute, Splunk will use the default value.
maxresults = <int>
Set the global maximum number of search results sent via alerts.
Defaults to 100.
hostname = <string>
Set the hostname that is displayed in the link sent in alerts.
This is useful when the machine sending the alerts does not have a FQDN.
Defaults to current hostname (set in Splunk) or localhost (if none is set).
Email
Configure email options for alerts. Preface email settings with the [email] stanza name.
[email]
Set email notification options under this stanza name.
If you do not specify an entry for each attribute, Splunk uses the default value.
from = <string>
Email address originating alert.
Defaults to splunk@<splunk-hostname>.
subject = <string>
Specify an alternate email subject.
Defaults to SplunkAlert-<savedsearchname>.
format = <string>
Specify the format of text in the email.
Possible values: plain, html, raw and csv.
This value will also apply to any attachments.
inline = <true | false | auto>
187
Specify whether the search results are contained in the body of the alert email.
Defaults to false.
mailserver = <string>
The SMTP mail server to use when sending emails.
Defaults to localhost.
Example
The following example alert_actions.conf sets e-mail options for alerts.
[email]
from = alert@mysplunk.com
subject = daily log review
format = plain
RSS
[rss]
Set rss notification options under this stanza name.
If you do not specify an entry for each attribute, Splunk uses the default value.
items_count = <number>
Number of saved RSS feeds.
Cannot be more than maxresults (in [email] stanza).
Defaults to 30.
Send SNMP traps
Send SNMP traps
You can use Splunk as a monitoring tool to send SNMP alerts to other systems such as a Network
Systems Management console.
Configuration
Requirements
Perl is required to run the script below.
Net-SNMP package is required in order to use the /usr/bin/snmptrap command - if you
have another way of sending an SNMP trap from a shell script then modify as needed.

Admin access to the $SPLUNK_HOME/bin/scripts directory of your Splunk install.
For security reasons, scripts must reside in $SPLUNK_HOME/bin/scripts.
188
Create shell script
Create traphosts.pl script in your $SPLUNK_HOME/bin/scripts directory.
For security reasons, scripts must reside in this directory. Create the directory if it
doesn't already exist.

Copy the code below into traphosts.pl.

chmod +x traphosts.pl to make it executable.
Change the Host:Port of the SNMP trap handler, paths to external commands splunk and
snmptrap, and the user/password if necessary.

#!/usr/bin/perl
#
# sendtrap.pl: A script to for Splunk alerts to send an SNMP trap.
#
# Modify the following as necessary for your local environment
#
$hostPortSNMP = "qa-tm1:162"; # Host:Port of snmpd or other SNMP trap handler
$snmpTrapCmd = "/usr/bin/snmptrap"; # Path to snmptrap, from http://www.net-snmp.org
$OID = "1.3.6.1.4.1.27389.1.1"; # Object IDentifier for an alert, Splunk Enterprise OID is 27389
# Parameters passed in from the alert.
# $1-$9 is the positional parameter list. $ARGV[0] starts at $1 in Perl.
$searchCount = $ARGV[0]; # $1 - Number of events returned
$searchTerms = $ARGV[1]; # $2 - Search terms
$searchQuery = $ARGV[2]; # $3 - Fully qualified query string
$searchName = $ARGV[3]; # $4 - Name of saved search
$searchReason = $ARGV[4]; # $5 - Reason saved search triggered
$searchURL = $ARGV[5]; # $6 - URL/Permalink of saved search
if ( $ARGV[7] ) { # We received tags
$searchTags = $ARGV[6]; # $7 - Tags, if any, otherwise $7 is $8
$searchPath = $ARGV[7]; # $8 - Path to raw saved results in Splunk instance (advanced)
} else { # We didn't receive tags
$searchPath = $ARGV[6]; # $7 - Path to raw saved results in Splunk instance (advanced)
}
# Send trap, with the the parameter list above mapping down into the OID.
if ( $ARGV[7] ) { # We received tags
$cmd = qq/$snmpTrapCmd -v 1 -c public $hostPortSNMP $OID '' 1 0 ''
$OID.1 i $searchCount $OID.2 s "$searchTerms" $OID.3 s "$searchQuery" $OID.4 s
"$searchName" $OID.5 s "$searchReason" $OID.6 s "$searchURL" $OID.7 s
"$searchTags" $OID.8 s "$searchPath"/;
system($cmd);
} else { # We didn't receive tags
$cmd = qq/$snmpTrapCmd -v 1 -c public $hostPortSNMP $OID '' 1 0 ''
$OID.1 i $searchCount $OID.2 s "$searchTerms" $OID.3 s "$searchQuery" $OID.4 s
"$searchName" $OID.5 s "$searchReason" $OID.6 s "$searchURL" $OID.8 s
"$searchPath"/;
system($cmd);
}
Configure your alert to call a shell script
Create a saved search. See Set Up Saved Searches for more information.
Turn your saved search into an alert. See Set up Alerts for more information.
Set up your alert so that it calls your shell script by specifying the name of your script which
resides in $SPLUNK_HOME/bin/scripts:

189
Here is an example of the script running, including what it returns:
[root@qa-tm1 ~]# snmptrapd -f -Lo
2007-08-13 16:13:07 NET-SNMP version 5.2.1.2 Started.
2007-08-13 16:14:03 qa-el4.splunk.com [172.16.0.121] (via UDP: [172.16.0.121]:32883) TRAP, SNMP v1, community public
SNMPv2-SMI::enterprises.27389.1 Warm Start Trap (0) Uptime: 96 days, 20:45:08.35
SNMPv2-SMI::enterprises.27389.1.1 = INTEGER: 7 SNMPv2-
SMI::enterprises.27389.1.2 = STRING: "sourcetype::syslog" SNMPv2-
SMI::enterprises.27389.1.3 = STRING: "search sourcetype::syslog starttime:12/31
/1969:16:00:00 endtime::08/13/2007:16:14:01" SNMPv2-SMI::enterprises.27389.1.4
= STRING: "SyslogEventsLast24" SNMPv2-SMI::enterprises.27389.1.5 = STRING:
"Saved Search [SyslogEventsLast24]: The number of hosts(7) was greater than 1"
SNMPv2-SMI::enterprises.27389.1.6 = STRING: "http://qa-el4:18000/?q=sourcetype
%3a%3asyslog%20starttimeu%3a%3a0%20endtimeu%3a%3a1187046841" SNMPv2-
SMI::enterprises.27389.1.7 = STRING: "/home/tet/inst/splunk/var/run/splunk
/SyslogEventsLast24"
2007-08-13 16:14:15 NET-SNMP version 5.2.1.2 Stopped.
190
Security
Security options
Security options
Splunk includes several options for securing your data. Authentication options allow you to secure
your Splunk Server. Audit configurations enable data security, including cryptographic signing and
event hashing.
Authentication
Authentication includes SSL and HTTPS, user-based access controls (known as roles) and LDAP.
SSL/HTTPS
You can configure SSL for both Splunk's back-end (splunkd talking to the browser) and the front-end
(HTTPS when logging into Splunk Web). To set up SSL for Splunk's back-end, see instructions here.
To enable HTTPS for Splunk Web, follow these instructions.
Configure roles
You no longer have to use Splunk's default roles of Admin, Power or User. While these roles remain
built into Splunk, you can now define your own roles out of a list of capabilities. Create flexible roles
for Splunk users via authorize.conf.
Learn more about configuring roles.
LDAP
Splunk supports authentication via its internal authentication services or your existing LDAP server.
Learn more about configuring LDAP.
Scripted authentication
Use scripted authentication to tie Splunk's authentication into an external authentication system, such
as RADIUS or PAM.
Learn more about scripted auth.
Audit
Splunk includes audit features to allow you to track the reliability of your data. Watch files and
directories with the file system change monitor, monitor activities within Splunk (such as searches
or configuration changes) with audit events, cryptographically sign audit events events with audit
event signing, and block sign any data entering your Splunk index with IT data signing.
191
File system change monitor
You can use the file system change monitor in Splunk Preview to watch any directory or file. Splunk
indexes an event any time the file system undergoes any sort of change or someone edits the
watched files. The file system change monitor's behavior is completely configurable through
inputs.conf.
Learn more about how to configure the file system change monitor.
Audit events
Watch your Splunk instance by monitoring audit events. Audit events are generated whenever
anyone accesses any of your Splunk instances -- including any searches, configuration changes or
administrative activities. Each audit event contains information that shows you what changed where
and when and who implemented the change. Audit events are especially useful in distributed Splunk
configurations for detecting configuration and access control changes across many Splunk Servers.
Learn more about how audit events work.
Audit event signing
If you are using Splunk with an Enterprise license, you can configure audit events to be
cryptographically signed. Audit event signing adds a sequential number (for detecting gaps in data to
reveal tampering), and appends an encrypted hash signature to each audit event.
Configure auditing by setting stanzas in audit.conf, decorations.conf, and inputs.conf.
Learn more about audit event signing.
IT data signing
If you are using Splunk with an Enterprise license, you can configure Splunk to verify the integrity of
IT data as it is indexed. If IT data signing is enabled, Splunk creates a signature for blocks of data as
it is indexed. Signatures allow you to detect gaps in data or tampered data.
Learn more about IT data signing.
Customize audit decorations
Customize how different audit events appear in Splunk Web. Decorate events with unique CSS
based on audit information contained in the event. For example, valid events show with a green
check mark, while tampered events show a yellow caution symbol.
Learn more about Dynamic event rendering.
Enable HTTPS
192
Enable HTTPS
You can enable HTTPS via Splunk Web or web.conf.
Note: Your Splunk server can listen on either HTTP or HTTPS. It cannot listen on both.
You can also enable SSL through separate configurations.
Important: If you are using Firefox 3, enabling SSL for a Splunk deployment may result in an "invalid
security exception" being displayed in the browser. Refer to this workaround documentation for more
information.
Configuration
In Splunk Web
To enable HTTPS in Splunk Web, click the Admin link in the upper right hand corner. Then, click
Server and choose View Settings. Under Web interface, change the radio button to Yes for Enable
SSL (HTTPS) in Splunk Web?
Note: You must restart Splunk to enable the new settings. Also, you must now append "https://" to
the URI you use to access Splunk Web.
In web.conf
In order to enable HTTPS, modify web.conf. Edit this file in
[settings]
httpport = <port number>
enableSplunkWebSSL = true
httpport
Set the port number to your HTTPS port.

enableSplunkWebSSL
Set this key to true to enable SSL for Splunk Web.

Once you have made the changes to web.conf restart your Splunk server to read the new changes in.
193
Certificates
The certificates used for SSL between Splunk Web and the client browser is located in
$SPLUNK_HOME/share/splunk/certs/. You can replace the self-signed default certificate with
your own.
The certificates for SSL are specified in web.conf. You can change the defaults to your own certificate
names.
privKeyPath = /certs/privkey.pem
caCertPath = /certs/cert.pem
Restart Splunk Web from the CLI for your changes to take effect. To use Splunk's CLI, navigate to
the $SPLUNK_HOME/bin/ directory and use the ./splunk command. You can also add Splunk to
your path and use the splunk command.
./splunk restart splunkweb
If your self-signed certificate for Splunk Web expires, you can generate a new one by deleting
cert.pem and privkey.pem in $SPLUNK_HOME/share/splunk/certs/.
SSL
SSL
The Splunk management port (default 8089) supports both SSL and plain text connections. SSL is
turned on by default for communications among Splunk servers. To make changes to SSL settings,
edit server.conf.
Important: If you are using Firefox 3, enabling SSL for a Splunk deployment may result in an "invalid
security exception" being displayed in the browser. Refer to this workaround documentation for more
information.
Note: This only enables SSL for Splunk's back-end communication. To turn on SSL for the browser,
see enable HTTPS.
194
Configuration
When the Splunk server is turned on for the first time, the server generates a certificate for that
instance. This certificate is stored in the $SPLUNK_HOME/etc/auth/ directory by default.
Change SSL settings by editing $SPLUNK_HOME/etc/system/local/server.conf. Edit this file
in $SPLUNK_HOME/etc/system/local/, or your own custom application directory in
[sslConfig]
enableSplunkdSSL = true
keyfile = server.pem
keyfilePassword = password
caCertFile = cacert.pem
caPath = $SPLUNK_HOME/etc/auth
certCreateScript = $SPLUNK_HOME/bin/genSignedServerCert.py
enableSplunkdSSL = Setting this boolean key to true enables SSL in Splunk.
keyfile = Certificate for this Splunk instance (created on Splunk start-up by default - if the
certCreateScript tag is present).

Note: The path to the keyfile is relative to the caPath setting. If your keyfile is kept outside $SPLUNK_HOME, you must specify a full (absolute) path outside of $SPLUNK_HOME to reach it.
keyfilePassword = Password for the pem file store, is set to password by default.
caCertFile = This is the name of the certificate authority file.
caPath = Path where the Splunk certificates are stored. Default is
$SPLUNK_HOME/etc/auth.

certCreateScript = Script for creating & signing server certificates.
With the default script enabled, on startup, Splunk will generate a certificate in the caPath directory.
Deactivate SSL
To deactivate SSL, simply set enableSplunkdSSL to FALSE. This will disable SSL.
Note: Running splunkd without SSL is not generally recommended. Distributed search will often
perform better with SSL enabled.
Generate signed certificates
By default, all Splunk servers use the same self-signed certificate. The certificate's public and private
keys are distributed with Splunk. This allows Splunk instances to connect to each other out of the box
and lets you regenerate and sign your server certificates.
You can change this default behavior. There are scripts located in $SPLUNK_HOME/bin that you can
use to generate and self-sign your server certificates.
genRootCA.sh Run this script when you want to regenerate the certificates Splunk
uses. It generates cacerts.pem (public key) and ca.pem
(public/private password protected PEM). When you run it, it checks to
195
see if certs are already in place, and if they are, prompts you to
overwrite them. It then wraps these files into an X509-formatted cert.
Distribute cacerts.pem to clients as desired and keep ca.pem in a
secure location.
genSignedServerCert.sh
This shell script is a wrapper for the Python script that Splunk runs to
generate certificates when you start it for the first time. This script
creates a CSR (certificate signing request), self-signs it, and outputs a
signed server.pem that you can distribute to your Splunk servers.
Generate a CSR (Certificate Signing Request)
If your organization requires that your Splunk deployment use a certificate signed by an external CA,
you can use the following procedure to generate the CSR to send to the CA:
openssl req -new -key [certificate name].pem -out [certificate name].csr
You are prompted for the following X.509 attributes of the certificate:
Country Name: Use the two-letter code without punctuation for country, for example: US or
GB.

State or Province: Spell out the state completely; do not abbreviate the state or province
name, for example: California

Locality or City: The Locality is the city or town name, for example: Oakland. Do not
abbreviate. For example: Los Angeles, not LA, Saint Louis, not St. Louis.

Company: If your company or department contains an &, @, or any other non-alphanumeric
symbol that requires you to use the shift key, you must spell out the symbol or omit it. For
example, Fflanda & Rhallen Corporation would be Fflanda Rhallen Corporation or Fflanda and
Rhallen Corporation.

Organizational Unit: This field is optional; but you can specify it to help identify certificates
registered to an organization. The Organizational Unit (OU) field is the name of the department
or organization unit making the request. To skip the OU field, press Enter.

Common Name: The Common Name is the Host + Domain Name, for example
www.company.com or company.com. This must match the host name of the server where you
intend to deploy the certificate exactly.

This creates a private key ([certificate name].key), which is stored locally on your server, and a CSR
([certificate name].csr), which contains the public key associated with the private key. You can then
use this information to request a signed certificate from an external CA.
To copy and paste the information into your CA's enrollment form, open the .csr file in a text editor
and save it as a .txt file.
Note: Do not use Microsoft Word; it can insert extra hidden characters that alter the contents of the
CSR.
Set up LDAP
196
Set up LDAP
Splunk supports authentication via its internal authentication services or your existing LDAP server.
Notes:
You must add a CA when connecting to AD via secure LDAP. Read the section below entitled
Import your CA for more information.

Splunk is unable to follow LDAP referrals. Check the Splunk Wiki for information about ways to
authenticate against an LDAP server that returns referrals.

Be sure to read the section called "Known issues with LDAP" at the end of this topic before
proceeding.

User Management
Important: Once you have switched Splunk into LDAP mode, no user administration is done within
Splunk. Instead, you must administer users within your LDAP server and reload authentication
configuration within Splunk. For example:
To add an LDAP user to a Splunk role, add the user to the LDAP group on your LDAP server.
Then in Splunk go to Server > Control > Reload Authentication Configuration.

To change a user's role membership, change the LDAP group that the user is a member of on
your LDAP server. Then in Splunk go to Server > Control > Reload Authentication
Configuration.

To remove a user from a Splunk role, remove the user from the LDAP group on your LDAP
server. Then in Splunk go to Server > Control > Reload Authentication Configuration.

Configure LDAP
Configure LDAP through Splunk Web or via authentication.conf. If you are configuring authentication
via the conf file and wish to switch back to the default Splunk auth, the simplest way is to move the
existing authentication.conf file out of the way (rename to *.disabled is fine) and restart Splunk. This
will retain your previous configuration unchanged if you expect to return to it later.
Determine your User and Group Base DN
Before you map your LDAP settings in Splunk, figure out your user and groupbase DN, or
distinguished name. The DN is the location in the directory where authentication information is stored.
If all information is contained in each user's entry, then these DNs must be the same. If group
membership information for users is kept in a separate entry, enter a separate DN identifying the
subtree in the directory where the group information is stored.
If you are unable to get this information, please contact your LDAP Administrator for assistance.
Set up LDAP via Splunk Web
First, set LDAP as your authentication strategy:
1. Click the Admin link in the upper right-hand corner.
2. Click the Server tab then select Authentication Configuration.
197
3. Select LDAP from the Set Authentication method drop-down.
Next, fill in your LDAP settings:
4. Define an LDAP strategy name for your configuration. The name cannot be LDAP, cannot start
with a number and it must not contain spaces.
5. The strategy name is added to the Set Authentication Strategy drop-down once you save your
LDAP configurations.
6. Specify the Host name of your LDAP server. Be sure that your Splunk Server can resolve the host
name.
7. Specify the Port that Splunk should use to connect to your LDAP server.
By default LDAP servers listen on TCP port 389.
LDAPS (LDAP with SSL) defaults to port 636.
8. Turn on SSL by checking SSL enabled.
Note: You must also have SSL enabled on your LDAP server.
9. Enter the Bind DN
This is the distinguished name to bind to the LDAP server with.
This is typically the administrator or manager user.
This user needs to have access to all LDAP users you wish to add to Splunk.
10. Enter and confirm the Bind DN password for the binding user.
11. Specify the User base DN.
Splunk uses this attribute to locate user information.
You can specify multiple user base DN entries by separating them with a semicolon.
Note: You must set this attribute or your authentication will not work.
12. Specify the User base filter for the object class you want to filter your users on.
Default value is objectclass=*, which should work for most configurations.
13. Specify the Group base DN
Location of the user groups in LDAP.
You can specify multiple group base DN entries by separating them with a semicolon.
14. Input the Group base filter.
This attribute defines the group name.
Default value is objectclass=*, which should work for most configurations.
198
Splunk can also accept a GID as a group base filter.
15. Enter the User name attribute that defines the user name.
Note: The username attribute cannot contain whitespace. The username is case sensitive.
In Active Directory, this is sAMAccountName.
The value uid should work for most configurations.
16. Specify the Real name attribute (also referred to as the common name) of the user.
The value displayName or cn should work for most configurations.
17. Input the Group name attribute.
Set this only if users and groups are defined in the same tree.
This is usually cn.
18. Specify the Group member attribute.
This is usually member or memberOf, depending on whether the memberships are listed in the
group entry or the user entry.

19. Enter the Group mapping attribute.
Specify this value only if your member entries don't contain dn strings. In most cases,
however, you can leave this field blank.

If you enter this field, the value is usually dn.
20. Enter a value for pageSize.
This determines how many records to return at one time.
Enter 0 to disable paging and revert to LDAPv2. pageSize must be set to 0 in order to connect
to Sun LDAP.

21. Specify a Failsafe user name.
This allows you to authenticate into Splunk in the event that your LDAP server is unreachable.
Note: This user has admin privileges within Splunk.
22. Enter and confirm a Failsafe password for your failsafe user.
Import your CA
To configure Splunk's LDAP to work with your own CA, follow these steps:
1. Export your root CA cert in Base-64 encoded X.509 format.
2. Add these lines to $SPLUNK_HOME/etc/openldap/ldap.conf:
TLS_CACERT $SPLUNK_HOME/etc/openldap/certs/$YOUR_CERT_NAME
199
TLS_CACERTDIR $SPLUNK_HOME/etc/openldap/certs
3. Create the directory $SPLUNK_HOMEetc/openldap/certs.
4. Place the exported CA cert at $SPLUNK_HOME/etc/openldap/certs/$YOUR_CERT_NAME.
5. Restart Splunk.
6. In Splunk Web, navigate to Admin > Server > Authentication Configuration.
Click Save at the bottom of the page.
7. You can now map the designated AD groups to the respective roles in Splunk.
Map existing LDAP groups to Splunk roles
Once you have configured Splunk to authenticate via your LDAP server, map your existing LDAP
groups to any roles you have created. If you do not use groups, you can map your LDAP users
individually to Splunk roles. To do this you'll need to set userBaseDN = groupBaseDN. Please refer to
the example below on how to do this.
Note: You can either map users or map groups but not both. If you are using groups, all the users
you wish to have access to Splunk must be members of an appropriate group. Groups inherit
capabilities from the highest level role they're a member of.
All users and groups are visible under the Users tab in the Splunk Web Admin section. Click the Edit
link next to the appropriate user or group to define User Roles.
Important: If you change (and save) an existing user/group role LDAP mapping from within Splunk
Web, all users currently logged in to Splunk Web will be automatically logged out of Splunk Web
immediately and must log back in to proceed. This is done to ensure that any users who should no
longer have access as a result of the role mapping change are indeed denied access.
Test your LDAP configuration
If you find that your Splunk install is not able to successfully connect to your LDAP server, try these
troubleshooting steps:
1. Remove any custom values you've added for userBaseFilter and groupBaseFilter.
2. Check $SPLUNK_HOME/var/log/splunk/splunkd.log for any authentication errors.
3. Perform an ldapsearch to test that the variables you are specifying work:
ldapsearch -h "<host>" -p "<port>" -b "<userBaseDN>" -x -D "<bindDN>" -W"
ldapsearch -h "<host>" -p "<port>" -b "<groupBaseDN>" -x -D "<bindDN>" -W"
Note: On Solaris you have to add filter to the search.
ldapsearch -h "<host>" -p "<port>" -b "<groupBaseDN>" -x -D "<bindDN>" "<groupBaseFilter>" -W"
200
Example
This example steps you through obtaining LDIFs and setting up authentication.conf. You can
also enter these settings in Splunk Web, as described above.
Note: The particulars of your LDAP server may be different. Check your LDAP server settings and
adapt authentication.conf attributes to your environment.
Get LDIFs
You should have both the user and group LDIFs to set up authentication.conf.
User LDIF
Note On Windows systems you can extract ldifs with the ldifde command from the AD server
ldifde -f output.ldif
The ldifde command will export all entries in AD. You should then open the file in a simple text editor
and find the appropriate entries.
Get the user LDIF by running the following command (use your own ou and dc):
# ldapsearch -h ldaphost -p 389 -x -b "ou=People,dc=splunk,dc=com" -D
"cn=bind_user" -W
On Solaris:
# ldapsearch -h ldaphost -p 389 -x -b "ou=People,dc=splunk,dc=com" -D
"cn=bind_user" "(objectclass=*)" -W
This returns:
# splunkadmin, People, splunk.com
dn: uid=splunkadmin,ou=People, dc=splunk,dc=com
uid: splunkadmin
givenName: Splunk
objectClass: top
objectClass: person
objectClass: organizationalPerson
objectClass: inetorgperson
sn: Admin
cn: Splunk Admin
Group LDIF
Get the group LDIF by running the following command (use your own ou and dc):
# ldapsearch -h ldaphost -p 389 -x -b "ou=groups,dc=splunk,dc=com" -D
"cn=bind_user" -W
This returns:
201
# SplunkAdmins, Groups, splunk.com
dn: cn=SplunkAdmins,ou=Groups, dc=splunk,dc=com
description: Splunk Admins
objectClass: top
objectClass: groupofuniquenames
cn: SplunkAdmins
uniqueMember: uid=splunkadmin,ou=People, dc=splunk,dc=com
configure authentication.conf
Use the following instructions to set up authentication.conf. Edit this file in
To set up LDAP via Splunk Web, see the instructions above.
set authentication type
By default, Splunk uses its own authentication type. Change that in the [authentication] stanza.
[authentication]
authType = LDAP
authSettings = ldaphost
Turn on LDAP by setting authType = LDAP.
Map authSettings to your LDAP configuration stanza (below).
map to LDAP server entries
Now, map your LDIFs to the attribute/values in authentication.conf.
[ldaphost]
host = ldaphost.domain.com
pageSize = 0
port = 389
SSLEnabled = 0
failsafeLogin = admin
failsafePassword = admin_password
bindDN = cn=bind user
bindDNpassword = bind_user_password
groupBaseDN = ou=Groups,dc=splunk,dc=com;
groupBaseFilter = (objectclass=*)
groupMappingAttribute = dn
groupMemberAttribute = uniqueMember
groupNameAttribute = cn
realNameAttribute = displayName
userBaseDN = ou=People,dc=splunk,dc=com;
userBaseFilter = (objectclass=*)
userNameAttribute = uid
map roles
You can set up a stanza to map any custom roles you have created in authorize.conf to LDAP groups
you have enabled for Splunk access in authentication.conf.
202
[roleMap]
Admin = SplunkAdmins;
ITUsers = ITAdmins;
map users directly
If by chance you need to map users directly to Splunk role, you can do so by setting the
groupBaseDN = userBaseDN. Example:
[supportLDAP]
SSLEnabled = 0
bindDN = cn=Directory Manager
bindDNpassword = #########
failsafeLogin = failsafe
failsafePassword = ########
groupBaseDN = ou=People,dc=splunksupport,dc=com;
host = supportldap.splunksupport.com
pageSize = 0
port = 389
realNameAttribute = cn
userBaseDN = ou=People,dc=splunksupport,dc=com;
[roleMap]
Admin = Tina Phi;
Convert saved searches to LDAP
If you have already configured saved searches and want to convert them to work with your new
LDAP configuration, follow these steps:
1. Identify the user IDs at the Splunk CLI by typing:
./splunk list user
2. Then, modify $SPLUNK_HOME/etc/system/local/savedsearches.conf and swap the
userid= field in each stanza to be the ldap userid.
3. To test that this works, create one saved search as an LDAP user so you can verify that you have
the format of the LDAP userid, and then making the changes to the existing saved searches.
4. Once you finish modifying savedsearches.conf, you must restart Splunk.
Known issues with LDAP
When configuring Splunk to work with your LDAP instance, note the following:
Will not work if LDAP server has no groups.
Entries in Splunk Web and authentication.conf are case sensitive.
203
Splunk currently supports LDAP v2 and v3; v3 allows for paging and is the default protocol
used.

Splunk does not support scrolling. LDAP servers that use scrolling, such as SUN/iPlanet
Directory Server (versions 5.x and 6.x), should disable paging by setting pageSize to 0.

Splunk only works with one LDAP server at a time.
Splunk does not support (end user) anonymous bind. You may wish to create a user with
minimal privileges for this purpose.

Splunk Web can display a maximum of 499 LDAP groups.
To view and configure more than 499 groups manually configure them by editing
authentication.conf.

If you want a group that did not make the cut for UI rendering, add the dn to the
appropriate role in authentication.conf:

user = cn=splunk,ou=splunkgroups,ou=groups,o=company

LDAP referrals is currently not supported.
The LDAP strategy name can not be [LDAP], can't begin with a number and can't contain any
whitespace.

You must restart to be able to log in after switching from LDAP back to Splunk's auth.
For situations where users and groups reside in the same base, the value of userBaseDN
can't be the same as groupBaseDN. Workaround is to remove one level from the
groupBaseDN (or vice versa). Example: if the userBaseDN = cn=Users,dc=domain,dc=com,
set groupBaseDN = dc=domain,dc=com.

Splunk's authentication module does not work with Domino LDAP or Apache Directory.
If your LDAP group names contain ampersand '&', you will not be able to Edit Mappings via
SplunkWeb. The workaround is to map groups to roles directly in local/authentication.conf.

If your ldapBindPassword contains '&', or other unsafe XML, bind will fail. (SPL-18170).
Workaround is to modify the ldap bind password so that it does not contain unsafe XML
characters.

When using LDAP with distributed search, the failsafe user/password should be synchronized
on your distributed search nodes, as well as the splunk.secret file, and the hashed passwords
in authentication.conf which must match the splunk.secret.

When using LDAP with distributed seach, users must exist on all search nodes. This means
that you must perform a reload auth or splunk restart on all the search nodes to acquire new
users.

In order for Splunk to recognize LDAP membership changes, you must reload the
authentication configuration. This includes adding or removing users.

Configure roles
Configure roles
Configure flexible roles by editing authorize.conf. Roles are defined by lists of capabilities. You can
also use roles to create fine-grained access controls by setting a search filter for each role.
Caution: Do not edit or delete any roles in
$SPLUNK_HOME/etc/system/default/authorize.conf. This could break your admin
capabilities. Edit this file in $SPLUNK_HOME/etc/system/local/, or your own custom application
204
Default Roles
There are three default roles provided with Splunk:
User
Power User
Admin
The User role is the most limited role and is intended to provide access to search and customization
that is unlikely to have high impact on the splunk environment. Users can, by default:
Run searches and use the normal features of the event investigation and reporting in Splunk
Web

Create saved searches for their own use
Create event types
The Power User role adds access to resource-intensive abilities and advanced searches to all the
capabilities of the User role. Powers users can:
Do everything that Users can do.
Create saved searches for use by other users.
Create and modify tags
Schedule saved searches, and create and modify alerts on these scheduled saved searches.
Set up scheduled searches to populate pre-existing summary indexes.
Use Live Tail
The Admin role is set up to maintain, configure, and administrate the Splunk deployment. The Admin
role adds everything else, including:
General administration: modifying inputs, forwarding, users, authentication methods, index
configuration, roles. Generally everything accessible via the Admin page.

Modification of saved searches owned by other users, including schedules and deleting them.
Access to the special oldsearch delete command to hide data.
Access to the special search commands to modify <codesplunkd</code> logging.
Access to the crawl search feature.
Access to some debugging features.
Configuration
Add the following attribute/value pairs to $SPLUNK_HOME/etc/system/local/authorize.conf.
[role_$ROLE_NAME]
$CAPABILITY1 = enabled
$CAPABILITY2 = enabled
...
importRoles = $OTHER_ROLE
srchFilter = $SEARCH_STRING
role_$ROLE_NAME:
the name you want to give your role, for example security, compliance, ninja.

$CAPABILITY1:
205
any capability from the list below. You can have any number of capabilities for a role.
importRoles = <role>:
when set, the current role will inherit all the capabilities from <role>.

srchFilter = <search>:
use this field for fine-grained access controls. Searches for this role will be filtered by
this expression.

srchTimeWin = <string>
maximum time span of a search

Valid search strings
The srchFilter field can include any of the following search terms:
source=
host= and host tags
eventtype= and event type tags
sourcetype=
search fields
wildcards
use OR to use multiple terms, or AND to make searches more restrictive
Note: Members of multiple roles inherit capabilities from the role with the loosest permissions. In the
case of search filters, if a user is assigned to roles with different search filters, they are all applied.
The search terms cannot include:
indexes
saved searches
time operators
regular expressions
any fields or modifiers Splunk Web can overwrite
Map a role to a user
Once you've created a role in authorize.conf, map it to a user via Splunk Web.
Click on the admin link in the upper right-hand corner.
Then, select the Users tab.
Enter the username, password and full name.
Choose which role to map to from the Role list.
Any custom roles you have created via authorize.conf should be listed here.

Important: If you change (and save) an existing user/group role LDAP mapping from within Splunk
Web, all users currently logged in to Splunk Web will be automatically logged out of Splunk Web
immediately and must log back in to proceed. This is done to ensure that any users who should no
longer have access as a result of the role mapping change are indeed denied access.
Note: You must restart Splunk after making changes to authorize.conf. Otherwise, your new
roles will not appear in the Role list.
206
Prevent persistent changes for dashboard by role
You can prevent persistent dashboard changes on a per-role basis via web.conf.
In $SPLUNK_HOME/etc/system/local/web.conf add the following:
disablePersistedPrefs = <role>
This prevents any changes a role makes from being written to prefs.conf.
Example
The following example creates the role of Ninja. This user can do everything listed as capabilities
(eg edit_input). Also, the Ninja role imports the capabilities of the Security and Compliance roles
-- meaning Ninja can do everything (and more) that Security and Compliance can do.
Additionally, there is a search filter which means that Ninja can only run searches on hosts swan or
pearl.
[role_Ninja]
edit_input = enabled
delete_input = enabled
edit_global_save_search = enabled
delete_global_save_search = enabled
create_alert = enabled
start_alert = enabled
start_global_alert = enabled
stop_alert = enabled
stop_global_alert = enabled
save_local_eventtype = enabled
edit_role_search = enabled
edit_local_search = enabled
edit_saved_search = enabled
savesearch_tab = enabled
allow_livetail = enabled
importRoles = Security;Compliance
srchFilter = host=swan OR host=pearl
Splunk ships with support for three authentication systems: Splunk's built-in system, LDAP and a new
scripted authentication API. The scripted authentication system allows you to set up Splunk to
interface with an authentication system you already have in place -- such as PAM or RADIUS. Set up
authentication using authentication.conf.
For the most up-to-date information on scripted authentication, see the README file in
$SPLUNK_HOME/share/splunk/authScriptSamples/. There are sample scripts in this
directory for PAM and RADIUS, as well as a sample authentication.conf for each auth system.
Note: These scripts are samples, and must be edited to work in your specific environment.
207
Known issues with scripted authentication
Scripted authentication does not currently work with distributed search.
Everybody gets User-level privileges. Use the admin section of Splunk Web to map your users
to the correct Splunk role.
There is also a sample user-mapping script in
$SPLUNK_HOME/share/splunk/authScriptSamples/. To use it, you must adapt
the script to suit your environment; it is not designed to work without customization.

Configuration
Configure scripted auth via authentication.conf. If you're using PAM, you may also need to edit your
system's pamauth file in "etc/pam.d/pamauth".
Authentication.conf
Add the following settings to authentication.conf in $SPLUNK_HOME/etc/system/local/
(or your custom app directory) to enable your specific script. You can also copy the sample
authentication.conf from $SPLUNK_HOME/share/splunk/authScriptSamples/.
Specify scripted as your authentication type under the [authentication] stanza heading:
[authentication]
authType = Scripted
authSettings = script
Set script variables under the [script] stanza heading:
[script]
scriptPath = $SPLUNK_HOME/bin/python $SPLUNK_HOME/share/splunk/authScriptSamples/<scriptname>
scriptSearchFilters = 1
Set scriptSearchFilters to 1 if you want to enable search filters for roles mapped to users. Set
to 0 to disable.
Optionally, add a [cacheTiming] stanza if needed for your script. Use these settings to adjust the
frequency at which Splunk calls your application. Each call has its own timeout specified in seconds.
Caching does not occur if not specified.
[cacheTiming]
userLoginTTL = 1
searchFilterTTL = 1
getUserInfoTTL = 1
getUserTypeTTL = 1
getUsersTTL = 1
Script commands
Scripted authentication includes the following commands to use in your script. Here is a descriptive
list of these commands, including their inputs and outputs.
userlogin: login with username/password pair
208
in: --username=<username> --password=<password> (passed over stdin)
out: --status=<status_bit> --search_filter=<search_filter>(optional) --authToken=<tok>
(optional)success (or fail)

getUserType: this command corresponds to the role within Splunk (for example Admin,
Power or User)
in: --username=<username> --authToken=<tok> (optional)
out: --status=<status_bit> --role=<role> (eg Admin)

getUserInfo: get user information
out: --status=<status_bit>
--userInfo=<userId>;<username>;<realname>;<role>

Supplemental calls:
getUsers
in: --authToken=<tok> (optional)
--userInfo=<userId>;<username>;<realname>;<role>
--userInfo=<userId>;<username>;<realname>;<role>....

Advanced calls:
checkSession
in: --authToken=<tok> (optional)

getSearchFilter = <role>
This command corresponds to the role within Splunk (for example Admin, Power or
User).

out: --status=<status_bit> --search_filter=<filter> (you can have
one or more --search_filter)

Every out starts with a <status_bit> which is one of the following:
success
The command succeeded correctly.

tmp_fail
Temporary failure of auth plugin. Attempt to just go on.

auth_fail
Failure to authenticate. Terminate the user's session.

PAM auth
If you're using PAM and you're unable to auth after following the steps in the README, make sure
you've added an entry to the system to support pamauth config. Edit /etc/pam.d/pamauth and put
this line in:
auth sufficient pam_unix.so
209
Splunk's file system change monitor is useful for tracking changes in your file system. The file system
change monitor watches any directory you specify and generates an event (in Splunk) when that
directory undergoes any change. It is completely configurable and can detect when any file on the
system is edited, deleted or added (not just Splunk-specific files). For example, you can tell the file
system change monitor to watch /etc/sysconfig/ and alert you any time the system's
configurations are changed.
Configure file system change monitor in inputs.conf.
Note: If you're interested in auditing file reads on Windows, check out this topic on the Splunk
Community best practices Wiki. Some users might find it more straightforward to use Windows native
auditing tools.
How the file system change monitor works
The file system change monitor detects changes using:
modification date/time
group ID
user ID
file mode (read/write attributes, etc.)
optional SHA256 hash of file contents
You can configure the following features of the file system change monitor:
white listing using regular expressions
specify files that will be checked no matter what

black listing using regular expressions
specify files to skip

directory recursion
including symbolic link traversal
scanning multiple directories, each with their own polling frequency

cryptographic signing
creates a distributed audit-trail of file system changes

indexing entire file as an event on add/change
size cutoffs for sending entire file and/or hashing

all change events indexed by and searchable through Splunk
Configure the file system change monitor
By default, the file system change monitor will generate events whenever the contents of
$SPLUNK_HOME/etc/ are changed, deleted, or added to. When you start Splunk for the first time,
an add audit event will be generated for each file in the $SPLUNK_HOME/etc/ directory and all
sub-directories. Any time after that, any change in configuration (regardless of origin) will generate an
audit event for the affected file(s). If you have signedaudit=true , the file system change audit
event will be indexed into the audit index (index=_audit). If signedaudit is not tuend on, by default,
210
the events are written to the main index unless you specify another index.
Note: The file system change monitor does not track the user name of the account executing the
change, only that a change has occurred. For user-level monitoring consider using native operating
system audit tools, which have access to this information.
You can use the file system change monitor to watch any directory by adding a stanza to
inputs.conf.
Create your own inputs.conf in $SPLUNK_HOME/etc/system/local/. Edit this files in
Edit the [fschange] stanza to configure the file system change monitor. Every setting is optional
except the stanza name fschange:<directory or file to monitor>.
Note: You must restart Splunk any time you make changes to the [fschange] stanza.
[fschange:<directory or file to monitor>]
index=<indexname>
recurse=<true | false>
followLinks=<true | false>
pollPeriod=N
hashMaxSize=N
fullEvent=<true | false>
sendEventMaxSize=N
signedaudit=<true | false>
filters=<filter1>,<filter2>,...<filterN>
Possible attribute/value pairs
[fschange:<directory or file to monitor>]
The system will monitor all adds/updates/deletes to this directory and sub-directories.
Any changes will generate an event that is indexed by Splunk.
Defaults to $SPLUNK_HOME/etc/.
index=<indexname>
The index to store all events generated.
Defaults to main (unless you have turned on audit event signing).
recurse=<true | false>
If true, recurse directories within the directory specified in [fschange].
Defaults to true.
followLinks=<true | false>
If true, the file system change monitor will follow symbolic links.
211
Defaults to false.
Caution: If you are not careful with setting followLinks, file system loops may occur.
pollPeriod=N
Check this directory for changes every N seconds.
Defaults to 3600.
If you make a change, the file system audit events could take anywhere between 1 and
3600 seconds to be generated and become available in audit search.

hashMaxSize=N
Calculate a SHA1 hash for every file that is less than or equal to N size in bytes.
This hash can be used as an additional method for detecting change in the file/directory.
Defaults to -1 (no hashing used for change detection).
signedaudit=<true | false>
Send cryptographically signed add/update/delete events.
Defaults to false.
Setting to true will generate events in the _audit index.
This should be deliberately set to false if you wish to set the index.
Note: When setting signedaudit to true, make sure auditing is enabled in audit.conf.
fullEvent=<true | false>
Send the full event if an add or update change is detected.
Further qualified by the sendEventMaxSize attribute.
Defaults to false.
sendEventMaxSize=N
Only send the full event if the size of the event is less than or equal to N bytes.
This limits the size of indexed file data.
Defaults to -1, which is unlimited.
Set the sourcetype for events from this input.
"sourcetype=" is automatically prepended to <string>.
sourcetype = fs_notification by default.
filesPerDelay = <integer>
Injects a delay specified by 'delayInMills' after processing <integer> files.
This is used to throttle file system monitoring so it doesn't consume as much CPU.
212
delayInMills = <integer>
The delay in milliseconds to use after processing every <integer> files as specified in
'filesPerDelay'.

This is used to throttle file system monitoring so it doesn't consume as much CPU.
filters=<filter1>,<filter2>,...<filterN>
Each of these filters will apply from left to right for each file or directory that is found during the
monitors poll cycle.
To define a filter, add a [filter...] stanza as follows:
[filter:blacklist:backups]
regex1 = .*bak
regex2 = .*bk
[filter:blacklist:code]
regex1 = .*\.c
regex2 = .*\.h
[fschange:/etc]
filters = backups,code
Fschange white/blacklist logic is handed quite similarly to typical firewalls. The events pass down the
list of filters until they reach their first match. If the first filter to match an event is a whitelist, the event
will be indexed. If the first filter to match an event is a blacklist, the event will not be indexed. If an
event reaches the end of the chain with no matches, it will be indexed. To create a no-index default,
end the chain with a blacklist for all events. eg,
...
filters = <filter1>, <filter2>, ... terminal-blacklist
[filter:blacklist:terminal-blacklist]
regex1 = .
Audit events
Audit events
With auditing enabled, Splunk logs distinct events to the audit index (index=_audit). Every
interaction with Splunk -- search, configuration changes, etc -- generates an audit event. Directories
monitored by file change monitoring create audit events as well. This page outlines the composition
and generation of audit events.
Note: The punct field is not available for events in the _audit index because those events are signed
using PKI at the time they are generated.
213
Audit event composition
Timestamp:
date and time of the event.

User information:
the user who generated the event.
If the event contains no user information, Splunk sets the user to whoever is currently
logged in.

Additional information:
available event details -- what file, success/denial, etc.

ID (only if audit event signing is turned on):
a sequential number assigned to the event for detecting gaps in data.

Hash signature:
PKI encrypted SHA256 hash signature, including the timestamp and ID.

Additional attribute/value pairs specific to the type of event.
Example
The following is a sample signed audit log entry:
11-01-2007 09:23:59.581 INFO AuditLogger - Audit:[timestamp=Thu Nov 1 09:23:59 2007, id=1, user=admin, action=splunkStarting, info=n/a][NSsJkuZZNn1dKaH3tjgxN/RbGeKaQ/dXArIdK2M97E0Ckv6xqMurYbUVqC6YoICLjW/H113u6FDTPMBGdk29J95X1SecazMf+H1tRqfc+vcJPZH1RcQaiVCcJwRTJuXD4Z5JidyvjVIECIdrhPSAGj7CSEhTdYx4tOEfl5yMckU=]
The information within the first set of brackets ([ ]) is the hashed and signed data. The string in the
second set of brackets is the hash signature.
Audit event generation
Audit events are generated from monitoring:
all files in Splunk's configuration directory $SPLUNK_HOME/etc/*
files are monitored for add/change/delete using the file system change monitor.

system start and stop.
users logging in and out.
adding / removing a new user.
changing a user's information (password, role, etc).
execution of any capability in the system.
capabilities are listed in authorize.conf

Audit event storage
Splunk stores audit events locally in the audit index (index=_audit). Audit events are logged in the
log file: $SPLUNK_HOME/var/log/splunk/audit.log.
214
If you have configured Splunk as a forwarder in a distributed setting, audit events are forwarded like
any other event. Signing can happen on the forwarder, or on the receiving Splunk instance.
Audit event processing
The file audit.conf tells the audit processor whether or not to encrypt audit events. As audit events are
generated, Splunk's auditing processor assigns a sequence number to the event and stores the event
information in a SQLite database. If there is no user information specified when the event is
generated, Splunk uses the currently signed user information. Finally, if audit event signing is set,
Splunk hashes and encrypts the event.
Search for audit events
Search audit events in Splunk Web or in Splunk's CLI. To do this, pipe your searches to the new
audit command. The audit search command is most useful if audit event signing has been configured.
However, if you want to search for all audit events where audit event signing has not been configured
(or to skip integrity validation) you may search the whole audit index.
To search for all audit events, specify the _audit index:
index=_audit
This search returns all audit events.
Pipe your search to the audit command:
index=_audit | audit
This search returns the entire audit index, and processes the audit events it finds through the audit
command. Events piped to audit show up with decorations.
Narrow your search before piping to the audit command. However, you can only narrow the time
range, or constrain by a single host. This is because each host has its own ID number sequence.
Since sequential IDs exist to enable detection of gaps in audit events, narrowing a search across
multiple hosts causes false gap detection and decoration in the audit event trail.
Audit event signing
Audit event signing
Splunk creates audit trail information (by creating and signing audit events) when you have auditing
enabled. Audit event signing is only available if you are running Splunk with an Enterprise license.
215
How audit event signing works
The audit processor signs audit events by applying a sequence number ID to the event, and by
creating a hash signature from the sequence ID and event's timestamp. Configurable settings for
audit event signing are explained in the configure audit events signing.
Sequence numbering
The sequence number ID is useful to detect gaps in data which often identify tampering with the
system. When a gap in data is discovered, the gap is "decorated" according to the decoration
specification in decorations.conf (which ties to CSS style settings in prefs.conf).
Note: Decoration adds the name of the decoration to the _decoration metadata in the event. The
name of this decoration is derived by looking in decorations.conf and mapping the right side of
each attribute/value pair to the appropriate key in prefs.conf.
Hash encryption
For each processed audit event, Splunk's auditing processor computes an SHA256 hash on all of the
data. The processor then encrypts the hash value and applies Base64 encoding to it. Splunk then
compares this value to whatever key (your private key, or the default keys) you specify in audit.conf.
Configure audit event signing
Configure the following settings of Splunk's auditing feature through audit.conf:
Turn on and off audit event signing.
Set default public and private keys.
Configure audit.conf
Create your own audit.conf. Edit this file in $SPLUNK_HOME/etc/system/local/, or your own
custom application directory in $SPLUNK_HOME/etc/apps/. For more information on configuration
files in general, see how configuration files work.
216
Generate your own keys using genAuditKeys.py in $SPLUNK_HOME/bin/:
# python genAuditKeys.py
Note: You may need to set environment variables by running "source setSplunkEnv"
This creates your private and public keys, $SPLUNK_HOME/etc/auth/audit/private.pem and
$SPLUNK_HOME/etc/auth/audit/public.pem. To use these keys, set privateKey and
publicKey to the path to your keys in your $SPLUNK_HOME/etc/system/local/audit.conf:
[auditTrail]
privateKey = $PATH_TO_PRIVATE_KEY
publicKey = $PATH_TO_PUBLIC_KEY
Note: If the [auditTrail] stanza is missing, audit events are still generated, but not signed. If the
publicKey or privateKey values are missing, audit events will be generated but not signed.
Event hashing
Event hashing
Event hashing is a lightweight alternative to IT data signing. It provides a simple way to detect if
events have been tampered with between index time and search time.
Event hashes aren't cryptographically secure. Someone could tamper with an event if they have
physical access to a machine's file system. You should use event hashing only if you don't have the
capability to run Splunk's IT data signing feature.
How event hashing works
When event hashing is enabled, Splunk hashes events with a SHA256 hash just before index time.
When each event is displayed at search time, a hash is calculated and compared to that event's
index time hash. If the hashes match, the event is decorated in the search results as "valid". If the
hashes don't match, the event is decorated as "tampered" (For the CLI: the value of the decoration is
stored in the field: _decoration).
Configure event hashing by editing $SPLUNK_HOME/etc/system/local/audit.conf. Set up
event hashing filters that whitelist or blacklist events based on host, source, or sourcetype.
217
A whitelist is a set of criteria that events must match to be hashed. If events don't match, they
aren't hashed.

A blacklist is a set of criteria that events must match to NOT be hashed. If events don't match,
then they are hashed.

See more on configuring event hashing below.
Event hashing in search results
Splunk provides different visual indicators for your search results depending on the interface you use.
In Splunk Web
Search results are decorated in Splunk Web with decorations showing whether an event is valid or
has been tampered with.
If an event is valid, you'll see this above the raw data:
If an event has been tampered with, you'll see this above the raw event data:
In the CLI
Search results in the CLI return the value for the event hash result in the _decoration field.
Manipulate and run reports on the _decoration field the same way as you'd do for any other field.
Example:
./ splunk search " * | top _decoration"
The resulting output:
218
_decoration count percent
-------------- ----- ---------
audit_valid 50 50.000000
audit_tampered 50 50.000000
Configure custom decorations for Splunk Web
Splunk event hashing uses the standard decorations described in the above section. Configuring
custom decorations is optional.
Event hashing decorations are controlled by the same "valid" and "tampered" keys used by audit
decorations. Configure the "valid" and "tampered" keys in
$SPLUNK_HOME/etc/system/local/decorations.conf to change event hashing decorations.
Follow the instructions on the customize audit decorations page for detailed instructions.
Configure event hashing
Turn on event hashing by adding an [eventHashing] stanza to audit.conf. If you want to add
filters to event hashing, list each filter for which you have a filterSpec stanza in a
comma-separated list in the filters = key.
Configure filtering
Set up filters for event hashing in audit.conf. Create a stanza after the [eventHashing] stanza
to define a filter. Specify the details of each filter using comma-separated lists of hosts, sources, and
sourcetypes.
[filterSpec:FilterType:NameOfFilter]
host=<comma separated list of hosts>
source=<comma separated list of sources>
sourcetype=<comma separated list of sourcetypes>
Next, turn on specific filters by adding a filter= key under the [eventHashing] stanza with a list
of the names of the filters you want enabled.
[eventHashng]
filters=filter1,filter2,...
Note: The filter list is an OR list that is evaluated left to right. Currently, there is no support for an
AND list of filters.
219
Event hashing filter precedence
Filters are evaluated from left to right. 1.
Whitelist filters are evaluated before blacklist filters. 1.
If an event doesn't match a filter and no more filters exist, then it will be hashed. 1.
Configure a whitelist filter
Create a whitelist filter by changing the filter type in the filterSpec stanza to event_whitelist.
[filterSpec:event_whitelist:<specname>]
Configure a blacklist filter
Create a blacklist filter by changing the filter type in the filterSpec stanza to event_blacklist.
[filterSpec:event_blacklist:<specname>]
Example filter configurations
Turn on hashing for all events:
[eventHashing]
(Yes, just one line.)
Simple blacklisting:
Doesn't hash events from any of the listed hosts. Events that do not come from the listed hosts
will be hashed.

[filterSpec:event_blacklist:myblacklist]
host=foo.bigcompany.com, 45.46.1.2, 45.46.1.3
220
[eventHashing]
filters=myblacklist
Multiple type blacklisting:
Doesn' t hash any of the listed hosts, sources, or sourcetypes. Events from all other hosts,
sources, or sourcetypes will be hashed.

host=somehost.splunk.com, 46.45.32.1
source=/some/source
sourcetype=syslog, apache.error
[eventHashing]
filters=myblacklist
Simple whitelisting:
Hashes only events that contain the specified sourcetype. Events from any other sourcetype
won't be hashed.

(Note the use of the "all" tag in the blacklist specification.)
[filterSpec:event_whitelist:allow_syslog]
sourcetype=syslog
[filterSpec:event_blacklist:denyall]
#"all" is a special tag that matches all events
all=True
[eventHashing]
filters=allow_syslog, denyall
IT data signing
IT data signing
IT data signing helps you certify the integrity of your IT data. If you enable IT data signing and index
some data, Splunk tells you if that data is ever subsequently tampered with at the source. For
example, if you have enabled IT data signing and index a log file in Splunk, Splunk will warn you if
anyone removes or edits some entries from that log file on the original host. You can thus use Splunk
to confirm that your data has been tampered with.
Note: Signing IT data is different than signing Splunk audit events. IT data signing refers to signing
external IT data while it is indexed by Splunk; audit events are events that Splunk's auditing feature
generates and stores in the audit index.
221
How IT data signatures work
Splunk takes external IT data (typically in the form of log files), and applies digital signatures and
signature verification to show whether indexed or archived data has been modified since the index
was initially created.
A signature for a block of IT data involves three things:
A hash is generated for each individual event.
The events are grouped into blocks of a size you specify.
A digital signature is generated and applied to each block of events.
Note: Splunk can encrypt the hash to create a digital signature if you have configured the public and
private keys in audit.conf. See Configure audit event signing for details.
This digital signature is stored in a database you specify and can be validated as needed. Splunk can
demonstrate data tampering or gaps in the data by validating the digital signature at a later date. If
the signature does not match the data, an unexpected change has been made.
Configure IT data signing
This section explains how to enable and configure IT data signing. You enable and configure IT data
signing for each index individually, and then specify one central database for all the signing data.
Set configurations in indexes.conf. Edit this file in $SPLUNK_HOME/etc/system/local/, or
configuration files in general, see how configuration files work. Do not edit the copy in default.
Then, configure IT data signing by editing the indexes.conf you created.
Enable or disable IT data signing.
Specify the number of events contained in your IT data signatures.
Specify the database to store signing data in.
Note: You must configure audit event signing by editing audit.conf to have Splunk encrypt the
hash signature of the entire data block.
222
Enable or disable IT data signing
Enable and disable IT data signing by changing the value of the blockSignSize= key. This key
specifies the number of events that make up a block of data to apply a signature to. By default, IT
data signing is turned off on all indexes.
blockSignSize=<integer> (default = 0)
To enable IT data signing, set the blockSignSize= key to any integer value.
This example turns IT data signing ON in index=main, and sets the number of events per each
signature block to 100.
[main]
blockSignSize=100
...
To disable IT data signing, set the blockSignSize= key equal to 0.
This example turns IT data signing OFF for index=main.
[main]
blockSignSize=0
...
Specify the number of events in an IT data signature
Specify the number of events in an IT data signature by setting the value of the blockSignSize=
key. The default value for all indexes is 0. Set this key to a value greater than 0 to both turn on IT
data signing and set the number of events per IT signature block. You must set this key for each
index using IT data signing.
Note: the maximum number of events for the blockSignSize key is 2000.
This example sets the number of events in each IT data signature to 100 in index=main.
[main]
blockSignSize=100
...
223
Define the signature database
The IT data signature information from each index for which you have configured IT data signing is
stored in the signature database. Set the value of the blockSignatureDatabase= key to the
name of the database where Splunk should store IT signature data. This is a global setting that
applies to all indexes.
blockSignatureDatabase=<database name= string> (default = _blocksignature)
View the integrity of IT data
To view the integrity of indexed data at search time, open the Show source window for results of a
search. To bring up the Show source window, click the drop-down box at the left of any search
result. Select Show source and a window will open displaying the raw data for each search result.
The Show source window displays decorations correlating with whether the block of IT data has
gaps, has been tampered with, or is valid (no gaps or tampering).
The default decorations shown for types of events are:
Valid:
Tampered with:
Has gaps in data:
You can customize the decorations by configuring the CSS style associated with the event type.
Learn how to configure dynamic event rendering to customize the decorations.
Performance implications
Because of the additional processing overhead, indexing with IT data signing enabled can negatively
affect indexing performance. Smaller blocks mean more blocks to sign and larger blocks require more
work on display. Experiment with block size to determine optimal performance, as small events can
effectively use slightly larger blocks. The block size setting is a maximum, you may have smaller
blocks if you are not indexing enough events to fill a block in a few seconds. This allows incoming
events to be signed even when the indexing rate is very slow.
224
Turning IT data signing ON slows indexing.
Setting the blockSignSize= key to high integer values (ex: 1000) slows indexing
performance.

For best performance, set blockSignSize= to a value near 100.
Archive signing
Archive signing
Use archive signing to sign your data as it is archived (moved from colddb to frozen). This lets you
verify integrity when you restore an archive. See if your data was tampered with by comparing the
hash signatures, and can also encrypt your signatures to further prevent tampering. Configure the
size of the slice by setting your automated archiving policies.
How archive signing works
Data is archived from the colddb to frozen when
the size of colddb either reaches a maximum that you specify.
data in colddb reaches a certain age.
Specify automated archiving policies to define how your data is archived.
Splunk ships with two standard scripts, but you may use your ownData is archived from the colddb
to frozen with a coldToFrozen script that you specify (). The coldToFrozen script tells Splunk
how to format your data (gz, raw, etc..), and where to archive it. Archive signing happens after the
coldToFrozen script formats your data into its archive format, and then the data is moved to the
archive location that you specified according to your archive policy.
An archive signature is a hash signature of all the data in the data slice. Splunk can encrypt the hash
signature if you have audit event signing configured.
To invoke archive signing, use the standalone signtool utility. Add signtool -s
<path_of_archive> to the coldToFrozen script anywhere after the data formatting lines, but
before the lines that copy your data to your archive. See the section below on configuring
coldToFrozen scripts.
225
Verify archived data signatures
Splunk verifies archived data signatures automatically upon restoring. You can verify signatures
manually by using signtool -v <path_to_archive>.
Note: If your archive signatures are encrypted, you can only verify them in Splunk instances that
have a public key corresponding to the private key that the data was archived from (set when
configuring audit event signing).
Configure coldToFrozen scripts
Configure any coldToFrozen script by adding a line for the signtool utility.
Note: If you use a standard Splunk archiving script, either rename the script or move it to another
location (and specify that location in indexes.conf) to avoid having changes overwritten when you
upgrade Splunk.
Standard Splunk archiving scripts
The two standard archiving scripts that are shipped with Splunk are shown below with archive
signing.
Splunk's two archiving scripts are:
compressedExport.sh
This script exports files with the tsidx files compressed as gz.
#!/bin/sh
gzip $1/*.tsidx
signtool -s <path_to_archive> # replace this with the path to the archive you want signed
cp -r $1 /opt/tmp/myarchive #replace this with your archive directory
flatfileExport.sh
This script exports files as a flat text file.
#!/bin/sh
exporttool $1 ${1}/index.export meta::all
226
rm -rf ${1}/*.data
rm -rf ${1}/rawdata
rm -rf ${1}/*.tsidx
signtool -s <path_to_archive> # replace this with the path to the archive you want signed
cp -r $1 /opt/tmp/myarchive #replace this with your archive directory
Your own custom scripts
You can also use your own scripts to move data from cold to frozen.
Sign or verify your data slices
Use signtool located in $SPLUNK_HOME/etc/bin}} to sign data slices as they are archived or
verify the integrity of an archive.
Syntax
To sign:
signtool [- s | -- sign] archive_path
To verify:
signtool [-v | --verify] archive_path
227
Data Management
Splunk data management
Splunk data management
Splunk stores all processed data in indexes. Splunk ships with preconfigured indexes in
$SPLUNK_HOME/etc/system/default/indexes.conf. The following is a list of the indexes and
what they contain:
main: All processed data. Unless specified otherwise, this is the default index for all your data.
history: All search history.
splunklogger: Internal logs.
summary: Summary indexing searches.
_audit: Events from the file system change monitor and auditing.
_blocksignature: Event block signatures.
_internal: Metrics from Splunk's processors.
_thefishbucket: Internal information on file processing.
Each index is a collection of databases located in $SPLUNK_HOME/var/lib/splunk. Databases
are named as db_<starttime>_<endtime>_<seq_num>.
By default, Splunk searches through the main index. If you want to restrict your search to an index
other than main, use index= to specify the index in your search. For example, to search for
userid=henry.gale only in the hatch index:
index=hatch userid=henry.gale
Index management
You can add and remove indexes or move existing indexes.
Manage your indexes by configuring:
Disk usage settings
Data retirement policies
How Splunk archives data
Configure Splunk to use multiple partitions for its datastore, or use a write once, read many storage
device.
Configuration files for index management
Splunk's indexes are managed through indexes.conf. Edit this file in
Note: Settings in indexes.conf are configured per index (rather than a global server setting).
228
Before making changes to how Splunk manages data consider:
Your data retention policies.
How much data your Splunk deployment will consume (for example: 50GB/day).
Where your Splunk index datastores will live.
Create an index
Create an index
Splunk ships with an index called main that, by default, holds all your events. Splunk with an
Enterprise license lets you add an unlimited number of additional indexes. The main index serves as
the default index for any input and search command that doesn't specify an index, although you can
change the default. You can add indexes via Splunk Web, Splunk's CLI or indexes.conf.
Splunk searches automatically look through the default index (by default, main) unless otherwise
specified. If you have created a new index, or want to search in any index that is not default, you
must specify the index in your search:
index=hatch userid=henry.gale
This searches in the hatch index for the userid=henry.gale.
via Splunk Web
Note: To apply any changes that you make to the indexes, such as editing properties or adding a
new index, you must restart Splunk. In Splunk Web, you can restart the Splunk server from Admin >
Server: Control Server. Just click Restart Now.
Create a new index
The Admin > Indexes: Create Index page lets you define the properties for a new index. To create a
new index, enter:
A name for the index.
The maximum size (in MB) of the hot database.
The maximum size (in MB) of the index.
Note: Index names must consist of only numbers, letters, periods, underscores, and dashes.
If you check Advanced settings, the list of properties expands. Advanced properties include:
229
The maximum number of search results.
The maximum number of warm database directories.
The maximum number of cold databases open at any given time.
The frequency that new hot database are to be created.
The frequency that cold databases are to be frozen.
The script and directory to archive the index's data.
The number of concurrently running optimize processes.
Whether to wait for optimize processes to finish or just kill them.
The number of extra threads to use during indexing.
The amount of memory (in MB) to allocate for indexing.
The number of events to make up a block for block signatures.
After setting the index's properties, click Add. Then, restart Splunk to save and apply your changes.
You can also edit an index at any time by clicking on the index name within the Indexes tab of the
Admin section of Splunk Web. Properties that you cannot change are grayed out. To change these
properties, use indexes.conf.
via Splunk's CLI
To add an index, first shutdown Splunk with splunk stop. Then navigate to Splunk's CLI. Then
type:
./splunk add index [name]
Note: Do not use capital letters in your index name; this is a known problem that will be fixed.
The add index command brings you to a dialog session. Specify the configurations of your new
index:
./splunk add index hatch
Hit enter to accept the default values in parenthesis, or enter your own values.
Note: To apply any changes that you make to the indexes, such as editing properties or adding a
230
new index, you must restart Splunk. If you restart Splunk from the CLI, you are prompted to approve
the creation of the new index. To restart without requiring a response (to automatically respond
"yes"), type:
./splunk restart --answer-yes
via indexes.conf
Add a stanza to indexes.conf in $SPLUNK_HOME/etc/system/local. See configuration details
and examples in indexes.conf.spec.
Delete an index
To remove any indexes you don't want, use indexes.conf or Splunk's CLI.
via indexes.conf
Remove the index stanza from indexes.conf. Custom indexes are in
$SPLUNK_HOME/etc/system/local, or you application directory in
$SPLUNK_HOME/etc/system/apps
via the CLI
You can also delete an index through the CLI.
./splunk remove index [name] 1.
</pre>
This command deletes the index from your Splunk instance.
Remove (delete) data
Remove (delete) data
Use Splunk's tools to remove various types of data from your Splunk installation. With Splunk's tools,
you can remove:
Event data from an index.
User account data (all of your created user accounts).
231
Events from searches.
Note: You must have admin privileges to remove data.
You have two options when removing data from Splunk:
Use the clean command in the CLI to completely remove data (event and user data) from the
index. Typically, you do this before re-indexing all your data. Note: You must shut down
splunkd to delete data in this manner.

Use the delete:: modifier to specify that certain events not appear in search results.
Because delete:: is slower than clean, use it only if you wish to re-index a small subset of
your data sources--perhaps you want to reconfigure time stamp recognition for a single data
source before re-indexing it. For example, delete events from the source "foo"
(delete::source::foo) if you wish to re-index the source "foo".

Caution: Removing data is irreversible. Use caution when choosing what events to remove from
searches, or what data to remove from your Splunk installation. If you want to get your data back, you
must re-index the applicable data source(s).
Important: The CLI delete modifier was inadvertently disabled in versions 3.3.3 and 3.3.4 of
Splunk, and was reinstated in version 3.4. If you are running version 3.3.3 or 3.3.4, you can reinstate
it by hand. To reinstate it, add the following XML snippet after the domain finder module (at around
line 364) in $SPLUNK_HOME/etc/searchLanguage.xml:
<module>
<name>delete</name>
<requiredArgs>
<arg>delete</arg>
</requiredArgs>
<optionalArgs>
<arg>deleterestrict</arg>
</optionalArgs>
<defaults>
<delete>typeahead_suppress</delete>
<deleterestrict>typeahead_suppress</deleterestrict>
</defaults>
</module>
Then, restart Splunk.
The CLI command: clean
The clean CLI command deletes event data and user account data from your Splunk installation.
clean takes the following arguments: eventdata, userdata, and all.
Add the -f parameter to force clean to skip its confirmation prompts.
Note: From the Splunk CLI, type ./splunk help clean to access the help page for clean.
232
Remove event data from an index
Permanently remove event data from an index on your Splunk installation by typing ./splunk
clean followed by the eventdata argument. Specify an index to delete event data from a specific
index. If you don't specify an index, Splunk deletes all event data from all indexes.
Examples
Note: You must first stop Splunk before you can run any of these commands:
./splunk stop
This example tells Splunk to remove event data in all indexes (because no index argument is
specified).
./splunk clean eventdata
This example removes indexed event data from the internal index and forces Splunk to skip the
confirmation prompt.
./splunk clean eventdata internal -f
Remove user data
Remove user data (user accounts you've created) from your Splunk installation by typing ./splunk
clean followed by the userdata argument.
Examples
This example removes all of the user accounts you've created.
./splunk clean userdata
This example removes the user accounts you've created and forces Splunk to skip the confirmation
prompt.
./splunk clean userdata -f
Remove all data
Remove all user and indexed event data to return Splunk to its original installation state by typing
./splunk clean followed by the all argument.
Examples
This example removes all user and indexed event data.
./splunk clean all
This example removes all user and indexed event data you've created and forces Splunk to skip the
confirmation prompt.
233
./splunk clean all -f
Remove events from search results
This ONLY works in the CLI.
Use the delete:: modifier to remove events from your index based on an indexed field value, or by
matching a string. Access the delete:: modifier by using the oldsearch command in a CLI
search.
The delete:: modifier doesn't delete events from the index; it masks events from being displayed
in search results by flagging them with a value that makes them unsearchable.
Caution: Removing data is irreversible. Use caution when choosing what events to remove from
searches, or what data to remove from your Splunk installation. If you want to get your data back, you
must re-index the applicable data source(s).
Note: oldsearch is the deprecated version of the search command that you need to use to access
the delete:: modifier.
Syntax
In the CLI:
./splunk search ' | oldsearch delete::(host | source | sourcetype)::value '
Enter all fields and values in lowercase.
You can remove events based on values of any indexed field.
You can also remove events that match a string (delete::<string>) instead of matching a
field::value pair. The strings can't contain any spaces or commas, and you can't specify
multiple strings in a single argument.

Note: You need to authenticate when using oldsearch delete::xxx. Use the -auth search
parameter.
Examples
This example removes events of sourcetype=bar from the search results.
./splunk search ' | oldsearch delete::sourcetype::bar' -auth admin:changme
This example removes events from the host "webserver1".
./splunk search ' | oldsearch delete::host::websever1' -auth admin:changeme
NB: On Windows machines, use double-quotes (") instead of single-quotes (').
234
Export event data
Export event data
Use the export CLI command to copy or archive events from Splunk's indexes. The export
command does not remove any data -- it just makes a copy.
Important: Because the export command runs on active index files, you may lose data unless you
stop Splunk before using it. You can run this command while Splunk is running, however.
via the CLI
Note: To use Splunk's CLI, navigate to the $SPLUNK_HOME/bin/ directory and preface CLI
commands with ./splunk.
At minimum, specify an index from which to export data and a directory to copy the exported data
into:
./splunk export eventdata <index> -dir <directory to copy into>
Optionally add search restrictions with -host, -source, -sourcetype or -terms:
./splunk export eventdata <index> -dir <directory to copy into> -host <host> -sourcetype <sourcetype> -source <source> -terms <search terms
The exported data is recreated in directories and files matching the original sources in the destination
directory.
For example:
./splunk export eventdata my_apache_data index -dir /tmp/apache_raw_404_logs -host localhost -terms "404 html"
Export to CSV
You can export search results to CSV with the following commands.
To export the results of a search:
./splunk search '<search criteria>' -maxresults 200 -format csv >/splunk/export.csv
To export the results of a dispatched search:
./splunk dispatch '<search criteria>' -maxout 200 -format csv >/splunk/export.csv
Note: Type: ./splunk help export to see all of the export command's available arguments
and parameters.
via Splunk Web
To export data via Splunk Web, run your search and choose Export from the drop-down menu to the
left of the search box.
235
Select the format of the results (txt or CSV) and and the number of events that should be exported.
Move the Splunk index
Move the Splunk index
Move your Splunk indexed data from one location to another.
Caution: Do not try to break up and move parts of an index data store manually. If you need to
subdivide an existing index, contact Splunk Support for assistance.
Configuration
1. Make sure the target filesystem has enough space - at least 1.2 times the size of the total amount
of raw data you plan to index.
2. Make sure the target directory has the correct permissions so that the splunkd process can write
to files there.
# mkdir /foo/bar
# chown splunk /foo/bar/
# chmod 755 /foo/bar/
3. When the new index home is ready, stop the server (if it is running) from Splunk's CLI.
command. Or add Splunk to your path and use the splunk command.
# ./splunk stop
236
4. Copy the existing index filesystem to its new home.
# cp -r $SPLUNK_DB/* /foo/bar/
5. Edit ./etc/splunk-launch.conf to reflect the new index directory.
6. Inside ./etc/splunk-launch.conf, change the SPLUNK_DB variable to point to your new
index directory.
SPLUNK_DB=/foo/bar
Note: Ensure that the path $SPLUNK_HOME/var/lib/splunk/searches exists. Splunk saves a
small amount of index data here and without it your index may appear to vanish.
7. Start the server.
# ./splunk start
The Splunk Server picks up where it left off, reading from and writing to the new copy of its old index
filesystem.
Set a retirement and archiving policy
Set a retirement and archiving policy
Configure data retirement and archiving policy by controlling the size of indexes or the age of data in
indexes.
For a discussion of the best practices for for backing up your Splunk data, see "Best practices for
backing up" on the Deployment Wiki. For a related discussion of "buckets", and how Splunk uses
them, see "Understanding buckets" on the Deployment Wiki.
Caution: Whenever you change your data retirement and archiving policy settings, Splunk deletes
old data without prompting you.
Note: All index locations must be writable to configure data
Splunk indexes go through four stages of retirement. When an index reaches a frozen state, Splunk
deletes ALL frozen data by default. You must specify a valid coldToFrozenScript in
$SPLUNK_HOME/etc/system/local/indexes.conf (or your own custom app directory in
$SPLUNK_HOME/etc/apps/ to avoid losing your data.
237
Retirement
stage
Description Searchable?
Hot
Open for writing. Only one of these
for each index.
Yes.
Warm
Data rolled from hot. There are
many warm indexes.
Yes.
Cold
Data rolled from warm. There are
many cold indexes.
Only when a search's time range applies to
data in the Cold stage.
Frozen
Data rolled from cold. Eligible for
deletion.
Splunk deletes frozen data by default.
Splunk defines the sizes, locations, and ages of indexes in indexes.conf.
Note: Edit indexes.conf in $SPLUNK_HOME/etc/system/local/, or your own custom
general, see how configuration files work. Do not edit the copy in default.
Remove files beyond a certain size
If an index grows bigger than a specified maximum size, the oldest data is archived into frozen. To
set this maximum size, add the following line to your custom indexes.conf.
maxTotalDataSizeMB = <non-negative number> (500000)
Example:
[main]
maxTotalDataSizeMB = 2500000
Restart Splunk for the new setting to take effect. It may take up to 40 minutes for Splunk to move
events out of the index to conform to the new policy. You may see high CPU usage during this time.
Note: Make sure that the data size you specify for maxTotalDataSizeMB = is expressed in
Megabytes.
Remove files beyond a certain age
Splunk ages out data by buckets. Specifically, when the most recent data in a particular bucket
reaches the configured age, the entire bucket is rolled. If you are indexing a large volume of events,
bucket size is less a concern for retirement policy because they fill quickly. You can adjust the bucket
size by setting maxDataSize in indexes.conf smaller so they roll faster. But more, smaller buckets
take more time to search than fewer, larger buckets. To get the results you are after, you will have to
experiment a bit for the right size. Due to the structure of the index, there isn't a direct relationship
between time and data size.
Set the variable frozenTimePeriodinSecs in indexes.conf to the number of seconds after which
indexed data should be erased. The example below configures Splunk to cull old events from its
index when they become more than 180 days old. The default value is approximately 6 years.
238
[main]
frozenTimePeriodInSecs = 15552000
Restart the server for the new setting to take effect.
Note: Make sure that the time you specify for frozenTimePeriodInSecs = is expressed in
seconds.
Automate archiving
Automate archiving
Set up Splunk to archive your data automatically as it ages. To do this, configure indexes.conf to call
archiving scripts located in $SPLUNK_HOME/bin. Edit this file in
$SPLUNK_HOME/etc/system/local/, or in your own custom application directory in
configuration files work. Do not edit the copy in default.
Note: By default, Splunk deletes ALL frozen data. To avoid losing your data, you must specify a
valid coldToFrozenScript in $SPLUNK_HOME/etc/system/local/indexes.conf (or your
own custom app directory in $SPLUNK_HOME/etc/apps/).
Use Splunk's index aging policy to archive
Splunk rotates old data out of the index based on your data retirement policy. Data moves through
several stages, which correspond to file directory locations. Data starts out in the hot database
$SPLUNK_HOME/var/lib/splunk/defaultdb/db/db_hot. Then, data moves through the
warm database $SPLUNK_HOME/var/lib/splunk/defaultdb/db. Eventually, data is aged into
the cold database $SPLUNK_HOME/var/lib/splunk/defaultdb/colddb.
Finally, data reaches the frozen state. Splunk erases frozen index data once it is older than
frozenTimePeriodinSecs in indexes.conf. The coldToFrozenScript (also specified in
indexes.conf) runs just before the frozen data is erased. The default script simply writes the name
of the directory being retired, e.g. /opt/splunk/var/lib/splunk/defaultdb/colddb, to the
log file $SPLUNK_HOME/var/log/splunk/splunkd_stdout.log.
Add the following to $SPLUNK_HOME/etc/system/local/indexes.conf:
[<index>]
coldToFrozenScript = <script>
[<index>]
Specify which index to archive.

Specify the archiving script to use by changing <script>.
Define <$script> paths relative to splunk/bin.
Splunk ships with two default archiving scripts that you can use.
Note: Rename and then modify these scripts to set the archive location for your
installation. By default, the location is set to opt/tmp/myarchive.

compressedExport.sh: Export with tsidx files compressed as gz.

239
flatfileExport.sh: Export as a flat text file.
Note: Either rename the script you use or move it to another location (and specify that location in
indexes.conf) to avoid having changes overwritten when you upgrade Splunk.
Windows users use this notation: coldToFrozenScript = <script> "$DIR"
<script> can be either:
compressedExport.bat (download the script here).
flatfileExport.bat (download the script here).

Note: Either rename the script you use or move it to another location (and specify that location in
indexes.conf) to avoid having changes overwritten when you upgrade Splunk.
Restore archived data
Restore archived data
Archived data can be restored by moving the archive into the thawed directory,
/var/lib/splunk/defaultdb/thaweddb. An archive can be restored to any Splunk server
regardless of platform. Data in thaweddb is not subject to the server's index aging scheme (hot >
warm> cold > frozen). You can put old archived data in thawed for as long as you need. When the
data is no longer needed, simply delete it or move it out of thawed.
The details of how to restore archived data depends on how it was archived.
Note: you can restore archived data to any index or instance of Splunk. Archived data does not need
to be restored to its pre-archival location.
Restore with resurrect
The resurrect command can be used from Splunk's CLI to selectively restore events from an archive.
You specify the archive location, the index to hold the restored events, and the time range for the
restore.
Syntax of the command is:
resurrect archive_directory index from_time end_time
Note: It is not necessary to stop and start the server when adding or removing from thaweddb.
For example:
./splunk resurrect /tmp/myarchive oldstuff 01/01/2000:00:00:00 01/01/2001:00:00:00
240
This command will restore the events from the year 2000 that are found in the archive in
/tmp/myarchive. The events will be placed in the oldstuff index. If you archived with
compressed indexes, Splunk will uncompress them. If you archived without indexes, Splunk will
rebuild the indexes.
When you are through using the archived data, you can remove it with unresurrect. Unresurrect
can also be used to remove some events from a restored archive. For example:
./splunk unresurrect oldstuff 07/01/2000:00:00:00 08/01/2000:00:00:00
Will remove events from the month of July from the index oldstuff.
Restore a copied index archive
You can also copy or move in a previously saved archive to thawed. Use cp if you want to move the
entire db file instead of specifying the time and index.
# cp -r db_1181756465_1162600547_0 $SPLUNK_HOME/var/lib/splunk/defaultdb/thaweddb
Back up your data
Back up your data
Back up your configurations or all your indexed data.
Back up your Splunk configurations only
To back up your configurations, make an archive or copy of $SPLUNK_HOME/etc/ (where
$SPLUNK_HOME is the directory into which you installed Splunk, /opt/splunk by default). This
directory contains all the default and custom settings for your Splunk install, including your saved
searches, user accounts, tags, custom source type names and configuration files.
Copy this directory to a new Splunk instance to restore. You don't have to stop Splunk to do this.
Back up your indexed data
This topic discusses some considerations for planning your Splunk index backup strategy. It first
gives an overview of how your indexed data moves through Splunk, then makes recommendations
for planning your backup strategy based on this information.
For specific details on changing the default values mentioned in this topic, refer to this topic about
setting up data retirement policies. For a discussion of the best practices for for backing up your
Splunk data, see "Best practices for backing up" on the Deployment Wiki. For a related discussion of
"buckets", and how Splunk uses them, see "Understanding buckets" on the Deployment Wiki.
241
How data moves through Splunk
When Splunk is indexing, the data moves through a series of stages based on policies that you
define. At a high level, the default behavior is as follows:
When data is first indexed, it is put into the hot database, also known as the hot db.
The data remains in the hot db until the policy conditions are met for it to be reclassified as warm
data. This is called rolling the data into the warm db. By default, this happens when the hot db
reaches a specified size, but you can set up a saved search to force it to happen on a schedule, or
better still, write a script to force it to happen on a schedule from the Splunk CLI. Some details on
doing this are given a little later in this topic.
When the hot db is rolled, its directory is renamed to be a bucket in the warm db, and a new hot db is
created immediately to receive the new data being indexed. At this point, it is safe to back up the
warm db buckets.
Next, when you get to a specified number of warm buckets (the default value is 300 buckets), buckets
are renamed to be cold buckets to maintain 300 warm buckets. (If your cold db is located on another
fileshare, the warm buckets are moved to it and then deleted from the warm db directory.) Be aware
that the more warm buckets you have, the more places Splunk has to look to execute searches, so
adjust this setting accordingly.
Finally, when your data meets the policy requirements defined, it is frozen. The default setting for this
is to delete them. If you need to save data indefinitely, you must change this setting.
Summary:
hot db: being written to currently, non-incrementally changing; don't back this up, back up the
warm db instead

warm db: being added to incrementally, can be safely backed up, made up of multiple warm
'buckets'

cold db: based on your policy (default 300 buckets), when you get to that many buckets,
buckets are either renamed (like from hot) or copied (if on another filesystem) to cold (and
deleted from the warm directory)

frozen: default policy is to delete.
Choose your backup strategy
The general recommendation is to schedule backups of your warm db buckets regularly using the
incremental backup utility of your choice.
Splunk's default policy is to roll the hot db to the warm db based on the policy you define. By default,
this policy is set to roll the data when your hot db reaches a certain size. If your indexing volume is
fairly low, Splunk's default 'rolling' policy means that your hot db will be rolled to warm very
infrequently. This means that if you experience a disk failure in which you lose your hot db, you could
lose a lot of un-backed-up data from your hot db.
If you're concerned about losing data in this fashion, you can configure Splunk to force a roll from hot
to warm on whatever schedule you're comfortable with. and then schedule your backup utility to back
242
up the warm db immediately after that.
You should note, however, that if you roll too frequently, you might experience a degradation in
search speed, as well as use more disk space than you otherwise would. Every time data is rolled
from hot to warm, a new 'bucket' is created, which means that searches have to look in more buckets
to see all the data. As a result, Splunk recommends that you roll no more frequently than once a day.
Tune this to suit your particular data retention, search performance, and backup needs.
If your environment requires that you back up more than once a day, you can deploy Splunk in an HA
configuration where forwarders are configured to send all your data to two different Splunk indexers,
and use the second one as your hot backup.
Rolling from the CLI
You can use the following syntax to force a roll of the hot db to warm:
./splunk search '| oldsearch !++cmd++::roll' -auth admin:changeme
This will roll the default index, which is typically main.
You can specify an index to be rolled like this:
./splunk search ' | oldsearch index=_internal !++cmd++::roll' -auth admin:changeme
You'll always see an error about Search Execute failed because Hot db rolled out to
warm right afterwards; you can safely ignore it. You'll also need to provide the admin password to
execute this CLI command.
If you want to roll more than one index, you have to do them each separately. To list out your
indexes, use ./splunk list index
Recommendations for recovery
If you experience a non-catastrophic disk failure (for example you still have some of your data, but
Splunk won't run), Splunk recommends that you move the index directory aside and restore from the
backup rather than restoring on top of a partially corrupted datastore. Splunk will automatically create
the db-hot directories on startup and resume indexing. Monitored files and directories will pick up
where they were at the time of the backup.
Before you Restore
If you restore a full /opt/splunk backup, check these two items before starting the new instance.
License key (Splunk Professional)
Your backup may include an expired license key in $SPLUNK_HOME/etc/splunk.license. Install
a current one or get a temporary evaluation key from splunk.com if you don't have one.
243
Active input configurations
If you don't want your restored Splunk Server to instantly begin adding new data to its index, move
any active inputs.conf files out of the way before starting the server. This is useful if you want to
revisit an old index without having new events added to it.
# mv $SPLUNK_HOME/etc/system/local/input.conf $SPLUNK_HOME/etc/system/local/input.conf.disabled
# mv $SPLUNK_HOME/etc/system/default/input.conf $SPLUNK_HOME/etc/system/default/input.conf.disabled
# splunk start
Disk usage
Disk usage
There are several methods for controlling disk space used by Splunk. Most disk space will be used by
Splunk's indexes and compressed log files (collectively called the database). If you run out of disk
space, Splunk will stop indexing. You can set a minimum free space limit to control how low you will
let free disk space fall before indexing stops. Indexing will resume once you space exceeds the
minimum.
Set minimum free disk space
Use settings in Splunk Web to set a minimum amount of disk space to keep free on on the disk where
indexed data is stored. If the limit is reached, the server stops indexing data until more space is
available.
Note:
The Splunk server will not clear any of its own disk space under this method. It will simply wait
for more space to become available.

Some events may be lost if they are not written to a file during the paused period.
In Splunk Web
Click Admin in the upper right corner of the web interface.
Click the Server tab.
Click on Settings heading.
Under the Datastore section, find Pause indexing if free disk space falls below ___ MB:
Enter your desired minimum free disk space in megabytes.
Click Save at the bottom of the page.
244
Restart Splunk for your changes to take effect.
From the Command line interface (CLI)
You can set the minimum free megabytes via Splunk's CLI. To use the CLI, navigate to the
$SPLUNK_HOME/bin/ directory and use the ./splunk command. You can also add Splunk to your
path and use the splunk command:
# splunk set minfreemb 20000 # set minfree to 20GB
# splunk restart
Set database size
Controls for indexes are in indexes.conf. You can control disk storage usage by controlling total index
size, age of data in the database, and aging policy. When one of these limits is reached, data will be
removed. You can archive the data using one of Splunk's predefined archive scripts or create your
own. Edit this file in $SPLUNK_HOME/etc/system/local/, or your own custom application
how configuration files work. Do not edit the copy in default.
Set the following indexes.conf:
maxTotalDataSizeMB = (500000)
* The maximum size of an index. If an index grows bigger than this the oldest data is frozen out.
and set it to it new value (in megabytes)
Example:
[main]
Restart Splunk for your changes to take effect. It may take some time, up to 30 or 40 minutes, for
Splunk to move events out of the index to conform to the new policy, during which you may see high
CPU usage.
Use separate partitions for Splunk's datastore
Use separate partitions for Splunk's datastore
Splunk can use separate disks and partitions for its datastore. It's possible to configure splunk to use
many disks/partitions/filesystems on the basis of indexes and warm/cold, so long as you mount them
correctly and configure the DB rolling. However, we recommend that you use a single high
performance file system to hold your Splunk data for the best experience.
Splunk indexes roll through four DB stages:
Hot - open for writing. Only one of these for each index. Searchable.
Warm - data rolled from hot. There are many warm DBs. Searchable.
Cold - data rolled from warm. There are many cold DBs. Searched only when the search
245
specifies a time range included in these files.
Frozen - buckets entering the frozen state are immediately deleted.
If you do use seperate partitions, the most common way to arrange Splunk's datastore is to keep the
hot and warm databases on the local machine, and to keep the cold database on a separate array or
disks (for longer term storage). You want to run your hot and warm databases on a machine with
partitions that read and write fast (since you'll be doing a majority of your search operations on hot
and warm). Cold should be on a reliable array of disks.
Bucket flow:
The single hot bucket rolls to warm when it reaches the specified size (maxDataSize)
Buckets roll from warm to cold when the number of warm buckets exceeds the configured
maximum count (maxWarmDBCount)

Buckets stay in cold (or warm) until they are selected for archiving
In the default splunk configuration, you may experience pauses in indexing and searching when you
use separate partitions for the datastore. While buckets are being transferred from one partition to
another, searches will not run. To alleviate this, you should contact Splunk Support for a
warmToColdScript which allows the bucket to be transferred with very minimal pausing.
Set up separate partitions
Set up partitions just as you'd normally set them up in any operating system. Mount the
disks/partitions, and make sure Splunk points to the correct path in indexes.conf.
First, add the correct paths in $SPLUNK_HOME/etc/system/local/indexes.conf. Set paths on
a per-index basis -- under an [$INDEX] entry.
homePath = <path on server>
The path that contains the hot and warm databases and fields for the index.
Databases that are warm have a handle open to them at all times in splunkd.
CAUTION: Path MUST be writable.
coldPath = <path on server>
The path that contains the cold databases for the index.
Cold databases are opened as needed when searching.
CAUTION: Path MUST be writable.
thawedPath = <path on server>
The path that contains the thawed (resurrected) databases for the index.
If you put your cold DB on a separate partition, you should set a warmToColdScript in
indexes.conf. Set up a script to move your warm DBs from one partition to the partition where you
store your cold DBs.
246
warmToColdScript = <$script>
Specify a script to run when moving data from warm to cold.
The script must accept two variables:
first, the warm directory to be rolled to cold.
second, the destination in the cold path.

You only need to set this if you store warm and cold dbs on separate partitions.
Contact Splunk Support before configuring this setting.
Since buckets in the splunk db directories must be complete and coherent, a simple copy will
cause many problems.

Defaults to empty.
Use WORM (Write Once Read Many) volumes for Splunk's
datastore
Use WORM (Write Once Read Many) volumes for Splunk's datastore
Configure Splunk to use WORM (Write Once Read Many) volumes for its indexes by editing
indexes.conf. Edit this file in $SPLUNK_HOME/etc/system/local/, or your own custom
general, see how configuration files work.
Note: To use WORM volume for indexes, you must configure Splunk to push data to its warm, and
cold databases differently.
In a typical Splunk index configuration (with multiple-write disks), Splunk manages its indexes by
reading and writing into the hot database. It then pushes data to the warm database, where it is
written and read multiple times. Finally, it pushes data to the cold database, where is it written once
and stored until it is pushed to frozen.
In a write-once setup, data from the index never goes to the warm database. Data goes from hot
directly to the cold database because it is written once and never required to be written again.
Configuration
Determine data retention specifications.
Next, figure out how much data you will be passing into Splunk. 500MB/day? 50GB/day?
Use that information to determine the size and number of buckets in your indexes (example:
20GB/day retained for 30days = 60 buckets). This is how many buckets you will need in your
cold database.

Next, edit the following attributes/values in indexes.conf:
[<index name>]
maxWarmDBCount = 0
247
maxColdDBCound = <number of buckets>
Set maxWarmDBCount = 0 to keep data from going into the warm database (failure to do so
will cause Splunk to crash in a WORM configuration).

Set maxColdDBCount to a number greater than the anticipated number of buckets.
Mount your WORM to the location of the cold database. Set the path to:
$SPLUNK_HOME/var/lib/splunk/defaultdb/cold.

248
Deployment Server
How the deployment server works
How the deployment server works
This section describes the Splunk deployment server and its features. If you want an overview of the
different ways you can design a Splunk deployment in your organization, check out the deployment
information in the Community Wiki.
A deployment server is Splunk instance that acts as a centralized configuration manager, grouping
together and collectively managing any number of Splunk servers. Any Splunk instance -- even one
indexing data locally -- can act as a deployment server. Splunk servers that are remotely configured
by deployment servers are called deployment clients. A Splunk instance can be both a deployment
server and client at the same time.
To change the configuration of one or more of your Splunk deployment clients, you push out updated
configuration information for a given server class (defined below) from the deployment server to the
deployment clients that belong to the server class.
Note: If you're running Splunk 3.4 or later and looking for instructions on enabling the Splunk light
forwarder via the deployment server, go here.
Note: The Splunk desktop configuration (available in version 3.4 and later) disables deployment
server functionality, but supports running as a deployment client. To run the Splunk deployment
server, you must disable the desktop configuration app. Refer to the topic about the Splunk desktop
app for more information.
Server classes
A server class is a Splunk deployment client configuration that you have defined. To manage client
configurations, assign a Splunk deployment client to one or more server classes. Then, you can
manage all the Splunk deployment clients that belong to a given server class as a unit. A deployment
client may be a member of multiple server classes at once. You can group clients by application, OS,
data type to be indexed, or any other feature of your Splunk deployment.
249
In most cases, server class membership information is located on the deployment server in
deployment.conf. By default, the deployment server stores the configuration information for each
server class in $SPLUNK_HOME/etc/modules/distributedDeployment/classes, arranged in
subdirectories by the name of the server class. For example, if you have defined the following server
classes: web, apache, OS, your
$SPLUNK_HOME/etc/modules/distributedDeployment/classes must contain three
subdirectories: ../web/, ../apache/ and ../OS/.
To set up server classes, read about configuring server classes.
How configuration files are deployed
To make changes to configurations for a server class, edit the specific configuration file within the
server class subdirectory. Then, reload the server configurations. Clients that are members of the
affected server class receive the configurations in a tarball. The client stores the tarball in its
$SPLUNK_HOME/etc/bundles/ directory, named by server class and timestamp. Finally, the client
restarts Splunk to reload configuration changes.
Note: Although in version 3.3 the Splunk configuration directory structure has changed (to
$SPLUNK_HOME/etc/system, as described here), the Deployment server still uses the old
configuration directory structure for backwards compatibility purposes.
The configuration fetch procedure generates an event in the Splunk logs. Splunk administrators can
search for the event to check that all deployments worked properly. If the configuration fetch and kick
procedure did not work properly, the previous configurations are restored and an event detailing this
is sent to the indexer.
For details on configuration changes, see configuration changes after initial deployment.
Note: Currently, the deployment server does not push scripted inputs. In the future, Splunk will
support this behavior. For now, use your preferred configuration automation tool to push your script
directory to clients.
Location of configuration files for deployment server
Both deployment servers and clients are configured in deployment.conf.
Communication between deployment server and clients
Splunk deployment server currently supports two different methods of client/server communication:
multicast or polling. Multicast only works on a LAN. Polling, however, can broadcast across subnets.
If your Splunk deployment spans multiple subnets, you must use polling. If you're just setting up
deployment within a LAN, multicast is the recommended method. All communication is sent via SSL.
Multicast:
The deployment server sends multicast packets.
Multicast packets contain the server class name and a CRC (checksum) for the
configuration information files.

Deployment clients listen on the deployment server's multicast URI.

250
Deployment clients can also specify the interface if needed, otherwise the kernel will
select one as the default.

Polling:
The deployment client periodically polls the deployment server, requesting applicable
server class/CRC pairs.

Configurations for deployment server and clients are dependent on one-another; if you use multicast,
you must use it on both client and server.
Where to go next
To set up a deployment server, read about configuring the deployment server. To configure a client,
see this page.
Configure a Splunk deployment server
Configure a Splunk deployment server
A Splunk deployment server sends configuration changes to deployment clients. Configurations are
stored in directories divided by server class. To configure server classes, read configuring server
classes.
Any Splunk instance can be a deployment server. First, install Splunk on the server. Then, configure
settings via deployment.conf.
Edit deployment.conf
First, create a deployment.conf in $SPLUNK_HOME/etc/system/local/ (or your own custom
directory).
Include the following as the first stanza in deployment.conf: [distributedDeployment]
[distributedDeployment]
Include this stanza header to load the deployment server module.
Note: You must include this stanza header, even if you don't specify a serverClassPath.
Optionally specify the path to the server class configurations:
serverClassPath=$SPLUNK_HOME/etc/modules/distributedDeployment/classes
This is the path to server class configurations.
Defaults to $SPLUNK_HOME/etc/modules/distributedDeployment/classes.
Do not change the default, unless you decide to store your server class configurations in a
different directory.

Next, configure server classes. The server class stanza looks like:
251
[distributedDeployment-classMaps]
$IP_RANGE1 | $DNS1 = $SERVER_CLASSA, $SERVER_CLASSB
$IP_RANGE2 | $DNS2 = $SERVER_CLASSC
Finally, set server parameters for either multicast or polling. You must stick with either multicast or
polling on both the client and server side.
Specify communication over multicast
If your deployment server and all clients are on the same LAN, use multicast for communication
among them.
A stanza for multicast looks like this:
[distributedDeployment-multicast]
sendMulticast=true
multicastURI=<IP:PORT>
interfaceIP=<IP>
frequency=<integer>
useDNS=<true/false>
Set multicast configuration options under this stanza name.
If you do not specify an entry for each attribute, Splunk will use the default value.
sendMulticast = <true/false>
To use multicast, set this to true.
Defaults to false.
multicastUri = <IP:Port>
What multicast group to send to.
Only used if 'sendMulticast = true'.
Multicast is disabled if this field is not set.
No default.
interfaceIP = <IP Address>
Optional setting.
The IP address of the interface to send multicast packets on.
Defaults to whatever the kernel picks (usually sufficient).
frequency = <integer>
How often (in seconds) to send multicast packets.
Defaults to 30 seconds.
useDNS = <true/false>
252
Optional setting.
Look up host name.
Defaults to false.
Specify communication by polling
If your deployment server and its clients are across multiple subnets, you must use polling for
communication among them.
A stanza for polling looks like this:
sendMulticast=false
sendMulticast=false
Set this to false to enable polling.
NOTE: With polling, most configurations are set on the client side.
Example multicast configuration
Configure your deployment.conf and place it in $SPLUNK_HOME/etc/system/local/ or your
own custom configuration directory.
Here's a basic config, enabled for multicast:
serverClassPath=/opt/splunk/etc/modules/distributedDeployment/classes
sendMulticast=true
multicastUri=225.0.0.39:9999
www.* = web,apache
10.1.1.2* = osx
Important: The multicastUri port, shown here as 9999, should be set to your splunkd or
management port.
Example polling configuration
Configure your deployment.conf and place it in $SPLUNK_HOME/etc/system/local/ or your
own custom configuration directory.
Here's the same basic config, but enabled for polling:
253
sendMulticast=false
www.* = web,apache
10.1.1.2* = osx
Configure server classes
Configure server classes
Deployment clients receive configurations via server class membership. A deployment client can be a
member of multiple server classes at once, depending on which configurations updates it should be
aware of. Group clients by application, OS, data type to be indexed or any other feature of your
Splunk deployment. Servers class configurations are kept on the deployment server in subdirectories
of $SPLUNK_HOME/etc/modules/distributedDeployment/classes/. Each subdirectory
stores unique configurations for each server class.
For example, if you have a class called syslog, its configurations are stored in
$SPLUNK_HOME/etc/modules/distributedDeployment/classes/syslog. The
../syslog/ directory contains an inputs.conf that specifies only syslog input. It also contains an
outputs.conf, forwarding data to a centralized Splunk instance (see data distribution for more
information). Each server class directory can contain any number of applicable configuration files.
Every deployment client is automatically a member of two server classes:
Default:
The _default server class lets the administrator target all deployment clients without
having to specifically set a name for each individual client.

Hostname:
The 'hostname' server class is determined at startup time by the hostname of the
deployment client. For example, if the deployment client is www01, the hostname server
class becomes _www01 for that machine.

This server class allows for deploying configurations to individual machines.
If the word 'localhost' appears anywhere in the name, the host name class is not set for
this server.

Configuration
Set up server class maps in deployment.conf on the server side. Edit deployment.conf in
$SPLUNK_HOME/etc/system/local/ (or create a deployment.conf for the server using
instructions here). Specify server class settings under the
[distributedDeployment-classMaps] stanza heading.
$IP_RANGE1 | $DNS1 = $SERVER_CLASSA, $SERVER_CLASSB
$IP_RANGE2 | $DNS2 = $SERVER_CLASSC
...
Map IP addresses or DNS entries to server classes.
You can put a wildcard (*) anywhere in the string.
254
Important: Only one list of classes will be applied per host. Multiple entries for the same host will
overwrite each other, see how configuration files work. If both a numeric IP range and a DNS
expression match a host, only the numeric IP range will be used.
Example
Add the following stanza to $SPLUNK_HOME/etc/system/local/deployment.conf.
10.2.*.5 = apache, security
192.*.*.* = syslog
www.others* = the_others, web, apache
Configure deployment clients
Configure deployment clients
Before you configure deployment clients, determine whether you are using polling or multicast. Also,
make a map of server classes. A server class map includes which configuration files and clients
belong to each class. Learn more about server classes by reading this page.
Note: The Splunk desktop configuration (available in version 3.4 and later) disables deployment
server functionality, but supports running as a deployment client. To run the Splunk deployment
server, you must disable the desktop configuration app. Refer to the topic about the Splunk desktop
app for more information.
Next, install Splunk on each client machine and configure it as a deployment client.
Do a normal Splunk install.
You can run a script to install Splunk on each machine.

Enable clients with the instructions below.
Follow either polling or multicast, depending on which you have chosen.
Set up clients via the CLI or deployment.conf

If you are uncertain if a previously installed client is currently enabled or disabled, check the
deployment.conf file for disabled = <true/false>.
Enable/disable clients via the CLI
Note: Use only one of the following methods. The method for the client must match the server's.
polling
With polling enabled, deployment clients check for configurations from the deployment server and pull
configurations as needed.
Run the following command from the Splunk CLI on each client:
255
./splunk set deploy-poll x.x.x.x:pppp
Substitute the IP and management port of the deployment server (the management port is typically
8089).
Disable clients with the following command:
./splunk disable deploy-client
multicast
Deployment clients set with multicast receive notice of configurations from the deployment server.
Run the following command from the Splunk CLI on each client:
./splunk set deploy-multicast x.x.x.x:pppp
Substitute the multicast group IP/Port.
Disable clients with the following command:
./splunk disable deploy-client
Enable clients via deployment.conf
Alternately, configure clients via deployment.conf. Create a deployment.conf file in
$SPLUNK_HOME/etc/system/local/ or your custom configuration directory.
Note: Use only one of the following methods. The method for the client must match the server's.
multicast
[deployment-client]
multicastUri = <IP:Port, IP2:Port2, etc>
mcastInterfaceIp = <IP Address>
disabled = <true/false>
A comma-separated list of multicast addresses for deployment server instructions.
Each deployment server needs to be in control over a unique set of server classes.
Typically there is one entry in this list, or it is left blank.
Use the physical interface bound to this IP address to listen to muliticasts.
Only set this if using multicast.
Turn the client on or off.
256
polling
[deployment-client]
deploymentServerUri = <IP:Port, IP2:Port2, etc>
pollingFrequency = <integer>
maxBackoff = <integer>
serverClasses = <comma separated list>
List of deployment servers to poll.
Optional setting.
How often (in seconds) to poll each deployment server listed in 'deploymentServerUri'.
Only used if deploymentServerUri is specified.
Back off polling the deployment server at a random number between 0 and <integer> (in
seconds).

The more deployment clients controlled by a single deployment server, the higher this number
should be.

maxBackoff effectively "smooths" the number of concurrent requests on the server.
Optional setting.
List of server classes that this deployment client is a member of.
Usually set on the server side, under [distributedDeployment-classMaps].
If not set on the server side, specify membership here.
Turn the client on or off.
Example
Here are two different example deployment.conf files. Configure your deployment.conf and
place it in $SPLUNK_HOME/etc/system/local/ or your own custom configuration directory (on
the client side).
multicast
Here's a basic config, enabled for multicast:
[deployment-client]
257
polling
Here's a config, enabled for polling:
[deployment-client]
deploymentServerUri = 72.28.11.182:8089
pollingFrequency = 15
Sync the server and client
Sync the server and client
To sync the server with the clients, start the deployment server from the Splunk CLI:
./splunk start
Start the deployment clients from their Splunk CLIs. Use this command to start Splunk without a
license prompt:
./splunk start --accept-license
After syncing, each client either picks up the multicast packet as sent out by the deployment server or
polls the deployment server for configurations. Once the client receives configurations, it restarts
splunkd.
To confirm that all clients received the configuration correctly, check from the deployment server by
using ./splunk list deploy-clients . This lists all the deployment clients and the last time
they were successfully synced.
Note: If you're running Splunk 3.4 or later and looking for instructions on enabling the Splunk light
forwarder via the deployment server, go here.
Configuration changes
To make a configuration change after initial deployment, edit or add the appropriate file to the
applicable server class subdirectory. By default, the server class directory is located on the
deployment server in $SPLUNK_HOME/etc/modules/distributedDeployment/classes/. For
a list of configuration files, see the configuration file reference.
After you finish editing configuration files, alert the deployment server of the change via the Splunk
CLI. To use Splunk's CLI, navigate to the $SPLUNK_HOME/bin/ directory and use the ./splunk
./splunk reload deploy-server
This command checks all server classes for a change, and notifies clients.
-or-
258
./splunk reload deploy-server -class $SERVER_CLASS
This command notifies and updates only the specified server class. For example:
./splunk reload deploy-server -class www
This command notifies and updates only www server class clients. Once a client receives
configurations, it restarts splunkd.
Confirm the sync
To confirm that all clients received the configuration correctly, check from the deployment server by
using ./splunk list deploy-clients . This lists all the deployment clients and the last time
they were successfully synced.
259
Performance Tuning
Performance tuning Splunk
Performance tuning Splunk
By default, Splunk delivers high indexing throughput, fast search speeds, and dense storage.
However, each system is unique, and you may find that tuning Splunk produces significant
performance boosts. This section shows a summary of performance tuning recommendations to help
boost Splunk's performance in your environment.
Hardware considerations
Splunk's performance is affected by the quality of hardware in the system. Provide the best
performance possible for your Splunk Server by maximizing the quality of hardware you use. Different
hardware components have different impacts on performance:
Splunk can use up to 4 cores (not hyper-threaded) for indexing, and up to 4 more cores for
each concurrent search.

Run Splunk on an 8-core server for a significant search performance gain (+30-40%) when
using multiple indexes.

Run Splunk on a 64-bit platform to increase the scaling and speed of searching. Running on a
64-bit platform allows you to search 12x the amount of data (10GB buckets instead of 800MB
in 32-bit) in equivalent time and memory as 32-bit platforms running Splunk.

Use faster hard drives to improve search speeds. Fast SCSI drives with a quality RAID
controller can increase indexing speed up to 1.6x, and search speed up to 10x during
long-running, complex searches.

Use a networking controller, or a dedicated TCP card to off-load networking operations from
the CPU to improve searching and indexing speeds as well as network performance.

Splunk can run on a virtual machine. Virtual machines allow Splunk to run in a dedicated
environment that is not native to the system. However, virtual environments may degrade
performance.

Hardware considerations grow more complex when working with Splunk distributed deployments.
Increase indexing performance
Improve indexing performance by tuning Splunk's time stamp extraction settings, segmentation, and
other indexing properties. These settings are controlled in Splunk's various configuration files. Learn
more about how to tune indexing here.
260
Increase search speed
Tuning your search speed also involves tuning settings in Splunk's configuration files. Segmentation,
timestamping settings, and settings in Splunk Web affect your search speed. Learn more about how
to tune your search speed here.
Improve storage efficiency
Splunk comes configured out-of-the-box, able to compress raw data by approximately 40-50%. In
some cases, it is possible to tune Splunk's storage compression to 12% of raw data size. Tune
Splunk's storage ratio by configuring segmentation settings within configuration files. In some cases,
storage ratio is inversely proportional to search convenience. Learn how to configure your storage
efficiency here.
Reduce the CPU and memory footprint
Searching massive amounts of data efficiently may require tuning Splunk's CPU and memory usage.
Learn how to improve CPU and memory usage and increase overall throughput here.
Utilize multiple CPUs or cores
Increasing the number of CPUs and active cores in your system can improve indexing and search
performance. Splunk uses cores for true index threading (not hyper-threading). We expect Splunk to
perform better with more cores because the cache is shared; hence, it is closer if two threads use the
same memory. Learn more about how to make use of a multi-CPU/core system here.
64-bit operating systems
64-bit platforms improve Splunk's ability to scale search and index operations. The increased
memory results in an order of magnitude more of data that can be searched in the same amount of
time and and memory as a 32-bit system. Learn how to tweak a 64-bit system here.
Indexing performance
Indexing performance
Splunk's indexing performance can be maximized by tweaking settings in Splunk's configuration files.
Here are some basic tweaks you can implement to improve indexing performance:
Change Splunk's time stamp extraction settings in props.conf :
Set Splunk to look fewer characters into an event for a time stamp, (or turn off time
stamp extraction).

261
Use strptime formatting for timestamps (%d/%m/%Y %H:%M:%S).
Edit Splunk's aggregator function to turn off line merging.
Reduce segmentation of events by altering the MAJOR and MINOR breakers.
Turn off some of Splunk's advanced features.
Negative impact on indexing performance
The more regexes you configure in transforms.conf, the longer indexing takes. Make sure all of
your regexes are necessary.

Custom processing.
Using many fields extracted during indexing (see indexed fields).
Using your own C/C++ modules.
Processors
Splunk has several internal processors. If you notice that Splunk isn't indexing your data as you like,
you can track down exactly which processor is responsible for the delay by running the following
search:
index::_internal NOT sendout group=pipeline | timechart sum(cpu_seconds)
by processor
This search shows you a chart of Splunk's internal processors. If one processor in particular is taking
up more cpu time than another, you can tweak settings to reduce this.
Below are some tuning parameters in Splunk's configuration files that affect indexing performance.
indexes.conf
indexes.conf controls how Splunk's indexes are configured. You can change the following entries to
improve indexing performance.
indexThreads =
<non-negative
number> (0)
The number of extra threads to use for a specific index. Turning up the
number of index threads may improve indexing, but is dependent on the
capability of your hardware.
Important: This indexThreads value should not be increased unless
you have evidence of a specific bottleneck in the indexing processor.
Tuning this value incorrectly can have a negative effect on your indexing
performance. For extremely busy indexes (>100 GB per day), the
maximum value we have seen a benefit from is 3.
Caution: Only increase this value if Splunk is running on a
multi-processor, multi-core platform.
262
maxMemMB =
<non-negative
number> (50)
Amount of memory to allocate for indexing. This amount will be allocated
in an escalating amount per index thread, each thread beyond the first
will allocate N * maxMemMB. For example, if you have indexThreads
set to 2 and maxMemMB set to 100, the first thread will use 100MB, the
second thread will use 200MB for a total of 300 MB of memory.
Note:Increasing this value by *small* amounts may improve indexing
throughput. Increasing this value by large amounts will have significant
negative performance consequences across all splunk activities by
wasting memory that would be better allocated to other data.
maxDataSize =
<non-negative
number> (750)
Max amount of data in MBs db hot can grow to. On 32 bit systems we
recommend the value 750. On 64 bit systems we recommend the value
10000. These are the defaults for the appropriate downloads.
props.conf
props.conf controls what parameters apply to events during indexing based on settings tied to each
event's source, host, or sourcetype.
DATETIME_CONFIG = <filename
relative to Splunk_HOME>
(/etc/datetime.xml)
Specifies the file to configure the timestamp extractor.
This configuration may also be set to "NONE" to
prevent the timestamp extractor from running or
"CURRENT" to assign the current system time to each
event.
TIME_FORMAT = <strptime-style
format> (empty)
Specifies a strptime format to extract the date.
Specifying a strptime format for date extraction
accelerates event indexing.
MAX_TIMESTAMP_LOOKAHEAD =
<integer> (150)
Specifies how far into an event Splunk should look for
a timestamp. If you know your timestamp is in the first
n characters of the event, set this to n. This will
increase the speed of indexing.
segmenters.conf
segmenters.conf defines schemes for how events will be tokenized in Splunk's index.
MAJOR = <space
separated list of
strings>
Move MINOR breakers into the MAJOR breaker list, or remove
breakers in the MAJOR breaker list to change the size and amount of
raw data events.
MINOR = <space
separated list of
strings>
Remove the MINOR= string of characters that represent tokens to
index by in addition to the MAJOR breaker list. Reduce or remove this
list to increase indexing performance.
Read more about how to configure custom segmentation.
263
Search performance
Search performance
Splunk is optimized for text-based searching of raw event data. By default, Splunk indexes some
components of each event (default fields: host, source, sourcetype). Splunk can be configured to
extract and index additional components as you see fit. Performance may be affected if Splunk is:
Indexing or extracting additional fields.
Accessing compressed raw data.
Accessing a large number of events (you can change this by altering your time range, or
maximum results you search for).

You can improve Splunk's search performance by changing indexing properties such as time
stamping and segmentation. Here are some general guidelines to help you tune your search
performance:
Set the size of your hot db to the maximum size that your system can support. This is
dependent on the amount of RAM your system contains.

Reduce or eliminate segmentation by removing MINOR breakers, or turning some MINOR
breakers into MAJOR breakers. Play with the breakers to optimize your searches based on the
contents of the events particular to your scenario.

Separate data into different indexes. This is an advanced technique that is only applicable if
you are adding archived data while your Splunk server is indexing current data.

Make sure that time stamping is correct on events.
Below are some of the parameters in various configuration files that may improve your search
performance.
Configure system memory access
Determine how Splunk accesses system memory via indexes.conf.
maxDataSize =
<non-negative
number>
The maximum size in MBs of the hot DB. The hot DB will grow to this size
before it is rolled out to warm. Defaults to 750 on a 32-bit system, 10000
on a 64-bit system. Do not change these values unless specifically
advised to do so by a Splunk Engineer.
264
Configure indexing properties
Configure indexing properties via props.conf. Control indexing properties based on settings tied to
each event's source, host, or source type.
DATETIME_CONFIG =
<filename relative to
Splunk_HOME>
Specifies the file to configure the timestamp extractor. This
configuration may also be set to "NONE" to prevent the timestamp
extractor from running or "CURRENT" to assign the current system time
to each event. Defaults to /etc/datetime.xml (eg
$SPLUNK_HOME/etc/datetime.xml).
TIME_FORMAT =
<strptime-style
format> (empty)
Specifies a strptime format to extract the date. Specifying a strptime
format for date extraction accelerates event indexing.
Configure Splunk Web settings
Configure many of Splunk Web's settings via web.conf. You can configure the following attributes to
make searching faster.
numberOfEventsPerCard =
<integer>
Configuration for the number of events that the Endless Scroller
asks the server for with each request. Defaults to 10.
numberOfCardsPerDeck =
<integer>
Configuration for the number of requests that the Endless Scroller
will make before it starts to recycle space occupied by prior
pages. Defaults to 7.
Configure indexed fields
In some situations, you can increase search performance by extracting fields at index time. Review
the documentation on creating indexed fields, particularly the Note regarding performance to
determine whether they are likely to help in your environment.
Configure Splunk Web
You can increase search performance by changing various configuration settings in the Preferences
menu of Splunk Web.
Disable typeahead
Typeahead is not restricted to your current time range. If you have large datasets of days, months or
years, typeahead can be very slow and load the server. This can be especially problematic in a
distributed search environment.
You can disable typeahead altogether using a role capability in authorize.conf.
265
By default the typeahead capability is added to the User role in etc/system/default/authorize.conf, and
is inherited by the Power and Admin roles. Thus, disabling it for the user role will disable it for all
roles, and all users.
In $SPLUNK_HOME/etc/system/local/authorize.conf add the following settings.
[role_User]
get_typeahead = disabled
If you have a different role scheme, you will have to interpret these instructions within that scheme.
Set segmentation in Splunk Web
Change segmentation settings in the Preferences tab in Splunk Web. For example, raw
segmentation produces faster searching, but doesn't give you the ability to add search terms to you
search by clicking on parts of any event. Play around with the different segmentation settings to find
which one is the best for your data.
Storage efficiency
Storage efficiency
Tuning Splunk's use of storage involves using similar tuning principles as tuning for indexing
performance. The less amount of data that Splunk has to put to disk, the better the storage efficiency.
Reduce index density
You can reduce your index size by tuning segmentation. In segmenters.conf, change some MINOR
breakers to MAJOR breakers to decrease index size.
Note: This changes search performance. Read about segmentation before making any changes to
segmenters.conf.
Configure your data inputs to not index data locally (by editing inputs.conf). Set Splunk to gather data
through network mounts rather than through tailing or watching. Or, set tailing and watching to collect
local data and copy it to the index. Data coming from network mounts is copied to the index, but is not
stored locally and so takes up less space.
As a last resort, you can also configure Splunk to not index raw data at all (extreme version of
lowering the density of indexing).
Note: Not indexing raw data will significantly increase your storage efficiency, but will require your
users to perform more complex operations at search time. Users will have to search for data by
searching timestamps, and core fields. They will then have to filter the results by using where and
266
regex commands. Furthermore, these searches can only regex 10k results at a time.
Tuning inputs.conf
inputs.conf configures all inputs to Splunk including file and directory tailing and watching, network
ports and scripted inputs.
You can add and edit sources to input into Splunk. Configuring Splunk to gather data through the
network versus through tailing local files is the most efficient way to use storage.
Tuning props.conf
props.conf controls what parameters apply to events during indexing based on settings tied to each
event's source, host, or sourcetype.
TRUNCATE = <non-negative
integer> (10000)
Change the default maximum line length. Set to 0 if you don't
want truncation ever (very long lines are often a sign of garbage
data).
(256)
Specifies the maximum number of input lines that will be added to
any event. Splunk will break after the specified number of lines
are read.
Tuning segmenters.conf
segmenters.conf defines schemes for how events will be tokenized in Splunk's index.
MAJOR = <space separated
list of strings>
Move MINOR breakers into the MAJOR breaker list, or
remove breakers in the MAJOR breaker list to change the
size and amount of raw data events.
MINOR = <space separated
list of strings>
Remove the MINOR= string of characters that represent
tokens to index by in addition to the MAJOR breaker list.
Reduce or remove this list to increase indexing performance.
MINOR_LEN = <integer> (-1)
If set and non-negative, specifies how long a minor token can
be. Longer minor tokens are discarded without prejudice.
MAJOR_LEN = <integer> (-1)
If set and non-negative, specifies how long a major token can
be. Longer minor tokens are discarded without prejudice.
FILTER=regular expression
Set a regular expression to only segment data that matches
the regular expression.
LOOKAHEAD=<integer>(-1)
Set how far (in characters) that Splunk looks into an event for
segmentation. If FILTER is set, this applies to filtering too.
Set to 0 to turn off segmentation entirely.
267
Read more about how segmentation works, including how to configure custom segmentation.
Improve CPU usage
Splunk's CPU usage mostly depends on how you configure indexing. Maximize CPU throughput by
tuning indexing, or disabling features (like event type discovery). Splunk has approximately a 3-4
MBps throughput (on a commodity dual-core/dual-CPU system) out-of-the-box. Tuning indexing can
increase that to the range of 4-5 MBps.
Improve CPU usage for better throughput:
Disable or tune down various steps in processing.
Turn off event type discovery.
Tune timestamp recognition.
(If you have a lot of data from a single source) configure Splunk to use a strptime
timestamp instead of letting it guess the timestamp (on by default).

Turn off timestamping altogether (set MAX_TIMESTAMP_LOOKAHEAD to 0).

Improve memory usage
Splunk always uses the maximum amount of memory that is available to it to process searches. You
can increase Splunk's memory usage efficiency, and prevent it from running out of memory while
searching by tuning your searches memory usage:
Reduce unnecessary use of AND and OR conditions.
Reduce the complexity of regular expressions.
Avoid passing results of a very non-selective search into another command that runs in
memory like search or top.
Example: Instead of: * | search sourceip="192.1.1.1" Use: 192.1.1.1 |
search sourceip="192.1.1.1"

Reduce the number of fields that are extracted to avoid running out of memory during a
search.

Narrow the timerange of your search to avoid running out of memory during a search.
Select only host, source, and sourcetype fields using the fields picker. This causes time,
and memory extraction to not run.

Multi-CPU servers
268
Multi-CPU servers
Additional CPUs allow for an increased number of indexing threads which improves Splunk's indexing
and search performance.
Important: This value should not be increased unless you have evidence of a specific bottleneck in
the indexing processor. Tuning this value incorrectly can have a negative effect on your indexing
performance. For extremely busy indexes (>100 GB per day), the maximum value we have seen a
benefit from is 3.
Caution: Only increase this value if Splunk is running on a multi-processor, multi-core platform.
indexes.conf
Set the number of indexing threads by adding/editing this attribute in indexes.conf.
indexThreads =
<non-negative number>
(default=0)
The number of extra threads to use for indexing for this index. This
number should not be set higher than the number of processors in the
box. If splunkd is also doing parsing and aggregation the number
should be 2 less than the number of processors in the box.
A 64-bit operating system will allow you to increase various settings in Splunk to improve
performance.
Indexing can be improved by increasing the maximum amount of queries that a search will
attempt to resolve.

Index size can be increased in size in 64-bit mode.
indexes.conf
The following are the default values for 32-bit and 64-bit systems:
maxDataSize
The number of MBs db hot is allowed to grow to before it is rolled out to warm. This number should
not be increased unless Splunk is running in 64-bit mode.
64-bit: 10000
32-bit: 750
32-bit Windows: 400
269
maxQueryIds
Note: This setting was removed in 3.4.4.
The maximum number of IDs a search will attempt to resolve in a single query. This is a good value
for 32-bit systems. It can be raised for 64-bit installations with large amounts of RAM.
32-bit: 10000000
savedsearches.conf
maxResults
The maximum number of results the entire search can generate. NOTE: This is different from the
deprecated search command "maxresults" and the maxresults setting in prefs.conf.
64-bit: 50000
32-bit: 10000
270
Configuration Files
How do configuration files work?
How do configuration files work?
Splunk's configurations are affected via configuration files. Even configurations set up through Splunk
Web or the CLI are written out to configuration files. Set up more advanced configurations in
configuration files, or make an application. Learn more about application configuration, including best
practices.
Once you have created a working application for a single Splunk server, you can then distribute it to
target servers through the Splunk deployment server or share them with others through SplunkBase.
Changes to how Splunk processes index data do not affect data that is already indexed.
Configuration files must be created in ASCII or UTF8 character sets.
Restarting after configuration changes
Many configuration file changes require you to restart Splunk. Check the configuration file and/or its
documentation reference topic to see if a change you make requires you to restart Splunk.
The following changes require additional or different actions before they will take effect:
Enable search-time configuration changes made to transforms.conf and props.conf by
typing the following search in Splunk Web:

| extract reload=T
Bounce authentication.conf via the Admin -> Server section of Splunk Web.
Configuration file directory structure
There are two general configuration file directories in $SPLUNK_HOME/etc/:
$SPLUNK_HOME/etc/system/
The system directory contains default shipped Splunk configurations and user-created
content to override these settings.

$SPLUNK_HOME/etc/apps/
The apps directory stores downloaded and custom built applications.
Store custom configurations in custom directories underneath ../apps/.

NOTE: There is also a legacy directory in $SPLUNK_HOME/etc/bundles to support prior versions'
configurations and the deployment server.
Both system/ and the application directories in apps/ have the same directory structure:
default/
271
Settings in default/ should not be changed.
local/
Make all custom edits here, including overriding settings in default/.

readme/
Supporting documentation.

bin/
Scripts that support the application, such as searchscripts, custom web scripts, and
REST endpoint handlers.

static/
Files served by the HTTP server and other static files (non-executable).

For example:
apps/
myapp1/
default/
local/
static/
bin/
myapp2/
default/
local/
static/
bin/
Your Splunk server ships with several such directories, including:
default - contains the pre-configured configuration files. Do not modify the files in default.
Note: Not all configuration files appear in default/.
local - stores modifications you make through the web interface or command line. You can
make file edits here, or in a custom application directory.

Note: If you edit files that are also written to by Splunk Web, your edits may be overridden if
someone else is editing Splunk Web at the same time.
learned - this set of configurations are settings created by the Splunk Server as it trains on
incoming data.

README - this directory contains example and spec configuration files that can help you
create your own configuration files. For each configuration file, there are two reference files;
.spec and .example. For example, inputs.conf.spec and inputs.conf.example. The
.spec file is a specification of syntax, including which attributes and variables are available.
The .example files are helpful examples of real-world usage. These files are all found in the
$SPLUNK_HOME/etc/system/README directory.

272
Configuration file precedence
Configuration files live in multiple places: default, local and any custom application directories you
create. Configuration files are evaluated in the following order:
local -- local changes and preferences are evaluated first.
user-created directories -- these are evaluated in alphabetical order.
default -- Splunk's default settings are evaluated last.
NOTE: Any configurations set in $SPLUNK_HOME/etc/bundles take precedence over
configurations in $SPLUNK_HOME/etc/apps.
Example
Directories are evaluated in the following order:
$SPLUNK_HOME/etc/system/local/*
$SPLUNK_HOME/etc/bundles/local/*
$SPLUNK_HOME/etc/bundles/A/*
...
$SPLUNK_HOME/etc/bundles/Z/* $SPLUNK_HOME/etc/system/local/*
$SPLUNK_HOME/etc/apps/A/local/*
...
$SPLUNK_HOME/etc/apps/Z/local/*
$SPLUNK_HOME/etc/apps/A/default/*
...
$SPLUNK_HOME/etc/apps/Z/default/*
$SPLUNK_HOME/etc/system/default/*
Numbered directories are evaluated in the following order:
$SPLUNK_HOME/etc/apps/myapp1
...
Attribute precedence
Precedence is applied attribute-by-attribute. That is, if the file props.conf exists in local and a
user created configuration file directory, the props.conf file in local does not override or replace
the entire props.conf file. If the same attribute/specification exists in both the local props.conf
and the user-created props.conf, the local props.conf overrides the attribute.
For example, if $SPLUNK_HOME/etc/system/local/props.conf contains this stanza:
[source::/opt/Locke/Logs/error*]
sourcetype = t2rss-error
And $SPLUNK_HOME/etc/apps/t2rss/props.conf contains this stanza:
[source::/opt/Locke/Logs/error*]
BREAK_ONLY_BEFORE_DATE = True
273
Both the sourcetype assignment in local and the line merging attributes in t2rss apply. However, if
both local and t2rss have a sourcetype assignment for source::/opt/Locke/Logs/error*,
the assignment in local overrides t2rss.
Precedence rules for events with multiple attribute assignments
Beyond the above rules for precedence, there is an additional precedence issue that affects only
props.conf. props.conf sets attributes for processing individual events by host, source or
sourcetype (and sometimes eventtype). So it's possible for one event to have the same attribute set
differently for the default fields: host, source or sourcetype. The precedence order is:
source
host
sourcetype
Settings higher in the list will override settings lower in the list for settings with the same name.
Note: only one stanza of each type will apply to a given event, one source, one host, and one
sourcetype stanza. Therefore, you should not create overlapping source expressions, in order to
achieve a defined result.
You may want to override default props.conf settings. For example, you are tailing
mylogfile.xml, which by default is labeled sourcetype = xml_file. This configuration will
re-index the entire file whenever it changes, even if you manually specify another sourcetype,
because the property is set by source. To override this, add the explicit configuration by source:
[source::/var/log/mylogfile.xml]
CHECK_METHOD = endpoint_md5
Configure application directories
Configure application directories
Application directories are individual directories placed in $SPLUNK_HOME/etc/system/ or
$SPLUNK_HOME/etc/apps/. Each directory must contain at least one configuration file to be
considered an application directory. Examples and spec files for every configuration file live in
$SPLUNK_HOME/etc/system/README/.
Each Splunk application can have a setup.conf file to specify how that application interacts with other
Splunk applications.
Note: Restart your Splunk server to apply any changes you make to the configuration files. Changes
to how Splunk processes index data do not affect data that is already indexed.
274
Make an application directory
Make configuration changes in the local directory ($SPLUNK_HOME/etc/system/local). To create
a new application, make a directory in $SPLUNK_HOME/etc/apps/. Name the directory anything
you like, but it is a good idea to make the name functionally descriptive. There can be many
application directories on a server.
To get started with configuration changes, use example configuration files from
$SPLUNK_HOME/etc/system/README/. Copy the sample configuration file into your target
directory. It's a good idea to try out configuration changes on a test system (see best practices
section).
Step-by-step configuration file changes
Copy the .example configuration file from ../README to your test location. 1.
Edit the file to fit your data -- double-check file syntax and logic. 1.
When you are ready, change the file extension to .conf (eg remove the .example). 1.
Restart Splunk. 1.
If the modifications you just did involve re-indexing data, you should run the following CLI
commands:
1.
# ./splunk stop
# ./splunk clean eventdata (only if this is a test system!)
# ./splunk start
Check that your changes had the desired effect. 1.
Best practice
For a single Splunk server, it is easiest to keep all configuration files in the
$SPLUNK_HOME/etc/system/local directory.
Caution: Splunk Web writes to ../local/. So if you edit configuration files in ../local/, your
edits may be overwritten if someone else edits Splunk Web at the same time. Thus, if you have many
275
users who make changes in Splunk Web, it is a good idea to create a custom directory for any
configuration files you edit directly.
Also, you may want to create different directories for different configurations. For example, create one
application for inputs. To do this, create a directory in $SPLUNK_HOME/etc/apps/ called inputs
and copy in your own inputs.conf.
For a distributed Splunk deployment, you can copy existing configurations on your local Splunk
server to any remote Splunk server. This is most easily achieved using the Splunk deployment
server. However, if you just make a few simple changes and have a small number of servers, you can
simply copy your configurations to each of your instances.
Never make configuration changes in $SPLUNK_HOME/etc/system/default. These changes will
be overwritten during an upgrade.
It is a good idea to make a back up of the original before making any changes. If your configuration
does not work as expected, you can reinstate the back up.
Test configurations
As with any application, it is unwise to make changes on a production server without testing. When
you have a change to make to a configuration, test it on another server which has a sample of the
data you are configuring.
Configuration file list
Configuration file list
Here is a list of all Splunk's configuration files with descriptions. Descriptions link to configuration
instructions. Examples and specifications for each configuration file are contained in
$SPLUNK_HOME/etc/system/README/.
File Purpose
alert_actions.conf Customize Splunk's global alerting actions.
app.conf Set up fields for your custom application.
audit.conf Configure auditing and event hashing.
authentication.conf Toggle between Splunk's built-in authentication or LDAP. Configure LDAP.
authorize.conf Configure roles, including granular access controls.
commands.conf Connect search commands to any custom search script.
deployment_server.conf Set up deployment servers and clients.
276
decorations.conf Customize dynamic event rendering.
eventdiscoverer.conf Set terms to ignore for typelearner (event discovery).
eventtypes.conf Create event type definitions.
field.conf Create multivalue fields and add search capability for indexed fields.
field_actions.conf Enable clickable actions on fields in SplunkWeb.
indexes.conf Manage and configure index settings.
inputs.conf Set up data inputs.
limits.conf Set various limits (such as maximum result size) for search commands.
literals.conf Customize the text displayed in Splunk Web.
multikv.conf Configure extraction rules for table-like events (eg ps, netstat, ls).
outputs.conf Set up forwarding, routing, cloning and data balancing.
prefs.conf Specify user preferences and dashboards for Splunk Web.
props.conf
Set indexing property configurations, including timezone offset and custom
sourcetype rules. Also map transforms to event properties.
restmap.conf Configure REST endpoints.
prefs.conf Specify user preferences and dashboards for Splunk Web.
regmonfilters.conf Create filters for Windows registry monitoring.
savedsearches.conf Define saved searches and their associated schedules and alerts.
segmenters.conf Customize segmentation rules for indexed events.
server.conf Enable SSL for Splunk's back-end and specify certification locations.
setup.conf Configure a Splunk application's interaction with other Splunk applications.
sourceclassifier.conf Terms to ignore (such as sensitive data) when creating a sourcetype.
sourcetypes.conf
Machine-generated file that stores sourcetype learning rules created by
sourcetype training.
streams.conf Configure additional streams for Live tail.
strings.conf Configure externalized product text strings.
sysmon.conf Set up Windows registry monitoring.
tags.conf Configure tags for extracted and indexed fields.
transactiontypes.conf Add additional transaction types for transaction search.
transforms.conf
Configure regex transformations to perform on data inputs. Use in tandem
with props.conf.
user_seed.conf Set a default user and password.
web.conf Configure Splunk Web, enable HTTPs.
wmi.conf Set up Windows management instrumentation (WMI) inputs.
277
Applications
About apps
About apps
This topic provides and overview of applications and how you can use them. For Splunk applications
and information on how to build your own, refer to the Apps Wiki. For existing applications, use
Splunk's App Manager to browse SplunkBase.
What are applications?
A Splunk application can be as simple as a collection of one or more event type definitions, searches,
and/or saved searches. Or, it can be as complex as an entirely new program using Splunk's REST
API.
Where can you find them?
When you install Splunk, a number of applications are installed by default (but not necessarily
enabled, we'll get to that later). You can see them by launching Splunk Web and navigating to the
Admin > Applications page. In particular, the Splunk forwarder, light forwarder, and desktop
configuration applications are listed here. You can find and install more Splunk applications from this
page.
How can I tell what applications are installed?
You can navigate to the Admin > Applications page in Splunk Web and see what applications are
enabled for your Splunk installation, or you can use the CLI to check to see if a particular application
is installed by going to $SPLUNK_HOME/bin and typing:
./splunk display <application name>
Where do they fit into Splunk?
Each Splunk application that is listed in the Admin>Applications page has its own directory under
$SPLUNK_HOME/etc/apps/, where SPLUNK_HOME is the directory into which you installed Splunk.
Each Splunk application can have a setup.conf file to specify how that application interacts with other
Splunk applications.
278
How do you install them?
For general installation instructions refer to the Install Splunk applications topic.
Important: Splunk's directory structure changed between versions 3.2 and 3.3. If you are
downloading an application from SplunkBase, you may have to upgrade to 3.3. Contact Splunk
support for guidance.
About Splunk's app manager
About Splunk's app manager
Use Splunk Web's Admin interface to manage existing applications and browse Splunkbase for new
applications. For Splunk applications and information on how to build your own, refer to the Apps
Wiki.
View and manage applications
The Applications: View/Manage page displays a table of the applications currently installed on your
system, each application's status, and actions you can perform on each application. By clicking on
the option in the Actions column, you can Uninstall, Configure, and Enable/Disable your applications.
Uninstall
Uninstall deletes the application from the list and removes all the files associated with the
application. When you click Uninstall a dialog window opens and asks if you wish to continue or
cancel the uninstall action. You must restart Splunk to complete the uninstall process.
Configure
Configure redirects you to the applications View/Manage page, where you can edit the application's
configuration stanzas (if they exist).
Enable/Disable
Enable/Disable turns your application on or off and updates the Status column to reflect the change.
If your application is already enabled, you will see the Disable option and vice-versa.
Use the Enable and Disable buttons, located above the table, to quickly enable and disable all
checked items.
Browse SplunkBase
Use Splunk Web's Applications: Browse SplunkBase page to view and install any of the
applications available on SplunkBase.
279
Application summary
Scroll through all the applications or use the category links to view groups of applications. Each
application has a summary describing its use and a list of information that includes the number of
downloads and the application's price.
Install applications
Each application has an install button: Install Free for free applications or Install 30-day Free Trial
for applications that have an associated price. When you click on an install button for an application,
Splunk Web redirects you to a SplunkBase login page. You have to log in to download the
application.
After you install an application, restart Splunk. Now, you can view and manage your new application
from the Applications: View/Manage page.
Note: To upgrade an application, we recommend that you remove the old version and do a fresh
install of the application (with the new version).
Install Splunk apps
Install Splunk apps
This topic discusses the ways you can download and install applications from SplunkBase.
SplunkBase is the repository for all apps created by Splunk and the Splunk community. You can
either access applications directly from SplunkBase or you can use Splunk's App manager.
Read the About Apps topic for general information. For information about specific Splunk applications
and building your own application, refer to the Apps Wiki.
via Splunk's App Manager
Splunk Web's Admin interface includes an [URL app manager], the Applications page. You can use
this page to view and manage all your existing applications and browse SplunkBase for more
applications. The Applications: Browse SplunkBase page lets you view and install any of the
applications available on SplunkBase.
To install an application using Splunk Web:
1. From the Admin > Applications page, click Browse SplunkBase.
2. Scroll through the list of applications to find the one you want.
You can use the category links to narrow your search. Each application listed include a brief
summary and some statistics, such as last update, number of downloads, and price.
Note: For more information about an application, click its name to open its SplunkBase page in
another window or tab. If you want to install the application, return to Splunk Web before continuing.
280
3. Each application has an install button: Install Free for free applications or Install 30-day Free
Trial for applications that have an associated price. Click the install button.
Splunk Web redirects you to a SplunkBase login page.
4. Use your Splunk.com username and password to log into SplunkBase.
After you click Login, Splunk downloads and installs the application and redirects you to the
Applications: View/Manage Applications page. You should now see the new application in the list.
5. To complete the installation process, restart Splunk.
You can restart Splunk from the Admin > Server: Control Server page. Just click Restart Now.
after download from SplunkBase
1. Download your application from SplunkBase.
Note: Apps come packaged with the Splunk extension .spl.
2. Rename the SPL file to replace the .spl extension with .tar.gz.
For example, if you downloaded myApp.spl, rename it myApp.tar.gz.
3. Move and expand the application file into the $SPLUNK_HOME/etc/apps/ directory.
4. Review the contents of the application.
You may need to configure the app before you continue. For more information, refer to Configuring
your new application.
5. After you configure the application, restart your Splunk server.
From the CLI, you can restart Splunk with:
./splunk restart
The application is now installed and can be viewed and managed from Splunk's App manager.
on an isolated network or machine
If your Splunk server cannot reach the WAN, Splunk Web will not be able to access SplunkBase. To
install an application in this situation, download the app using another machine that can access the
WAN. Then, transfer the file to the desired machine and expand the file into the proper directory to
install the application.
281
1. From a machine that can access the WAN, download an application from SplunkBase.
2. Rename the SPL file to replace the .spl extension with .tar.gz.
3. Transfer the file to the $SPLUNK_HOME/etc/apps/ directory on the desired machine.
Note: Depending on the accessibility of this isolated machine, this may require SCP/FTP or a USB
drive.
4. Expand the application file into the $SPLUNK_HOME/etc/apps/ directory.
5. Configure the application.
6. Restart your Splunk server.
Enable app via CLI
To enable a Splunk application via the CLI, you must first download the applicaton from SplunkBase
and unpack it as directed above:
./splunk enable app <appname> -auth <username>:<password>
Note: If you are running Splunk with a free license, you do not have to provide a username and
password.
Disable app via CLI
To disable a Splunk application via the CLI:
./splunk disable app <appname> -auth <username>:<password>
Note: If you are running Splunk with a free license, you do not have to provide a username and
password.
282
Reference
Pre-trained source types
Splunk ships pre-trained to recognize many different source types. A number of source types are
automatically recognized, tagged and parsed appropriately. Splunk also contains a significant number
of pre-trained source types that are not automatically recognized but can be assigned via SplunkWeb
or inputs.conf.
It's a good idea to use a pre-trained source type if it matches your data, as Splunk contains optimized
indexing properties for pre-trained source types. However, if your data does not fit with any
pre-trained source types, you can Splunk can index virtually any format of data without custom
properties.
Learn more about source types and how they work.
Automatically recognized source types
Source type name Origin Examples
access_combined
NCSA combined
format http web
server logs (can be
generated by
apache or other
web servers)
10.1.1.43 - webdev [08/Aug/2005:13:18:16 -0700] "GET / HTTP/1.0" 200 0442
"-" "check_http/1.10 (nagios-plugins 1.4)"
access_combined_wcookie
NCSA combined
format http web
server logs (can be
generated by
apache or other
web servers), with
cookie field added
at end
"66.249.66.102.1124471045570513" 59.92.110.121 - - [19/Aug/2005:10:04:07
-0700] "GET /themes/splunk_com/images/logo_splunk.png HTTP/1.1" 200 994
"http://www.splunk.org/index.php/docs" "Mozilla/5.0 (X11; U; Linux i686;
en-US; rv:1.7.8) Gecko/20050524 Fedora/1.0.4-4 Firefox/1.0.4"
"61.3.110.148.1124404439914689"
access_common
NCSA common
format http web
server logs (can be
generated by
apache or other
web servers)
10.1.1.140 - - [16/May/2005:15:01:52 -0700] "GET
/themes/ComBeta/images/bullet.png HTTP/1.1" 404 304
apache_error
Standard Apache
web server error
log
[Sun Aug 7 12:17:35 2005] [error] [client 10.1.1.015] File does not exist:
/home/reba/public_html/images/bullet_image.gif
283
asterisk_cdr
Standard Asterisk
IP PBX call detail
record
"","5106435249","1234","default","""James
Jesse""<5106435249>","SIP/5249-1ce3","","VoiceMail","u1234","2005-05-26
15:19:25","2005-05-26 15:19:25","2005-05-26
15:19:42",17,17,"ANSWERED","DOCUMENTATION"
asterisk_event
Standard Asterisk
event log
(management
events)
Aug 24 14:08:05 asterisk[14287]: Manager 'randy' logged on from 127.0.0.1
asterisk_messages
Standard Asterisk
messages log
(errors and
warnings)
Aug 24 14:48:27 WARNING[14287]: Channel 'Zap/1-1' sent into invalid
extension 's' in context 'default', but no invalid handler
asterisk_queue
Standard Asterisk
queue log
NONE|NONE|NONE|CONFIGRELOAD|
cisco_syslog
Standard Cisco
syslog produced by
all Cisco network
devices including
PIX firewalls,
routers, ACS, etc.,
usually via remote
syslog to a central
log host
Sep 14 10:51:11 stage-test.splunk.com Aug 24 2005 00:08:49: %PIX-2-106001:
Inbound TCP connection denied from IP_addr/port to IP_addr/port flags
TCP_flags on interface int_name Inbound TCP connection denied from
144.1.10.222/9876 to 10.0.253.252/6161 flags SYN on interface outside
db2_diag
Standard IBM DB2
database
administrative and
error log
2005-07-01-14.08.15.304000-420 I27231H328 LEVEL: Event PID : 2120 TID :
4760 PROC : db2fmp.exe INSTANCE: DB2 NODE : 000 FUNCTION: DB2 UDB, Automatic
Table Maintenance, db2HmonEvalStats, probe:900 STOP : Automatic Runstats:
evaluation has finished on database TRADEDB
exim_main Exim MTA mainlog
2005-08-19 09:02:43 1E69KN-0001u6-8E => support-notifications@splunk.com
R=send_to_relay T=remote_smtp H=mail.int.splunk.com [10.2.1.10]
exim_reject Exim reject log
2005-08-08 12:24:57 SMTP protocol violation: synchronization error (input
sent without waiting for greeting): rejected connection from
H=gate.int.splunk.com [10.2.1.254]
linux_messages_syslog
Standard linux
syslog
(/var/log/messages
on most platforms)
Aug 19 10:04:28 db1 sshd(pam_unix)[15979]: session opened for user root by
(uid=0)
linux_secure Linux securelog
Aug 18 16:19:27 db1 sshd[29330]: Accepted publickey for root
from ::ffff:10.2.1.5 port 40892 ssh2
log4j
Log4j standard
output produced by
any J2EE server
using log4j
2005-03-07 16:44:03,110 53223013 [PoolThread-0] INFO [STDOUT] got some
property...
mysqld_error Standard mysql
error log
050818 16:19:29 InnoDB: Started; log sequence number 0 43644
/usr/libexec/mysqld: ready for connections. Version: '4.1.10a-log' socket:
284
'/var/lib/mysql/mysql.sock' port: 3306 Source distribution
mysqld
Standard mysql
query log; also
matches mysql's
binary log following
conversion to text
53 Query SELECT xar_dd_itemid, xar_dd_propid, xar_dd_value FROM
xar_dynamic_data WHERE xar_dd_propid IN (27) AND xar_dd_itemid = 2
postfix_syslog
Standard Postfix
MTA log reported
via the Unix/Linux
syslog facility
Mar 1 00:01:43 avas postfix/smtpd[1822]: 0141A61A83:
client=host76-117.pool80180.interbusiness.it[80.180.117.76]
sendmail_syslog
Standard Sendmail
MTA log reported
via the Unix/Linux
syslog facility
Aug 6 04:03:32 nmrjl00 sendmail[5200]: q64F01Vr001110: to=root,
ctladdr=root (0/0), delay=00:00:01, xdelay=00:00:00, mailer=relay,
min=00026, relay=[101.0.0.1] [101.0.0.1], dsn=2.0.0, stat=Sent
(v00F3HmX004301 Message accepted for delivery)
sugarcrm_log4php
Standard Sugarcrm
activity log reported
using the log4php
utility
Fri Aug 5 12:39:55 2005,244 [28666] FATAL layout_utils - Unable to load the
application list language file for the selected language(en_us) or the
default language(en_us)
weblogic_stdout
Weblogic server log
in the standard
native BEA format
####<Sep 26, 2005 7:27:24 PM MDT> <Warning> <WebLogicServer> <bea03>
<asiAdminServer> <ListenThread.Default> <<WLS Kernel>> <> <BEA-000372>
<HostName: 0.0.0.0, maps to multiple IP
addresses:169.254.25.129,169.254.193.219>
websphere_activity
Websphere activity
log, also often
referred to as the
service log
---------------------------------------------------------------
ComponentId: Application Server ProcessId: 2580 ThreadId: 0000001c
ThreadName: Non-deferrable Alarm : 3 SourceId:
com.ibm.ws.channel.framework.impl. WSChannelFrameworkImpl ClassName:
MethodName: Manufacturer: IBM Product: WebSphere Version: Platform 6.0 [BASE
6.0.1.0 o0510.18] ServerName: nd6Cell01\was1Node01\TradeServer1 TimeStamp:
2005-07-01 13:04:55.187000000 UnitOfWork: Severity: 3 Category: AUDIT
PrimaryMessage: CHFW0020I: The Transport Channel Service has stopped the
Chain labeled SOAPAcceptorChain2 ExtendedMessage:
---------------------------------------------------------------
websphere_core
Corefile export from
Websphere
NULL------------------------------------------------------------------------
0SECTION TITLE subcomponent dump routine NULL===============================
1TISIGINFO signal 0 received 1TIDATETIME Date: 2005/08/02 at 10:19:24
1TIFILENAME Javacore filename: /kmbcc/javacore95014.1122945564.txt NULL
------------------------------------------------------------------------
0SECTION XHPI subcomponent dump routine NULL ==============================
1XHTIME Tue Aug 2 10:19:24 20051XHSIGRECV SIGNONE received at 0x0 in
<unknown>. Processing terminated. 1XHFULLVERSION J2RE 1.3.1 IBM AIX build
ca131-20031105 NULL
websphere_trlog_syserr
Standard
Websphere system
error log in IBM's
native tr log format
[7/1/05 13:41:00:516 PDT] 000003ae SystemErr R at com.ibm.ws.http.channel.
inbound.impl.HttpICLReadCallback.complete (HttpICLReadCallback.java(Compiled
Code)) (truncated)
285
websphere_trlog_sysout
Standard
Websphere system
out log in IBM's
native trlog format;
similar to the log4j
server log for Resin
and Jboss, sampe
format as the
system error log
but containing
lower severity and
informational
events
[7/1/05 13:44:28:172 PDT] 0000082d SystemOut O Fri Jul 01 13:44:28 PDT 2005
TradeStreamerMDB: 100 Trade stock prices updated: Current Statistics Total
update Quote Price message count = 4400 Time to receive stock update alerts
messages (in seconds): min: -0.013 max: 527.347 avg: 1.0365270454545454 The
current price update is: Update Stock price for s:393 old price = 15.47 new
price = 21.50
windows_snare_syslog
Standard windows
event log reported
through a 3rd party
Intersect Alliance
Snare agent to
remote syslog on a
Unix or Linuxserver
0050818050818 Sep 14 10:49:46 stage-test.splunk.com Windows_Host
MSWinEventLog 0 Security 3030 Day Aug 24 00:16:29 2005 560 Security admin4
User Success Audit Test_Host Object Open: Object Server: Security Object
Type: File Object Name: C:\Directory\secrets1.doc New Handle ID: 1220
Operation ID: {0,117792} Process ID: 924 Primary User Name: admin4 Primary
Domain: FLAME Primary Logon ID: (0x0,0x8F9F) Client User Name: - Client
Domain: - Client Logon ID: - Accesses SYNCHRONIZE ReadData (or
ListDirectory) Privileges -Sep
This list contains both automatically recognized source types and pre-trained source types that are
not automatically recognized.
Category Source type(s)
Application
servers
log4j, log4php, weblogic_stdout, websphere_activity, websphere_core,
websphere_trlog
Databases mysqld, mysqld_error, mysqld_bin
E-mail exim_main, exim_reject, postfix_syslog, sendmail_syslog, procmail
Operating
systems
linux_messages_syslog, linux_secure, linux_audit, linux_bootlog, anaconda,
anaconda_syslog, osx_asl, osx_crashreporter, osx_crash_log, osx_install,
osx_secure, osx_daily, osx_weekly, osx_monthly, osx_window_server,
windows_snare_syslog, dmesg, ftp, ssl_error, syslog, sar, rpmpkgs
Network novell_groupwise, tcp
Printers cups_access, cups_error, spooler
Routers and
firewalls
cisco_cdr, cisco_syslog, clavister
VoIP asterisk_cdr, asterisk_event, asterisk_messages, asterisk_queue
Webservers access_combined, access_combined_wcookie, access_common, apache_error, iis
Miscellaneous snort
286
Splunk log files
Splunk log files
Splunk keeps track of its activity by logging to various files in $SPLUNK_HOME/var/log/splunk.
Splunk's internal log files are rolled based on size. You can change the default log rotation size by
editing $SPLUNK_HOME/etc/log.cfg.
Search these files in Splunk Web by typing:
index::_internal
Internal logs
Here is a complete list with descriptions of the internal logs in $SPLUNK_HOME/var/log/splunk.
Splunk's internal logs are useful for troubleshooting or metric analysis.
audit.log
Log of audit events.
crawl.log
Log of crawl activities.
inputs.log
license_audit.log
Continuous audit of license violations.
metrics.log
Contains information about CPU usage and Splunk's data processing. The metrics.log file is a
sampling of the top ten items in each category for in 30-second intervals, based on the size of _raw. It
can be used for limited analysis of volume trends for data inputs. For more information about what's
in metrics.log, refer to Work with metrics.log as well as this developer blog post about Splunk
forwarder and indexer metrics.
migration.log
A log of events during install and migration. Specifies which files were altered during upgrade.
python.log
A log of python events within Splunk. Useful for debugging REST endpoints and communication with
splunkd.
287
searchhistory.log
A log of all searches performed on the server since installation or the most recent splunk clean
command.
splunkd_stdout.log
The Unix standard output device for the server.
splunkd_stderr.log
The Unix standard error device for the server.
splunklogger.log
A subset of the Splunk server's own log events since installation or the most recent splunk clean
command. This file is sent to index::splunklogger and can be searched through Splunk Web.
splunkd.log
A record of actions made by the Splunk server. May be requested by Splunk Support for
troubleshooting purposes.
splunkmon.log
Log of splunk's watchdog process.
web_access.log
A record of actions made by Splunk Web, in an Apache access_log format.
web_service.log
A record of actions made by Splunk Web.
wmi.log
Only relevant on Windows machines. Logs attempts from Splunk to connect to WMI.
debug
Splunk has a debugging parameter (--debug) you can add when starting Splunk from the CLI (with
./splunk start).
./splunk start --debug
Note: Navigate to Splunk's CLI $SPLUNK_HOME/bin and use the ./splunk command. You can
also add Splunk to your path.
This command outputs logs to $SPLUNK_HOME/var/log/splunk/splunkd.log. To turn off
debugging, stop or restart Splunk.
288
Note: running Splunk with debugging turned on outputs a large amount of information. Make sure
you do not leave debugging on for any significant length of time.
To dynamically enable debugging messages for a particular category, you can use these searches:
From the UI:
| debug cmd=logchange param1=FileInputTracker param2=DEBUG
| debug cmd=logchange param1=selectProcessor param2=DEBUG
You will get a message "Error in 'DebugCommand'..." This is normal and can be ignored.
From the CLI:
$ ./splunk search "| debug cmd=logchange param1=FileInputTracker
param2=DEBUG" -auth admin:changeme
You will get the message "FATAL: Error in 'DebugCommand': Setting priority of..." This is normal and
can be ignored.
The log.cfg file is not changed and the original settings from this file will be restored on next restart.
log.cfg
For more granular debugging messages, you can change log levels by editing
$SPLUNK_HOME/etc/log.cfg. This affects Splunk's internal logs.
You can change the following categories in log.cfg. Set the category you wish to debug from WARN
or INFO to DEBUG.
The message levels, in order from least to most urgent are:
DEBUG
INFO
WARN
ERROR
FATAL
CRIT
rootCategory=WARN,A1
category.LicenseManager=INFO
category.TcpOutputProc=INFO
category.TcpInputProc=INFO
category.UDPInputProcessor=INFO
category.SavedSplunker=INFO
category.DistributedMgr=INFO
category.DistributedExecutionContext=INFO
category.DistributedDeploymentProcessor=INFO
category.DistributedDeploymentClientProcessor=INFO
category.DistributedDeploymentClientMgr=INFO
category.DistributedDeploymentMgr=INFO
289
category.ThruputProcessor=WARN
category.ShutdownHandler=WARN
# leave loader at INFO! this is what gives us our build + system info...
category.loader=INFO
category.ulimit=INFO
category.SearchPerformance=INFO
category.SearchPipelinePerformance=WARN
To change the maximum size of a log file before it rolls, change the maxFileSize value (in bytes)
for the desired file:
appender.A1=RollingFileAppender
appender.A1.fileName=${SPLUNK_HOME}/var/log/splunk/splunkd.log
appender.A1.maxFileSize=250000000
appender.A1.maxBackupIndex=5
appender.A1.layout=PatternLayout
appender.A1.layout.ConversionPattern=%d{%m-%d-%Y %H:%M:%S.%l} %-5p %c - %m%n
If you modify logging settings with a logchange search or through the UI, these are not persisted in
log.cfg. Your original settings will be restored on the next restart.
Work with metrics.log
Work with metrics.log
Splunk's internal metrics.log file is a sampling of the top ten items in each category for in
30-second intervals, based on the size of _raw. It does not give you an exact accounting of all your
inputs, just the top 10 hot data sources. You can examine its contents for limited analysis of volume
trends for data inputs. It is different from the numbers reported by LicenseManager, which include the
indexed fields. Also, the default configuration only maintains the metrics data in the internal index a
few days, but by going to the files you can see trends over a period of months if your rolled files go
that far back.
Note: You can change the number of series that metrics.log tracks from the default of 10 by
editing the value of maxseries in the [metrics] stanza in limits.conf.
You can find more information about metrics.log in this developer blog posting about forwarder
and indexer metrics.
Use metrics.log to troubleshoot issues with data inputs
You might want to identify a data input that has suddenly begun to generate uncharacteristically large
numbers of events. If this input is hidden in a large quantity of similar data, it can be difficult to
determine which one is actually the problem. You can find it by searching the internal index (add
index=_internal to your search) or just look in metrics.log itself in
290
$SPLUNK_HOME/var/log/splunk.
A typical metrics.log has stuff like this:
03-13-2008 10:48:55.620 INFO Metrics - group=pipeline, name=tail, processor=tail, cpu_seconds=0.000000, executes=31, cumulative_hits=73399
03-13-2008 10:48:55.620 INFO Metrics - group=pipeline, name=typing, processor=annotator, cpu_seconds=0.000000, executes=63, cumulative_hits=134912
03-13-2008 10:48:55.620 INFO Metrics - group=pipeline, name=typing, processor=clusterer, cpu_seconds=0.000000, executes=63, cumulative_hits=134912
03-13-2008 10:48:55.620 INFO Metrics - group=pipeline, name=typing, processor=readerin, cpu_seconds=0.000000, executes=63, cumulative_hits=134912
03-13-2008 10:48:55.620 INFO Metrics - group=pipeline, name=typing, processor=sendout, cpu_seconds=0.000000, executes=63, cumulative_hits=134912
03-13-2008 10:48:55.620 INFO Metrics - group=thruput, name=index_thruput, instantaneous_kbps=0.302766, instantaneous_eps=2.129032, average_kbps=0.000000, total_k_processed=19757, load_average=0.124023
03-13-2008 10:48:55.620 INFO Metrics - group=per_host_thruput, series="fthost", kbps=0.019563, eps=0.096774, kb=0.606445
03-13-2008 10:48:55.620 INFO Metrics - group=per_host_thruput, series="grumpy", kbps=0.283203, eps=2.032258, kb=8.779297
03-13-2008 10:48:55.620 INFO Metrics - group=per_index_thruput, series="_internal", kbps=0.275328, eps=1.903226, kb=8.535156
03-13-2008 10:48:55.620 INFO Metrics - group=per_index_thruput, series="_thefishbucket", kbps=0.019563, eps=0.096774, kb=0.606445
03-13-2008 10:48:55.620 INFO Metrics - group=per_index_thruput, series="default", kbps=0.007876, eps=0.129032, kb=0.244141
03-13-2008 10:48:55.620 INFO Metrics - group=per_source_thruput, series="/applications/splunk3.2/var/log/splunk/metrics.log", kbps=0.272114, eps=1.870968, kb=8.435547
03-13-2008 10:48:55.620 INFO Metrics - group=per_source_thruput, series="/applications/splunk3.2/var/log/splunk/splunkd.log", kbps=0.003213, eps=0.032258, kb=0.099609
03-13-2008 10:48:55.620 INFO Metrics - group=per_source_thruput, series="/var/log/apache2/somedomain_access_log", kbps=0.007876, eps=0.096774, kb=0.244141
03-13-2008 10:48:55.620 INFO Metrics - group=per_source_thruput, series="filetracker", kbps=0.019563, eps=0.096774, kb=0.606445
03-13-2008 10:48:55.620 INFO Metrics - group=per_sourcetype_thruput, series="access_common", kbps=0.007876, eps=0.129032, kb=0.244141
03-13-2008 10:48:55.620 INFO Metrics - group=per_sourcetype_thruput, series="filetrackercrclog", kbps=0.019563, eps=0.096774, kb=0.606445
03-13-2008 10:48:55.620 INFO Metrics - group=per_sourcetype_thruput, series="splunkd", kbps=0.275328, eps=1.903226, kb=8.535156
03-13-2008 10:48:55.620 INFO Metrics - group=queue, name=aeq, max_size=10, filled_count=0, empty_count=0, current_size=0, largest_size=0, smallest_size=0
03-13-2008 10:48:55.620 INFO Metrics - group=queue, name=aq, max_size=10, filled_count=0, empty_count=0, current_size=0, largest_size=0, smallest_size=0
03-13-2008 10:48:55.620 INFO Metrics - group=queue, name=tailingq, current_size=0, largest_size=0, smallest_size=0
03-13-2008 10:48:55.620 INFO Metrics - group=queue, name=udp_queue, max_size=1000, filled_count=0, empty_count=0, current_size=0, largest_size=0, smallest_size=0
There's a lot more there than just volume data, but for now let's focus on investigating data inputs.
group identifies what type of thing is being reported on and series gives the particular item.
For incoming events, the amount of data processed is in the thruput group, as in
per_host_thruput. In this example, you're only indexing data from one host, so
per_host_thruput actually can tell us something useful: that right now host "grumpy" indexes
around 8k in a 30-second period. Since there is only one host, you can add it all up and get a good
picture of what you're indexing, but if you had more than 10 hosts you would only get a sample.
291
For example, you might know that access_common is a popular sourcetype for events on this Web
server, so it would give you a good idea of what was happening:
But you have probably got more than 10 sourcetypes, so at any particular time some other one could
spike and access_common wouldn't be reported. per_index_thruput and
per_source_thruput work similarly.
With this in mind, let's examine the standard saved search "KB indexed per hour last 24 hours".
index::_internal metrics group=per_index_thruput NOT debug NOT sourcetype::splunk_web_access | timechart fixedrange=t span=1h sum(kb) | rename sum(kb) as totalKB
This means: look in the internal index for metrics data of group per_index_thruput, ignore some
internal stuff and make a report showing the sum of the kb values. For cleverness, we'll also rename
the output to something meaningful, "totalKB". The result looks like this:
sum of kb vs. time for results in the past day
_time totalKB
1 03/12/2008 11:00:00 922.466802
2 03/12/2008 12:00:00 1144.674811
3 03/12/2008 13:00:00 1074.541995
4 03/12/2008 14:00:00 2695.178730
5 03/12/2008 15:00:00 1032.747082
6 03/12/2008 16:00:00 898.662123
Those totalKB values just come from the sum of kb over a one hour interval. If you like, you can
change the search and get just the ones from grumpy:
index::_internal metrics grumpy group=per_host_thruput | timechart fixedrange=t span=1h sum(kb) | rename sum(kb) as totalKB
_time totalKB
1 03/12/2008 11:00:00 746.471681
2 03/12/2008 12:00:00 988.568358
3 03/12/2008 13:00:00 936.092772
4 03/12/2008 14:00:00 2529.226566
5 03/12/2008 15:00:00 914.945313
292
6 03/12/2008 16:00:00 825.353518
index::_internal metrics access_common group=per_sourcetype_thruput | timechart fixedrange=t span=1h sum(kb) | rename sum(kb) as totalKB
_time totalKB
1 03/12/2008 11:00:00 65.696285
2 03/12/2008 12:00:00 112.035162
3 03/12/2008 13:00:00 59.775395
4 03/12/2008 14:00:00 35.008788
5 03/12/2008 15:00:00 62.478514
6 03/12/2008 16:00:00 14.173828
Log file rotation
Log file rotation
Splunk recognizes when a file that it is monitoring (such as /var/log/messages) has been rolled
(/var/log/messages1) and will not read the rolled file in a second time.
Note: Splunk does not recognize archive files produced by logrotate (such as tar or gzip) as the
same as the uncompressed originals. This can lead to a duplication of data if these files are then
monitored by Splunk. You can explicitly set blacklist rules for archive filetypes to prevent Splunk from
reading these files as new logfiles, or you can configure logrotate to move these files into a directory
you have not told Splunk to read.
Splunk recognizes the following archive filetypes: tar, gz, bz2, tar.gz, tgz, tbz, tbz2, zip, and z.
For more information on setting blacklist rules see "Whitelist and blacklist rules" in this manual.
How log rotation works
The monitoring processor picks up new files and reads the first and last 256 bytes of the file. This
data is hashed into a begin and end cyclic redundancy check (CRC). Splunk checks new CRCs
against a database that contains all the CRCs of files Splunk has seen before. The location Splunk
last read in the file is also stored.
There are three possible outcomes of a CRC check:
1. There is no begin and end CRC matching this file in the database. This is a new file and will be
picked up and consumed from the start. Splunk updates the database with new CRCs and seekptrs
as the file is being consumed.
2. The begin CRC is present and the end CRC are present but the size of the file is larger than the
seekPtr Splunk stored. This means that, while Splunk has seen the file before, there has been
information added to it since it was last read. Splunk opens the file and seeks to the previous end of
the file and starts reading from there (so Splunk will only grab the new data and not anything it has
read before).
3. The begin CRC is present but the end CRC does not match. This means the file has been changed
since Splunk last read it and some of the portions it has read in already are different. In this case
293
there is evidence that the previous data Splunk read from has been changed. In this case Splunk has
no choice but to read the whole file again.
Determine what files Splunk is monitoring
Determine what files Splunk is monitoring
When you configure inputs, you may want to know what specific files Splunk is monitoring prior to
starting Splunk for indexing. This is especially true when configuring whitelisting/blacklisting rules.
Splunk includes a listtails utility which reads in the configuration of inputs.conf in all applications,
scans your directories and shows you the exact list of files what Splunk will monitor when you restart.
This allows you to make changes to inputs.conf and verify if the blacklist/whitelist filtering is
correct.
Run listtails
To use the listtails utility:
1. Navigate to $SPLUNK_HOME/bin/.
2. Run the command ./splunk cmd listtails.
Index SNMP events with Splunk
Index SNMP events with Splunk
The most effective way to index SNMP events is to use snmptrapd to write them to a file.
First, configure snmptrapd to write to a file on disk.
# touch /var/run/snmp-traps
# snmptrapd -Lf /var/run/snmp-traps
Then, configure the Splunk server to add the file as a data input.
log4j
log4j
The best way to index log4j files is to set up a standard log4j-syslog appender on your log4j host.
Then configure the Splunk server's properties to strip the syslog header prior to other processing, so
Splunk doesn't think the logs are single-line syslog entries.
294
See the entry on stripping syslog headers for instructions on stripping the syslog headers.
Strip syslog headers before processing
Strip syslog headers before processing
Remove syslog headers from non-syslog events that have been passed through syslog to Splunk,
such as log4j events from a log4j-to-syslog appender. Splunk ships with a regex to do this for you in
$SPLUNK_HOME/etc/system/default/transforms.conf. Overwrite or change any of the
default attributes and values by creating a transforms.conf in
$SPLUNK_HOME/etc/system/local/ or your own custom bundle directory. For more information
on configuration files in general, see how configuration files work.
Configuration
transforms.conf
In $SPLUNK_HOME/etc/system/default/transforms.conf:
# This will strip out date stamp, host, process with pid and just get the
# actual message
[syslog-header-stripper-ts-host-proc]
REGEX = ^[A-Z][a-z]+\s+\d+\s\d+:\d+:\d+\s.*?:\s(.*)$
FORMAT = $1
DEST_KEY = _raw
Additional strippers found in this file include:
syslog-header-stripper-ts-host-proc This will strip out date stamp, host, process
with pid and just get the actual message

syslog-header-stripper-ts-host This will strip the syslog header (date stamp and
host) from a syslog event. This is especially useful in allowing Splunk to extract the correct
hostname if you are using hostname chaining

syslog-header-stripper-ts This will just strip the time stamp
props.conf
In $SPLUNK_HOME/etc/sstem/local/props.conf:
[syslog]
TRANSFORMS--strip-syslog = syslog-header-stripper-ts-host-proc
295
This example turns on the built-in regex for remote syslog inputs.
[syslog]
TRANSFORMS-strip-syslog = syslog-header-stripper-ts-host-proc
Add a name onto the TRANSFORMS declarations. There are no special keywords.
TRANSFORMS-the-cake-is-a-lie works just as well.
Example
If you have a central syslog server (syslog1.idkfa.kom) receiving events from multiple servers,
you can forward the events to a Splunk Server and index them based on the original host
(doom1.idkfa.kom) and original timestamp (07:37:15). For this example the events come to
Splunk via UDP port 514 and look like this:
Mar 30 14:29:35 syslog1.idkfa.kom Mar 30 07:37:15 doom1.idkfa.kom
sshd[7728]: Connection closed by ::ffff:192.168.1.101
Create this configuration stanza in props.conf:
[syslog]
TIME_PREFIX = ^[A-Z][a-z]+\s+\d+\s\d+:\d+:\d+\s[^\s]*\s
TRANSFORMS-strip-syslog= syslog-header-stripper-ts-host
Wildcards
Wildcards
A wildcard is a character that you can substitute for any of a class of characters.
Note: This is a work in progress.
Search
Configuration files
alert_actions.conf
296
alert_actions.conf
Alert_actions.conf controls parameters for available alerting actions for scheduled searches.
alert_actions.conf.spec
# Copyright (C) 2005-2008 Splunk Inc. All Rights Reserved. Version 3.0
#
# This file contains possible attributes and values for configuring global saved search actions and
# in alert_actions.conf. Saved searches are configured in savedsearches.conf.
#
# There is an alert_actions.conf in $SPLUNK_HOME/etc/system/default/. To set custom configurations,
# place an alert_actions.conf in $SPLUNK_HOME/etc/system/local/. For examples, see
# alert_actions.conf.example. You must restart Splunk to enable configurations.
#
# To learn more about configuration files (including precedence) please see the documentation
# located at http://www.splunk.com/base/Documentation/latest/Admin/HowDoConfigurationFilesWork.
################################################################################
# Global options: these settings do not need to be prefaced by a stanza name
# If you do not specify an entry for each attribute, Splunk will use the default value.
################################################################################
maxresults = <int>
* Set the global maximum number of search results sent via alerts.
* Defaults to 100.
hostname = <string>
* Set the hostname that is displayed in the link sent in alerts.
* This is useful when the machine sending the alerts does not have a FQDN.
* Defaults to current hostname (set in Splunk) or localhost (if none is set).
################################################################################
# EMAIL: these settings are prefaced by the [email] stanza name
################################################################################
[email]
* Set email notification options under this stanza name.
* Follow this stanza name with any number of the following attribute/value pairs.
* If you do not specify an entry for each attribute, Splunk uses the default value.
from = <string>
* Email address originating alert.
* Defaults to splunk@$LOCALHOST.
subject = <string>
* Specify an alternate email subject.
* Defaults to SplunkAlert-<savedsearchname>.
format = <string>
* Specify the format of text in the email.
* Possible values: plain, html, raw and csv.
* This value will also apply to any attachments.
inline = <true | false | auto>
* Specify whether the search results are contained in the body of the alert email.
mailserver = <string>
* The SMTP mail server to use when sending emails.
* Defaults to $LOCALHOST.
################################################################################
# RSS: these settings are prefaced by the [rss] stanza
################################################################################
[rss]
* Set rss notification options under this stanza name.
297
items_count = <number>
* Number of saved RSS feeds.
* Cannot be more than maxresults (in [email] stanza).
* Defaults to 30.
################################################################################
# summary_index: these settings are prefaced by the [summary_index] stanza
################################################################################
[summary_index]
* Set summary index options under this stanza name.
* NOTE: For best practice, set up summary indexing via Splunk Web.
command = <string>
* Command that triggers the summary indexing action.
* CAUTION: This attribute is set automatically when you configure summary indexing in Splunk Web.
* Configure summary indexing via Splunk Web to avoid any mistakes.
alert_actions.conf.example
#
# This is an example alert_actions.conf. Use this file to configure alert actions for saved searches.
#
# To use one or more of these configurations, copy the configuration block into alert_actions.conf
# in $SPLUNK_HOME/etc/system/local/. You must restart Splunk to enable configurations.
#
[email]
from = <email address>
# Set a custom from email address.
subject = <custom subject>
# By default, the subject is SplunkAlert-<splunk-name>, but you can set a custom subject here.
format = <html, plain, csv>
# Specify the format of the text in the email.
# Possible values: html, plain, csv.
[rss]
items_count=30
# Set the threshold of rss feeds.
app.conf
app.conf
Configure app.conf to create dynamic fields for user entry in an application.
app.conf.spec
#
# This file contains possible attribute/value pairs for creating user entry fields for your custom application.
# Configure available fields for user entry via app.conf.
# There is no default app.conf. To set custom configurations, place an app.conf in
# $SPLUNK_HOME/etc/system/local/. For examples, see app.conf.example.
# You must restart Splunk to enable configurations.
298
#
[config:$STRING]
* Name your stanza.
* Preface with config:.
* Set $STRING to any arbitrary identifier.
targetconf = <$CONFIG_FILE>
* Target configuration file for changes.
* There can be only one.
* Any configuration file that is included in the application.
* For example indexes, for indexes.conf.
targetstanza = <$STANZA_NAME>
* Stanza name from application.
targetkey = <$ATTRIBUTE>
* Attribute to set.
targetkeydefault = <$VALUE>
* Default setting for attribute.
* Can be empty for no default.
conflabel = <$LABEL>
* Short description of configuration to display in Splunk Web.
app.conf.example
#
# The following are example app.conf configurations. Configure properties for your custom application.
#
# There is NO DEFAULT app.conf.
#
# To use one or more of these configurations, copy the configuration block into
# props.conf in $SPLUNK_HOME/etc/system/local/. You must restart Splunk to enable configurations.
#
[config:coldindexpath]
targetconf=indexes
targetstanza=sampledata
targetkey=coldPath
targetkeydefault=$SPLUNK_DB/sampledata/colddb
conflabel=Cold DB Path for Sample Data Index
[config:thawedindexpath]
targetconf=indexes
targetkey=thawedPath
targetkeydefault=$SPLUNK_DB/sampledata/thaweddb
conflabel=Thawed DB Path for Sample Data Index
[config:homeindexpath]
targetconf=indexes
targetkey=homePath
targetkeydefault=$SPLUNK_DB/sampledata/db
conflabel=Home DB Path for Sample Data Index
299
audit.conf
audit.conf
audit.conf controls settings for auditing and event signing.
audit.conf.spec
#
# This file contains possible attributes and values you can use to configure auditing
# and event signing in audit.conf.
#
# There is NO DEFAULT audit.conf. To set custom configurations, place an audit.conf in
# $SPLUNK_HOME/etc/system/local/. For examples, see audit.conf.example. You must restart
# Splunk to enable configurations.
#
#########################################################################################
# KEYS: specify your public and private keys for encryption.
#########################################################################################
[auditTrail]
* This stanza turns on cryptographic signing for audit trail events (set in inputs.conf)
and hashed events (if event hashing is enabled).
privateKey=/some/path/to/your/private/key/private_key.pem
publicKey=/some/path/to/your/public/key/public_key.pem
* Set a path to your keys.
* You must have a private key to encrypt the signatures and a public key to decrypt them.
* Generate your own keys using genAuditKeys.py in $SPLUNK_HOME/bin/.
queueing=<true | false>
* Turn off sending audit events to the indexQueue -- tail the audit events instead.
* If this is set to 'false', you MUST add an inputs.conf stanza to tail the audit log.
* Defaults to 'true.'
#########################################################################################
# EVENT HASHING: turn on SHA256 event hashing.
#########################################################################################
[eventHashing]
* This stanza turns on event hashing -- every event is SHA256 hashed.
* The indexer will encrypt all the signatures in a block.
filters=mywhitelist,myblacklist...
* (Optional) Filter which events are hashed.
* Specify filtername values to apply to events.
* NOTE: The order of precedence is left to right.
# FILTER SPECIFICATIONS FOR EVENT HASHING
[filterSpec:<event_whitelist | event_blacklist>:<filtername>]
* This stanza turns on whitelisting or blacklisting for events.
* Use filternames in "filters" entry (above).
* For example [filterSpec:event_whitelist:foofilter].
* Follow the filterSpec stanza with an optional list of blacklisted/whitelisted sources,
hosts or sourcetypes (in order from left to right).
* For example:
source=s1,s2,s3...
host=h1,h2,h3...
sourcetype=st1,st2,st3...
300
all=<true | false>
* The 'all' tag tells the blacklist to stop 'all' events.
* Defaults to 'false.'
audit.conf.example
#
# This is an example audit.conf. Use this file to configure auditing and event hashing.
#
# There is NO DEFAULT audit.conf.
#
# To use one or more of these configurations, copy the configuration block into audit.conf
#
###################################
# Audit heading
# If this stanza exists, audit events are cryptographically signed.
# You must have a private key to encrypt the signatures and a public key to decrypt them.
# Generate your own keys using genAuditKeys.py in $SPLUNK_HOME/bin/.
[auditTrail]
privateKey=/some/path/to/your/private/key/private_key.pem
publicKey=/some/path/to/your/public/key/public_key.pem
###################################
# EXAMPLE 1 - Hash all events
# This performs an SHA256 hash on every event other than ones in the audit index.
# NOTE: All you need to enable hashing is the presence of the stanza 'eventHashing'.
[eventHashing]
###################################
# EXAMPLE 2 - Simple blacklisting
# Splunk does NOT hash any events from the hosts listed - they are 'blacklisted'. Hash all other
# events.
host=somehost.splunk.com, 45.2.4.6, 45.3.5.4
[eventHashing]
filters=myblacklist
###################################
# EXAMPLE 3 - Multiple blacklisting
# DO NOT hash any events with the following, sources, sourcetypes and hosts - they are all
# blacklisted. All other events are hashed.
host=somehost.splunk.com, 46.45.32.1
source=/some/source
sourcetype=syslog, apache.error
[eventHashing]
filters=myblacklist
###################################
# EXAMPLE 4 - Whitelisting
# Hash ONLY those events which are sourcetype 'syslog'. All other events are NOT hashed.
# Note that filters are executed from left to right for every event.
# If an event passes a whitelist, the rest of the filters do not execute. Thus, placing
# the whitelist filter before the 'all' blacklist filter says "only hash those events which
# match the whitelist".
[filterspec:event_whitelist:mywhitelist]
sourcetype=syslog
source=/var/log
301
host=foo
[filterspec:event_blacklist:nothingelse]
#The 'all' tag is a special boolean (defaults to false) that says match *all* events
all=True
[eventSigning]
filters=mywhitelist, nothingelse
authentication.conf
authentication.conf
authentication.conf controls which authentication method is used (LDAP or native Splunk
authentication) and contains settings for LDAP configuration. This file is written to when you use
SplunkWeb to set up server authentication (Admin > Server > Authentication Configuration) and can
also be configured manually.
When you wish to test changes to authentication.conf, you do not need to restart the Splunk
server. You can reload the file by using SplunkWeb > Admin > Server > Control > Reload
Authentication Configuration.
authentication.conf.spec
#
# This file contains possible attributes and values for configuring authentication via
# authentication.conf.
#
# There is an authentication.conf in $SPLUNK_HOME/etc/system/default/. To set custom configurations,
# place an authentication.conf in $SPLUNK_HOME/etc/system/local/. For examples, see
# authentication.conf.example. You must restart Splunk to enable configurations.
#
[authentication]
authType = <string>
* Specify which authentication system to use.
* Currently available: Splunk, LDAP, Scripted.
* Defaults to Splunk.
authSettings = <string>
* Key to look up the specific configurations of chosen authentication system.
* <string> is the name of the stanza header [<authSettingsKey>].
* This is used by LDAP and Scripted Authentication.
#####################
# LDAP settings
#####################
[<authSettings-key>]
host = <string>
* Hostname of LDAP server.
* Be sure that your Splunk server can resolve the host name.
port = <integer>
* Specify the port that Splunk should use to connect to your LDAP server.
302
* By default, LDAP servers listen on TCP port 389.
pageSize = <integer>
* Determines how many records to return at one time.
* Enter 0 to disable and revert to LDAPv2.
* Defaults to 800.
SSLEnabled = <integer>
* 0 for disabled.
* 1 for enabled.
* See the file $SPLUNK_HOME/etc/openldap/openldap.conf for SSL LDAP settings.
bindDN = <string>
* Bind string for the manager that will be retrieving the LDAP records.
* This user needs to have access to all LDAP users you wish to add to Splunk.
bindDNpassword = <string>
* Password for bindDN user.
groupBaseDN = <string>
* Location of the user groups in LDAP.
* You may provided a ';' delimited list to search multiple trees.
groupBaseFilter = <string>
* This attribute defines the group name.
* Default value is objectclass=*, which should work for most configurations.
* Splunk can also accept a POSIX-style GID as a group base filter.
groupMappingAttribute = <string>
* Name of LDAP group mapping when the list of users in a group do not match the dn of the user.
* Sometimes this is a list of uid attributes and not dn attributes.
* In most cases, you can leave this field blank.
groupMemberAttribute = <string>
* This is usually member or memberOf, depending on whether the memberships are listed in the group entry or the user entry.
* The standard POSIX value is member.
groupNameAttribute = <string>
* Set this only if users and groups are defined in the same tree.
* This is usually cn.
realNameAttribute = <string>
* Name of LDAP user field to map to Splunk's realname field.
* For example, cn.
userBaseDN = <string>
* Location of user records in LDAP.
* Enter a ';' delimited list to search multiple trees.
userBaseFilter = <string>
* The object class you want to filter users on.
* Default value is objectclass=*, which should work for most configurations.
* Or set a specific filter for users:
* For example
userBaseFilter = (|(department=IT)(department=HR))
matches users who are in the IT department or HR department
userNameAttribute = <string>
* NOTE: The username attribute cannot contain whitespace. The username is case sensitive.
* In Active Directory, this is sAMAccountName.
* The value uid should work for most configurations.
failsafeLogin = <string>
* This login allows you to log into Splunk in the event that your LDAP server is unreachable.
* IMPORTANT: This user has admin privileges on the Splunk install.
303
failsafePassword = <string>
* Default password for your failsafe user.
#####################
# Map roles
#####################
[roleMap]
* Follow this stanza name with the following attribute/value pair.
<RoleName> = <string>
* Map LDAP roles to Splunk role (as defined in authorize.conf).
* This list is semi-colon delimited (no spaces).
#####################
# Scripted authentication
#####################
[<authSettings-key>]
scriptPath = <string>
* Full path to the script.
* eg $SPLUNK_HOME/etc/system/bin/$MY_SCRIPT.
scriptSearchFilters = 0|1
* Set to 1 to call the script to add search filters.
* 0 disables.
# Cache timing:
# Use these settings to adjust the frequency at which Splunk calls your application.
# Each call has its own timeout specified in seconds. Caching does not occur if not specified.
[cacheTiming]
getUserInfoTTL = <integer>
* Timeout for getUserInfo in seconds.
getUserTypeTTL = <integer>
* Timeout for getUsertype in seconds.
getUsersTTL = <integer>
* Timeout for getUsers in seconds.
userLoginTTL = <integer>
* Timeout for userLogin calls.
getSearchFilterTTL = <integer>
* Timeout for search filters.
authentication.conf.example
#
# This is an example authentication.conf. Use this file to configure LDAP or toggle between LDAP
# and Splunk's native authentication system.
#
# To use one or more of these configurations, copy the configuration block into authentication.conf
#
# Use Splunk's built-in authentication:
[auth]
authType = Splunk
# Use LDAP
[authentication]
authType = LDAP
authSettings = ldaphost
304
[ldaphost]
host = ldaphost.domain.com
pageSize = 0
port = 389
SSLEnabled = 0
failsafeLogin = failsafe
failsafePassword = fail
bindDN = cn=Directory Manager
bindDNpassword = password
groupBaseDN = ou=Groups,dc=splunk,dc=com;
realNameAttribute = givenName
userBaseDN = ou=People,dc=splunk,dc=com;
# You can also set a stanza to map roles you have created in authorize.conf to users in authentication.conf.
[roleMap]
Admin = SplunkAdmins
# Scripted Auth examples
# The following example is for RADIUS authentication:
[authentication]
authType = Scripted
[script]
scriptPath = $SPLUNK_HOME/bin/python $SPLUNK_HOME/share/splunk/authScriptSamples/radiusScripted.py
# The following example works with PAM authentication:
[authentication]
authType = Scripted
[script]
scriptPath = $SPLUNK_HOME/bin/python $SPLUNK_HOME/share/splunk/authScriptSamples/pamScripted.py
[cacheTiming]
userLoginTTL = 1
searchFilterTTL = 1
getUserInfoTTL = 1
getUserTypeTTL = 1
getUsersTTL = 1
authorize.conf
authorize.conf
Use this file to configure roles and granular access controls.
New in version 3.4.8, the 'change_own_password' capability is available to disable password
changes per role. The ability to change one's own password is enabled by default.
authorize.conf.spec
#
# This file contains possible attribute/value pairs for creating roles in authorize.conf.
305
# You can configure roles and granular access controls by creating your own authorize.conf.
# There is an authorize.conf in $SPLUNK_HOME/etc/system/default/. To set custom configurations,
# place an authorize.conf in $SPLUNK_HOME/etc/system/local/. For examples, see
# authorize.conf.example. You must restart Splunk to enable configurations.
#
[capability::<capability>]
* Define a capability in Splunk.
* This can also be added dynamically by software registering in the system (see restmap.conf.spec).
* Splunk adds most of its capabilities this way so they are enumerated at the end of the file for reference.
* See below for the default list of capabilities.
[role_<roleName>]
<capability_name> = <enabled|disabled>
* Capability attached to this role.
* You can list many of these.
importRoles = <string>
* Semicolon delimited list of other role capabilities that should be imported.
srchFilter = <string>
* Semicolon delimited list of search filters for this Role.
srchTimeWin = <string>
* Maximum time span of a search.
* In seconds.
# The following is a list of Splunk's capabilities. NOTE: This list is subject to change as
# new capabilities are added and old ones are deprecated. If you encounter problems while
# configuring authorize.conf, please contact support@splunk.com.
[role_Admin]
edit_user = change user information in CLI/UI.
edit_search_server = gives you the ability to write any xml config file in $SPLUNK_HOME/etc.
delete_user = delete users in UI/CLI.
user_tab = access users in Splunk Web.
edit_authen = edit authentication configurations.
delete_authen = delete authentication configurations.
sync_auth = sync your auth system with Splunk's settings.
edit_server_config = edit server configurations.
delete_eventtype_tag = delete eventtype tags.
delete_global_search = delete a saved search.
config_management = manage configurations.
access_datastore = allows access to tagging info and license usage info.
change_authentication = this allows you to save authentication settings.
bounce_authentication = reload authentication in the UI/CLI.
target_processor = save settings to Splunk's internal processors
admin_operator = run the admin operator while searching.
delete_by_keyword = access delete search operator.
allow_shutdown = shutdown Splunk.
write_config_splunkd = narrows write config to splunkd.xml, for server tab in Splunk Web.
server_settings_tab = access server settings tab in Splunk Web.
server_control_tab = access server control tab in Splunk Web.
server_auth_config_tab = access server authentication configurations in Splunk Web.
distributed_all_tab = enables the distributed search tab in Splunk Web.
distributed_receive_tab = enables the distributed search receive tab in Splunk Web.
distributed_forward_tab = enables the distributed search forwarding tab in Splunk Web.
distributed_search_tab = enables the distributed search tab in Splunk Web.
license_tab = access license tab.
search_admin_index = search the admin index or any index prefaced with a _.
edit_alert_action = change alert actions.
edit_applications = access the applications section of Splunk Web Admin page.
edit_audit = change audit settings.
306
edit_roles = change user mappings to roles.
edit_deployment_server = change deployment server settings.
edit_deployment_class_mapping = edit deployment classes.
edit_deployment_client = change deployment client settings.
edit_event_discoverer = change event discovery settings.
edit_field_actions = change field action settings.
edit_index = change index settings.
edit_input_defaults = change default input settings.
edit_batch = change watch/batch input settings.
edit_fifo = change FIFO settings.
edit_filter = configure filter for fschange monitor.
edit_fschange = change file system monitor settings.
edit_monitor = change monitor input settings.
edit_scripted = change scripted input settings.
edit_splunktcp = set distributed data settings over tcp.
edit_splunktcp_ssl = set tcp ssl settings.
edit_ssl = set ssl settings.
edit_tcp = change tcp input settings.
edit_udp = change udp input settings.
edit_prefs = edit prefs.conf.
edit_props = edit props.conf.
edit_transaction_types = edit transactiontypes.conf
edit_transform = edit transforms.conf.
edit_segmenter = edit segmenters.conf.
edit_server = change server settings in server.conf.
edit_source_classifier = change source classification as sourcetype.
edit_admin_tabs = controls editing admin tabs stanza in web.conf.
edit_web_settings = change the web.conf settings.
edit_forward_server = change settings on the forwarding side.
run_script_crawl = run the crawl script.
run_script_input = run input script.
run_script_idxprobe = run idxprobe script
use_file_operator = use the file operator to search of your file system.
request_auth_token = get auth token for other users.
edit_user_searches = edit any saved search.
rest_apps_management = manage applications via the REST endpoint.
rest_properties_get = read REST services/properties.
rest_properties_set = write REST services/properties.
importRoles = Power;User;Everybody
srchFilter =
[role_Power]
edit_global_save_search = edit a shared saved search.
schedule_search = schedule a search.
delete_global_save_search = delete a shared saved search.
create_alert = schedule an alert for a scheduled search.
start_alert = run alerts for a scheduled search.
start_global_alert = run a shared alert for a scheduled search.
stop_alert = disable an alert.
stop_global_alert = disable a shared alert.
edit_role_search = save a search to a specific role.
allow_livetail = display live tail in the UI.
edit_tags = set tags for events.
run_script_collect = run collect script.
importRoles = User;Everybody
srchFilter =
[role_User]
edit_local_search = change only your own searches.
savesearch_tab = access saved searches via Splunk Web.
get_metadata = access metadata for metadata search processor.
307
get_typeahead = allow typeahead.
edit_eventtype = configure eventtypes via eventtype.conf.
get_user_prefs = retrieve your own user prefs.
set_user_prefs = write your own prefs.
get_property_map = lets you write to a conf file.
access_datamap = export global data import global data via the CLI.
get_config_by_type = access configurations.
get_config_file = access any configuration file.
search = run a search.
# Script running capabilities
list_inputs = list inputs.
list_saved_searches = list saved searches -- see your own and those shared with your role.
run_web_script_fields = Interactive field extraction script.
run_web_script_surrounding_events = enabled
# These scripts are located in $SPLUNK_HOME/etc/searchscripts/
run_script_createrss = enabled
run_script_diff = enabled
run_script_gentimes = enabled
run_script_head = enabled
run_script_iplocation = enabled
run_script_loglady = enabled
run_script_marklar = enabled
run_script_overlap = enabled
run_script_reportcache = enabled
run_script_runshellscript = enabled
run_script_sendemail = enabled
run_script_transpose = enabled
run_script_uniq = enabled
run_script_windbag = enabled
run_script_mocknodegraph = enabled
run_script_xmlkv = enabled
run_script_xmlunescape = enabled
importRoles = Everybody
srchFilter =
[role_Everybody]
srchFilter =
authorize.conf.example
#
# This is an example authorize.conf. Use this file to configure roles and capabilities.
#
# To use one or more of these configurations, copy the configuration block into authorize.conf
#
[role_Ninja]
edit_save_search = enabled
schedule_search = enabled
edit_eventtype = enabled
edit_role_search = enabled
edit_local_search = enabled
savesearch_tab = enabled
edit_tags = enabled
importRoles = User;Everybody
srchFilter = host=foo
# This creates the role Ninja, which inherits capabilities from the default roles User and Everybody.
308
# Ninja has almost the same capabilities as Power, except cannot create alerts (only saved searches).
# Also, Ninja is limited to searching on host=foo.
commands.conf
commands.conf
Use this file to create custom search commands.
commands.conf.spec
#
# This file contains possible attribute/value pairs for creating search commands for
# any custom search scripts created. Add your custom search script to $SPLUNK_HOME/etc/searchscripts/
# or $SPLUNK_HOME/apps/MY_APP/bin/. For the latter, put a custom commands.conf in
# $SPLUNK_HOME/apps/MY_APP. For the former, put your custom commands.conf
# in $SPLUNK_HOME/etc/system/local/.
# There is a commands.conf in $SPLUNK_HOME/etc/system/default/. For examples, see
# commands.conf.example. You must restart Splunk to enable configurations.
[$STANZA_NAME]
* Each stanza represents a search command; the command is the stanza name.
* The stanza name invokes the command in the search language.
* Set the following attributes/values for the command. Otherwise, Splunk uses the defaults.
type = <string>
* Type of script: python, perl
* Defaults to python.
filename = <string>
* Name of script file for command.
* <stanza-name>.pl for perl.
* <stanza-name>.py for python.
streaming = <true/false>
* Is the command streamable.
maxinputs = <integer>
* Maximum number of events that can be passed to the command for each invocation.
* 0 for no limit.
passauth = <true/false>
* If set to true, passes an authentication token on the start of input.
enableheader = <true/false>
* Indicate whether or not your script is expecting header information or not.
* Currently, the only thing in the header information is an auth token.
* Defaults to true.
commands.conf.example
# This is an example commands.conf. Use this to configure custom search commands.
#
309
# To use one or more of these configurations, copy the configuration block into commands.conf
# NOTE: Add your custom search script to $SPLUNK_HOME/etc/searchscripts/
# or $SPLUNK_HOME/apps/MY_APP/bin/. For the latter, put a custom commands.conf in
# $SPLUNK_HOME/apps/MY_APP/. For the former, put your custom commands.conf
[foo]
FILENAME = foo.pl
type = perl
[black_smoke]
FILENAME = black_smoke.py
crawl.conf
crawl.conf
crawl.conf.example
#
# The following are example crawl.conf configurations. Configure properties for crawl.
#
# crawl.conf in $SPLUNK_HOME/etc/system/local/. You must restart Splunk to enable configurations.
#
[simple_file_crawler]
bad_directories_list = bin, sbin, boot, mnt, proc, tmp, temp, home, mail, .thumbnails, cache, old
bad_extensions_list = mp3, mpg, jpeg, jpg, m4, mcp, mid
bad_file_matches_list = *example*, *makefile, core.*
packed_extensions_list = gz, tgz, tar, zip
collapse_threshold = 10
days_sizek_pairs_list = 3-0,7-1000, 30-10000
big_dir_filecount = 100
index = main
max_badfiles_per_dir = 100
crawl.conf.spec
#
# This file contains possible attribute/value pairs for configuring crawl.
#
# There is a crawl.conf in $SPLUNK_HOME/etc/system/default/. To set custom configurations,
# place a crawl.conf in $SPLUNK_HOME/etc/system/local/. For help, see
# crawl.conf.example. You must restart Splunk to enable configurations.
#
#
# Set of attribute-values used by crawl.
#
# If attribute, ends in _list, the form is:
#
# attr = val, val, val, etc.
310
#
# The space after the comma is necessary, so that "," can be used, as in BAD_FILE_PATTERNS's use of "*,v"
[default]
logging = <warn | error | info | debug>
* Set crawl's logging level -- affects the logs in
* Defaults to warn.
[crawlers]
* This stanza enumerates all the available crawlers.
* Follow this stanza name with a list of crawlers.
crawlers_list = <comma-separated list of crawlers>
* Create the crawlers below, in a stanza with the crawler name as the stanza header.
[file_crawler]
* Set crawler-specific attributes under this stanza header.
* Follow this stanza name with any of the following attributes.
* The stanza name is the crawler name for crawlers_list (above).
root = <semi-colon separate list of directories>
* Set a list of directories this crawler should search through.
* Defaults to /;/Library/Logs
bad_directories_list = <comma-separated list of bad directories>
* List any directories you don't want to crawl.
* Defaults to:
bin, sbin, boot, mnt, proc, tmp, temp, dev, initrd, help, driver, drivers, share, bak, old, lib, include, doc, docs, man, html, images, tests, js,
dtd, org, com, net, class, java, resource, locale, static, testing, src, sys, icons, css, dist, cache, users, system, resources, examples, gdm, manual,
spool, lock, kerberos, .thumbnails, libs, old, manuals, splunk, splunkpreview, mail, resources, documentation, applications, library, network,
automount, mount, cores, lost\+found, fonts, extensions, components, printers, caches, findlogs, music, volumes, libexec,
bad_extensions_list = <comma-separated list of file extensions to skip>
* List any file extensions and crawl will skip files that end in those extensions.
* Defaults to:
0t, a, adb, ads, ali, am, asa, asm, asp, au, bak, bas, bat, bmp, c, cache, cc,
cg, cgi, class, clp, com, conf, config, cpp, cs, css, csv, cxx, dat,
doc, dot, dvi, dylib, ec, elc, eps, exe, f, f77, f90, for, ftn, gif, h, hh,
hlp, hpp, hqx, hs, htm, html, hxx, icns, ico, ics, in, inc, jar, java, jin,
jpeg, jpg, js, jsp, kml, la, lai, lhs, lib, license, lo, m, m4, mcp, mid, mp3,
mpg, msf, nib, nsmap, o, obj, odt, ogg, old, ook, opt, os, os2, pal, pbm, pdf,
pdf, pem, pgm, php, php3, php4, pl, plex, plist, plo, plx, pm, png, po, pod,
ppd, ppm, ppt, prc, presets, ps, psd, psym, py, pyc, pyd, pyw, rast, rb, rc,
rde, rdf, rdr, res, rgb, ro, rsrc, s, sgml, sh, shtml, so, soap, sql, ss, stg,
strings, tcl, tdt, template, tif, tiff, tk, uue, v, vhd, wsdl, xbm, xlb, xls,
xlw, xml, xsd, xsl, xslt, jame, d, ac, properties, pid, del, lock, md5, rpm,
pp, deb, iso, vim, lng, list
bad_file_matches_list = <comma-separated list of regex>
* Crawl applies the specified regex and skips files tha match the patterns.
* There is an implied "$" (end of file name) after each pattern.
* Defaults to:
*~, *#, *,v, *readme*, *install, (/|^).*, *passwd*, *example*, *makefile, core.*
packed_extensions_list = <comma-separated list of extensions>
* Specify extensions of compressed files to include.
* Defaults to:
bz, bz2, tbz, tbz2, Z, gz, tgz, tar, zip
collapse_threshold = <integer>
* Specify the minimum number of files a source must have to be considered a directory.
* Defaults to 1000.
days_sizek_pairs_list = <comma-separated hyphenated pairs of integers>
* Specify a comma-separated list of age (days) and size (kb) pairs to constrain what files are crawled.
* For example: days_sizek_pairs_list = 7-0, 30-1000 tells Splunk to crawl only files last
modified within 7 days and at least 0kb in size, or modified within the last 30 days and at least 1000kb in size.
311
* Defaults to 30-0.
big_dir_filecount = <integer>
* Skip directories with files above <integer>
index = <$INDEX>
* Specify index to add crawled files to.
* Defaults to main.
max_badfiles_per_dir = <integer>
* Specify how far to crawl into a directory for files.
* Crawl excludes a directory if it doesn't find valid files within the specified max_badfiles_per_dir.
* Defaults to 100.
decorations.conf
decorations.conf
Use this file to configure event decorations in Splunk Web.
decorations.conf.spec
#
# This file contains possible attributes and values you can use to configure decorating audit events
# in decorations.conf.
#
# NOTE: You can only decorate audit events with this file. To configure decorations for
# other events, please see prefs.conf.spec.
#
# There is a decorations.conf in $SPLUNK_HOME/etc/system/default/. To set custom configurations,
# place a decorations.conf in $SPLUNK_HOME/etc/system/local/. For examples, see
# decorations.conf.example. You must restart Splunk to enable configurations.
#
[audittrail]
* This stanza turns on decorations.
* Each attribute maps to any tag in 'prefs.conf' that starts with the word 'decoration_'.
valid = decoration_$PREFSTAGv
* Maps to the decoration tag for an audit event that is in sequence and has not been tampered with.
* $PREFSTAGv is the name of the tag configured for valid events in prefs.conf.
gap = decoration_$PREFSTAGg
* Maps to the decoration tag for an audit event that has an event before it that is out of
sequence or missing.
* $PREFSTAGg is the name of the tag configured for gap events in prefs.conf.
tampered = decoration_$PREFSTAGt
* Maps to the decoration tag for an audit event that has been changed such that the
cryptographic signature does not match.
* $PREFSTAGt is the name of the tag configured for tampered events in prefs.conf.
cantValidate = decoration_$PREFSTAGc
* Maps to events where no signature exists, or the signature is corrupt and cannot be decrypted,
so it cannot be validated.
312
* $PREFSTAGc is the name of the tag configured for cantValidate events in prefs.conf.
decorations.conf.example
#
# This is an example decorations.conf. Use this file to configure audit event decorations.
# NOTE: You can only decorate audit events with this file. To configure decorations for
# other events, please see prefs.conf.spec.
#
# To use one or more of these configurations, copy the configuration block into decorations.conf
#
# The left side must be these values.
# The right side maps to decorations in prefs.conf.
[audittrail]
valid = decoration_my_valid
gap = decoration_my_gap
tampered = decoration_my_tampered
cantValidate = decoration_my_cantvalidate
deployment.conf
deployment.conf
Deployment.conf contains settings for configuring both deployment servers and clients.
deployment.conf.spec
#
# This file contains possible attributes and values for configuring deployment settings for both
# a Deployment Server and Deployment Clients in deployment.conf.
#
# There is NO DEFAULT deployment.conf.
# To set custom configurations, place a deployment.conf in $SPLUNK_HOME/etc/system/local/.
# For examples, see deployment.conf.example. You must restart Splunk to enable configurations.
#
#******************************************************************************
# SERVER SETTINGS: These settings are for the Deployment Server (not client).
# The Deployment Server is the Splunk instance that pushes configs to Deployment Clients.
#******************************************************************************
* The directory which contains the base of your server class configuration directories.
* Defaults to $SPLUNK_HOME/etc/modules/distributedDeployment/classes
www.* = web,apache
10.1.*.1 = osx
* Map IP ranges or DNS entries to server classes.
* You can put a wildcard (*) anywhere in the string.
313
#******************************************************************************
# MULTICAST: The following settings are only for multicast configuration (as opposed to polling).
# Only use this group of tags if you are using multicast to notify deployment clients.
* Set multicast configuration options under this stanza name.
sendMulticast = <true/false>
* To use multicast, set this to true.
multicastUri = <IP:Port>
* Which multicast group to send to.
* Only used if 'sendMulticast = true'.
* Multicast is disabled if this field is not set.
* No default.
interfaceIP = <IP Address>
* The IP address of the interface to send multicast packets on.
* Defaults to whatever the kernel picks (usually sufficient).
frequency = <integer>
* How often (in seconds) to send multicast packets.
* Defaults to 30 seconds.
useDNS = <true/false>
* Look up host name.
#******************************************************************************
# CLIENT SETTINGS: These settings are for Deployment Clients.
# Deployment Clients are Splunk instances that receive configs from the Deployment Server.
#******************************************************************************
# NOTE: You do not need to write these settings in deployment.conf if you enable the client
# from the CLI using ./splunk set deploy. These settings are written out automatically.
[deployment-client]
* List of deployment servers to poll.
* Use this setting concurrently with multicast, but typically it is used instead of multicast.
* How often (in seconds) to poll each deployment server listed in 'deploymentServerUri'.
* Only used if deploymentServerUri is specified.
* Back off polling the deployment server at a random number between 0 and <integer> (in seconds).
* The more deployment clients controlled by a single deployment server, the higher this number should be.
* maxBackoff effectively "smooths" the number of concurrent requests on the server.
* List of server classes that this deployment client is a member of.
* Usually set on the server side, under [distributedDeployment-classMaps].
* If not set on the server side, specify membership here.
#******************************************************************************
# Multicast configuration -- these settings are only necessary if you are using multicast.
* A comma-separated list of multicast addresses for deployment server instructions.
* Each deployment server needs to be in control over a unique set of server classes.
* Typically there is one entry in this list, or it is left blank.
* Use the physical interface bound to this IP address to listen to multicasts.
* Only set this if using multicast.
314
deployment.conf.example
#
# This is an example deployment.conf. Use this file to configure deployment.
#
# There is NO DEFAULT deployment.conf.
#
# To use one or more of these configurations, copy the configuration block into deployment.conf
#
#################
# Server settings
# Path to the server class configuration files.
# By default, this example directory does not exist -- you must create your own.
# Multicast configuration
sendMulticast=true
# Enable multicast
# Class map settings
www.* = web,apache
10.1.1.2* = osx
# Map server classes to host names and IP addresses
#################
# Client settings
[deployment-client]
multicastUri=225.0.0.39.9999
# This stanza enables a distributed deployment client which listens on the same
# multicast IP as the above server sends.
distsearch.conf
distsearch.conf
Use distsearch.conf to configure distributed search.
distsearch.conf.spec
#
# This file contains possible attributes and values you can use to configure distributed search.
#
# There is NO DEFAULT distsearch.conf.
#
# To set custom configurations, place a distsearch.conf in $SPLUNK_HOME/etc/system/local/.
# For examples, see distsearch.conf.example. You must restart Splunk to enable configurations.
#
315
[distributedSearch]
* Set distributed search configuration options under this stanza name.
* If you do not set any attribute, Splunk uses the default value (if there is one listed).
* Toggle distributed search off and on.
* Defaults to false (your distributed search stanza is enabled by default).
heartbeatFrequency = <in seconds>
* Heartbeat in seconds.
* 0 disables all heartbeats.
* If the heartbeat is disabled, no other Splunk server is able to auto-discover this instance.
* Defaults to 2.
heartbeatMcastAddr = <IP address>
* Set a multicast address.
* Defaults to 255.0.0.37.
heartbeatPort = <port>
* Set heartbeat port.
* Defaults to 60.
serverTimeout = <in seconds>
* How long to wait for a connection to a server.
* If a connection occurs, a search times out in 10x this value.
* For example, if set to 10 seconds, the maximum search allowed is 100 seconds.
* This setting works in tandem with 'removeTimedOutPeers.'
* Defaults to 10.
statusTimeout = <in seconds>
* Set how long to wait for a server to return its status.
* Up this number if your peered servers are slow or if the server name disappears from the
SplunkWeb widget.
removedTimedOutServers = <true | false>
* If true, remove a server connection that cannot be made within 'serverTimeout.'
* If false, every call to that server attempts to connect.
* NOTE: This may result in a slow user interface.
checkTimedOutServersFrequency = <in seconds>
* This tag is ONLY relevant if 'removeTimedOutServers' is set to true.
* If 'removeTimedOutServers' is false, this attribute is ignored.
* Rechecks servers at this frequency (in seconds).
* If this is set to 0, then no recheck will occur.
* Defaults to 60.
autoAddServers = [True | False]
* If this tag is set to 'true', this node will automatically add all discovered servers.
skipOurselves = [True | False]
* If this is set to 'true', then this server will NOT participate as a server in any search or
other call.
* This is used for building a node that does nothing but merge the results from other servers.
* Defaults to 'false.'
ttl = <integer>
* Time To Live.
* Increasing this number allows the UDP multicast packets to spread beyond the current subnet
to the specified number of hops.
* NOTE: This only will work if routers along the way are configured to pass UDP multicast packets.
* Defaults to 1 (this subnet).
servers =
* Initial list of servers.
* If operating completely in 'autoAddServers' mode (discovering all servers), there is no need
to have any servers listed here.
316
blacklistNames =
* List of server names that you do not want to peer with. <CA>
* Server names are the 'server name' that is created for you at startup time.
blacklistURLs =
* Comma-delimited lists of blacklisted discovered servers.
* You can black list on server name (above) or server URI (x.x.x.x:port).
distsearch.conf.example
#
# This is an example distsearch.conf. Use this file to configure distributed search. For all
# available attribute/value pairs, see distsearch.conf.spec.
#
# There is NO DEFAULT distsearch.conf.
#
# To use one or more of these configurations, copy the configuration block into distsearch.conf
#
[distributedSearch]
heartbeatFrequency = 10
servers = 192.168.1.1:8059,192.168.1.2:8059
blacklistURLs = 192.168.1.3:8059,192.168.1.4:8059
# This entry distributes searches to 192.168.1.1:8059,192.168.1.2:8059.
# The server sends a heartbeat every 10 seconds.
# There are four blacklisted instances, listed across blacklistNames and blacklistURLs.
# Attributes not set here will use the defaults listed in distsearch.conf.spec.
eventdiscoverer.conf
eventdiscoverer.conf
eventdiscover.conf controls whether and how Splunk attempts to automatically learn new event
types.
eventdiscoverer.conf.spec
# This file contains possible attributes and values you can use to configure event discovery through
# the search command "typelearner."
#
# There is an eventdiscoverer.conf in $SPLUNK_HOME/etc/system/default/. To set custom configurations,
# place an eventdiscoverer.conf in $SPLUNK_HOME/etc/system/local/. For examples, see
# eventdiscoverer.conf.example. You must restart Splunk to enable configurations.
#
ignored_keywords = <comma-separated list of terms>
* Terms in this list are never considered for defining an event type.
* If you find that eventtypes have terms you do not want considered (e.g., "mylaptopname"), add
that term to this list.
* Default = "sun, mon, tue,..." (see $SPLUNK_HOME/etc/system/default/eventdiscover.conf).
ignored_fields = <comma-separated list of fields>
* Similar to ignored_keywords, except fields as defined in Splunk.
317
* Defaults include time-related fields that would not be useful for defining an event type.
eventdiscoverer.conf.example
#
# This is an example eventdiscoverer.conf. These settings are used to control the discovery of
# common eventtypes used by the typelearner search command.
#
# To use one or more of these configurations, copy the configuration block into eventdiscoverer.conf
#
# Terms in this list are never considered for defining an eventtype.
ignored_keywords = foo, bar, application, kate, charlie
# Fields in this list are never considered for defining an eventtype.
ignored_fields = pid, others, directory
eventtypes.conf
eventtypes.conf
eventtypes.conf stores definition and tags for event types, whether they were discovered by
Splunk automatically or defined by users in Splunk Web.
Note: To disable any event type from eventtypes.conf:
Delete the eventtype from $SPLUNK_HOME/etc/system/default/eventtypes.conf
Add the tag disabled = 0 to any event type entry.
Set disabled = 0 in $SPLUNK_HOME/etc/system/local/eventtypes.conf for
any entry in ../default/eventtypes.conf to override the default entry.

eventtypes.conf.spec
#
# This file contains all possible attributes and value pairs for an eventtypes.conf file.
# Use this file to configure event types and their properties. You can also pipe any search
# to the "typelearner" command to create event types. Event types created this way will be written
# to $SPLUNK_HOME/etc/systems/local/eventtypes.conf.
#
# There is an eventtypes.conf in $SPLUNK_HOME/etc/system/default/. To set custom configurations,
# place an eventtypes.conf in $SPLUNK_HOME/etc/system/local/. For examples, see
# eventtypes.conf.example. You must restart Splunk to enable configurations.
#
[$EVENTTYPE]
* Header for the event type
* $EVENTTYPE is the name of your event type.
* You can have any number of event types, each represented by a stanza and any number
of the following attribute/value pairs.
318
* NOTE: If the name of the event type includes field names surrounded by the percent
character (e.g. "%$FIELD%") then the value of $FIELD is substituted into the event type
name for that event. For example, an event type with the header [cisco-%code%] that has
"code=432" becomes labeled "cisco-432".
disabled = <1 or 0>
* Toggle event type on or off.
* Set to 1 to disable.
search = <string>
* Search terms for this event type.
* For example: error OR warn.
tags = <string>
* Space separated words that are used to tag an event type.
isglobal = <1 or 0>
* Toggle whether event type is shared.
* If isglobal is set to 1, everyone can see/use this event type.
* Defaults to 1.
eventtypes.conf.example
#
# This file contains an example eventtypes.conf. Use this file to configure custom eventtypes.
#
# To use one or more of these configurations, copy the configuration block into eventtypes.conf
#
#
# The following example makes an eventtype called "error" based on the search "error OR fatal."
[error]
search = error OR fatal
tags = error problem alert important
# The following example makes an eventtype template because it includes a field name
# surrounded by the percent character (in this case "%code%").
# The value of "%code%" is substituted into the event type name for that event.
# For example, if the following example event type is instantiated on an event that has a
# "code=432," it becomes "cisco-432".
[cisco-%code%]
search = cisco
field_actions.conf
field_actions.conf
field_actions.conf controls what actions are available in SplunkWeb inline with events.
field_actions.conf.spec
#
# This file contains possible attribute and value pairs for creating field actions: drop-down
# actions in SplunkWeb. You can configure field actions by creating your own field_actions.conf.
#
# There is a field_actions.conf in $SPLUNK_HOME/etc/system/default/. To set custom configurations,
# place a field_actions.conf in $SPLUNK_HOME/etc/system/local/. For examples, see
319
# field_actions.conf.example. You must restart Splunk to enable configurations.
#
#
# NOTE: You must clear your browser cache.
# In Firefox, go to Tools > Clear Private Data >
# If you are experiencing errors, check $SPLUNK_HOME/var/log/splunk/web_service.log.
[<field_action_name>]
* Set field action options under this stanza name.
metaKeys = <string>
* Comma-separated list of metadata keys that are required for the action to display in Splunk Web.
* Keys listed in metaKeys are then usable in the uri field.
uri = <string>
* URI, either beginning with http:// or https://.
* Alternately, for URLs in SplunkWeb, beginning with "/".
* This URI will load when the user clicks on the action in SplunkWeb.
target = <string>
* Only meaningful if URI is present.
* If set to _self, the URI loads in the current window.
* If set to _blank, URI opens in a new window.
* If set to fooWindow, the URI opens in any window named fooWindow or in a new window if none exists.
method = <string>
* The HTTP method that should be used with the given URI.
* Can be set to either GET or POST.
* Only meaningful if URI is present.
payload = <string>
* Only meaningful if method is set to POST.
* This method allows the user to customize the values passed.
* IMPORTANT: Key value pairs are separated with an &
* For example, event={$_raw}&myhost={$host}.
term = <string>
* An alternative to URI.
* If present, the action becomes a search in Splunk.
* Assuming you have metaKeys rhost and ruser, you can search term=<string> {$rhost} {$ruser}.
* The search string will run whenever a user clicks the field action.
alwaysReplace = <true/false>
* For use with the term field.
* If present and set to true, term will replace the current search instead of appending to it.
field_actions.conf.example
#
# This file contains an example field_actions.conf. Use this file to configure field actions.
#
# fields_actions.conf in $SPLUNK_HOME/etc/system/local/. You must restart Splunk to
# enable configurations.
#
#
# NOTE: Splunk Web must be restarted when you make changes to this file.
# Additionally, you must clear your browser cache. In Firefox this is Tools > Clear Private Data >
# If you are experiencing errors, check $SPLUNK_HOME/var/log/splunk/web_service.log.
320
# This example searches an IP on Google:
[googleExample]
metaKeys=ip
uri=http://google.com/search?q={$ip}
label=Google this ip
method=GET
# This example does a reverse look up on an IP address:
[WAN_ReverseLookup]
metaKeys=ip
uri=http://www.networksolutions.com/enhancedWHOIS.do?queryString={$ip}&method-submit=&successPage=%2Fwhois%2Fresults.jsp&errorPage=%2Fwhois%2Findex.jsp&fatalErrorPage=%2Fcommon%2Ferror.jsp&queryType=ip&STRING2.x=26&STRING2.y=12&currentPage=%2Fwhois%2Findex.jsp
label=Reverse look up this IP
# This example jumps to a bug in Jira:
[Jira]
metaKeys=jira
uri=http://10.1.1.10:8080/browse/SPL-{$jira}
label=Go to Bug in Jira
target=_blank
# This example goes to commit in Perforce web:
[P4Web]
metaKeys=p4
http://perforce:8800/@md=d&cd=//&c=dmm@/{$p4}?ac=10
label=Go to commit in P4Web
# This example performs a geolocation on an IP address:
[IP2Location]
metaKeys=ip
uri=http://www.ip2location.com/{$ip}
label=Geolocate this IP
# This example runs a custom search in SplunkWeb:
[some_custom_search]
metaKeys = ruser,rhost
term=authentication failure | filter ruser={$ruser} rhost={$rhost}
label=Search for other breakin attempts by this user
alwaysReplace=true
# This example looks up your event on SplunkBase
[SplunkBaseLookup]
metaKeys=_raw, host
uri=http://www.splunkbase.com/
label=Search SplunkBase
target=splunkbase
method=POST
payload= event={$_raw}&myhost={$host}
# Links for other useful field actions:
#-- IP ADDRESS LINKS
#http://www.dnsstuff.com/tools/ptr.ch?ip={$ip}
#http://www.dnsstuff.com/tools/tracert.ch?ip={$ip}
#http://www.completewhois.org/cgi-bin/whois.cgi?query_type=auto&ip_whoislookup_cyberabuse=ON&ip_nameservers_hostlookup=ON&query={$ip}
#http://www.senderbase.org/search?oOrder=lastday%20desc&searchString={$ip}%2F24
#http://spamcop.net/w3m?action=checkblock&ip={$ip}
#http://www.google.com/search?q={$ip} -- sometimes useful to do a quick search on an IP address on Google
#http://groups.google.com/groups?q={$ip} -- you can search groups, blogs, whatever...
#http://spamcop.net/sc?track={$ip}
#http://clez.net/net.whois?ip={$ip}&t=ip
#http://www.melissadata.com/Lookups/iplocation.asp?ipaddress{$ip}
#-- HOST LINKS
#http://www.statsaholic.com/nagios.org?y=r&r=1y&z=10
#-- OTHER IDEAS
#windows eventID link http://www.eventid.net/display.asp?eventid=5781&source=netlogon
#IP2Location http://www.ip2location.com/demo.aspx
321
fields.conf
fields.conf
Add entries to fields.conf for any fields you create.
fields.conf.spec
#
# This file contains possible attribute and value pairs for configuring additional information for fields.
# Use this file if you are creating a field at index time (not advised).
# Also, use this file to indicate that your configured field's value is a smaller part of a token.
# For example, your field's value is "123" but it occurs as "foo123" in your event.
#
# There is a fields.conf in $SPLUNK_HOME/etc/system/default/. To set custom configurations,
# place a fields.conf in $SPLUNK_HOME/etc/system/local/. For examples, see
# fields.conf.example. You must restart Splunk to enable configurations.
#
[<field name>]
* Name of the field you're configuring.
TOKENIZER = <regular expression>
* A regular expression that indicates how the field can take on multiple values at the same time.
* Use this setting to configure multi-value fields (http://www.splunk.com/doc/current/admin/MultivalueFields).
* If empty, the field can only take on a single value.
* Otherwise, the first group is taken from each match to form the set of values.
* This setting is used by search/where (the search command), the summary and XML outputs of the asynchronous search API, and by the top, timeline and stats commands.
* Default to empty.
* Indicate whether a field is indexed or not.
* Set to true if the field is indexed.
* Set to false for fields extracted at search time (the majority of fields).
* Set {{indexed_value}} to true if the value is in the raw text of the event.
* Set it to false if the value is not in the raw text of the event.
* Setting this to true expands any search for key=value into a search of value AND key=value (since value is indexed).
* Defaults to true.
* NOTE: You only need to set indexed_value if indexed = false.
fields.conf.example
#
# This file contains an example fields.conf. Use this file to configure dynamic field extractions.
#
# fields.conf in $SPLUNK_HOME/etc/system/local/. You must restart Splunk to
#
322
#
# The following example can be used with the IMAP bundle (found here:
# http://www.splunkbase.com/addons/All/Technologies/Mail/addon:IMAP+Addon .)
# These tokenizers result in the values of To, From and Cc treated as a list,
# where each list element is an email address found in the raw string of data.
[To]
TOKENIZER = (\w[\w.\-]*@[\w.\-]*\w)
[From]
TOKENIZER = (\w[\w.\-]*@[\w.\-]*\w)
[Cc]
TOKENIZER = (\w[\w.\-]*@[\w.\-]*\w)
indexes.conf
indexes.conf
Indexes.conf controls index settings including archiving, retirement, path and tuning parameters.
To edit configurations for your local Splunk server, use
$SPLUNK_HOME/etc/system/local/indexes.conf.
You can create this file by copying examples from
$SPLUNK_HOME/etc/system/README/indexes.conf.example.
Never edit files in the default directory $SPLUNK_HOME/etc/system/default, as your changes
may be overwritten in an upgrade.
indexes.conf.spec
#
# This file contains all possible options for an indexes.conf file. Use this file to configure
# Splunk's indexes and their properties.
#
# There is an indexes.conf in $SPLUNK_HOME/etc/system/default/. To set custom configurations,
# place an indexes.conf in $SPLUNK_HOME/etc/system/local/. For examples, see
# indexes.conf.example. You must restart Splunk to enable configurations.
#
#
# CAUTION: You can drastically affect your Splunk installation by changing these settings.
# Consult technical support (support@splunk.com) if you are not sure how to configure this file.
#
# DO NOT change the attribute QueryLanguageDefinition without consulting technical support.
#******************************************************************************
# GLOBAL OPTIONS
# These options affect every index
#******************************************************************************
sync = <integer>
323
* The index processor syncs events every <integer> number of events.
* Must be non-negative.
* Set to 0 to disable.
* Defaults to 0.
defaultDatabase = <database name>
* If no index is specified during search, Splunk searches default database.
* Also the database displays by default on the homepage.
* Defaults to main.
queryLanguageDefinition = <path to file>
* The path to the search language definition file.
* DO NOT EDIT THIS SETTING.
* Defaults to $SPLUNK_HOME/etc/searchLanguage.xml.
blockSignatureDatabase = <database name>
* This is the database that stores block signatures of events.
* Defaults to _blocksignature.
#******************************************************************************
# PER INDEX OPTIONS
# These options may be set under an [$INDEX] entry
#******************************************************************************
disabled = true | false
* Toggle your index entry off and on.
* Set to true to disble an index.
homePath = <path on server>
* The path that contains the hot and warm databases and fields for the index.
* Splunkd keeps a file handle open for warm databases at all times .
* CAUTION: Path MUST be writable.
coldPath = <path on server>
* The path that contains the cold databases for the index.
* Cold databases are opened as needed when searching.
* CAUTION: Path MUST be writable.
thawedPath = <path on server>
* The path that contains the thawed (resurrected) databases for the index.
# The following options can be set either per index or at the top of the file as defaults for all indexes.
# Defaults set at the top of the file are overridden if set on a per-index basis.
maxWarmDBCount = <integer>
* The maximum number of warm DB_N_N_N directories.
* All warm DBs are in the <homePath> for the index.
* Warm DBs are kept in open state.
* Defaults to 300.
maxColdDBCount = <integer>
* The maximum number of open cold databases at any given time.
* THIS IS NOT the total number of cold databases.
* During search, splunkd keeps an LRU cache of all open cold DBs; this number controls the size of that cache.
* Defaults to 10.
maxTotalDataSizeMB = <integer>
* The maximum size of an index (in MB).
324
* If an index grows larger, the oldest data is frozen.
rotatePeriodInSecs = <integer>
* Frequency (in seconds) to check if a new hot DB needs to be created.
* Also the frequency to check if there are any cold DBs that need to be frozen.
* Defaults to 60.
frozenTimePeriodInSecs = <integer>
* Number of seconds after which indexed data rolls to frozen.
* If you do not specify a coldToFrozenScript, this data is erased.
* IMPORTANT: Every event in the DB must be older than frozenTimePeriodInSecs before it will roll.
* frozenTimePeriodInSecs will be frozen the next time splunkd checks.
warmToColdScript = <script>
* Specify a script to run when moving data from warm to cold.
* The script must accept two variables:
* First: the warm directory to be rolled to cold.
* Second: the destination in the cold path.
* You only need to set this if you store warm and cold DBs on separate partitions.
* Please contact Splunk Support if you need help configuring this setting.
* Specify an archiving script by changing <script>.
* Splunk ships with two default archiving scripts (or create your own):
* compressedExport.sh - Export with tsidx files compressed as gz.
* flatfileExport.sh - Export as a flat text file.
* Define <$script> paths relative to $SPLUNK_HOME/bin
* WINDOWS users use this notation:
coldToFrozenScript = <script> "$DIR"
* <script> can be either compressedExport.bat or flatfileExport.bat
compressRawdata = true | false
* If set to true, Splunk writes raw data out as compressed gz files.
* If set to false, Splunk will write data to an uncompressed raw file.
* Defaults to true.
maxConcurrentOptimizes = <integer>
* The number of concurrent optimize processes that can be run against the hot DB.
* This number should be increased if:
1. There are always many small tsidx files in the hot DB.
2. After rolling, there are many tsidx files in warm or cold DB.
waitForOptimize = <integer>
* Wait to roll until optimize processes finish.
* If you set to 0, Splunk does not wait for the optimize to finish before rolling.
* If you are seeing a big pause in indexing or searching during rolling set this to 0.
maxDataSize = <integer>
* The maximum size in MBs of the hot DB.
* The hot DB will grow to this size before it is rolled out to warm.
* Do not increase the default setting unless Splunk is running in 64bit mode.
* Defaults to 750.
indexThreads = <integer>
* The number of extra threads to use during indexing.
* This number should not be set higher than the number of processors in the box minus one.
325
* If splunkd is also doing parsing and aggregation, the number should be lower than the total number of processors minus two.
* Defaults to 0.
maxMemMB = <integer>
* The amount of memory to allocate for indexing.
* This amount of memory will be allocated PER INDEX THREAD.
* OR If indexThreads is set to 0, once per index.
* IMPORTANT: Calculate this number carefully.
* splunkd will crash if you set this number higher than what is available.
* Defaults to 50.
blockSignSize = <integer>
* Controls how many events make up a block for block signatures.
* If it is set to 0 block signing is disabled for this index.
* Defaults to 0.
#******************************************************************************
# Advanced memory tuning parameters. Do not alter these without contacting Splunk Support.
# Use maxMemMB to control memory usage.
#******************************************************************************
maxTermChars = <integer>
maxTerms = <integer>
maxPostings = <integer>
maxValues = <integer>
indexes.conf.example
#
# This file contains an example indexes.conf. Use this file to configure indexing properties.
#
# indexes.conf in $SPLUNK_HOME/etc/system/local/. You must restart Splunk to
#
#
# The following example sets up a new default index, called "hatch."
defaultDatabase = hatch
[hatch]
homePath = $SPLUNK_DB/hatchdb/db
coldPath = $SPLUNK_DB/hatchdb/colddb
thawedPath = $SPLUNK_DB/hatchdb/thaweddb
indexThreads = 1
# Max amount of physical memory (in megabytes) to use for a given index
maxMemMB = 200
maxDataSize = 10000
# The following example changes the default amount of space and memory Splunk's indexes use.
326
maxMemMB = 75
# The following example changes the time data is kept around by default.
# It also sets an export script. NOTE: You must edit this script to set export location before
# running it.
maxWarmDBCount = 200
maxColdDBCount = 5
frozenTimePeriodInSecs = 432000
rotatePeriodInSecs = 30
coldToFrozenScript = /opt/bin/compressedExport.sh
inputs.conf
inputs.conf
inputs.conf configures all inputs to Splunk including file and directory tailing and watching,
network ports and scripted inputs.
For help configuring inputs, see how input configuration works.
Note: Splunk looks for the inputs it is configured to monitor every 24 hours starting from the time it
yet, it could take up to 24 hours for Splunk to start indexing its contents. To ensure that your input is
immediately recognized and indexed, add the input via Splunk Web or by using the add command in
the CLI.
Important: To avoid performance issues, Splunk recommends that you set followTail=1 if you
are deploying Splunk to systems containing significant quantities of historical data. Setting
followTail=1 for a monitor input means that any new incoming data is indexed when it arrives, but
anything already in files on the system when Splunk was first started will not be processed.
inputs.conf.spec
# This file contains possible attributes and values you can use to configure inputs,
# distributed inputs and file system monitoring in inputs.conf.
#
# There is an inputs.conf in $SPLUNK_HOME/etc/system/default/. To set custom configurations,
# place an inputs.conf in $SPLUNK_HOME/etc/system/local/. For examples, see inputs.conf.example.
# You must restart Splunk to enable new configurations.
#
#
#*******
# WINDOWS INPUTS:
#*******
* Windows platform specific input processor.
* These inputs are enabled by default. To disable an input type, comment it out in $SPLUNK_HOME\etc\system\local\inputs.conf.
[WinEventLog:Application]
[WinEventLog:Security]
[WinEventLog:System]
#*******
# GENERAL SETTINGS:
327
# The following attributes/value pairs are valid for ALL input types (except file system change monitor).
# You must first enter a stanza header, specifying the input type.
# Then, use any of the following attribute/value pairs.
#*******
host = <string>
* Set the default host to a static value.
* "host=" is automatically prepended to <string>.
* Defaults to the IP address of fully qualified domain name of the host where the data originated.
index = <string>
* Set the index to store events from this input.
* "index=" is automatically prepended to <string>.
* Defaults to "index=main" (or whatever you have set as your default index).
source = <string>
* Set the source for events from this input.
* "source=" is automatically prepended to <string>.
* Defaults to the file path.
* Set the sourcetype for events from this input.
* "sourcetype=" is automatically prepended to <string>.
* Splunk automatically picks a source type based on various aspects of your data. There is no hard-coded default.
queue = parsingQueue | indexQueue
* Specify where the input processor should deposit the events it reads.
* Set to "parsingQueue" to apply props.conf and other parsing rules to your data.
* Set to "indexQueue" to send your data directly into the index.
* Defaults to parsingQueue.
#*******
# Valid input types follow, with input-specific attributes listed as well:
#*******
#*******
# MONITOR:
#*******
[monitor://<path>]
* This directs Splunk to watch all files in <path>.
* <path> can be an entire directory or just a single file.
* You must specify the input type and then the path, so put three slashes in your path if you're starting at the root.
# Additional attributes:
* If specified, <regular expression> extracts host from the filename of each input.
* Specifically, the first group of the regex is used as the host.
* If the regex fails to match, the default "host =" attribute is used.
* If specified, the '/' separated segment of the path is set as host.
* If the value is not an integer, or is less than 1, the default "host =" attribute is used.
_whitelist = <regular expression>
* If set, files from this path are monitored only if they match the specified regex.
_blacklist = <regular expression>
* If set, files from this path are NOT monitored if they match the specified regex.
Note: Wildcards and monitor:
328
* You can use wildcards to specify your input path for monitored input. Use ... for paths and * for files.
* ... recurses through directories until the match is met. This means that /foo/.../bar will match foo/bar, foo/1/bar, foo/1/2/bar, etc. but only if bar is a file.
* To recurse through a subdirectory, use another .... For example /foo/.../bar/....
* * matches anything in that specific path segment. It cannot be used inside of a directory path; it must be used in the last segment of the path. For example /foo/*.log matches /foo/bar.log but not /foo/bar.txt or /foo/bar/test.log.
* Combine * and ... for more specific matches:
* foo/.../bar/* matches any file in the bar directory within the specified path.
crcSalt = <string>
* Use this to force Splunk to consume files with matching CRCs.
* Set any string to add to the CRC.
* If set to "crcSalt = <SOURCE>" (the actual string <SOURCE>), then the full source path is added to the CRC.
followTail = 0 | 1
* If set to 1, monitoring begins at the end of the file (like tail -f).
* This only applies to files the first time Splunk sees them.
* After that, Splunk's internal file position records keep track of the file.
dedicatedFD = 0 | 1
* Dedicates a file descriptor to the input.
* Only accepted if monitor path points directly to a file (as opposed to a directory) and does not use the * or ... wildcards.
* Set the available number of FDs in limits.conf.
* Make sure you don't use up all the FDs as this may cause other data to be ignored.
* WARNING: This setting can drastically affect your Splunk install, as well as the server it is running on. Do NOT set unless you know exactly what you're doing.
alwaysOpenFile = 0 | 1
* Opens a file to check if it has already been indexed.
* Only useful for files that don't update modtime.
* Should only be used for monitoring files on Windows, and mostly for IIS logs.
* NOTE: This flag should only be used as a last resort, as it increases load and slows down indexing.
#*******
# BATCH:
#*******
[batch://<path>]
* One time, destructive input.
* For continuous, non-destructive inputs, use **monitor**.
move_policy = sinkhole.
* Important = You must set move_policy = sinkhole.
* This loads the file destructively.
* Do not use this input type for files you do not want to consume destructively.
host_regex (see MONITOR, above)
host_segment (see MONITOR, above)
# IMPORTANT: The following are not used by batch:
source = <string>
<KEY> = <string>
#*******
# TCP:
#*******
[tcp://<remote server>:<port>]
* Configure Splunk to listen on a specific port.
* If a connection is made from <remote server>, this stanza is used to configure the input.
* If <remote server> is blank, this stanza matches all connections on the specified port.
329
connection_host = ip | dns
* Set to either "ip" or "dns."
* "ip" sets the TCP input processor to rewrite the host with the IP address of the remote server.
* "dns" sets the host to the DNS entry of the remote server.
* Defaults to ip.
#*******
# Data distribution:
#*******
[Splunktcp://<remote server>:<port>]
* This is the same as TCP, except the remote server is assumed to be a Splunk server.
* For SplunkTCP, the host or connection_host will be used if the remote Splunk server does not set a host, or if the host is set to host::localhost.
* See documentation (http://www.splunk.com/doc/latest/admin/ForwardingReceiving) for help.
[splunktcp]
route = has_key | absent_key:<key>:<queueName>;...
* Settings for the light forwarder.
* Splunk sets these parameters automatically -- you DO NOT need to set them.
* The property route is composed of rules delimited by ';'.
* Splunk checks each incoming data payload via cooked tcp port against the route rules.
* If a matching rule is found, Splunk sends the payload to the specified <queueName>.
* If no matching rule is found, Splunk sends the payload to the default queue
specified by any queue= for this stanza. If no queue= key is set in
the stanza or globally, the events will be sent to the parsingQueue.
# SSL settings for data distribution:
[splunktcp-ssl:PORT]
* Use this stanza name if you are sending encrypted, cooked data from Splunk.
* Set PORT to the port on which your forwarder is sending cooked, encrypted data.
* Forwarder settings are set in outputs.conf on the forwarder-side.
[tcp-ssl:PORT]
* Use this stanza name if you are sending encrypted, raw data from a third-party system.
* Set PORT to the port on which your forwarder is sending raw, encrypted data.
[SSL]
* Set the following specifications for SSL underneath this stanza name:
serverCert = <path>
* Full path to the server certificate.
password = <string>
* Server certificate password, if any.
rootCA = <string>
* Certificate authority list (root file).
requireClientCert = true | false
* Toggle whether it is required for a client to authenticate.
supportSSLV3Only = <true|false>
* If true, tells the inputproc to only accept connections
* from SSLv3 clients.
* Default is false.
cipherSuite = <cipher suite string>
* If set, uses the specified cipher string for the input processors.
* If not set, uses the default cipher string
330
* provided by OpenSSL. This is used to ensure that the server does not
* accept connections using weak encryption protocols.
#*******
# UDP:
#*******
[udp://<port>]
* Similar to TCP, except that it listens on a UDP port.
_rcvbuf = <integer>
* Specify the receive buffer for the UDP port (in bytes).
* If the value is 0 or negative, it is ignored.
* Defaults to 1,000,000 (or 1 MB).
* Note: The default in the OS varies.
no_priority_stripping = true
* If this attribute is set to true, then Splunk does NOT strip the <priority> syslog field from received events.
* NOTE: Do NOT include this key if you want to strip <priority>.
no_appending_timestamp = true
* If this attribute is set to true, then Splunk does NOT append a timestamp and host to received events.
* NOTE: Do NOT include this key if you want to append timestamp and host to received events.
#*******
# FIFO:
#*******
[fifo://<path>]
* This directs Splunk to read from a FIFO at the specified path.
#*******
# Scripted Input:
#*******
[script://<cmd>]
* Runs <cmd> at a configured interval (below) and indexes the output.
* The command must reside in $SPLUNK_HOME/etc/system/bin/ or ../etc/apps/$YOUR_APP/bin/.
interval = <integer>
* How often to execute the specified command (in seconds).
passAuth = <username>
* User to run the script under.
* If you provide a username, Splunk generates an auth token for that user and passes it to the script via stdin.
#*******
# File system change monitor
#*******
[fschange:<path>]
* Monitors all add/update/deletes to this directory and sub directories.
* NOTE: <path> is the direct path. You do not need to preface it with // like other inputs.
* Sends an event for every change.
* NOTE: You cannot simultaneously watch a directory using fs change monitor and monitor (above).
# NOTE: fschange does not use the same attributes as other input types (above). Use only the following attributes.
331
filters = <filter1>,<filter2>,...<filterN>
* Each filter is applied left to right for each file or directory found during the monitor's poll cycle.
* See "File System Monitoring Filters" below for help defining a filter.
recurse = true | false
* If true, recurse directories within the directory specified in [fschange].
* Defaults to true.
followLinks = true | false
* Follow symbolic links if true.
* It is recommended that you do not set this to true or file system loops may occur.
pollPeriod = <integer>
* Check this directory for changes every <integer> seconds.
* Defaults to 3600.
hashMaxSize = <integer>
* Calculate a SHA256 hash for every file that is less than or equal to <integer> bytes.
* This hash is used as an addional method for detecting changes to the file/directory.
* Defaults to -1 (disabled).
fullEvent = true | false
* Set to true to send the full event if an add or update change is detected.
* Further qualified by the 'sendEventMaxSize' attribute.
sendEventMaxSize = <integer>
* Only send the full event if the size of the event is less than or equal to <integer> bytes.
* This limits the size of indexed file data.
* Defaults to -1, which is unlimited.
signedaudit = true | false
* Send cryptographically signed add/update/delete events.
* If set to true, events are *always* sent to the '_audit' index and will *always* have the sourcetype 'audittrail'.
* If set to false, events are placed in the main index and the source type is whatever you specify (or 'fs_notification' by default).
* NOTE: You MUST also enable auditing in audit.conf.
index = <index name>
* The index to store all events generated.
* Defaults to _audit.
* Set the sourcetype for events from this input.
* "sourcetype=" is automatically prepended to <string>.
filesPerDelay = <integer>
* Injects a delay specified by 'delayInMills' after processing <integer> files.
* This is used to throttle file system monitoring so it doesn't consume as much CPU.
delayInMills = <integer>
* The delay in milliseconds to use after processing every <integer> files as specified in 'filesPerDelay'.
* This is used to throttle file system monitoring so it doesn't consume as much CPU.
#*******
# File system monitoring filters:
#*******
332
[filter:<filtertype>:<filtername>]
* Define a filter of type <filtertype> and name it <filtername>.
<filtertype>
* Filter types are either 'blacklist' or 'whitelist.'
* A whitelist filter processes all file names that match the regex list.
* A blacklist filter skips all file names that match the regex list.
<filtername>
* The filter name is used in the comma-separated list when defining a file system monitor.
regex<integer> = <regex>
* Blacklist and whitelist filters can include a set of regexes.
* The name of each regex MUST be 'regex<integer>', where <integer> starts at 1 and increments.
* Splunk applies each regex in numeric order:
regex1=<regex>
regex2=<regex>
...
inputs.conf.example
#
# This is an example inputs.conf. Use this file to configure data inputs.
#
# inputs.conf in $SPLUNK_HOME/etc/system/local/. You must restart Splunk to
#
# The following configuration directs Splunk to read all the files in the directory /var/log.
# The following configuration directs Splunk to read all the files under /var/log/httpd and classify them
# as sourcetype::access_common.
[monitor:///var/log/httpd]
# The following configuration directs Splunk to read all the files under /mnt/logs. When the path is
# /mnt/logs/<host>/... it sets the hostname (by file) to <host>.
host_segment = 3
# The following configuration directs Splunk to listen on TCP port 9997 for raw data from ANY remote server
# (not just a Splunk instance). The host of the data is set to the IP address of the remote server.
[tcp://:9997]
# The following configuration directs Splunk to listen on TCP port 9995 for raw data from ANY remote server.
# The host of the data is set as the host name of the remote server. All data will also be
# assigned the sourcetype "log4j" and the source "tcp:9995."
[tcp://:9995]
sourcetype = log4j
source = tcp:9995
# The following configuration directs Splunk to listen on TCP port 9995 for raw data from 10.1.1.10.
# All data is assigned the host "webhead-1", the sourcetype "access_common" and the
# the source "//10.1.1.10/var/log/apache/access.log."
[tcp://10.1.1.10:9995]
host = webhead-1
source = //10.1.1.10/var/log/apache/access.log
# The following configuration sets a global default for data payloads sent from the light forwarder.
333
# The route parameter is an ordered set of rules that is evaluated in order for each payload of cooked data.
[splunktcp]
route=has_key:_utf8:indexQueue;has_key:_linebreaker:indexQueue;absent_key:_utf8:parsingQueue;absent_key:_linebreaker:parsingQueue;
# The following configuration directs Splunk to listen on TCP port 9996 for distributed search data from ANY
# remote server. The data is delivered directly to the indexer on the local machine without any
# further processing. The host of the data is set to the host name of the remote server ONLY
# IF the remote data has no host set, or if it is set to "localhost."
[splunktcp://:9996]
queue = indexQueue
# The following configuration directs Splunk to listen on TCP port 9998 for distributed search data from
# 10.1.1.100. The data is processed the same as locally indexed data.
[splunktcp://10.1.1.100:9996]
# The following configuration directs Splunk to listen on TCP port 514 for data from
# syslog.corp.company.net. The data is assigned the sourcetype "syslog" and the host
# is set to the host name of the remote server.
[tcp://syslog.corp.company.net:514]
sourcetype = syslog
# Set up SSL:
[SSL]
serverCert=$SPLUNK_HOME/etc/auth/server.pem
password=password
rootCA=$SPLUNK_HOME/etc/auth/CAcert.pem
requireClientCert=false
[splunktcp-ssl:9996]
# Use file system change monitor:
[fschange:/etc/]
fullEvent=true
pollPeriod=60
recurse=true
sendEventMaxSize=100000
index=main
limits.conf
limits.conf
Use limits.conf to configure limits for search commands.
limits.conf.spec
#
# This file contains possible attribute/value pairs for configuring limits for search commands.
#
# There is a limits.conf in $SPLUNK_HOME/etc/system/default/. To set custom configurations,
# place a limits.conf in $SPLUNK_HOME/etc/system/local/. For examples, see
# limits.conf.example. You must restart Splunk to enable configurations.
#
#
# CAUTION: Do not alter the settings in limits.conf unless you know what you are doing.
# Improperly configured limits may result in splunkd crashes and/or memory overuse.
334
* Each stanza controls different parameters of search commands.
[searchresults]
* This stanza controls search results.
maxresultrows = <integer>
* Configures the maximum number of events that can be present in memory at one time.
tocsv_maxretry = <integer>
* Maximum number of times to try in the atomic write operation.
* 1 = no retries.
* Defaults to 5.
tocsv_retryperiod_ms = <integer>
* Retry period.
* Defaults to 500.
[subsearch]
* This stanza controls subsearch results.
maxout = <integer>
* Maximum number of results to return from a subsearch.
* Defaults to 100.
maxtime = <integer>
* Maximum number of seconds to run a subsearch before finalizing
* Defaults to 10.
timeout = <integer>
* Maximum time to wait for an already running subsearch.
* Defaults to 30.
ttl = <integer>
* Time to cache a given subsearch's results.
* Defaults to 300.
[anomalousvalue]
* Configures the maximum number of events that can be present in memory at one time.
* Defaults to searchresults::maxresultsrows (eg 50000).
maxvalues = <integer>
* Maximum number of distinct values for a field.
maxvaluesize = <integer>
* Maximum size in bytes of any single value (truncated to this size if larger)
* Defaults to 1000.
[associate]
maxfields = <integer>
* Maximum number of fields to analyze.
* Maximum number of values for any field to keep track of.
* Maximum length of a single value to consider.
* Defaults to 1000.
[ctable]
* This stanza controls the contingency, ctable, and counttable commands.
* Maximum number of columns/rows to generate (i.e. the maximum distinct values for the row field and column field)
* Defaults to 1000.
[correlate]
maxfields = <integer>
* Maximum number of fields to correlate.
* Defaults to 1000.
[discretize]
335
* This stanza set attributes for bin/bucket/discretize.
maxbins = <integer>
* Maximum number of buckets to discretize into.
* If maxbins is not specified or = 0, it defaults to searchresults::maxresultrows (eg 50000).
[inputcsv]
mkdir_max_retries = <integer>
* Maximum number of retries for creating a tmp directory (with random name as subdir of SPLUNK_HOME/var/run/splunk)
* Defaults to 100.
[kmeans]
maxdatapoints = <integer>
* Maximum data points to do kmeans clusterings for.
* Defaults to 100000000
[kv]
maxcols = <integer>
* When non-zero, the point at which kv should stop creating new fields.
* Defaults to 512.
[metrics]
maxseries = <integer>
* The number of series to include in the per_x_thruput reports in metrics.log.
* Defaults to 10.
[rare]
* Maximum number of result rows to create.
* If not specified, defaults to searchresults::maxresultrows (eg 50000).
* Maximum number of distinct field vector values to keep track of.
* Defaults 100000.
* defaults to 1000.
[report]
* Defaults to 300.
[restapi]
* Maximum result rows to be return by /events or /results getters from REST API.
[search]
ttl = <integer>
* How long searches should be stored on disk once completed.
status_buckets = 300
* The approximate maximum number of timeline buckets to maintain.
* Defaults to 300.
max_count = <integer>
* The last accessible event in a call that takes a base and bounds.
min_prefix_len = <integer>
* The minimum length of a prefix before a * to ask the index about.
* Defaults to 1.
max_results_raw_size = <integer>
* The largest "_raw" volume that should be read in memory.
cache_ttl = <integer>
* The length of time to persist search cache entries (in seconds).
* Defaults to 300.
[slc]
maxclusters = <integer>
336
* Maximum number of clusters to create.
[stats]
* Maximum number of values for any field to keep track of.
* Defaults to 1000.
[thruput]
maxKBps = <integer>
* If specified and not zero, this limits the speed through the thruput processor to the specified rate in kilobytes per second.
[top]
* Maximum number of distinct field vector values to keep track of.
* Defaults to 1000.
[inputproc]
max_fd = <integer>
* Maximum number of file descriptors that Splunk can use in the Select Processor.
* The maximum value allowed is the top number of file descriptors per process / 2.
* Defaults to 32.
time_before_close = <integer>
* Modtime delta required before Splunk can close a file on EOF.
* Tells the system not to close files that have been updated in past <integer> seconds.
* Defaults to 5.
fishbucketSyncTime = <integer>
* Frequency at which the fishbucket DB queue is flushed to disk.
* Default is 10 seconds.
tailing_proc_speed = <integer>
* Number of non-input directory entries Splunk will traverse before sleeping.
* Controls how actively Splunk will traverse blacklisted files, directories, and other excluded files.
* Increasing this setting increases Splunks use of CPU and speeds up the location of included/non-blacklisted files for indexing.
* Defaults to 1, contact Splunk Support for guidance in setting this value any higher.
* This setting is only available in 3.4.11 and later.
limits.conf.example
#
# This file contains an example limits.conf.
#
# CAUTION: Do not alter the settings in limits.conf unless you know what you are doing.
# Improperly configured limits may result in splunkd crashes and/or memory overuse.
#
# limits.conf in $SPLUNK_HOME/etc/system/local/. You must restart Splunk to
#
337
[searchresults]
maxresultrows = 50000
# maximum number of times to try in the atomic write operation (1 = no retries)
tocsv_maxretry = 5
# retry period is 1/2 second (500 milliseconds)
tocsv_retryperiod_ms = 500
[subsearch]
# maximum number of results to return from a subsearch
maxout = 100
# maximum number of seconds to run a subsearch before finalizing
maxtime = 10
# maximum time to wait for an already running subsearch
timeout = 30
# time to cache a given subsearch's results
ttl = 300
[anomalousvalue]
# maximum number of distinct values for a field
maxvalues = 100000
# maximum size in bytes of any single value (truncated to this size if larger)
maxvaluesize = 1000
[associate]
maxfields = 10000
maxvalues = 10000
maxvaluesize = 1000
# for the contingency, ctable, and counttable commands
[ctable]
maxvalues = 1000
[correlate]
maxfields = 1000
# for bin/bucket/discretize
[discretize]
maxbins = 50000
# if maxbins not specified or = 0, defaults to searchresults::maxresultrows
[inputcsv]
# maximum number of retries for creating a tmp directory (with random name in SPLUNK_HOME/var/run/splunk)
mkdir_max_retries = 100
[kmeans]
maxdatapoints = 100000000
[kv]
# when non-zero, the point at which kv should stop creating new columns
maxcols = 512
[rare]
# maximum distinct value vectors to keep track of
maxvalues = 100000
maxvaluesize = 1000
[report]
maxresultrows = 300
[restapi]
# maximum result rows to be return by /events or /results getters from REST API
[search]
# how long searches should be stored on disk once completed
ttl = 86400
# the approximate maximum number of timeline buckets to maintain
status_buckets = 300
# the last accessible event in a call that takes a base and bounds
338
max_count = 10000
# the minimum length of a prefix before a * to ask the index about
min_prefix_len = 1
# the largest "_raw" volume that should be read in memory
max_results_raw_size = 100000000
# the length of time to persist search cache entries (in seconds)
cache_ttl = 300
[slc]
# maximum number of clusters to create
maxclusters = 10000
[stats]
maxvalues = 10000
maxvaluesize = 1000
[top]
# maximum distinct value vectors to keep track of
maxvalues = 100000
maxvaluesize = 1000
[inputproc]
max_fd = 32
time_before_close = 5
literals.conf
literals.conf
literals.conf controls what literal text displays and how literal text is displayed in SplunkWeb.
EuroDate formatting is configured here. See
$SPLUNK_HOME/etc/system/default/literals.conf for all possible configurable strings.
literals.conf.example
#
# This file contains an example literals.conf. Use this file to configure the externalized strings
# in Splunk.
#
# literals.conf in $SPLUNK_HOME/etc/system/local/. You must restart Splunk to
#
#
# For the full list of all literals that can be overwritten, consult the far longer list in
# $SPLUNK_HOME/etc/system/default/literals.conf
#
# NOTE: When strings contain "%s", do not add or remove any occurrences of %s, or reorder their positions.
# NOTE: When strings contain HTML tags, take special care to make sure that all tags and quoted
# attributes are properly closed, and that all entities such as & are escaped.
# //////////////////////////////////////////////////////////////////////////////
# UI/Appserver embedded strings
# //////////////////////////////////////////////////////////////////////////////
[ui]
# -------------
# to switch to european date format, use these values instead
339
SEARCH_TERM_TIME_FORMAT = %d/%m/%Y:%H:%M:%S
SEARCH_RESULTS_TIME_FORMAT = %d/%m/%Y %H:%M:%S
# -------------
PRO_SERVER_LOGIN_HEADER = Login to Splunk (guest/guest)
INSUFFICIENT_DISK_SPACE_ERROR = The server's free disk space is too low. Indexing will temporarily pause until more disk space becomes available.
SERVER_RESTART_MESSAGE_ADMIN = You need to restart the Splunk Server for your changes to take effect. <span class="divider">|</span> To restart, go to the <a href="/admin/settings/control">Server > Control</a> screen.
SERVER_RESTART_MESSAGE = This Splunk Server's configuration has been changed. The server needs to be restarted by an administrator.
UNABLE_TO_CONNECT_MESSAGE = Could not connect to splunkd at %s.
multikv.conf
multikv.conf
Use multikv.conf to configure extracted fields from table-like events.
multikv.conf.spec
#
# This file contains possible attribute and value pairs for creating multikv rules.
# Multikv is the process of extracting events from table-like events, such as the output of top, ps,
# ls, netstat, etc.
#
# There is NO DEFAULT multikv.conf. To set custom configurations, place a multikv.conf in
# $SPLUNK_HOME/etc/system/local/. For examples, see multikv.conf.example.
#
#
# NOTE: Configure multikv.conf only if you are unhappy with Splunk's automatic multikv
# behavior. If you use the multikv search command with successful outcome, there is no reason to
# create this file.
# A table-like event includes a table, which in turn consists of four parts or sections:
#
#---------------------------------------------------------------------------------------
# Section Name | Description
#---------------------------------------------------------------------------------------
# pre | optional: info/description (eg the system summary output in top)
# header | optional: if not defined, fields are named Column_N
# body | required: this is the body of the table from which child events are constructed
# post | optional: info/description
#---------------------------------------------------------------------------------------
# NOTE: Each section up to and including the section for processing must have both a section
# definition (below) and processing (also below) set.
[multikv_config_name]
* Name your stanza to use with the mulitkv search command:
ex: '.... | multikv conf=$STANZA_NAME rmorig=f | ....'
#####################
# Section Definition
#####################
# Define where each section begins and ends.
#
section_$NAME.start = <regex>
340
* A line matching this regex denotes the start of this section (inclusive).
OR
section_$NAME.start_offset = <int>
* Line offset from event-start or end of previous section where this section starts (inclusive).
* Use this if you cannot define a regex for the start of the section.
section_$NAME.member = <regex>
* A line membership test.
* Member iff lines match the regex.
section_$NAME.end = <regex>
* A line matching this regex denotes the end of this section (exclusive).
OR
section_name.linecount = <int>
* Specify the number of lines in this section.
* Use this if you cannot specify a regex for the end of the section.
#####################
# Section processing
#####################
# Set processing for each section.
#
section_$NAME.ignore = <string-matcher>
* Member lines matching this string matcher will be ignored and thus NOT processed further
* <string-matcher> = _all_ | _none_ | _regex_ <regex-list>
section_$NAMEe.replace = <quoted-str> = <quoted-str>, <quoted-str> = <quoted-str>....
* List of the form toReplace = replaceWith.
* Can have any number of toReplace = replaceWith.
* Example: "%" = "_", "#" = "_"
section_$NAME.tokens = <chopper> | <tokenizer> | <aligner> | <token-list>
* See below for definitions of each possible $VAL.
<chopper> = _chop_, <int-list>
* Transform each string into a list of tokens specified by <int-list>.
* <int-list> is a list of (offset, length) tuples.
<tokenizer> = _tokenize_ <max_tokens (int)> <delims>
* <delims> = comma-separated list of delimiting chars.
* Tokenize the string using the delim characters.
* This generates at most max_tokens tokens
* Set max_tokens to:
* -1 for complete tokenization
* 0 to inherit from previous section (usually header)
* Or to a non-zero number for a specific token count
* If tokenization is limited by the max_tokens the rest of the string is added onto the last token.
* Note: consecutive delimiters treated as an empty field.
<aligner> = _align_, <header_string>, <side>, <max_width>
* Generates tokens by extracting text aligned to the specified header fields.
* header_string: a complete or partial header field value the columns are aligned with.
* side: either L or R (for left or right align, respectively).
* max_width: the maximum width of the extracted field.
* Set max_width to -1 for automatic width (this expands the field until any of the
following delimiters are found : " ", "\t")
<token_list> = _token_list_ <comma-separated list>
* Defines a list of static tokens in a section.
* This is useful for tables with no header, for example in the output of 'ls -lah'
which misses a header altogether.
341
multikv.conf.example
#
# This file contains example multi key/value extraction configurations.
#
# multikv.conf in $SPLUNK_HOME/etc/system/local/. You must restart Splunk to
#
# located at hhttp://www.splunk.com/base/Documentation/latest/Admin/HowDoConfigurationFilesWork.
# This example breaks up the output from top:
# Sample output:
# Processes: 56 total, 2 running, 54 sleeping... 221 threads 10:14:07
#.....
#
# PID COMMAND %CPU TIME #TH #PRTS #MREGS RPRVT RSHRD RSIZE VSIZE
# 29960 mdimport 0.0% 0:00.29 3 60 50 1.10M 2.55M 3.54M 38.7M
# 29905 pickup 0.0% 0:00.01 1 16 17 164K 832K 764K 26.7M
#....
[top_mkv]
# pre table starts at "Process..." and ends at line containing "PID"
pre.start = "Process"
pre.end = "PID"
pre.ignore = _all_
# specify table header location and processing
header.start = "PID"
342
header.linecount = 1
header.replace = "%" = "_", "#" = "_"
header.tokens = _tokenize_, -1," "
# table body ends at the next "Process" line (ie start of another top) tokenize
# and inherit the number of tokens from previous section (header)
body.end = "Process"
body.tokens = _tokenize_, 0, " "
## This example handles the output of 'ls -lah' command:
#
# total 2150528
# drwxr-xr-x 88 john john 2K Jan 30 07:56 .
# drwxr-xr-x 15 john john 510B Jan 30 07:49 ..
# -rw------- 1 john john 2K Jan 28 11:25 .hiden_file
# drwxr-xr-x 20 john john 680B Jan 30 07:49 my_dir
# -r--r--r-- 1 john john 3K Jan 11 09:00 my_file.txt
[ls-lah]
pre.start = "total"
pre.linecount = 1
# the header is missing, so list the column names
header.tokens = _token_list_, mode, links, user, group, size, date, name
body.end = "^\s*$"
body.member = "\.cpp"
# concatenates the date into a single unbreakable item
343
body.replace = "(\w{3})\s+(\d{1,2})\s+(\d{2}:\d{2})" ="\1_\2_\3"
# ignore dirs
body.ignore = _regex_ "^drwx.*",
body.tokens = _tokenize_, 0, " "
outputs.conf
outputs.conf
outputs.conf controls the destination and configuration for routing and cloning data to other
servers over TCP.
outputs.conf.spec
#
# This file contains possible attributes and values for configuring outputs.conf. Configure
# Splunk's data forwarding actions by creating your own outputs.conf.
#
# There is NO DEFAULT outputs.conf. To set custom configurations, place an outputs.conf
# $SPLUNK_HOME/etc/system/local/. For examples, see outputs.conf.example.
#
#
# NOTE: Place outputs.conf on the forwarding side of any distributed Splunk deployment.
# To learn more about distributed configurations, see the documentation at
# http://www.splunk.com/doc/latest/admin/ForwardingReceiving.
#########################################################################################
#----GLOBAL CONFIGURATION-----
#########################################################################################
# These configurations will be used if they are not overwritten in specific target groups.
# All events that do not have target group metadata will be sent to this group.
# If there is more than one group specified, the events will be cloned to all listed.
[tcpout]
defaultGroup= Group1, Group2, ...
attribute1 = val1
attribute2 = val2
...
#NOTE: This is not for typical use:
#This configuration item looks in the event for <key>. If the event contains this
#this key, the value is prepended to the raw data that is sent out to the destination
#server. Note that this ONLY works if 'sendCookedData = false'. The key/value pair
#and how it is derived is set in props.conf and transforms.conf.
344
#Use case: appending <priority> to a syslog event which has been obtained by monitoring
#a syslog file and sending it out to a syslog server.
prependKeyToRaw = key
#########################################################################################
#----TARGET GROUP CONFIGURATION-----
#########################################################################################
# You can have as many target groups as you wish.
# If more than one is specified, the forwarder will clone every event into each target group.
[tcpout:$TARGET_GROUP]
server=$IP:$PORT, $IP2:$PORT2...
attribute1 = val1
attribute2 = val2
...
#########################################################################################
#----SINGLE SERVER CONFIGURATION-----
#########################################################################################
# NOTE: Single server configuration is necessary for implementing SSL and back-off settings
# (listed below). However, you must list any single server as a part of a target group or
# default group to send data.
attribute1 = val1
attribute2 = val2
...
#########################################################################################
#----OPTIONAL SETTINGS----
#########################################################################################
# There are a number of optional attributes you can set in outputs.conf.
sendCookedData = true | false
* If true, events are cooked (have been processed by Splunk and are not raw).
* If false, events are raw and untouched prior to sending.
* Set to false if you are sending to a third-party system.
* Defaults to true.
heartbeatFrequency = <integer>
* How often (in seconds) to send a heartbeat packet to the receiving server.
* Heartbeats are only sent if 'sendCookedData' is true.
blockOnCloning = true | false
* If true, TcpOutputProcessor blocks till at least one of the cloned group gets events. This will
not drop events when all the cloned groups are down.
* If false, TcpOutputProcessor will drop events when all the cloned groups are down and Queues for
the each cloned groups are full. When at least one of the cloned groups are up and Queues are not full,
the event is not dropped.
* Defaults to true.
#########################################################################################
#----QUEUE SETTINGS----
345
#########################################################################################
maxQueueSize = <integer>
* The maximum number of queued events (queue size) on the forwarding server.
* Defaults to 1000.
dropEventsOnQueueFull = <integer>
* If set to a positive number N, wait N * 5 seconds before throwing out all new events until the queue has space.
* Setting this to -1 or 0 will set the queue to block when it gets full causing blocking up the processor chain.
* When any target group's queue is blocked, no more data will reach any other target group.
* Using load balanced groups is the best way to alleviate this condition because multiple
receivers must be down (or jammed up) before queue blocking occurs.
* Defaults to -1 (do not drop events).
* DO NOT SET THIS VALUE TO A POSITIVE INTEGER (true) IF YOU ARE MONITORING FILES!
indexAndForward = true | false
* In addition to other actions, index all this data locally as well as forwarding it.
* This is known as an index and forward configuration.
#########################################################################################
#----BACKOFF SETTINGS----
#########################################################################################
# Backoff settings are server specific, meaning they must be set in a [tcpout-server://$IP:$PORT] stanza.
# They cannot be set for a target or default group.
# These are optional, and there are no global overrides for these.
backoffAtStartup = <integer>
* Set how long (in seconds) to wait until retrying the first time a retry is needed.
* Defaults to 5.
initialBackoff = <integer>
* Set how long (in seconds) to wait until retrying every time after the first retry.
* Defaults to 2.
maxNumberOfRetriesAtHighestBackoff = <integer>
* Specifies the number of times the system should retry after reaching the highest back-off period before stopping completely.
* -1 means to try forever.
* It is suggested that you never change this from the default, or the forwarder will completely stop forwarding to a downed URI at some point.
* Defaults to -1 (forever).
* Specifies the number of seconds before reaching the maximum backoff frequency.
* Defaults to 20.
#########################################################################################
#----SSL SETTINGS----
#########################################################################################
# To set up SSL on the forwarder, set the following attribute/value pairs.
# If you want to use SSL for authentication, add a stanza for each receiver that needs to be certified.
sslPassword = <password>
* The password associated with the CAcert.
* The default splunk CAcert uses the password "password".
346
sslCertPath = <path>
* If specified, this connection will use SSL.
* This is the path to the client certificate.
sslRootCAPath = <path>
* The path to the root certificate authority file (optional).
sslVerifyServerCert = true | false
* If true, make sure that the server you are connecting to is a valid one (authenticated).
* Both the common name and the alternate name of the server are then checked for a match.
sslCommonNameToCheck = <string>
* Check the common name of the server's certificate against this name.
* If there is no match, assume that Splunk is not authenticated against this server.
* You must specify this setting if 'sslVerifyServerCert' is true.
altCommonNameToCheck = <string>
* Check the alternate name of the server's certificate against this name.
* If there is no match, assume that Splunk is not authenticated against this server.
* You must specify this setting if 'sslVerifyServerCert' is true.
outputs.conf.example
#
# This file contains an example outputs.conf. Use this file to configure forwarding in a distributed
# set up.
#
# outputs.conf in $SPLUNK_HOME/etc/system/local/. You must restart Splunk to
#
# Specify a target group for an IP:PORT which consists of a single receiver.
# This is the simplest possible configuration; it sends data to the host at 10.1.1.197 on port 9997.
[tcpout:group1]
server=10.1.1.197:9997
# Specify a target group for a hostname which consists of a single receiver.
[tcpout:group2]
server=myhost.Splunk.com:9997
# Specify a target group made up of two receivers. In this case, the data will be
# balanced (round-robin) between these two receivers. You can specify as many
# receivers as you wish here. You can combine host name and IP if you wish.
[tcpout:group3]
server=myhost.Splunk.com:9997,10.1.1.197:6666
# You can override any of the global configuration values on a per-target group basis.
347
# All target groups that do not override a global config will inherit the global config.
# Send every event to a receiver at foo.Splunk.com:9997 and send heartbeats every
# 45 seconds with a maximum queue size of 100,500 events.
[tcpout:group4]
server=foo.Splunk.com:9997
maxQueueSize=100500
# Set the hearbeat frequency to 15 for each group and clone the events to
# groups indexer1 and indexer2. Also, index all this data locally as well.
[tcpout]
indexAndForward=true
[tcpout:indexer1]
server=Y.Y.Y.Y:9997
[tcpout:indexer2]
server=X.X.X.X:6666
# Data balance between Y.Y.Y.Y and X.X.X.X.
[tcpout:indexerGroup]
server=Y.Y.Y.Y:9997, X.X.X.X:6666
# Clone events between two data balanced groups.
[tcpout:indexer1]
server=A.A.A.A:1111, B.B.B.B:2222
[tcpout:indexer2]
server=C.C.C.C:3333, D.D.D.D:4444
prefs.conf
prefs.conf
prefs.conf controls per-user settings including SplunkWeb search and result display preferences
and dashboard layout.
prefs.conf.spec
#
# This file contains all possible attributes and value pairs for a prefs.conf
# file. Use this file to configure display preferences in Splunk Web.
#
# There is a prefs.conf in $SPLUNK_HOME/etc/system/default/. To set custom configurations,
348
# place a prefs.conf in $SPLUNK_HOME/etc/system/local/. For help, see
# prefs.conf.example. You must restart Splunk to enable configurations.
#
#
# Global default preferences are specified at the top of the file
# without a stanza name.
#
# Subsequent stanzas are organized by user name, and hold user-specific settings.
# The user settings override any global preferences.
selectedKeys = <space-separated string>
* This value represents the default arguments to the Splunk Web select processor.
* Whenever any of these keys are present in the data, they appear in the filtering bar, just below the timeline, and just above the events returned by the search.
* If a key in the list is not present in the data, it will not appear in the filtering bar.
* Defaults to source host sourcetype.
skin = <string>
* This value represents the name of the skin CSS file that should be loaded by default.
* Splunk ships with 'basic' and 'black' and defaults to 'basic.'
* You are free to create your own files and activate them by placing them in the share/splunk/search_oxiclean/static/css/skins/ directory.
* For instance, placing a foo.css file in the skins dir will make 'foo' appear as a third option in the Splunk Web theme pulldown, as well as make 'foo' a valid value for <string>.
* Defaults to Basic.
dashboard_activeset = <string>
* Represents the name of the currently loaded dashboard panel set.
* The value here is linked to a 'dashboardset_*' key name that exists as a prefs.conf key.
* For example, a value of 'foo' means that another key named 'dashboardset_foo' MUST exist.
dashboardset_<setname> = <JS array literal>
* Represents a list of saved search names to load as a unit on the Splunk Web home page.
* The second part of this keyname is linked to the 'dashboard_activeset' key.
* It is expected that there will be multiple versions of this key, i.e. 'dashboardset_default', 'dashboardset_admin', 'dashboardset_noc', etc.
* The <JS array literal> is a JSON array format: ['web_errors','failed_logins','db_exceptions']
* Set to SPLUNK-DELETED-DASHBOARD to hide the dashboard and remove from the dashboard dropdown in Splunk Web.
dashboard_customList = <comma separated list of custom list modules>
* Define custom list modules in dashboard_customlist_NAME_OF_CUSTOM_LIST_MODULE.
dashboard_customlist_NAME_OF_CUSTOM_LIST_MODULE_searches = <any valid search>
* Set a search to appear in your dashboard.
* Note: You must also use the*_labels attribute (below).
dashboard_customlist_NAME_OF_CUSTOM_LIST_MODULE_labels = <label your searches>
* Add a label to your searches.
* Note: You must use this attribute if you are using*_searches, even if you don't want to label your searches. Leave it blank.
dashboard_customlist_NAME_OF_CUSTOM_LIST_MODULE_text = <html>
* Any valid html.
* Use the *_text attribute instead of *_searches and *_labels.
* Each line must end with a \ to mark a newline.
saved_<saved_search_name>_panelIsOpen = true | false
* Indicates the panel state of a particular saved search when displayed in a dashboard set.
* If 'true', then the full panel is shown.
* If 'false', then only a summary line is shown.
* The <saved_search_name> is the full search string of the saved search with all non-alpha characters removed.
saved_<saved_search_name>_panelMode = <string>
* Indicates the view state of a saved search when displayed in a dashboard set.
* The values for this correspond to the available panels than can be shown on a given search.
* Typical values are: 'Timeline', 'Chart', and 'Table'.
* The <saved_search_name> is the full search string of the saved search with all non-alpha characters removed.
showMeta = true | false
* Toggle the following on and off:
349
* fields
* dividers between events
* timestamp at the left of the event
* the colored time boundary bars between events
* Defaults to true.
softWrap = true | false
* Toggle on and off softWrap.
* If set to true, events softwrap at the browser window edge.
* If set to false, events will go offscreen and trigger horizontal scrollbars.
* Defaults to true.
showTimeline = true | false
* Toggle on and off the timeline chart in search results view.
* Please note: reporting has its own timechart graph, and this setting is unrelated.
* Defaults to true.
format = Inner | Outer | Raw | Full
* Set the segmentation display options.
* Set to Inner, Outer, Raw, or Full.
* To configure segmentation in events, use segmenters.conf.
* Defaults to Full.
maxResults = <integer>
* Set the number of events that the search language should load when doing processing, field extraction, charting, etc.
* NOTE: This setting is different from maxresults in savedsearches.conf.
prefs.conf.example
#
# This file contains an example prefs.conf. Use this file to configure display preferences in Splunk Web.
#
# prefs.conf in $SPLUNK_HOME/etc/system/local/. You must restart Splunk to enable configurations.
#
# The following example sets default settings for all users of a single instance.
selectedKeys = "source host punct ip sourcetype eventtype"
format = "Inner"
skin = "Basic"
defaultTimeRange = startminutesago::60
maxResults = 50000
# The following example sets display preferences for user Admin.
[user:admin]
format = "Outer"
skin = "Basic"
showMeta = false
softWrap = true
showTimeline = false
maxResults = 50000
# The following example sets display preferences for user Bob.
[user:bob]
format = "Full"
skin = "Black"
showMeta = true
softWrap = true
showTimeline = true
maxResults = 5000
# Mask all dashboards
350
# The following example masks all the default dashboards in ../default/prefs.conf.
# Splunk starts with a blank dashboard that each user can customize.
dashboardset_getting_started = SPLUNK-DELETED-DASHBOARD
dashboardset_admin = SPLUNK-DELETED-DASHBOARD
dashboardset_main = SPLUNK-DELETED-DASHBOARD
dashboard_activeset = test
dashboardset_test = null
dashboard_intro_getting_started =
# ADVANCED EXAMPLE
# Advanced custom search dashboard example using Twiki. Edit the searches and display options to
# customize this example for your own dataset.
#This defines the modules for the Twiki dashboard. The first module is a custom _text module,
#the 2nd, 3rd, 4th are all custom 'columns of blue links' modules. And the last one is a Flash chart.
dashboardset_twiki = TwikiIntro,Twiki saved searches,Twiki activity last 24 hours,Twiki activity
last 7 days,Users editing in the last 24 hours,Pages edited in the last 24 hours
# The $+ is important, as we dont want to blow away the custom list, but rather append to existing ones.
dashboard_customList = Twiki activity last 7 days,Twiki activity last 24 hours,TwikiIntro,Twiki saved searches,$+
# Custom list entries have to have a _searches and a _labels entry (even if the _labels one is empty).
# If you have only one search in the _searches list, you can let it return as many as you want, and
# it will split the rendering up into 2 and 3 columns past certain thresholds.
dashboard_customList_Twiki_saved_searches_searches = ['| admin mysavedsearches | where stanza LIKE
"Twiki%" | rename stanza as name query as term | sort name']
dashboard_customList_Twiki_saved_searches_labels =
# If you have more than one search in _searches, you MUST limit the results to 15 by whatever
# means you choose. This is to defeat the auto-column-splitting feature referred to above,
# which renders poorly.
# You must use _labels when there is more than one search in the _searches key.
# They appear as subheaders above the respective results.
dashboard_customList_Twiki_activity_last_24_hours_searches = ['sourcetype="twiki" ( save OR edit )
starthoursago="24" | top limit=15 twikiuser | eval term="( save OR edit ) ".twikiuser | rename
twikiuser as name | rename count as rowCount', 'sourcetype="twiki" ( attach OR upload )
starthoursago="24" | top limit=15 twikiuser | eval term="(attach OR upload) ".twikiuser | rename
twikiuser as name | rename count as rowCount']
dashboard_customList_Twiki_activity_last_24_hours_labels = Edits, Uploads
dashboard_customList_Twiki_activity_last_7_days_searches = ['sourcetype::twiki edit
startdaysago::7 | where date_hour>20 OR date_hour<5 | top limit=15 twikiuser |
eval term="edit ".twikiuser." | where date_hour>20 OR date_hour<5" | rename twikiuser as name |
rename count as rowCount', 'host::twiki view | where twikiuser=twikipage | top limit=15 twikiuser |
rename twikiuser as name | rename count as rowCount | eval term="host::twiki view ".name." |
where twikiuser=twikipage"','host::twiki *kickoff* save startdaysago::7 | top limit=15 twikipage |
rename twikipage as name count as rowCount | eval term="host::twiki \"*kickoff*\" | where
twikipage=\".twikipage.\""' ]
dashboard_customList_Twiki_activity_last_7_days_labels=Insomnia,Profile updates,Edited pages with
'kickoff' in the title. (replace kickoff with anything you want to keep an eye on)
dashboard_customList_TwikiIntro_text = \
With this application enabled, you'll get \
<ul> \
<li>some extracted fields like twikiuser, twikipage, twikiaction</li> \
<li>some event types, like twikiViews, twikiEdits, twikiUploads</li> \
<li>some field actions, some that go to the live twiki, some that launch 'show source' style viewers within Splunk </li> \
<li>Some shared dashboard charts, as you see here</li> \
<li>Some custom 'blue link' modules that show various useful little searches and breakdowns</li> \
<li>Also there's a <a href="http://spacecake:28000/?s=Twiki%20-%20template%20for%20Twiki%20homepage%20by%20hour%20of%20day"
target="_top">Form Search</a> template for viewing distribution of classes of events split by hour of the day. </li> \
</ul>
351
props.conf
props.conf
props.conf controls what parameters apply to events during indexing based on settings tied to
each event's source, host, or sourcetype.
Note: You can only use wildcards for host or source. Use ... for paths and * for files.
props.conf.example
#
# The following are example props.conf configurations. Configure properties for your data.
#
# props.conf in $SPLUNK_HOME/etc/system/local/. You must restart Splunk to enable configurations.
#
########
# Line merging settings
########
# The following example linemerges source data into multi-line events for apache_error sourcetype.
[apache_error]
########
# Settings for tuning
########
# The following example limits the amount of characters indexed per event from host::small_events.
[host::small_events]
TRUNCATE = 256
# The following example turns off DATETIME_CONFIG (which can speed up indexing) from any path
# that ends in /mylogs/*.log.
[source::.../mylogs/*.log]
DATETIME_CONFIG = NONE
########
# Timestamp extraction configuration
########
# The following example sets Eastern Time Zone if host matches nyc*.
[host::nyc*]
TZ = US/Eastern
# The following example uses a custom datetime.xml that has been created and placed in a custom app
# directory. This sets all events coming in from hosts starting with dharma to use this custom file.
[host::dharma*]
DATETIME_CONFIG = <etc/apps/custom_time/datetime.xml>
########
# Transform configuration
########
# The following example creates a search field for host::foo if tied to a stanza in transforms.conf.
[host::foo]
TRANSFORMS-foo=foobar
# The following example creates an extracted field for sourcetype access_combined
# if tied to a stanza in transforms.conf.
352
[eventtype::my_custom_eventtype]
REPORT-baz = foobaz
########
# Sourcetype configuration
########
# The following example sets a sourcetype for the file web_access.log.
[source::.../web_access.log]
sourcetype = splunk_web_access
# The following example untars syslog events.
[syslog]
invalid_cause = archive
unarchive_cmd = gzip -cd -
# The following example learns a custom sourcetype and limits the range between different examples
# with a smaller than default maxDist.
[custom_sourcetype]
LEARN_MODEL = true
maxDist = 30
# rule:: and delayedrule:: configuration
# The following examples create sourectype rules for custom sourcetypes with regex.
[rule::bar_some]
sourcetype = source_with_lots_of_bars
MORE_THAN_80 = ----
[delayedrule::baz_some]
sourcetype = my_sourcetype
LESS_THAN_70 = ####
########
# File configuration
########
# Binary file configuration
# The following example eats binary files from any file that matches source::.../mylogs/*.log.
[source::.../mylogs/*.log]
NO_BINARY_CHECK = true
# File checksum configuration
# The following example checks the entirety of every file in the web_access dir rather than
# skipping files that appear to be the same.
[source::.../web_access/*]
CHECK_METHOD = entire_md5
props.conf.spec
#
# This file contains possible attribute/value pairs for configuring Splunk's processing properties
# via props.conf.
#
# There is a props.conf in $SPLUNK_HOME/etc/system/default/. To set custom configurations,
# place a props.conf in $SPLUNK_HOME/etc/system/local/. For help, see
# props.conf.example.
# You can enable configurations changes made to props.conf by typing the following search string
# in Splunk Web:
#
# | extract reload=T
#
[<spec>]
353
* This stanza enables properties for a given <spec>.
* A props.conf file can contain multiple stanzas for any number of different <spec>.
* If you do not set an attribute for a given <spec>, the default is used.
<spec> can be:
1. <sourcetype>, the sourcetype of an event.
2. host::<host>, where <host> is the host for an event.
3. source::<source>, where <source> is the source for an event.
4. rule::<rulename>, where <rulename> is a unique name of a sourcetype classification rule.
5. delayedrule::<rulename>, where <rulename> is a unique name of a delayed sourcetype classification rule.
These are only considered as a last resort before generating a new sourcetype based on the source seen.
Precedence:
For settings that are specified in multiple categories of matching stanzas,
host:: spec settings override sourcetype:: spec settings.
Additionally, source:: spec settings override both host:: and
sourcetype:: settings.
For a given event, only one [sourcetype], one [source::...] and one [host::...]
stanza will apply. When multiple stanzas might match, which is applied is not
defined.
NOTE: When setting a <spec> (EXCEPT sourcetype), you can use the following regex-type syntax:
... = recurses through directories until the match is met.
* = matches anything but / 0 or more times.
| = or
( ) = used to limit scope of |.
Example: [source::....(?<!tar.)(gz|tgz)]
Match language:
These match expressions must match the entire key value, not just a substring.
For those familiar with regular expressions, these are a full implementation
(PCRE) with the translation of ..., * and .
Thus . matches a period, * non-directory seperators, and ... any number of any characters.
For more information see the wildcards section at:
http://www.splunk.com/base/Documentation/latest/Admin/FilesAndDirectories#Inputs.conf
#******************************************************************************
# The possible attributes/value pairs for props.conf, and their default values, are:
#******************************************************************************
# International characters
CHARSET = <string>
* When set, Splunk assumes the input from the given <spec> is in the specified encoding.
* A list of valid encodings can be retrieved using the command "iconv -l" on most *nix systems.
* If an invalid encoding is specified, a warning is logged during initial configuration and further input from that <spec> is discarded.
* If the source encoding is valid, but some characters from the <spec> are not valid in the specified encoding, then the characters are escaped as hex (e.g. "\xF3").
354
* When set to "AUTO", Splunk attempts to automatically determine the character encoding and convert text from that encoding to UTF-8.
* For a complete list of the character sets Splunk automatically detects, see the online documentation.
* Defaults to AUTO on Windows and UTF-8 on non-Windows OSes.
#******************************************************************************
# Line breaking
#******************************************************************************
# Use the following attributes to define the length of events.
TRUNCATE = <non-negative integer>
* Change the default maximum line length.
* Set to 0 if you do not want truncation ever (very long lines are, however, often a sign of garbage data).
LINE_BREAKER = <regular expression>
* If not set, data is broken into an event for each line, delimited by \r or \n.
* The contents of the first matching group does not occur in either the previous or next events.
* NOTE: There is a significant speed boost by using the LINE_BREAKER to delimit multiline events rather than using line merging to reassemble individual lines into events.
* If set, the given regex is used to break the raw stream into events.
* Wherever the regex matches, the start of the first match is considered the start of the next event.
* The regex must contain a matching group.
* Defaults to ([\r\n]+).
LINE_BREAKER_LOOKBEHIND = <integer>
* Change the default lookbehind for the regex based linebreaker.
* When there is leftover data from a previous raw chunk, this is how far before the end the raw chunk (with the next chunk concatenated) we should begin applying the regex.
* Defaults to 100.
# Use the following attribute to define multi-line events with additional attributes and values.
SHOULD_LINEMERGE = true | false
* When set to true, Splunk combines several lines of data into a single event, based on the following configuration attributes.
* Defaults to true.
# When SHOULD_LINEMERGE = True, use the following attributes to define the multi-line events.
AUTO_LINEMERGE = true | false
* Directs Splunk to use automatic learning methods to determine where to break lines in events.
* Defaults to true.
BREAK_ONLY_BEFORE_DATE = true | false
* When set to true, Splunk creates a new event if and only if it encounters a new line with a date.
BREAK_ONLY_BEFORE = <regular expression>
* When set, Splunk creates a new event if and only if it encounters a new line that matches the regular expression.
MUST_BREAK_AFTER = <regular expression>
* When set, and the regular expression matches the current line, Splunk creates a new event for the next input line.
* Splunk may still break before the current line if another rule matches.
MUST_NOT_BREAK_AFTER = <regular expression>
* When set and the current line matches the regular expression, Splunk does not break on any subsequent lines until the MUST_BREAK_AFTER expression matches.
MUST_NOT_BREAK_BEFORE = <regular expression>
355
* When set and the current line matches the regular expression, Splunk does not break the last event before the current line.
* Specifies the maximum number of input lines to add to any event.
* Splunk breaks after the specified number of lines are read.
* Defaults to 256.
#******************************************************************************
# Timestamp extraction configuration
#******************************************************************************
* Specifies which file configures the timestamp extractor.
* This configuration may also be set to "NONE" to prevent the timestamp extractor from running or "CURRENT" to assign the current system time to each event.
* Defaults to /etc/datetime.xml (eg $SPLUNK_HOME/etc/datetime.xml).
* Specifies how far (in characters) into an event Splunk should look for a timestamp.
* Defaults to 150.
* Specifies the necessary condition for timestamp extraction.
* The timestamping algorithm only looks for a timestamp after the first regex match.
* Specifies a strptime format string to extract the date.
* This method of date extraction does not support in-event timezones.
* TIME_FORMAT starts reading after the TIME_PREFIX.
* The <strptime-style format> must contain the hour, minute, month, and day.
* NOTE: If you use TIME_FORMAT Splunk assumes your strptime is correctly formatted, and all MAX* settings (below) are ignored.
TZ = <timezone identifier>
* The algorithm for determining the time zone for a particular event is as follows:
* If the event has a timezone in its raw text (e.g., UTC, -08:00), use that.
* If TZ is set to a valid timezone string, use that.
* Otherwise, use the timezone of the system that is running splunkd.
* Specifies the maximum number of days past, from the current date, for an extracted date to be valid.
* If set to 10, for example, Splunk ignores dates that are older than 10 days ago.
* Defaults to 2000.
* IMPORTANT: If your data is older than 2000 days, change this setting.
* Specifies the maximum number of days in the future, from the current date, for an extracted date to be valid.
* If set to 3, for example, dates that are more than 3 days in the future are ignored.
* False positives are less likely with a tighter window.
* The default value includes dates from one day in the future.
* If your servers have the wrong date set or are in a timezone that is one day ahead, increase this value to at least 3.
* Defaults to 2.
MAX_DIFF_SECS_AGO = <integer>
* If the event's timestamp is more than <integer> seconds BEFORE the previous timestamp, only accept it if it has the same exact time format as the majority of timestamps from the source.
* IMPORTANT: If your timestamps are wildly out of order, consider increasing this value.
356
* Defaults to 3600 (one hour).
MAX_DIFF_SECS_HENCE = <integer>
* If the event's timestamp is more than <integer> seconds AFTER the previous timestamp only accept it if it has the same exact time format as the majority of timestamps from the source.
* IMPORTANT: If your timestamps are wildly out of order, or you have logs that are written less than once a week, consider increasing this value.
* Defaults to 604800 (one week).
#******************************************************************************
# Transform configuration
#******************************************************************************
# Use the TRANSFORMS class to create indexed fields. Use the REPORT class to create extracted fields.
# Please note that extracted fields are recommended as best practice.
# Note: Indexed fields have performance implications and are only recommended in specific circumstances.
# You may want to use indexed fields if you search for expressions like foo!="bar" or NOT foo="bar" and the field foo nearly always takes on the value bar.
# Another common reason to use indexed fields is if the value of the field exists outside of the field more often than not.
# For example, if you commonly search for foo="1", but 1 occurs in many events that do not have foo="1", you may want to index foo.
# For more information, see documentation at: http://www.splunk.com/doc/latest/admin/ExtractFields
# For examples, see props.conf.spec and transforms.conf.spec.
Precedence rules for classes:
* For each class, Splunk takes the configuration from the highest precedence configuration block (see precedence rules at the beginning of this file).
* If a particular class is specified for a source and a sourcetype, the class for source wins out.
* Similarly, if a particular class is specified in ../local/ for a <spec>, it overrides that class in ../default/.
TRANSFORMS-<value> = <unique_stanza_name>
* <unique_stanza_name> is the name of your stanza from transforms.conf.
* <value> is any value you want to give to your stanza to identify its name-space.
* Transforms are applied in the specified order.
* If you need to change the order, control it by rearranging the list.
REPORT-<value> = <unique_stanza_name>
* <unique_stanza_name> is the name of your stanza from transforms.conf.
* <value> is any value you want to give to your stanza to identify its name-space.
* Transforms are applied in the specified order.
* If you need to change the order, control it by rearranging the list.
KV_MODE = none | auto | multi
* Specifies the key/value extraction mode for the data.
* Set KV_MODE to one of the following:
* none: if you want no key/value extraction to take place.
* auto: extracts key/value pairs separated by equal signs.
* multi: invokes multikv to expand a tabular event into multiple events.
* Defaults to auto.
CHECK_FOR_HEADER = true | false
* Set to true to enable header-based field extraction for a file.
* If the file has a list of columns and each event contains a field value (without field name), Splunk picks a suitable header line to use to for extracting field names.
#******************************************************************************
# Segmentation configuration
#******************************************************************************
357
SEGMENTATION = <string>
* Specifies the segmenter from segmenters.conf to use at index time.
* Set segmentation for any of the <spec> outlined at the top of this file.
SEGMENTATION-<segment selection> = <string>
* Specifies that Splunk Web should use the a specific segmenter (from segmenters.conf) for the given <segment selection> choice.
* Default <segment selection> choices are: all, inner, outer, none.
#******************************************************************************
# Binary file configuration
#******************************************************************************
NO_BINARY_CHECK = true | false
* Can only be set for a [<source>::...] stanza.
* When set to true, Splunk processes binary files.
* By default, binary files are ignored.
#******************************************************************************
# File checksum configuration
#******************************************************************************
CHECK_METHOD = entire_md5 | modtime
* By default, if the checksums of the first and last 256 bytes of a file match existing stored checksums, Splunk lists the file as already indexed and thus ignores it.
* Set this to "entire_md5" to use the checksum of the entire file.
* Alternatively, set this to "modtime" to check only the modification time of the file.
* Defaults to endpoint_md5.
#******************************************************************************
# Sourcetype configuration
#******************************************************************************
* Can only be set for a [<source>::...] stanza.
* Anything from that <source> is assigned the specified sourcetype.
# The following attribute/value pairs can only be set for a stanza that begins with [<sourcetype>]:
invalid_cause = <string>
* Can only be set for a [<sourcetype>] stanza.
* Splunk does not index any data with invalid_cause set.
* Set <string> to "archive" to send the file to the archive processor (specified in unarchive_cmd).
* Set to any other string to throw an error in the splunkd.log if running Splunklogger in debug mode.
is_valid = true | false
* Automatically set by invalid_cause.
* DO NOT SET THIS.
* Defaults to true.
unarchive_cmd = <string>
* Only called if invalid_cause is set to "archive".
* <string> specifies the shell command to run to extract an archived source.
* Must be a shell command that takes input on stdin and produces output on stdout.
* DOES NOT WORK ON BATCH PROCESSED FILES. Use preprocessing_script.
358
LEARN_MODEL = true | false
* For known sourcetypes, the fileclassifier adds a model file to the learned directory.
* To disable this behavior for diverse sourcetypes (such as sourcecode, where there is no good exemplar to make a sourcetype) set LEARN_MODEL = false.
maxDist = <integer>
* Determines how different a sourcetype model may be from the current file.
* The larger the value, the more forgiving.
* For example, if the value is very small (e.g., 10), then files of the specified sourcetype should not vary much.
* A larger value indicates that files of the given sourcetype vary quite a bit.
* Defaults to 300.
# rule:: and delayedrule:: configuration
MORE_THAN<optional_unique_value>_<number> = <regular expression> (empty)
LESS_THAN<optional_unique_value>_<number> = <regular expression> (empty)
An example:
[rule::bar_some]
sourcetype = source_with_lots_of_bars
# if more than 80% of lines have "----", but fewer than 70% have "####"
# declare this a "source_with_lots_of_bars"
MORE_THAN_80 = ----
LESS_THAN_70 = ####
A rule can have many MORE_THAN and LESS_THAN patterns, and all are required for the rule to match.
#******************************************************************************
# Internal settings
#******************************************************************************
# NOT YOURS. DO NOT SET.
_actions = <string> ("new,edit,delete")
* Internal field used for user-interface control of objects.
* Defaults to "new,edit,delete".
pulldown_type = <bool>
* Internal field used for user-interface control of sourcetypes.
Note: See next page for continuation of props.conf.spec.
regmon-filters.conf
regmon-filters.conf
regmon-filters.conf.spec
#
[<stanza name>]
359
* Name of the filter to be applied to certain monitor.
proc = <string>
* Regex describing a certain process image that you want to filter on.
hive = <string>
* Regex describing a certain registry key path that you want to filter on.
type = <string>
* Regex describing a certain type of registry event that you want to filter on.
baseline = <int 0|1>
* Establishing a baseline or not for the keys were about to monitor based on this filter.
regmon-filters.conf.example
#
# This file contains example registry monitor filters. To create your own filter, use
# the information in regmon-filters.conf.spec.
#
# regmon-filters.conf in $SPLUNK_HOME/etc/system/local/. You must restart Splunk to enable configurations.
#
[default]
baseline = 1
baseline_interval = 86400
[reg-filter-1]
proc = \\Device\\HarddiskVolume2\\Windows\\.*
hive = \\REGISTRY\\USER\\.*
type = set|create|delete|rename
restmap.conf
restmap.conf
Set new endpoints via restmap.conf.
restmap.conf.spec
#
# This file contains possible attribute and value pairs for creating new rest endpoints.
#
# There is a restmap.conf in $SPLUNK_HOME/etc/system/default/. To set custom configurations,
# place a restmap.conf in $SPLUNK_HOME/etc/system/local/. For help, see
# restmap.conf.example. You must restart Splunk to enable configurations.
#
# NOTE: You must register every REST endpoint via this file to make it available.
###########################
# Global stanza
[global]
* This stanza sets global configurations for all REST endpoints.
allowGetAuth=<true | false>
360
* Allow user/password to be passed as a GET paramater to endpoint services/auth/login.
* Setting this to true, while convenient, may result in user/password getting
logged as cleartext in Splunk's logs *and* any proxy servers in between.
pythonHandlerPath=<path>
* Path to 'main' python script handler.
* Used by the script handler to determine where the actual 'main' script is located.
* Typically, you should not need to change this.
* Defaults to $SPLUNK_HOME/bin/rest_handler.py.
###########################
# Per-endpoint stanza
# Specify a handler and other handler-specific settings.
# The handler is responsible for implementing arbitrary namespace underneath each REST endpoint.
[script:<uniqueName>]
* NOTE: The uniqueName must be different for each handler.
* Call the specified hanlder when executing this endpoint.
* The following attribute/value pairs support the script handler.
scripttype=python
* Tell the system what type of script to execute when using this endpoint.
* Defaults to python.
* Python is currently the only option for scripttype.
handler=<SCRIPT>.<CLASSNAME>
* The name and class name of the file to execute.
* The file *must* live in an application's ../rest/ subdirectory.
* For example, $SPLUNK_HOME/etc/apps/<APPNAME>/default/rest/TestHandler.py
has a class called MyHandler (which, in the case of python must be derived from a base class called
'splunk.rest.BaseRestHandler'). The tag/value pair for this is: "handler=TestHandler.MyHandler".
match=<path>
* Specify the URI that calls the handler.
* For example if match=/foo, then https://$SERVER:$PORT/services/foo calls this handler.
* NOTE: You must start your path with a /.
requireAuthentication=<true | false>
* This OPTIONAL tag determines if this endpoint requires authentication or not.
* It defaults to 'true'.
capability=<capabilityName>
capability.<post|delete|get|put>=<capabilityName>
* Depending on the HTTP method, check capabilities on the authenticated session user.
* If you use 'capability.post|delete|get|put,' then the associated method is checked
against the authenticated user's role.
* If you just use 'capability,' then all calls get checked against this capability (regardless
of the HTTP method).
xsl=<path to XSL transform file>
* Optional.
* Perform an optional XSL transform on data returned from the handler.
* Only use this if the data is XML.
script=<path to a script executable>
* Optional.
* Execute a script which is *not* derived from 'splunk.rest.BaseRestHandler'.
* Put the path to that script here.
* This is rarely used.
* Do not use this unless you know what you are doing.
restmap.conf.example
#
# This file contains example REST endpoint configurations.
#
361
# restmap.conf in $SPLUNK_HOME/etc/system/local/. You must restart Splunk to enable configurations.
#
# The following are sample rest endpoint configurations. To create your own endpoints, modify
# the values by following the spec outlined in restmap.conf.spec.
############################
# Srcipt-specific Settings
############################
# This examples sets up endpoints for the globe script (see Splunk Developer documentation for more details).
[script:globe]
match = /globe/main
handler = handlers.globe
requireAuthentication = false
[script:iploc]
match = /globe/iploc
handler = handlers.iploc
[script:jquery]
match = /globe/jquery.js
handler = handlers.jquery
[script:lastips]
match = /globe/lastips
handler = handlers.lastips
savedsearches.conf
savedsearches.conf
savedsearches.conf stores saved searches and their associated schedules and alerts. Use this
file to:
Configure saved searches
Configure alerts
savedsearches.conf.spec
#
# This file contains possible attribute/value pairs for saved search entries in savedsearches.conf.
# You can configure saved searches by creating your own savedsearches.conf.
#
# There is a default savedsearches.conf in $SPLUNK_HOME/etc/system/default. To set custom
# configurations, place a savedsearches.conf in $SPLUNK_HOME/etc/system/local/.
# For examples, see savedsearches.conf.example. You must restart Splunk to enable configurations.
#
#******************************************************************************
# The possible attribute/value pairs for savedsearches.conf are:
#******************************************************************************
[<stanza name>]
* Name of the saved search stanza.
362
disabled = <0 | 1>
* Tag for entire search.
* Search will not be visible if set to 1.
* Defaults to 0.
search = <string>
* Actual search terms of the saved search.
* For example index::sampledata http NOT 500.
* Your search can include macro searches for substitution.
* To create a macro search, read the documentation at:
http://www.splunk.com/doc/latest/admin/MacroSearch
userid = <integer>
* UserId of the user who created this saved search.
Splunk needs this information to log who ran the search, and create editing capabilities in Splunk Web.
* Possible values: Any Splunk user ID.
* User IDs are found in $SPLUNK_HOME/etc/passwd.
role = <string>
* Role (from authorize.conf) that this saved search is shared with.
* Anyone that is a member of that role will see the saved search in their dashboard.
* To share with everyone, set to Everybody.
#******************************************************************************
# Scheduling options
#******************************************************************************
enableSched = <0 | 1>
* Set this to 1 to enable schedule for search
* Defaults to 0.
counttype = <string>
* Set the type of count for alerting.
* Possible values: number of events, number of hosts, number of sources, number of sourcetypes.
relation = <string>
* How to compare against counttype.
* Possible values: greater than, less than, equal to, drops by, rises by.
quantity = <integer>
* Number to compare against the given counttype.
schedule = <string>
* Cron style schedule (i.e. */12 * * * *).
sendresults = <integer>
* Whether or not to send the results along with the email/shell script.
* Possible values: 1/0 (1 to send, 0 to disable).
execDelay = <integer>
* Amount of time (in seconds) from most recent event to the execution of the scheduled search.
* Defaults to 0.
maxresults = <integer>
* The maximum number of results the entire search pipeline can generate.
* NOTE: This is different from the deprecated search command "maxresults" and the maxresults setting in prefs.conf.
* General guidelines: use 10000 for 32 bit machine and 50000 for 64bit machines
action_script = <string>
* Your search can trigger a shell script.
* Specify the name of the shell script to run.
* Place the script in $SPLUNK_HOME/bin/scripts.
* Command line arguments passed to the script are:
* $0 = script name.
363
* $1 = number of events returned.
* $2 = search terms.
* $3 = fully qualified query string.
* $4 = name of saved splunk.
* $5 = trigger reason (i.e. "The number of events was greater than 1").
* $6 = link to saved search.
* $7 = a list of tags belonging to this saved search.
* $8 = file where the results for this search are stored (contains raw results).
Note: If there are no saved tags, $7 becomes the name of the file containing the search results ($8).
action_rss = <integer>
* Toggle whether or not to create an RSS link.
* Possible values: 1/0 (1 to create, 0 to disable).
action_email = <string>
* Comma delimited list of email addresses to send alerts to.
nextrun = <integer>
* NOTE: This attribute is automatically set. DO NOT SET.
#******************************************************************************
# Summary index settings
#******************************************************************************
action.summary_index = <1 | 0>
* Toggle whether or not the summary index is enabled.
* 1 to enable, 0 to disable.
* Defaults to 0.
action.summary_index._name = <string>
* The summary index where the results of the scheduled search are saved.
* Defaults to summary.
action.summary_index.<$KEY> = <string>
* Optional $KEY = <string> to add to each event when saving it in the summary index.
#******************************************************************************
# Search execution http settings
#******************************************************************************
http_read_timeout = <int>
http_write_timeout = <int>
http_conn_timeout = <int>
* read/write/connect timeout (seconds) for the HTTP connection (to splunkd)
used to execute the scheduled search and any of its actions/alerts
#******************************************************************************
# Viewstate settings
#******************************************************************************
viewstate.resultView = reportView
* The UI state for a saved search.
* Can be either normalView or reportView.
* normalView returns the SplunkWeb search interface.
* reportView returns the report interface.
viewstate.chart.plotMode = column
* Set the plot mode for a chart returned by a saved search.
* Only valid when viewstate.resultView == reportView
* Possible values: area, axis, bubble, column, donut, heatmap, legend, line, pie, scatte,
stackedarea, stackedcolumn.
viewstate.prefs.selectedKeys = source host sourcetype
* Space-delimited list of field to use.
* Always auto-generated, but can be edited after the fact to include new fields.
364
#******************************************************************************
# The following are flash chart formatting options that are auto-generated.
# DO NOT EDIT.
viewstate.chart.formatting.dateTimeFormat = %m/%d/%Y %H:%M:%S
viewstate.chart.formatting.height = 300
viewstate.chart.formatting.padding.bottom = 10
viewstate.chart.formatting.padding.left = 0
viewstate.chart.formatting.padding.right = 0
viewstate.chart.formatting.padding.top = 20
viewstate.chart.formatting.textColor = 3355443
viewstate.chart.formatting.width = 852
savedsearches.conf.example
#
# This file contains example saved searches and alerts.
#
# savedsearches.conf in $SPLUNK_HOME/etc/system/local/. You must restart Splunk to enable configurations.
#
# The following searches are example searches. To create your own search, modify
# the values by following the spec outlined in savedsearches.conf.spec.
[Invalid 3months notshared db test2]
action_rss = 0
search = * Invalid startmonthsago=3
schedule = */60 * * * *
sendresults = 0
userid = 1
[bus error 15min email notshared db test5 ]
action_email = my_email@splunk.com
action_rss = 0
counttype = number of hosts
quantity = 5
search = * error Bus startminutesago=15
schedule = */12 * * * *
sendresults = 1
userid = 1
[kCGError 3months shared db test1]
action_rss = 0
search = * kCGErrorIllegalArgument startmonthsago=3
role = Everybody
schedule = */60 * * * *
sendresults = 0
userid = 1
[normal shutdown 1month shareda nodb scheduled gt3 midnight test3]
action_rss = 0
365
enableSched = 1
quantity = 3
search = * Scheduler shutting down normally startmonthsago=1
role = Admin
schedule = 0 0 * * *
sendresults = 0
userid = 1
[syslog not responding 15min shared rss]
action_rss = 1
counttype = always
search = * sourcetype="syslog" not responding startminutesago=15
role = Everybody
schedule = */12 * * * *
sendresults = 0
userid = 1
### Scripted searches
# The following search calls a script and sends an RSS feed. It runs every minute, Monday through
# Friday and alerts (eg sends RSS and triggers the script splunk.sh) every time the count of events
# returned by the search rises by 100.
[splunk_script]
search = eventtype = attack OR eventtype = deny
action_script = splunk.sh
action_rss = 1
relation = rises by
quantity = 100
schedule = */60 * * * 1-5
sendresults = 1
isGlobal = 0
viewstate.resultView = normalView
segmenters.conf
segmenters.conf
segmenters.conf defines schemes for how events will be tokenized in Splunk's index. These
schemes are applied to events from particular sources, hosts or sourcetypes via props.conf.
segmenters.conf.example
#
# The following are examples of segmentation configurations.
#
# segmenters.conf in $SPLUNK_HOME/etc/system/local/. You must restart Splunk to enable configurations.
#
# Example of a segmenter that doesn't index the date as segments in syslog data:
366
[syslog]
FILTER = ^.*?\d\d:\d\d:\d\d\s+\S+\s+(.*)$
# Example of a segmenter that only indexes the first 256b of events:
[limited-reach]
LOOKAHEAD = 256
# Example of a segmenter that only indexes the first line of an event:
[first-line]
FILTER = ^(.*?)(\n|$)
# Turn segmentation off completely:
[no-segmentation]
LOOKAHEAD = 0
segmenters.conf.spec
#
# This file contains possible attribute/value pairs for configuring segmentation of events in
# segementers.conf.
#
# There is a default segmenters.conf in $SPLUNK_HOME/etc/system/default. To set custom
# configurations, place a segmenters.conf in $SPLUNK_HOME/etc/system/local/.
# For examples, see segmenters.conf.example. You must restart Splunk to enable configurations.
#
[SegmenterName]
* Name your stanza.
* If you don't specify an attribute/value pair, Splunk will use the default.
MAJOR = <space separated list of breaking characters>
* Set major breakers.
* Major breakers are words, phrases or terms in your data that are surrounded by set breaking characters.
* By default, major breakers are set to most characters and blank spaces.
* Typically, major breakers are single characters.
* Default is [ ] < > ( ) { } | ! ; , ' " * \n \r \s \t & ? + %21 %26 %2526 %3B %7C %20 %2B %3D -- %2520 %5D %5B %3A %0A %2C %28 %29
* Please note: \s represents a space; \n, a newline; \r, a carriage return; and \t, a tab.
MINOR = <space separated list of strings>
* Set minor breakers.
* In addition to the segments specified by the major breakers, for each minor breaker found,
Splunk indexes the token from the last major breaker to the current minor breaker and
from the last minor breaker to the current minor breaker.
* Default is / : = @ . - $ # % \\ _
FILTER = <regular expression>
* If set, segmentation will only take place if the regular expression matches.
* Furthermore, segmentation will only take place on the first group of the matching regex.
* Default is empty.
LOOKAHEAD = <integer>
* Set how far into a given event (in characters) Splunk segments.
* LOOKAHEAD applied after any FILTER rules.
* To disable segmentation, set to 0.
* Defaults to -1 (read the whole event).
MINOR_LEN = <integer>
* Specify how long a minor token can be.
* Longer minor tokens are discarded without prejudice.
* Defaults to -1.
MAJOR_LEN = <integer>
* Specify how long a major token can be.
* Longer major tokens are discarded without prejudice.
* Defaults to -1.
367
MINOR_COUNT = <integer>
* Specify how many minor segments to create per event.
* After the specified number of minor tokens have been created, later ones are
discarded without prejudice.
* Defaults to -1.
MAJOR_COUNT = <integer>
* Specify how many major segments are created per event.
* After the specified number of major segments have been created, later ones are
discarded without prejudice.
* Default to -1.
server.conf
server.conf
server.conf controls SSL and HTTP settings for your Splunk server.
server.conf.spec
#
# This file contains possible attributes and values you can use to configure SSL and HTTP server options
# in server.conf.
#
# There is a server.conf in $SPLUNK_HOME/etc/system/default/. To set custom configurations,
# place a server.conf in $SPLUNK_HOME/etc/system/local/. For examples, see server.conf.example.
#
# This file contains options for controlling the server configuration
# The only options currently available is controlling the SSL
# configuration of the server.
##########################################################################################
# SSL Configuration details
##########################################################################################
[sslConfig]
* Set SSL for communications on Splunk's back-end under this stanza name.
* NOTE: To set SSL (eg HTTPS) for Splunk Web and the browser, use web.conf.
* If you do not specify an entry for each attribute, Splunk will use the default value.
enableSplunkdSSL = <true | false>
* Enables/disables SSL on the splunkd management port (8089).
* Defaults to true.
useClientSSLCompression = <true | false>
* Turns on HTTP client compression.
* Server-side compression is turned on by default; setting this on the client side enables
compression between server and client.
* Enabling this potentially gives you much faster distributed searches across multiple
Splunk instances.
keyfile = <filename>
* Server certificate and key file.
* Certificates are auto-generated by splunkd upon starting Splunk.
* You may replace the default cert with your own PEM format file.
* Certs are stored in caPath (see below).
* Defaults to server.pem.
368
keyfilePassword = <password>
* Server certificate password.
* Defaults to password.
caCertFile = <filename>
* Public key of the signing authority.
* Defaults to cacert.pem.
caPath = <path>
* Path where all these certs are stored.
* Defaults to SPLUNK_HOME/etc/auth.
certCreateScript = <script name>
* Script for generating certs on Splunk startup.
* Defaults to genSignedServerCert.py.
##########################################################################################
# Splunkd HTTP server configuration
##########################################################################################
[httpServer]
* Set stand-alone HTTP settings for Splunk under this stanza name.
atomFeedStylesheet = <string>
* Defines the stylesheet relative URL to apply to default Atom feeds.
* Set to 'none' to stop writing out xsl-stylesheet directive.
* Defaults to /static/atom.xsl.
max-age = <int>
* Set the maximum time (in seconds) to cache a static asset served off of the '/static' directory.
* This value is passed along in the 'Cache-Control' HTTP header.
* Defaults to 3600.
follow-symlinks = <true|false>
* Toggle whether static file handler (serving the '/static' directory) follow filesystem
symlinks when serving files.
##########################################################################################
# Static file handler MIME-type map
[mimetype-extension-map]
* Map filename extensions to MIME type for files served from the static file handler under
this stanza name.
<file-extension> = <MIME-type>
* Instructs the HTTP static file server to mark any files ending in 'file-exension'
with a header of 'Content-Type: <MIME-type>'.
* Defaults to:
[mimetype-extension-map]
gif = image/gif
htm = text/html
jpg = image/jpg
png = image/png
txt = text/plain
xml = text/xml
xsl = text/xml
##########################################################################################
# Remote applications configuration (e.g. SplunkBase)
##########################################################################################
[applicationsManagement]
369
* Set remote applications settings for Splunk under this stanza name.
url = <URL>
* Applications repository.
* Defaults to http://www.splunkbase.com/api/apps
loginUrl = <URL>
* Applications repository login.
* Defaults to http://www.splunkbase.com/api/account:login/
useragent = <splunk-version>-<splunk-build-num>-<platform>
* User-agent string to use when contacting applications repository.
* <platform> includes information like operating system and CPU architecture.
server.conf.example
#
# This file contains an example server.conf. Use this file to configure SSL and HTTP server options.
#
# server.conf in $SPLUNK_HOME/etc/system/local/. You must restart Splunk to enable configurations.
#
[sslConfig]
enableSplunkdSSL = true
useClientSSLCompression = true
keyfile = MYserver.pem
keyfilePassword = password
caCertFile = MYcacert.pem
caPath = $$SPLUNK_HOME/etc/apps/MYAPP/auth/
certCreateScript = MYgenSignedServerCert.py
setup.conf
setup.conf
Each Splunk application can have a setup.conf file to specify how it interacts with Splunk and with
other Splunk applications. For each application, setup.conf is located in
$SPLUNK_HOME/etc/apps/<application name>/default, where SPLUNK_HOME is the
directory into which you installed Splunk and <application name> is the name of the specific
application. If you want to edit setup.conf for a given application, make your edits in that
application's local directory. This way, they will not be overwritten if you upgrade/update the
application.
Use setup.conf to provide instructions to Splunk about how a particular Splunk application
interacts with Splunk and with other Splunk applications.
setup.conf.spec
#
# This file contains possible setup options for an individual Splunk application.
370
# Each Splunk application can have its own setup.conf file.
# The setup.conf file is used when you enable or disable a given application to
# make changes to a Splunk configuration that are beyond those available using
# the normal configuration system.
# In the setup.conf file, you can specify the following:
# * Files to be manipulated. You can specify that a file or files needed by this application
# (images or css files, for example) replace another file or files.
# * Other Splunk applications that are required or not supported when this application is enabled.
# You can specify that an application is explicitly required or just that it be enabled if it
# is available.
# * Scripts to be run when this application is enabled or disabled.
# * Splunk module(s) to be enabled or disabled. This option is rarely used except for in the
# Splunk Light Forwarder configuration.
# You can specify multiple items in each stanza.
#
#
# To enable these configurations, you must enable or disable for a given application.
# Splunk will then ask you to restart.
#####################
# Files Stanza
#####################
# Use this stanza to put in place specific files local to this application to specified locations
# within a Splunk installation.
[files]
SOURCE_LOCATION = DEST_LOCATION
#####################
# Apps Stanza
#####################
# Use this stanza to specify other Splunk applications that are required or not supported when this
# application is running. The difference between "enabled" and "required" is that "enabled" will
# not produce an error if the specified application is not present.
[apps]
APP_NAME = disabled | enabled | required
#####################
# Scripts Stanza
#####################
# Use this module to specify that a particular script or scripts be run when this application
# is enabled or disabled.
[scripts]
enabled = SCRIPT_LOCATION
disabled = SCRIPT_LOCATION
#####################
# Modules Stanza
#####################
# Use this stanza to specify that a particular Splunk module or modules be enabled or disabled when
# this application is enabled. Modules are defined in $SPLUNK_HOME/etc/modules and are where
# Splunk's internal pipelines and processors are defined. Disabling a module should be done with
# care and is not a typical application operation.
[modules]
MODULE_NAME = disabled | enabled
source-classifier.conf
371
source-classifier.conf
source-classifier.conf.spec
#
# This file contains all possible options for configuring settings for the file classifier
# in source-classifier.conf.
#
# There is a source-classifier.conf in $SPLUNK_HOME/etc/system/default/ To set custom
# configurations, place a source-classifier.conf in $SPLUNK_HOME/etc/system/local/.
# For examples, see source-classifier.conf.example. You must restart Splunk to enable configurations.
#
ignored_model_keywords = <space-separated list of terms>
* Terms to ignore when generating a sourcetype model.
* To prevent sourcetype model files from containing sensitive terms (e.g. "bobslaptop") that
occur very frequently in your data files, add those terms to ignored_model_keywords.
ignored_filename_keywords = <space-separated list of terms>
* Terms to ignore when comparing a new sourcename against a known sourcename, for the purpose of
classifying a source.
source-classifier.conf.example
#
# This file contains an example source-classifier.conf. Use this file to configure classification
# of sources into sourcetypes.
#
# source-classifier.conf in $SPLUNK_HOME/etc/system/local/. You must restart Splunk to
#
# terms to ignore when generating sourcetype model to prevent model from containing servernames,
ignored_model_keywords = sun mon tue tues wed thurs fri sat sunday monday tuesday wednesday thursday friday saturday jan feb mar apr may jun jul aug sep oct nov dec january february march april may june july august september october november december 2003 2004 2005 2006 2007 2008 2009 am pm ut utc gmt cet cest cetdst met mest metdst mez mesz eet eest eetdst wet west wetdst msk msd ist jst kst hkt ast adt est edt cst cdt mst mdt pst pdt cast cadt east eadt wast wadt
# terms to ignore when comparing a sourcename against a known sourcename
ignored_filename_keywords = log logs com common event events little main message messages queue server splunk
sourcetypes.conf
sourcetypes.conf
sourcetypes.conf.spec
#
# NOTE: sourcetypes.conf is a machine-generated file that stores the document models used by the
# file classifier for creating source types.
# Generally, you should not edit sourcetypes.conf, as most attributes are machine generated.
# However, there are two attributes which you can change.
#
# There is a sourcetypes.conf in $SPLUNK_HOME/etc/system/default/ To set custom
# configurations, place a sourcetypes..conf in $SPLUNK_HOME/etc/system/local/.
# For examples, see sourcetypes.conf.example. You must restart Splunk to enable configurations.
#
372
_sourcetype = <value>
* Specifies the sourcetype for the model.
* Change this to change the model's sourcetype.
* Future sources that match the model will receive a sourcetype of this new name.
_source = <value>
* Specifies the source (filename) for the model.
sourcetypes.conf.example
#
# This file contains an example sourcetypes.conf. Use this file to configure sourcetype models.
#
# NOTE: sourcetypes.conf is a machine-generated file that stores the document models used by the
# file classifier for creating source types.
#
# Generally, you should not edit sourcetypes.conf, as most attributes are machine generated.
# However, there are two attributes which you can change.
#
# sourcetypes.conf in $SPLUNK_HOME/etc/system/local/. You must restart Splunk to enable configurations.
#
#
# This is an example of a machine-generated sourcetype models for a fictitious sourcetype cadcamlog.
#
[/Users/bob/logs/bnf.x5_Thu_Dec_13_15:59:06_2007_171714722]
_source = /Users/bob/logs/bnf.x5
_sourcetype = cadcamlog
L----------- = 0.096899
L-t<_EQ> = 0.016473
streams.conf
streams.conf
streams.conf.spec
#
# This file controls filters for live tail, (real-time view of data as it's indexed).
# Apply search filters so just the data you are interested shows up in the live tail interface.
#
# There is a streams.conf in $SPLUNK_HOME/etc/system/default/. To set custom configurations,
# place a streams.conf in $SPLUNK_HOME/etc/system/local/. For examples, see streams.conf.example.
#
[stream:<stream name>]
* You may have as many of these stanzas as you wish.
* CAUTION: DO NOT USE THE NAME "livetail" as it is reserved by the system.
373
filter = <search string>
* Filter your live tail data on a search string.
* This filter is applied to the stream above.
* Currently, these searches CANNOT include piping.
* You can use the following fields (and ONLY the following fields) in your filter:
source, sourcetype, host.
streams.conf.example
#
# This file contains an example streams.conf. Use this file to configure filters for live tail.
#
# streams.conf in $SPLUNK_HOME/etc/system/local/. You must restart Splunk to enable configurations.
#
# This example sets up a Live Splunk named apache errors, that is filtered with the search "error
# sourcetype=apache." Customize the name and search string as you see fit.
[stream:apacheerrors]
filter = error sourcetype=apache
strings.conf
strings.conf
strings.conf.spec
# This file contains possible attributes and values you can use to configure text strings
# in strings.conf.
#
# There is a strings.conf in $SPLUNK_HOME/etc/system/default/. To set custom configurations,
# place a strings.conf in $SPLUNK_HOME/etc/system/local/. For examples, see strings.conf.example.
#
#
# CAUTION: You can destroy Splunk's performance by editing this file incorrectly. Only edit the
# lowercase strings on the right hand side. DO NOT edit the upper case resource. DO NOT change
# any printf formatting strings.
#########################################################################
# String Resources File
#
# STRING NAMES
#
# Full name of resource is stanza_name + attribute name, so
# StringMgr::str("MAP_USAGE") refers to the string value of the USAGE
# attribute in [MAP] stanza. This allows for short attribute names
# that are unique.
#
# SUBSTITUTION NAMING CONVENTION
374
#
# After a descriptive name, append two underscores, and then use the
# letters after the % in printf formatting strings, surrounded by
# underscores. For example "MSG_NAME__D_LU_S" would expect 3 args, a
# %d, %lu, and %s.
#
#########################################################################
strings.conf.example
# This file contains possible attributes and values you can use to configure text strings
# in strings.conf.
#
# To use one or more of these configurations, copy the configuration block into strings.conf
#
# CAUTION: You can destroy Splunk's performance by editing this file incorrectly. Only edit the
# lowercase strings on the right hand side. DO NOT edit the upper case resource. DO NOT change
# any printf formatting strings.
#########################################################################
# String Resources File
#
# STRING NAMES
#
# Full name of resource is stanza_name + attribute name, so
# StringMgr::str("MAP_USAGE") refers to the string value of the USAGE
# attribute in [MAP] stanza. This allows for short attribute names
# that are unique.
#
# SUBSTITUTION NAMING CONVENTION
#
# After a descriptive name, append two underscores, and then use the
# letters after the % in printf formatting strings, surrounded by
# underscores. For example "MSG_NAME__D_LU_S" would expect 3 args, a
# %d, %lu, and %s.
#
#########################################################################
[MAP]
VALUE_NOT_FOUND__S = Did not find value for required attribute %s
NO_SAVED_SPLUNK__S = Unable to find saved search with name = %s
USAGE = Usage: (search="subsearch" | saved_search_name)
CANNOT_RUN__S = Unable to run query (%s)
sysmon.conf
sysmon.conf
sysmon.conf.spec
#
# This file contains possible attribute/value pairs for configuring registry monitoring
# on a Windows system, including global settings for which event types (adds, deletes, renames,
# and so on) to monitor, which regular expression filters from the regmon-filters.conf file to use,
# and whether or not Windows registry events are monitored at all.
375
# This file is used in conjunction with regmon-filters.conf.
#
[<stanza name>]
* Defaults to [RegistryMonitor]
* Follow this stanza name with the following attribute/value pairs
event_types = <string>
* Regex string specifying the type of events to monitor. Can be delete, set, create, rename, open, close, query.
active_filters = <string>
* Double quoted strings of filter names (defined in regmon-filters.conf) to use.
disabled = <1 or 0>
* 1 to disable, 0 to enable.
sysmon.conf.example
#
# This file contains an example configuration for monitoring changes
# to the Windows registry. Refer to sysmon.conf.spec for details.
# The following is an example of a registry monitor filter. To create your own filters, modify
# the values using the information in regmon-filters.conf.spec.
#
# sysmon-filters.conf in $SPLUNK_HOME/etc/system/local/. You must restart Splunk to enable configurations.
#
[RegistryMonitor]
event_types = set.*|create.*|delete.*|rename.*
active_filters = "reg-filter-1"
disabled = 0
tags.conf
tags.conf
tags.conf.spec
#
# This file contains possible attribute/value pairs for configuring tags. Set any number of tags
# for indexed or extracted fields.
#
# There is no tags.conf in $SPLUNK_HOME/etc/system/default/. To set custom configurations,
# place a tags.conf in $SPLUNK_HOME/etc/system/local/. For help, see tags.conf.example.
#
[<fieldname>]
* The field name to which the tags in the stanza apply ( eg host, source, ip ).
* A tags.conf file can contain multiple stanzas.
* Each stanza can refer to only one field name.
376
* Set whether each <tag> for a specific <value> of the field <fieldname> is enabled or disabled.
* <value> is any possible value of field <fieldname>.
* Only one tag is allowed per stanza line.
tags.conf.example
#
# This is an example of a tags.conf file. Use this file to create, disable, and delete tags for field values.
# Use this file in tandem with props.conf.
#
# To use one or more of these configurations, copy the configuration block into tags.conf
# in $SPLUNK_HOME/etc/system/local/. You must restart Splunk to enable configurations and configuration changes.
#
#
# This first example presents a situation where the field is "host" and the three hostnames for which tags are being defined
# are "hostswitch," "emailbox," and "devmachine." Each hostname has two tags applied to it, one per line. Note also that
# the "building1" tag has been applied to two hostname values (emailbox and devmachine).
[host]
tag::hostswitch::pci = enabled
tag::hostswitch::cardholder-dest = enabled
tag::emailbox::email = enabled
tag::emailbox::building1 = enabled
tag::devmachine::development = enabled
tag::devmachine::building1 = enabled
[src_ip]
tag::192.168.1.1::firewall = enabled
[seekPtr]
tag::1cb58000::EOF = enabled
tag::1d158000::NOT_EOF = disabled
transactiontypes.conf
transactiontypes.conf
Use transactiontypes.conf to define transactions to use in searches. Each transaction type is
defined by a stanza, where the transaction name is contained in [brackets]. Specify constraints for
each transaction by using the same options that are defined for the transaction search command.
Transactions you define in transactiontypes.conf can be overridden by setting constraints with
transaction while searching.
To learn more about how transaction types work, read the section on transaction types.
transactiontypes.conf.spec
#
# This file contains all possible attributes and value pairs for a transactiontypes.conf
# file. Use this file to configure transaction searches and their properties.
377
#
# There is a transactiontypes.conf in $SPLUNK_HOME/etc/system/default/. To set custom configurations,
# place a transactiontypes.conf in $SPLUNK_HOME/etc/system/local/. You must restart Splunk to
#
#
# This file contains all possible attributes and value pairs for a transactiontypes.conf
# file. Use this file to configure transaction searches and their properties.
#
# There is a transactiontypes.conf in $SPLUNK_HOME/etc/system/default/. To set custom configurations,
# place a transactiontypes.conf in $SPLUNK_HOME/etc/system/local/. You must restart Splunk to
#
[<TRANSACTIONTYPE>]
* Create any number of transaction types, each represented by a stanza name and any number of the following attribute/value pairs.
* Use the stanza name, [<TRANSACTIONTYPE>], to search for the transaction in Splunk Web.
* If you do not specify an entry for each of the following attributes, Splunk uses the default value.
* Set the maximum time span for the transaction.
* Can be in seconds, minutes, hours or days.
* For example: 5s, 6m, 12h or 30d.
* If there is no "pattern" set (below), defaults to 5m. Otherwise, defaults to -1 (unlimited).
* Set the maximum pause between the events in a transaction.
* Can be in seconds, minutes, hours or days.
* For example: 5s, 6m, 12h or 30d.
* If there is no "pattern" set (below), defaults to 2s. Otherwise, defaults to -1 (unlimited).
* Set the maximum number of repeated event types to match against pattern (see below).
* For example, if maxrepeats is 10, and there are 100 events in a row, all with the same eventtype, only the first and last 10 are matched against pattern.
* A negative value means no limit on repeats, but can possibly cause memory problems.
* Defaults to 10.
fields = <comma-separated list of fields>
* If set, each event must have the same field(s) to be considered part of the same transaction.
* Defaults to "".
* Toggle whether events can be in multiple transactions, or 'exclusive' to a single transaction.
* Applies to 'fields' (above).
* For example, if fields=url,cookie, and exclusive=false, then an event with a 'cookie', but not a 'url' value could be in multiple transactions that share the same 'cookie', but have different URLs.
* Setting to 'false' causes the matcher to look for multiple matches for each event and approximately doubles the processing time.
* Defaults to "true".
* Define a short-hand alias for an eventtype to be used in pattern (below).
* For example, A=login, B=purchase, C=logout means "A" is equal to eventtype=login, "B" to "purchase", "C" to "logout".
* Defaults to "".
pattern = <regular expression-like pattern>
378
* Defines the pattern of event types in events making up the transaction.
* Uses aliases to refer to eventtypes.
* For example, "A, B*, C" means this transaction consists of a "login" event, followed by any number of "purchase" events, and followed by a "logout" event.
* You can also specify a group of events to be repeated: "A, (B, C)*, D".
* Defaults to "".
match = closest
* Specify the match type to use.
* Currently, the only value supported is "closest."
* Defaults to "closest."
transactiontypes.conf.example
#
# This is an example transactiontypes.conf. Use this file as a template to configure transactions types.
#
# To use one or more of these configurations, copy the configuration block into transactiontypes.conf
#
# located at http://www.splunk.com/doc/latest/admin/BundlesIntro.
[default]
maxspan = 5m
maxpause = 2s
match = closest
[purchase]
aliases = A=login, B=purchase, C=logout
pattern = A, B, C
maxspan = 10m
maxpause = 5m
fields = userid
transforms.conf
transforms.conf
Transforms.conf specifies transformations to apply to events based on regex-based patterns,
including rules for extracting fields or masking event text.
These transformations are applied to events from particular sources, hosts or sourcetypes via
props.conf.
transforms.conf.spec
#
# This file contains possible attributes and values you can use to configure transform
# and event signing in transforms.conf.
#
# There is a transforms.conf in $SPLUNK_HOME/etc/system/default/. To set custom configurations,
379
# place a transforms.conf $SPLUNK_HOME/etc/system/local/. For examples, see transforms.conf.example.
# You can enable configurations changes made to transforms.conf by typing the following search string
# in Splunk Web:
#
# | extract reload=T
#
* Name your stanza. Use this name when configuring props.conf.
For example, in a props.conf stanza, enter TRANSFORMS-<value> = <unique_stanza_name>.
REGEX = <regular expression>
* Enter a regular expression to operate on the data.
LOOKAHEAD = <integer>
* Specify how many characters to search into an event.
* Defaults to 256.
DEST_KEY = <KEY>
* Specify where to store the results of the REGEX.
* Use the KEYs listed below.
FORMAT = <string>
* Specify the format of the event, including any fields names or values you want to add.
* Use $n (e.g. $1, $2, etc) to specify the output of each REGEX match.
* If the regex does not have n groups, the matching fails.
* The special identifier $0 represents what was in the DEST_KEY before this regex was performed.
* Defaults to $1.
WRITE_META = <true | false>
* Automatically writes REGEX to metadata.
* Use instead of DEST_KEY = meta.
DEFAULT_VALUE = <string>
* If set, and REGEX (above) fails, write this value to DEST_KEY.
SOURCE_KEY = <string>
* Set which KEY to perform the regex on.
* Use the KEYs listed below.
* Defaults to _raw (the raw event).
REPEAT_MATCH = <true | false>
* Specify whether to run REGEX several times on the SOURCE_KEY.
* REPEAT_MATCH starts wherever the last match stopped, and continues until no more matches are found.
DELIMS = <quoted string>
* Set delimiter characters to separate data into key-value pairs, and then to separate key from value.
* NOTE: Delimiters must be quoted with " " (to escape, use \).
* Usually, two sets of delimiter characters must be specified:
The first to extract key/value pairs.
The second to separate the key from the value.
* If you enter only one set of delimiter characters, then the extracted tokens:
Are named with names from FIELDS, if FIELDS are entered (below).
OR even tokens are used as field names while odd tokens become field values.
* Consecutive delimiter characters are consumed except when a list of field names is specified.
FIELDS = <quoted string list>
* List the names of the field values extracted using DELIMS.
* NOTE: If field names contain spaces or commas they must be quoted with " " (to escape, use \).
* Defaults to "".
#######
380
# KEYS:
#######
* NOTE: Keys are case-sensitive. Use the following keys exactly as they appear.
_raw : The raw text of the event.
_done : If set to any string, this is the last event in a stream.
_meta : A space separated list of metadata for an event.
_time : The timestamp of the event, in seconds since 1/1/1970 UTC.
MetaData:FinalType : The event type of the event.
MetaData:Host : The host associated with the event.
The value must be prefixed by "host::"
_MetaData:Index : The index where the event should be stored.
MetaData:Source : The source associated with the event.
The value must be prefixed by "source::"
MetaData:Sourcetype : The sourcetype of the event.
The value must be prefixed by "sourcetype::"
* NOTE: Any KEY prefixed by '_' is not indexed by Splunk, in general.
transforms.conf.example
#
# This is an example transforms.conf. Use this file to create regexes and rules for transforms.
# Use this file in tandem with props.conf.
#
# To use one or more of these configurations, copy the configuration block into transforms.conf
#
# Note: These are examples. Replace the values with your own customizations.
# Indexed field:
[netscreen-error]
WRITE_META = true
# Extracted field:
[netscreen-error]
# Override host:
[hostoverride]
REGEX = \s(\w*)$
FORMAT = host::$1
# Extracted fields:
[netscreen-error]
# Mask sensitive data:
DEST_KEY = _raw
# Route to an alternate index:
[AppRedirect]
REGEX = Application
FORMAT = Verbose
# Extract comma-delimited values into fields:
381
[extract_csv]
DELIMS = ","
FIELDS = "field1", "field2", "field3"
# This example assigns the extracted values from _raw to field1, field2 and field3 (in order of
# extraction). If more than three values are extracted the values without a matching field name
# are ignored.
# Extract key-value pairs
# This example extracts key-value pairs which are separated by '|'
# while the key is delimited from value by '='.
[pipe_eq]
DELIMS = "|", "="
# This example extracts key-value pairs which are separated by '|'
# while the key is delimited from value by '='.
user-seed.conf
user-seed.conf
user-seed.conf.spec
#
# Specification for user-seed.conf. Allows configuration of Splunk's initial username and password.
# Currently, only one user can be configured with user-seed.conf.
#
# To override the default username and password, place user-seed.conf in
# $SPLUNK_HOME/etc/system/default. You must restart Splunk to enable configurations.
#
[user_info]
USERNAME = <string>
* Username you want to associate with a password.
* Default is Admin.
PASSWORD = <string>
* Password you wish to set for that user.
* Default is changeme.
user-seed.conf controls what the default admin user account will be on first run. You can use any
user name in this file -- the name will be given admin privileges.
IMPORTANT: Unlike all other conf files, user-seed must be in
$SPLUNK_HOME/etc/system/default/ (as opposed to ../local/) before first running Splunk.
After Splunk has been run at least once $SPLUNK_HOME/etc/passwd takes over and user-seed will
NOT work.
user-seed.conf.example
#
# This is an example user-seed.conf. Use this file to create an initial login.
#
# NOTE: To change the default start up login and password, this file must be in
# $SPLUNK_HOME/etc/system/default/ prior to starting Splunk for the first time.
382
#
# To use this configuration, copy the configuration block into user-seed.conf
#
[user_info]
USERNAME = admin
PASSWORD = myowndefaultPass
web.conf
web.conf
web.conf contains settings for Splunk Web.
web.conf.spec
#
# This file contains possible attributes and values you can use to configure Splunk's web interface.
#
# There is a web.conf in $SPLUNK_HOME/etc/system/default/. To set custom configurations,
# place a web.conf in $SPLUNK_HOME/etc/system/local/. For examples, see web.conf.example.
#
[settings]
* Set general Splunk Web configuration options under this stanza name.
startwebserver = [0 | 1]
* Set whether or not to start Splunk Web.
* 0 disables Splunk Web, 1 enables it.
* Defaults to 1.
httpport = <port_number>
* Must be present for Splunk Web to start.
* If omitted or 0 the server will NOT start an http listener.
* Defaults to 8000.
mgmtHostPort = <IP:port>
* Location of splunkd.
* Don't include http[s]:// -- just the IP address.
* Defaults to 127.0.0.1:8089.
enableSplunkWebSSL = [True | False]
* Toggle between http or https.
* Set to true to enable https and SSL.
* Defaults to False.
sslport = <port_number>
* Must be present for the front-end to start.
* If omitted, or given a value of 0 the server will not start an SSL listener
* Specify paths and names for web SSL certs.
serviceFormPostURL = http://headlamp.Splunk.com/event/add
userRegistrationURL = https://www.Splunk.com/index.php/pre_reg?destination=prod_reg
updateCheckerBaseURL = http://quickdraw.Splunk.com/js/
383
* These are various Splunk.com urls that are configurable.
* Setting updateCheckerBaseURL to 0 will stop the Splunk Web from pinging Splunk.com
for new versions of itself.
isHosted = [True | False]
* Sets 'hosted' mode, which controls whether users can save/alter options.
* IMPORTANT: Setting this to true is not recommended.
disablePersistedPrefs = <string>
* A comma-separated list of user-types.
* When set, prefs changes for the users will not be persisted across sessions.
* For example, set to <User> to disable the persistence of preferences for all 'User' type accounts.
* Can also be User,Power, User,Power,Admin etc...
* Generally useful when many people are sharing the same account.
* Defaults to User.
uiversion = [pinesol | oxiclean]
* Toggle which version of Splunk Web to load.
* Must be set to either pinesol or oxiclean.
* When set to pinesol, Splunk will try to start with the legacy 2.1 and 2.2 front-end.
* Not recommended in customer installations.
* IMPORTANT: When using the pinesol front-end with Splunk 3.0, not all features will work correctly.
twistedLoginTimeout = <seconds_until_timeout>
* Number of seconds before the twisted session times out.
* After <second_until_timeout>, idle users get redirected to login.
* Defaults to 3600.
restApiPostMode = [none | permissive | strict]
* Set level of checking the REST API performs on form submissions.
* none = does not check; GET and POST are allowed.
* permissive = warns if REST method should be accessed via POST method.
* strict = throws error if POST-only REST method is accessed via GET.
* Defaults to strict.
enableRestControlApi = [True | False]
* Toggle the REST API /v3/controlapi/ endpoint on or off.
distributedStatusMessages
* Configuration for various status messages that can come back from servers in distributed search.
* Defaults to notauthenticated:bad auth,invaliduserorpsw:bad login,down:unreachable,versionMismatch:version mismatch,productMismatch:no license,missingServerName:no server name
numberOfEventsPerCard = <integer>
* Configuration for the number of events that the Endless Scroller asks the
server for with each request.
* Defaults to 10.
numberOfCardsPerDeck = <integer>
* Configuration for the number of requests that the Endless Scroller will
make before it starts to recycle space occupied by prior pages.
* Defaults to 7.
appLoggingLevel = [DEBUG | INFO | WARNING | ERROR | CRITICAL]
* Set the logging level for the python appserver.
* The output is separate from the main splunkd logs.
* Writes to $SPLUNK_HOME/var/log/splunk/web_service.log
* Defaults to INFO.
compressStaticFiles = [True | False]
* Indicates if the static JS, CSS, XSL files are condensed or consolidated.
* Enable to improve client-side performance.
* Defaults to True.
# Configuration options for the administration section of Splunk Web
[adminTabs]
* Configure the order and appearance of Splunk Web's tabs under this stanza name.
384
_order = <string>
* Comma-separated values of tab paths.
* Controls the order of the tabs in Splunk Web.
* Defaults to settings,datainputs,indexmanager,apps,distributed,users,saved,license
<pathToAdminSection>_label = <string>
* Set the label on the tab that appears in Splunk Web.
* Map from _order.
* Defaults are:
apps_label = Applications
settings_label = Server
datainputs_label = Data Inputs
distributed_label = Distributed
indexmanager_label = Indexes
users_label = Users
saved_label = Saved Searches
license_label = License & Usage
<pathToAdminSection>_capabilities = <string>
* Comma-separated list that maps the tabs in Splunk Web to roles.
* Tab will only appear for users in one of those roles.
* NOTE: For non-admin users there is a special tab called 'my account' that
appears, and that tab is not configurable here.
* Defaults to:
apps_capabilities = edit_applications
settings_capabilities = server_settings_tab,server_control_tab,server_auth_config_tab
datainputs_capabilities = edit_input,delete_input,edit_sourcetype,edit_tail,edit_tcp,edit_udp,edit_watch
distributed_capabilities = distributed_all_tab,distributed_receive_tab,distributed_forward_tab,distributed_search_tab
indexmanager_capabilities = edit_index
users_capabilities = edit_user,user_tab
saved_capabilities = save_global_search,save_local_search,delete_local_search,schedule_search,edit_local_search,edit_role_search,edit_saved_search,savedsearch_tab
license_capabilities = license_tab
# Configuration options for the LiveTail section of Splunk Web
[livetail]
buffer_max_size = <integer>
* Max size of stream events, items will be dropped after this limit is reached.
* Defaults to 4000.
update_interval = <integer>
* Delay in milliseconds before updating the UI with a stream event.
* Defaults to 200.
multiple = <True | False>
* WARNING! enabling multiple live tails with improper browser configuration may make splunk web to appear to be not working.
* Please ensure HTTP pipelining http://en.wikipedia.org/wiki/HTTP_pipelining is enabled.
* See http://www.freerepublic.com/focus/f-news/1299854/posts for firefox configuration.
* Defaults to False.
web.conf.example
#
# This is an example web.conf. Use this file to configure data web settings.
#
# To use one or more of these configurations, copy the configuration block into web.conf
#
[settings]
385
# This stanza heading must precede any changes.
# Change the default port number:
httpport = 12800
# Turn on SSL:
enableSplunkWebSSL = true
sslport = 8080
# Endless Scroller configuration.
# If you deal with almost all multiline data, you may be better served by
# lower defaults, which mean fewer events requested at a time.
numberOfEventsPerCard = 5
numberOfCardsPerDeck = 3
wmi.conf
wmi.conf
wmi.conf.spec
#
# This file contains possible attribute/value pairs for configuring WMI access from Splunk.
#
# There is a wmi.conf in $SPLUNK_HOME\etc\system\default\. To set custom configurations,
# place a wmi.conf in $SPLUNK_HOME\etc\system\local\. For examples, see
# wmi.conf.example. You must restart Splunk to enable configurations.
#
#########################################################################################
#----GLOBAL SETTINGS-----
#########################################################################################
[settings]
* The settings stanza specifies various runtime parameters.
* The entire stanza and every parameter within it is optional.
* If the stanza is missing, Splunk assumes system defaults.
initial_backoff = <integer>
* How long to wait (in seconds) before retrying the connection to the WMI provider after the first connection error.
* If connection errors continue, the wait time doubles until it reaches max_backoff.
* Defaults to 5.
max_backoff = <integer>
* Maximum time (in seconds) to attempt reconnect.
* Defaults to 20.
max_retries_at_max_backoff = <integer>
* Try to reconnect this many times once max_backoff is reached.
* If reconnection fails after max_retries, give up forever (until restart).
* Defaults to 2.
result_queue_size = <integer>
* Puts results from WMI provider(s) into a queue, then send to output.
* Defaults to 1000.
386
checkpoint_sync_interval = <integer>
* Minimum wait time (in seconds) for state data (event log checkpoint) to be written to disk.
* Defaults to 2.
heartbeat_interval = <integer>
* Heartbeat interval (in milliseconds) to test connection to WMI providers.
* Defaults to 500.
proc_name = <string>
#########################################################################################
#----INPUT-SPECIFIC SETTINGS-----
#########################################################################################
[WMI:$NAME]
* There are two types of WMI stanzas:
* Event log: for pulling event logs. You must set the event_log_file attribute.
* WQL: for issuing raw WQL requests. You must set the WQL attribute.
server = <comma-separated list>
* A comma-separated list of servers from which to get data.
* Defaults to local machine.
interval = <integer>
* How often to poll for new data.
* Not optional.
* No default.
disabled = 0 | 1
* 1 to disable, 0 to enable.
* No default.
* Event log-specific attributes:
event_log_file = <Application, System, etc>
* Use this instead of WQL to specify sources.
* Specify a comma-separated list of log files to poll.
* No default.
* WQL-specific attributes:
wql = <string>
* Use this if you're not using event_log_file.
* Specify wql to extract data from WMI provider.
* For example, SELECT PercentDiskTime, AvgDiskQueueLength FROM Win32_PerfFormattedData_PerfDisk_PhysicalDisk
namespace = <string>
* Location of WMI providers.
* The namespace where the WMI provider resides.
* Direct WQL queries.
* Defaults to root\.
wmi.conf.example
#
# This is an example wmi.conf. These settings are used to control inputs from WMI providers.
# Refer to wmi.conf.spec and the documentation at splunk.com for more information about this file.
#
387
# To use one or more of these configurations, copy the configuration block into wmi.conf
# in $SPLUNK_HOME\etc\system\local\. You must restart Splunk to enable configurations.
#
[settings]
initial_backoff = 5
max_backoff = 20
max_retries_at_max_backoff = 2
result_queue_size = 1000
checkpoint_sync_interval = 2
heartbeat_interval = 500
# Pull event logs from the local system
[WMI:LocalApplication]
interval = 10
event_log_file = Application
disabled = 1
[WMI:LocalSystem]
interval = 10
event_log_file = System
disabled = 1
[WMI:LocalSecurity]
interval = 10
event_log_file = Security
disabled = 1
# Gather performance data from the local system
[WMI:CPUTime]
interval = 5
wql = SELECT PercentProcessorTime FROM Win32_PerfFormattedData_PerfOS_Processor
disabled = 1
[WMI:Memory]
interval = 5
wql = SELECT CommittedBytes, AvailableMBytes, PagesPerSec FROM Win32_PerfFormattedData_PerfOS_Memory
disabled = 1
[WMI:LocalDisk]
interval = 5
wql = SELECT PercentDiskTime, AvgDiskQueueLength FROM Win32_PerfFormattedData_PerfDisk_PhysicalDisk
disabled = 1
[WMI:FreeDiskSpace]
interval = 5
wql = SELECT FreeMegabytes FROM Win32_PerfFormattedData_PerfDisk_LogicalDisk
disabled = 1
388
Troubleshooting
Contact Support
Contact Support
For contact information, see the main Support contact page.
Here is some information on tools and techniques Splunk Support uses to diagnose problems. Many
of these you can try yourself.
Note: For any files or information you send to Splunk Support, we encourage you to verify that you
are comfortable with sending it to us. We try to ensure that no sensitive information is included in any
output from the commands below, but we cannot guarantee compliance with your particular security
policy.
diag
The diag command collects basic info about your Splunk server, including Splunk's configuration
details (such as the contents of $SPLUNK_HOME/etc and general details about your index such as
host and source names). It does not include any event data or private information.
From $SPLUNK_HOME/bin run
UNIX:
./splunk diag
Windows:
splunk diag
If you have difficultly running diag in your environment, you can also run the python script directly
using cmd
./splunk cmd python /opt/splunk/lib/python2.5/site-packages/splunk/clilib/info_gather.py
This produces splunk-diag.tar.gz (or .zip) that you can send to Splunk Support for
troubleshooting. For version before 3.3.x, contact Support for the infoGather script instead.
389
Upload your diag output to your Support case here -
Enterprise Support: http://www.splunk.com/index.php/track_issues
Community Members: http://www.splunk.com/index.php/send_to_splunk
Log levels and starting in debug mode
Splunk logging levels can be changed to provide more detail for different features in the
$SPLUNK_HOME/var/log/splunk/splunkd.log. The easiest way is to enable all messages with
the --debug option. This does impact performance and should not be used routinely.
Stop Splunk, if it is running.
Save your existing splunkd.log file by moving it to a new filename, like
splunkd.log.old.

Restart Splunk in debug mode with splunk start --debug.
When you notice the problem, stop Splunk.
Move the new splunkd.log file elsewhere and restore your old one.
Restart Splunk normally (without the --debug flag) to disable debug logging.
Specific areas can be enabled to collect debugging details over a longer period with minimal
performance impact. See the category settings in the file $SPLUNK_HOME/etc/log.cfg to set
specific log levels without enabling a large number of categories as with --debug. Note that not all
messages marked WARN or ERROR indicate actual problems with Splunk; some indicate that a
feature is not being used.
For 3.2+, debug messages in splunkd.log can also be enabled dynamically with a search:
To enable debugging search for
| oldsearch !++cmd++::logchange !++param1++::root !++param2++::DEBUG
To return to the default log level search for
| oldsearch !++cmd++::logchange !++param1++::root !++param2++::WARN
To set a particular category of messages, replace "root" with the desired category. This does not
change any settings in log.cfg. On restart, the log level reverts to what is defined in log.cfg.
390
Note This search will return a "Search Execute failed because Setting priority of ... " message. This is
normal.
For investigating problems monitoring files, use the FileInputTracker and selectProcessor categories.
These are not enabled with the normal "--debug" option because they are very verbose.
Debug Splunk Web
Enable additional Splunk Web debugging in web.conf:
[settings]
appLoggingLevel = DEBUG
Restart splunkweb with the command ./splunk restart splunkweb. The additional messages are output
in $SPLUNK_HOME/var/log/splunk/web_service.log file.
Core Files
To collect a core file, use ulimit to remove any maximum file size setting before starting Splunk.
# ulimit -c unlimited
# splunk restart
This setting only affects the processes you start in a particular shell, so you may wish to do it in a new
session. For Linux, start Splunk with the --nodaemon option (splunk start --nodaemon). In
another shell, start the web interface manually with splunk start splunkweb.
Depending on your system, the core may be named something like core.1234, where the number
indicates the process id and be the same location as the splunkd executable.
LDAP configurations
If you are having trouble setting up LDAP, Support will typically need the following information:
The authentication.conf file from $SPLUNK_HOME/etc/system/local/.
An ldif for a group you are trying to map roles for.
An ldif for a user you are trying to authenticate as.
In some instances, a debug splunkd.log or web_service.log are helpful.
391
Recover metadata for a corrupt Splunk index directory
Important: You must contact Splunk support for direction before using this command.
The recover-metadata command recovers missing or corrupt metadata associated with any
Splunk index directory, sometimes also referred to as a 'bucket'. If your Splunk instance will not start
up, one possible diagnosis is that one or more of your index buckets is corrupt in some way. Contact
support; they will help you determine if this is indeed the case and if so, which bucket(s) are affected.
Then, run this command:
$SPLUNK_HOME/bin/recover-metadata <full path to the exact index
directory/bucket>
Splunk will return a success or failure message.
Splunkd is down
Splunkd is down
Sometimes Splunk Web may show the message ""Splunkd appears to be down," when splunkd is
definitely up and running. There are several potential causes.
Splunkd may need a few more seconds to come up
Sometimes the web server is ready to respond before the daemon. This is usually the case if you
have enabled Splunk to work with LDAP, as it can take time for the LDAP authentication to pass. If
you notice a lag after reboots, try putting a sleep delay or otherwise making the splunkweb (twistd.py)
process wait before it launches.
Your license file contains line-breaks or other hidden characters
Some email clients may insert hidden characters (line-breaks, null characters, etc.) in the license
string when it gets emailed to you. These hidden characters can break your license string causing
Splunk to go down. To resolve this issue you should copy the license string from the store and paste
it into your license file in $SPLUNK_HOME/etc/splunk.license or into the License section of
Splunk Web.
License issues
License issues
If you are having trouble with your license, here are some things to look at.
392
Cannot login
To log in for the first time after applying an Enterprise license (converting from free), use the default
username "admin" with the password "changeme". If you later clean (reset) your user data, your
username/password is reset to this default.
If admin/changeme doesn't work and you just applied a new license to 3.x (not a beta,) then you may
have the wrong type of license. 3.0 GA changed the license format, so previous versions (including
licenses created for beta releases) do not work. A 3.x instance with a 2.x license advises you to
contact support for a new license and not allow you to login with your username and password. A 2.x
instance with a 3.x license only says Incorrect username or password when you try to log in.
If you are using a free license, switch to the new format license:
Stop Splunk
./splunk stop

Copy your new license into $SPLUNK_HOME/etc/splunk.license or paste it into the
License area in Splunk Web's Admin section.

Start Splunk
./splunk start

If you get a message that your license is invalid, check that it was correctly installed. Ensure the key
does not have any added spaces or line breaks. Also, a valid Enterprise license key begins with the
email address of the original recipient from the Splunk Store and is a single line with no spaces. (The
free license begins "freelicense".) It may be a bare key or be enclosed in a block of XML like this:
<license>
<user>freelicense</user>
<expiration-date>2017-07-17</expiration-date>
<creation-date>2007-07-17</creation-date>
<bytelimit>500</bytelimit>
<licenseKey>freelicense;m9LkX5oOpeie+8cCsRg9fdmI6H/A1v0LQUkHx/0SsJMxsy9dXrAcNMgjxcU8j7crCR6tdewuozjNLsGRbPXWMoQf6f5etIIYHNo/WwH/3bJ04qJiq6GikRjg8ySqjcwvaSMLGS4a3CxHtKOwivqCZ+uS1wuiKaYSSZu5DYJMCfO6eAg1xvBQ+VOSmDW3NSVpFKiImrYwmeNqNyDHgaNKCZoF6h5WMYslpk7taF1cdLCaq3tvqwnPXqVGfK9Q+CnA3/vynnA+5WWcFH6fiTgI8nYFUYayb1s5FqjKCOjDpgDsLdLtpOQsEqxJbLFoPPP1tfv8KVKS0NfzZHYZkyNLXA==</licenseKey>
<productName>free</productName>
</license>
metrics.log
The $SPLUNK_HOME/var/log/splunk/metrics.log file contains indexing statistics like events
per second and throughput for various categories. It is not a complete accounting, however. Metrics
entries report the top 10 items for each group over a 30 second period. For example, if you have 20
hosts, there will be a thruput_per_host record for the top 10 hosts by size of the input raw data
for each period.
393
License violation warning
If you exceed your licensed daily volume on any one calendar day, you will get a violation warning.
The message persists for 14 days. If you have more than 7 violations in a rolling 30-day window, you
will be unable to search. Search access returns when you no longer have 7 violations in the previous
30 days or apply a different license with a larger volume limit. Note that Splunk will not stop indexing
your data for a license violation but only block your access to it for the period you exceed your
license.
To verify your license information in Splunk Web, go to the Admin page, License & Usage tab to
verify your license information. Here you can see your License Level daily usage limit, Peak Usage
and other information.
If you are adding compressed files, the size of the uncompressed data is counted towards your daily
volume. Also, adding custom indexed fields or adding to the raw event text (with either a transform or
a custom processor) increases your indexed volume. Internal Splunk events are not counted against
your licensed volume.
Advanced license troubleshooting
From the CLI, use the command
splunk show license
to see the license summary:
Product: Splunk Server
License Level: 10240 MB/day peak indexing
Peak Usage: 0.00 GB/day
Peak %: 0.0%
Expiration Date: 01/18/2038
Time Remaining: 11112 days
This search shows the periodic license check from splunkd, from the _internal index:
index::_internal audit NOT v3
The events returned actually come from
$SPLUNK_HOME/var/log/splunk/license_audit.log, so you could also look in this file. This
can be helpful for monitoring reported indexed volume over a longer period of time than what is kept
in the _internal index. The cumulative daily data indexed is totalCumulativeBytesAtRollover and
todaysBytesIndexed. These audit records are generated by splunkd a few times a day, so statistics
394
on new events will not be immediately available.
You can set some additional debugging options to monitor usage and license information.
The file $SPLUNK_HOME/etc/log.cfg controls the level of messages are created in Splunk's own
logfiles, from NONE to DEBUG. You can set category.LicenseManager=DEBUG to see additional
information about your license. (Restart for changes to take effect.) The output, found in splunkd.log,
looks like this:
08-17-2007 07:43:27.631 INFO LicenseManager - Checking License
08-17-2007 07:43:27.632 DEBUG LicenseManager - Checking for byte-count quota reached. byteCountSinceMeterReset=0, byteLimit=10240M
08-17-2007 07:43:27.632 DEBUG LicenseManager - Byte-count quota not exceeded.
08-17-2007 07:43:27.632 DEBUG LicenseManager - Checking for expiration...
08-17-2007 07:43:27.632 DEBUG LicenseManager - ... license has not expired
08-17-2007 07:43:27.632 DEBUG LicenseManager - checklicense returned: 0
You can also view your license status with this Python script:
import httplib
import time
def rpc(body):
conn = httplib.HTTPSConnection('localhost:8089')
conn.connect()
conn.request("POST", "/rpc/", body)
response = conn.getresponse()
reply = response.read()
return reply
def main():
xml = '''<call name="getLicenseInfo">
<params>
</params>
</call>'''
print rpc(xml)
if __name__ == '__main__':
main()
Save this in a file (like "licensecheck.py") and edit the url for the management port of your instance
(including http or https, as appropriate.) Then run "python licensecheck.py". The output is a block of
XML containing the details of your license and current status:
<licenseInfo>
<expirationDate>2147483647</expirationDate>
<issueDate>1185551536</issueDate>
<remainingTimeOnLicense>960114900</remainingTimeOnLicense>
<timeToMeterReset>51653</timeToMeterReset>
<currentByteCount>0</currentByteCount>
395
<byteLimit>10240</byteLimit>
<peakBytesPerDay>0</peakBytesPerDay>
<totalBytesProcessedAtRollover>0</totalBytesProcessedAtRollover>
<numberRollovers>3</numberRollovers>
<byteQuotaExceededCount>0</byteQuotaExceededCount>
<lastExceededDate>0</lastExceededDate>
<expirationState>0</expirationState>
<errorString></errorString>
<previouslyKeyed>false</previouslyKeyed>
<product>pro</product>
<type>prod</type>
<addons></addons>
<maxViolationsHit>0</maxViolationsHit>
</licenseInfo>
I installed my license in a Preview release and now it doesn't work!
Splunk's Preview releases require a different license that is not compatible with other Splunk
releases. Alternately, if you are evaluating a Preview release of Splunk, it will not run with a Free or
Enterprise license. The preview licenses will typically enable Enterprise features, they are just
restricted to Preview releases.
Anonymize data samples
Anonymize data samples
Splunk contains an anonymize function. The anonymizer combs through sample log files or event
files to replace identifying data - usernames, IP addresses, domain names, etc. - with fictional values
that maintain the same word length, and event type. For example, it may turn the string
user=carol@adalberto.com into user=plums@wonderful.com. This lets Splunk users share
log data without revealing confidential or personal information from their networks.
The anonymized file is written to the same directory as the source file, with ANON- prepended to its
filename. For example, /tmp/messages will be anonymized as /tmp/ANON-messages.
You can anonymize files from Splunk's CLI. To use Splunk's CLI, navigate to the
$SPLUNK_HOME/bin/ directory and use the ./splunk command. You can also add Splunk to your
path and use the splunk command.
Simple method
The easiest way to anonymize a file is with the anonymizer tool's defaults, as shown in the session
below. Note that you currently need to have $SPLUNK_HOME/bin as your current working directory;
this will be fixed in an incremental release.
From the CLI, type the following:
# ./splunk anonymize file -source /path/to/[filename]
# cp -p /var/log/messages /tmp
# cd $SPLUNK_HOME/bin
# splunk anonymize file -source /tmp/messages
Getting timestamp from: /opt/paul207/splunk/lib/python2.4/site-packages/splunk/timestamp.config
396
Processing files: ['/tmp/messages']
Getting named entities
Processing /tmp/messages
Adding named entities to list of public terms: Set(['secErrStr', 'MD_SB_DISKS', 'TTY', 'target', 'precision ', 'lpj', 'ip', 'pci', 'hard', 'last bus', 'override with idebus', 'SecKeychainFindGenericPassword err', 'vector', 'USER', 'irq ', 'com user', 'uid'])
Processing /tmp/messages for terms.
Calculating replacements for 4672 terms.
===================================================
Wrote dictionary scrubbed terms with replacements to "/tmp/INFO-mapping.txt"
Wrote suggestions for dictionary to "/tmp/INFO-suggestions.txt"
===================================================
Writing out /tmp/ANON-messages
Done.
Advanced method
You can customize the anonymizer by telling it what terms to anonymize, what terms to leave alone,
and what terms to use as replacements. The advanced form of the command is shown below.
# ./splunk anonymize file -source <filename> [-public_terms <file>] [-private_terms <file>] [-name_terms <file>] [-dictionary <file>] [-timestamp_config <file>]
filename
Default: None
Path and name of the file to anonymize.

public_terms
Default: $SPLUNK_HOME/etc/anonymizer/public-terms.txt
A list of locally-used words that will not be anonymized if they are in the file. It serves as
an appendix to the dictionary file.

Here is a sample entry:

2003 2004 2005 2006 abort aborted am apr april aug august auth
authorize authorized authorizing bea certificate class com complete
private_terms
Default: $SPLUNK_HOME/etc/anonymizer/private-terms.txt
A list of words that will be anonymized if found in the file, because they may denote
confidential information.


481-51-6234
passw0rd
name_terms
Default: $SPLUNK_HOME/etc/anonymizer/names.txt
A global list of common English personal names that Splunk uses to replace
anonymized words.

Splunk always replaces a word with a name of the exact same length, to keep each
event's data pattern the same.

Splunk uses each name in name_terms once to replace a character string of equal
length throughout the file. After it runs out of names, it begins using randomized
character strings, but still mapping each replaced pattern to one anonymized string.


charlie
397
claire
desmond
jack
dictionary
Default: $SPLUNK_HOME/etc/anonymizer/dictionary.txt
A global list of common words that will not be anonymized, unless overridden by entries
in the private_terms file.


algol
ansi
arco
arpa
arpanet
ascii
timestamp_config
Default: $SPLUNK_HOME/etc/anonymizer/anonymizer-time.ini
Splunk's built-in file that determines how timestamps are parsed.

Output Files
Splunk's anonmyizer function will create three new files in the same directory as the source file.
ANON-filename
The anonymized version of the source file.

INFO-mapping.txt
This file contains a list of which terms were anonymized into which strings.

Replacement Mappings
--------------------
kb900485 --> LO200231
1718 --> 1608
transitions --> tstymnbkxno
reboot --> SPLUNK
cdrom --> pqyvi
INFO-suggestions.txt
A report of terms found in the file that, based on their appearance and frequency, you
may want to add to public_terms.txt or to private-terms.txt or to
public-terms.txt for more accurate anonymization of your local data.


Terms to consider making private (currently not scrubbed):
['uid', 'pci', 'lpj', 'hard']
Terms to consider making public (currently scrubbed):
['jun', 'security', 'user', 'ariel', 'name', 'logon', 'for', 'process', 'domain', 'audit']
398
Unable to get a properly formatted response from the server
Unable to get a properly formatted response from the server
Users running Splunk on a SuSE server may receive the error message Unable to get a properly
formatted response from the server; canceling the current search when executing any kind of search.
Or, the dashboard just won't display properly (or at all.)
In order to resolve this issue edit /etc/mime.types. Delete (or comment out) these 2 lines:
text/x-xsl xsl
text/x-xslt xslt xsl
Also change this line:
text/xml xml
to:
text/xml xml xsl
With these changes in place, restart Splunk and clear your browser cache.
Note: If you are using a proxy, you will need to flush that as well.
Command line tools
Command line tools
Caution: DO NOT use these commands without consulting Splunk support first.
cmd
btool
Cmd line modification and listing of bundles.
399
Syntax
Add
./splunk cmd btool application name add
Delete
./splunk cmd btool application name delete [prefix] [entry]
List
./splunk cmd btool application name list [prefix]
classify
gzdumper
listtails
locktest
locktool
./splunk cmd locktool
Usage :
lock : [-l | --lock ] [dirToLock] <timeOutSecs>
unlock [-u | --unlock ] [dirToUnlock] <timeOutSecs>
Acquires and releases locks in the same manner as splunkd. If you were to write an external script to
copy db buckets in and out of indexes you should acqure locks on the db colddb and thaweddb
directories as you are modifying them and release the locks when you are done.
400
parsetest
pcregextest
regextest
searchtest
signtool
Sign
./splunk cmd signtool [-s | --sign] [<dir to sign>]
Verify
./splunk cmd signtool [-v | --verify] [<dir to verify>]
Using logging configuration at /Applications/splunk/etc/log-cmdline.cfg.
Allows verification and signing splunk index buckets. If you have signing set up in a cold to frozen
script. Signtool allows you to verify the signatures of your archives.
tsidxprobe
This will take a look at your index files (.tsidx) and verify that they meet the necessary format
requirements. It should also identify any files that are potentially causing a problem
go to the $SPLUNK_HOME/bin directory. Do "source setSplunkEnv".
Then use tsidxprobe to look at each of your index files with this little script you can run from your shell
(this works with bash):
for i in `find $SPLUNK_DB | grep tsidx`; do tsidxprobe $i >> tsidxprobeout.txt; done 1.
(If you've changed the default datastore path, then this should be in the new location.)
The file tsidxprobeout.txt will contain the results from your index files. You should be able to gzip this
and attach it to an email and send it to Splunk Support.
401

SplunkAdminManual 3.4.6

Загружено:

Сведения о документе

Исходное описание:

Авторское право

Доступные форматы

Поделиться этим документом

Поделиться или встроить документ

Параметры публикации

Этот документ был вам полезен?

Это неприемлемый материал?

Авторское право:

Доступные форматы

SplunkAdminManual 3.4.6

Загружено:

Авторское право:

Доступные форматы

Splunk Admin Manual

Вам также может понравиться