Академический Документы
Профессиональный Документы
Культура Документы
Programmer's Reference
Guide
4.0
June 2013
Legal Notice
Copyright 2013 Symantec Corporation. All rights reserved.
Symantec, the Symantec Logo, the Checkmark Logo and are trademarks or registered
trademarks of Symantec Corporation or its affiliates in the U.S. and other countries. Other
names may be trademarks of their respective owners.
This Symantec product may contain third party software for which Symantec is required
to provide attribution to the third party (Third Party Programs). Some of the Third Party
Programs are available under open source or free software licenses. The License Agreement
accompanying the Software does not alter any rights or obligations you may have under
those open source or free software licenses. Please see the Third Party Legal Notice Appendix
to this Documentation or TPIP ReadMe File accompanying this Symantec product for more
information on the Third Party Programs.
The product described in this document is distributed under licenses restricting its use,
copying, distribution, and decompilation/reverse engineering. No part of this document
may be reproduced in any form by any means without prior written authorization of
Symantec Corporation and its licensors, if any.
THE DOCUMENTATION IS PROVIDED "AS IS" AND ALL EXPRESS OR IMPLIED CONDITIONS,
REPRESENTATIONS AND WARRANTIES, INCLUDING ANY IMPLIED WARRANTY OF
MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE OR NON-INFRINGEMENT,
ARE DISCLAIMED, EXCEPT TO THE EXTENT THAT SUCH DISCLAIMERS ARE HELD TO
BE LEGALLY INVALID. SYMANTEC CORPORATION SHALL NOT BE LIABLE FOR INCIDENTAL
OR CONSEQUENTIAL DAMAGES IN CONNECTION WITH THE FURNISHING,
PERFORMANCE, OR USE OF THIS DOCUMENTATION. THE INFORMATION CONTAINED
IN THIS DOCUMENTATION IS SUBJECT TO CHANGE WITHOUT NOTICE.
The Licensed Software and Documentation are deemed to be commercial computer software
as defined in FAR 12.212 and subject to restricted rights as defined in FAR Section 52.227-19
"Commercial Computer Software - Restricted Rights" and DFARS 227.7202, "Rights in
Commercial Computer Software or Commercial Computer Software Documentation", as
applicable, and any successor regulations. Any use, modification, reproduction release,
performance, display or disclosure of the Licensed Software and Documentation by the U.S.
Government shall be solely in accordance with the terms of this Agreement.
Symantec Corporation
350 Ellis Street
Mountain View, CA 94043
http://www.symantec.com
Technical Support
Symantec Technical Support maintains support centers globally. Technical
Supports primary role is to respond to specific queries about product features
and functionality. The Technical Support group also creates content for our online
Knowledge Base. The Technical Support group works collaboratively with the
other functional areas within Symantec to answer your questions in a timely
fashion. For example, the Technical Support group works with Product Engineering
and Symantec Security Response to provide alerting services and virus definition
updates.
Symantecs support offerings include the following:
A range of support options that give you the flexibility to select the right
amount of service for any size organization
For information about Symantecs support offerings, you can visit our website at
the following URL:
www.symantec.com/business/support/
All support services will be delivered in accordance with your support agreement
and the then-current enterprise technical support policy.
Hardware information
Operating system
Network topology
Problem description:
Customer service
Customer service information is available at the following URL:
www.symantec.com/business/support/
Customer Service is available to assist with non-technical questions, such as the
following types of issues:
customercare_apac@symantec.com
semea@symantec.com
supportsolutions@symantec.com
Contents
Chapter 2
Chapter 3
11
11
13
13
14
15
16
17
19
22
23
24
25
26
26
26
27
30
32
35
35
36
36
36
38
Contents
Chapter 4
Chapter 5
57
57
59
59
59
60
60
60
61
61
61
62
Chapter
DataInsight Query Language (DQL) - Use DQL to create queries for the purpose
of creating customized reports.
See About Data Insight Query Language (DQL) on page 11.
The generic device web API - Use the API to extend platform support for the
storage devices that Data Insight monitors.
See Web API specification for generic Collector service on page 41.
For information about configuring a generic device in Data Insight and
credentials required to monitor the device, see the Symantec Data Insight
Administrator's Guide.
10
Chapter
DataInsight Query
Language (DQL)
This chapter includes the following topics:
DQL Objects/Tables
DQL functions
DQL Objects/Tables
With DQL, you can run a query on objects and retrieve other objects as results. If
you are familiar with the SQL language, an object in DQL is similar to a table in
Symantec Proprietary and Confidential
12
SQL. The attributes of an object are similar to the columns of the table. The output
of a DQL query is a relational database table with attribute values as column
values.
The complete list of DQL tables and their brief description is as shown:
device
msu
path
dfspath
owner
user
Describes the details of the users that are listed in directory services
such as Active Directory, LDAP, or NIS+ directory server.
groups
Describes the details of the groups that are listed in directory services
such as Active Directory, LDAP, or NIS+.
activity
permission
Describes the details of the NTFS or UNIX permissions that are set
on directory or file paths.
custodian
In the above mentioned list of objects, the owner object differs from the rest of
the object it is a computed object. Owner objects are not first class objects that
are stored in the Data Insight indices. They are computed at run-time depending
on the method that is to be used to calculate file ownership.
device Columns
Column
Type
Description
id
Integer
name
String
type
String
collector
String
indexer
String
[custodians]
[Custodian Object]
capacity
Integer
used_space
Integer
share_count
Integer
open_share_count
Integer
13
14
Column
Type
Description
open_share_data_size
Integer
open_share_file_count
Integer
file_count
Integer
sensitive_file_count
Integer
folder_count
Integer
activity_count
Integer
Column
Type
Description
id
Integer
name
String
type
String
device
Device Object
indexer
String
indexdir
String
[custodians]
[Custodian Object]
[permissions]
[Permission Object]
msu Columns
Column
Type
Description
isopen
Integer
activity_count
Integer
active_user_count
Integer
last_activity_time
Integer
size
Integer
active_data_size
Integer
file_count
Integer
sensitive_file_count
Integer
folder_count
Integer
most_active_user
User Object
Column
Type
Description
sid
String
name
String
login
String
domain
String
firstname
String
lastname
String
isdisabled
Integer
user Columns
15
16
Column
Type
Description
isdeleted
Integer
buname
String
buowner
String
[memberof]
[Group Object]
<custom-attr>
[String]
Column
Type
Description
sid
String
name
String
domain
String
isdisabled
Integer
isdeleted
Integer
[memberof]
[Group Object]
[memberusers]
[User Object]
[membergroups]
[Group Object]
groups Columns
Column
Type
Description
<custom-attr>
[String]
Column
Type
Description
name
String
absname
String
id
Integer
parent
Path Object
type
String
device
Device Object
msu
msu Object
size
Integer
last_accessed
Integer
path Columns
17
18
Column
Type
Description
last_modified
Integer
created_on
Integer
last_accessor
User Object
last_modifier
User Object
creator
User Object
creator_group
Group Object
owner
Owner Object
isdeleted
Integer
depth
Integer
activity_count
Integer
isopen
Integer
[open_reasons]
[String]
[permissions]
[Permission Object]
[custodians]
[Custodian Object]
issensitive
Integer
[filegroups]
[String]
Column
Type
Description
extension
String
[dfsnames]
[String]
[permitted_users]
[User Object]
permitted_users_count
Integer
[active_users]
[User Object]
active_users_count
Integer
[inactive_users]
[User Object]
inactive_users_count
Integer
[dlp_policies]
[String]
iscontrol_point
Integer
[control_point_reasons]
[String]
filesystem_owner
User Object
Column
Type
Description
name
String
dfspath Columns
19
20
Column
Type
Description
absname
String
id
Integer
parent
physicalname
String
type
String
device
Device Object
msu
msu Object
size
Integer
last_accessed
Integer
last_modified
Integer
created_on
Integer
last_accessor
User Object
last_modifier
User Object
creator
User Object
Column
Type
Description
creator_group
Group Object
owner
Owner Object
isdeleted
Integer
depth
Integer
activity_count
Integer
isopen
Integer
[open_reasons]
[String]
[permissions]
[Permission Object]
[custodians]
[Custodian Object]
issensitive
Integer
[filegroups]
[String]
extension
String
[permitted_users]
[User Object]
permitted_users_count
Integer
[active_users]
[User Object]
21
22
Column
Type
Description
active_users_count
Integer
[inactive_users]
[User Object]
inactive_users_count
Integer
[dlp_policies]
[String]
iscontrol_point
Integer
[control_point_reasons]
[String]
filesystem_owner
User Object
Column
Type
Description
path
Path Object
dfspath
user
User Object
read_count
Integer
write_count
Integer
owner Columns
Column
Type
Description
method
String
Column
Type
Description
timestamp
Integer
timerange
Integer
user
User Object
path
Path Object
dfspath
opcode
Integer
operation
String
count
Integer
ipaddr
String
activity Columns
23
24
Column
Type
Description
rename_target
Path Object
dfs_rename_target
Column
Type
Description
object_type
String
path
Path Object
dfspath
msu
msu Object
trustee_type
String
user_trustee
User Object
group_trustee
Group Object
permission_mask
Integer
Permission bitmask.
readable_permission
String
type
String
isinherited
Integer
1 if the permission is
inherited from parent.
inheriting_type
String
permission Columns
Column
Type
Description
inheriting_path
Path Object
inheriting_dfspath
inheriting_msu
msu Object
appliesto
String
Column
Type
Description
path
Path Object
dfspath
msu
msu Object
device
Device Object
dfslink
String
user
User Object
isinherited
Integer
inheriting_type
String
inheriting_path
Path Object
custodian Columns
25
26
Column
Type
Description
inheriting_dfspath
inheriting_msu
msu Object
inheriting_device
Device Object
inheriting_dfslink
String
FROM
GET
[IF
[USING
[FORMAT
[GROUPBY
[HAVING
[SORTBY
[LIMIT
<table>
<column expression> [AS alias], <column expression> [AS alias], ...
<condition>]
<definition>]
<column> AS (CSV|TABLE <tablename>) [<count>]]
<column expression>, <column expression>, ...]
<aggregate-condition>]
<column expression> [ASC|DESC]]
[<offset>,]<count>];
FROM clause
The FROM specifies the table from which DQL retrieves the data. DQL does not
support joins as in SQL. You can only specify one table in the FROM clause.
GET clause
The GET clause specifies the columns (or expressions on columns) that you want
to retrieve from the table that you specify in the FROM clause.
DQL tables can have columns that refer to other tables. For example, the table
groups has a column memberusers which refers to rows from table user. When
you retrieve such reference columns, you need to specify what columns you want
to retrieve from the referred table. For example, you cannot retrieve memberusers
from groups without specifying which columns of the user table you are interested
in. So, you can select memberusers.name or memberusers.sid but not just
memberusers.
The column names in the output table are decided by the expressions used in the
GET clause. While displaying the output, DQL may optionally replace the period
( . ) with the underscore ( _ ). For example, for GET path.name, the output column
name in the SQLite database becomes path_name.
FORMAT clause
Data Insight tables can contain multi-valued columns. For example, path contains
a multivalued column permissions. When you specify the columns in the GET
clause, you also need to specify the manner in which you want their values to
appear in the output database table. Use the FORMAT clause to control the format
of the output in case of multi-valued columns. You can use two formatting options
as shown below:
FORMAT <column> AS CSV
The above syntax displays the output values for a multi-valued column as a
comma-separated list in a single column.
FORMAT <column> AS TABLE <tablename>
The above syntax displays the output values for a multi-valued column in a
separate table. Each row of this table contains a reference to its corresponding
row in the parent table.
The default value for the FORMAT clause is a TABLE. If you do not provide a
FORMAT clause in your query, DQL displays the contents of the multi-valued
columns in separate tables. And the name of the multi-valued column is displayed
as the default name of the table. For example, if you want to retrieve path
permissions and you do not specify the FORMAT clause, DQL displays the output
the permissions of a path in a separate table called permissions.
Consider this example:
FROM
GET
FORMAT
groups
name, memberusers.sid, memberusers.name
memberusers AS CSV
27
28
name
memberusers_sid
memberusers_name
Domain Users
S-1,S-2,S-10,S-11
John,Jim,Paul,Steve
HR_Global
S-10,S-12
Paul,Jane
HR_US
S-10
Paul
FROM
GET
FORMAT
groups
name, memberusers.sid, memberusers.name
memberusers AS TABLE memberusers
In this case, the output database contains two tables groups and memberusers.
The groups table has two columns groups_rowid and name. The memberusers
table has three columns groups_rowid, memberusers_sid, memberusers_name.
The groups_rowid column in the memberusers table is a reference to the
groups_rowid column from the groups table.
Example output tables are as shown below:
groups
groups_rowid
name
Domain Users
HR_Global
HR_US
memberusers
groups_rowid
memberusers_sid
memberusers_name
S-1
John
S-2
Jim
S-10
Paul
S-11
Steve
S-10
Paul
S-12
Jane
S-10
Paul
By default, DQL lists all memberusers of a group. Optionally, you can limit the
number of memberusers listed using the FORMAT clause. This is as shown in the
following query:
FROM
GET
FORMAT
group
name, memberusers.sid, memberusers.name
memberusers AS CSV 4
This limits the output table to a maximum of four member user values for each
group. These four values are the first four members of the list.
FROM
GET
FORMAT
path
name, active_users.name, active_users.memberof.name
active_users AS CSV AND
active_users.memberof AS CSV;
29
30
Notice that you have a flat list of all group names in this column. You have lost
information about what groups each of the active users belongs to. You only know
that there is one active user who belongs to HR, two who belong to ALL-Employees
and one who belongs to Finance.
IF clause
The IF clause is an optional clause that you can use to specify a set of conditions
on the rows that you want to retrieve. It is similar to the WHERE clause of SQL.
DQL retrieves only those rows whose columns satisfy the condition(s) provided
under the IF clause.
Operators
DQL supports the following binary operators that you can use to specify a
condition:
Arithmetic operators: +, -, *, /, %
Constants
DQLs IF clause supports specification of constants in operations. Constants can
be either numeric or string. Some example of supported column-related operations
are as shown below:
IF size/1024 > 10
IF size = 10
Note that string comparisons are case insensitive by default. To specify case
sensitive or case insensitive comparisons, you can use the CASE SENSITIVE and
CASE INSENSITIVE keywords.
the condition while ANY specifies that any value of the multi-valued column
should satisfy the condition.
Suppose that you want to retrieve only those paths on which the user John is
active. You can write a query as shown below.
FROM
GET
IF
FORMAT
path
name, active_users.name
ANY active_users.name = "John"
active_users AS CSV;
Suppose that you want to retrieve paths on which either John or Joe are active.
You can write a query (query a) as shown below.
FROM
GET
IF
FORMAT
path
name, active_users.name
ANY active_users.name IN ("John","Joe")
active_users AS CSV;
The above query retrieves the paths on which either John is one of the active users
and/or Joe is one of the active users.
Suppose that you want to retrieve the paths that only have John and Joe as active
users. You can write a query (query b) as shown below.
FROM
GET
IF
FORMAT
path
name, active_users.name
EACH active_users.name IN ("John","Joe")
active_users AS CSV;
The above query retrieves paths where the only active users are John and/or Joe.
Note that in query (a), you get the paths on which John or Joe is one of the active
users whereas in query (b), you get the paths on which John and/or Joe are the
only active users.
31
32
FROM
GET
IF
FORMAT
path
name, active_users.memberof.name
ANY active_users.memberof.name = "HR"
active_users.memberof AS CSV;
Suppose that you want to retrieve those paths containing active users who belong
only to groups HR and/or FINANCE. You can write a query (query b) as shown
bellow.
FROM
GET
IF
FORMAT
path
name, active_users.memberof.name
EACH active_users.memberof.name IN ("HR", "FINANCE")
active_users.memberof AS CSV;
Note that DQL by default uses the ANY construct if you do not specify an
ANY/EACH construct.
USING clause
Values of certain columns like owner are computed at run-time based on some
criteria. For example, to compute an owner of a file, you need to specify what
methods (like read_count, rw_count, parent_owner etc.) you want to use to
determine the owner. When you determine active users of a path, you need to
specify the time range you want to consider for the activity.
You can use the USING clause to specify such functions that can be applied to
obtain a column value.
The details of the DQL USING functions are as shown below.
FROM
GET
USING
path
name, owner.user.name, owner.method, owner.read_count
owner AS calc_owner("2012-01-01", "2012-06-01", "YYYY-MM-DD",
"rw_count, read_count, last_accessor");
33
If you dont specify a USING function for owner, DQL uses a default time range
of last 6 months and uses a data owner ordering of rw_count, write_count,
read_count, last_modifier, last_accessor, creator, parent_owner.
FROM
GET
USING
FORMAT
path
name, active_users.name
active_users AS
calc_active_users("2012-01-01", "2012-06-01", "YYYY-MM-DD")
active_users AS CSV;
If you dont specify a USING function for active_users, DQL uses a default time
range of last 6 months.
FROM
GET
USING
path
name, active_users_count
active_users_count AS
get_active_users_count("2012-01-01", "2012-06-01", "YYYY-MM-DD");
If you dont specify a USING function for active_users_count, DQL uses a default
time range of last 6 months.
FROM
path
Symantec Proprietary and Confidential
34
GET
USING
FORMAT
name, inactive_users.name
inactive_users AS
calc_inactive_users("2012-01-01", "2012-06-01", "YYYY-MM-DD")
inactive_users AS CSV;
If you dont specify a USING function for inactive_users, DQL uses a default time
range of last 6 months for calculating inactivity.
FROM
GET
USING
path
name, inactive_users_count
inactive_users_count AS
get_inactive_users_count("2012-01-01", "2012-06-01", "YYYY-MM-DD");
If you dont specify a USING function for inactive_users_count, DQL uses a default
time range of last 6 months for calculating inactivity.
FROM
GET
USING
path
name, activity_count
activity_count AS
get_activity_count("2012-01-01 10:00", "2012-01-01 15:00",
"YYYY-MM-DD HH:mm");
If you dont specify a USING function for activity_count, DQL uses a default time
range of last 6 months for calculating activity.
HAVING clause
The HAVING clause is similar to the SQL HAVING clause and allows specification
of conditions on aggregate functions. The syntax of conditions that can be specified
in the HAVING clause is the same as that of the DQL IF clause.
Suppose that you want to retrieve the sum of the sizes of all shares for each filer.
You can write a query for this as shown bellow:
FROM
GET
GROUPBY
msu
filer.name, sum(size)
filer.name;
Now suppose that you want to select only those filers whose sum of share sizes
is greater than 1 GB (1,073,741,824 bytes). Then you need to modify the previous
query as:
FROM
GET
GROUPBY
HAVING
msu
filer.name, sum(size)
filer.name
sum(size) > 1073741824;
GROUPBY clause
The GROUPBY clause is similar to the SQL GROUP BY clause. It enables you to
aggregate the output rows into groups. Suppose that you want to retrieve the sum
of the sizes of all shares for each filer. You can write a query for this as shown
below.
FROM
GET
GROUPBY
msu
filer.name, sum(size)
filer.name;
sum
count
max
min
35
36
SORTBY clause
The SORTBY clause is similar to the SQL ORDER BY clause. It enables you to sort
of the rows of the output table based on their column values.
FROM
GET
SORTBY
msu
name, size
size DESC;
LIMIT clause
The LIMIT clause is similar to the SQL LIMIT clause and is used to limit the number
of output rows.
LIMIT
LIMIT
count
offset, count
DQL functions
DQL supports the following built-in functions:
upper(X)
lower(X)
strlen(X)
length(X)
substr(X, Y)
substri(X, Y)
match(X, P)
matchi(X, P)
datetime(D, F)
formatdate(T, F)
37
38
Get the name, size, active data size, percentage of data size that is active,
openness, and number of active users for each share
FROM
GET
msu
name, size, active_data_size,
(active_data_size*100/size) AS active_data_percent,
isopen, active_user_count;
Get the activity for all paths of share, share1, on March 4, 2012 between 9:00
A.M. and 5:00 P.M..
FROM
GET
IF
activity
path.name, user.name, operation,
formatdate(timestamp, "YYYY/MM/DD HH:mm")
path.msu.name = "share1" AND
timestamp >= datetime("2012/03/04 09:00", "YYYY/MM/DD HH:mm")
AND timestamp <= datetime("2012/03/04 17:00",
"YYYY/MM/DD HH:mm");
Get a list of all sensitive files from all shares of filer, filer1, sorted by size.
FROM
GET
IF
SORTBY
Get a list of all open paths and the reason why they are marked as open.
FROM
GET
IF
FORMAT
path
name, issensitive, size
issensitive = 1 AND type = "FILE" AND device.name = "filer1"
size DESC;
path
name, msu.name, isopen, open_reasons
isopen = 1
open_reasons AS CSV;
Get a list of all open paths and the reason why they are marked as open. Also,
list the permissions on each open path.
FROM
GET
IF
FORMAT
FROM
IF
USING
path
name, msu.name, owner.user.name, owner.method,
owner.read_count, owner.write_count
type = "DIR"
owner AS calc_owner("2012-01-01", "2012-06-01",
"YYYY-MM-DD","rw_count, last_modifier");
FROM
GET
IF
USING
user
name, sid, login, domain, "E-mail", department,
memberof.sid, memberof.name
memberof AS table memberof_groups;
FROM
GET
path
name, msu.name, isopen, open_reasons,
permissions.user_trustee.name, permissions.group_trustee.name,
permissions.readable_permission, permissions.isinherited,
permissions.inheriting_path.name
isopen = 1
permissions AS TABLE permissions
AND open_reasons AS CSV;
Get a list of all users, their e-mail and department (custom attributes) and the
groups that they belong to.
FROM
GET
39
path
name, msu.name, isopen, inactive_users.name
isopen = 1
inactive_users AS calc_inactive_users("2012-01-01",
"2012-06-01","YYYY-MM-DD"
For each share, get the count of paths that have permissions set on Everyone
FROM
GET
IF
permissions
msu.name, count(path.id) AS risk_path_count
object_type = "DIR" AND group_trustee.name = "Everyone"
Symantec Proprietary and Confidential
40
GROUPBY
SORTBY
AND isinherited = 0
msu.name
risk_path_count DESC;
The condition isinherited = 0 ensures that we only get the paths that have
permissions explicitly defined on Everyone and not populate all paths that
simply inherit those permissions.
Chapter
The Data Insight server identifies the client using a login API request.
2.
On successful log in, the Data Insight server returns an authentication token
as the response. The same token is inserted into an HTTP cookie called
MATRIX_AUTH which is valid for 30 minutes. If the log in attempt is
unsuccessful, an HTTP response code 401 is returned.
3.
You must include the authentication token in each subsequent request to the
Data Insight server either in an HTTP request header called MATRIX_AUTH,
or in a cookie with the same name, or as an HTTP request input parameter
with the same name.
4.
Each token has an inactivity timeout interval of 30 minutes. The token expires
if the client does not send a request for 30 minutes. In case the Data Insight
Symantec Proprietary and Confidential
42
server restarts, the client must obtain the authentication token by using the
login API. Data Insight uses the standard HTTP status code 401 to convey
that login is required. Data Insight returns the HTTP status code 401
(Unauthorized), if the client does not have the correct privileges.
5.
The user principal against which log in is performed can be any valid Data
Insight user with the Server Administrator role.
Login
POST /api?function=LOGIN
Request parameters
Name
Description
username
domain
password
format
Comment
Optional format=json
Request body
Do not supply a request body for this method.
Response
Login Success
If format=json is specified, then the authentication token is written on HTTP
response output in JSON format.
HTTP/1.1 200 OK
Content-Type: application/json
43
Status: 200 OK
{"auth_token":"A2360DD2D9BB7284EF8BEB40E8DBA63F"}
Upload Events
POST /api?function=COLLECTOR&cmd=upload_events_sqlite&event_type=<type>
Request parameters
Name
Description
Comment
MATRIX_AUTH
Authentication token
event_type
Optional (cifs|nfs)
The type of events in the
file that is uploaded on the
Collector.
Request body
The request can be an HTTP multi-part request or the request body can have
the contents of the file.
Response
If the file upload is successful, returns a response with following structure:
HTTP/1.1 200 OK
Content-Type: application/json
Status: 200 OK
{"status_code":<code>,"status_msg":"<msg>"}
44
Column name
Type
Constraints
Description
filer
TEXT
NOT NULL
Filer's address as
added to the Data
Insight
configuration.
opcode
INTEGER
NOT NULL
An integer
describing the event
operation (For
example,READ=3,
WRITE=4) Please
refer to the
Protobuf format for
a complete set of
values.
username
TEXT
Username of the
user for CIFS
(Optional). UID of
the user in case of
an NFS event.
domainname
TEXT
sid
TEXT
pathname
TEXT
renamepath
TEXT
Applicable in case of
rename event.
type
TEXT
Type of path.
(FOLDER=1,
FILE=2)
ipaddr
TEXT
IP address from
where the path was
accessed (optional).
NOT NULL
Column name
Type
Constraints
Description
timestamp
INTEGER
NOT NULL
Timestamp of event
in seconds as UNIX
epoch.
45
CREATE TABLE events (filer TEXT NOT NULL, opcode INTEGER NOT NULL, usern
domainname TEXT, sid TEXT,
pathname TEXT NOT NULL, renamepath TEXT, type TEXT, ipaddr TEXT
timestamp INTEGER NOT NULL);
Note: For CIFS events, the SID value is mandatory; user name and domain
name are optional.
For NFS events, SID should be blank, user name should be the UID, and domain
name should be blank.
For CIFS events, pathname should be the UNC path.
For NFS events, the pathname should be the absolute path of the file or the
folder.
3.
Request parameters
Name
Description
MATRIX_AUTH
Authentication token
input_format
Comment
json|proto
Request body
The request body must contain the events list in the specified format, Google
Protocol Buffers or JSON.
Response
Returns a response with following structure:
HTTP/1.1 200 OK
46
Content-Type: application/json
Status: 200 OK
{"status_code":<code>,"status_msg":"<msg>"}
AccessType opcode = 1;
string unc_path = 2;
string rename_path = 3;
PathType path_type = 4;
string sid = 5;
string username = 6;
string domain = 7;
uint64 timestamp_msec = 8;
string ip_address = 9;
message NfsEventMessage {
required
required
optional
required
required
optional
optional
required
optional
}
AccessType opcode = 1;
string path = 2;
string rename_path = 3;
PathType path_type = 4;
int64 uid = 5;
int64 gid = 6;
string domain = 7;
uint64 timestamp_msec = 8;
string ip_address = 9;
enum PathType {
UNKNOWN_PATHTYPE = -1;
FOLDER = 1;
FILE = 2;
}
enum AccessType {
CREATE
DELETE
READ
WRITE
RENAME
MKDIR
RMDIR
RENAMEDIR
SECURITY
SYMLINK
LINK
READLINK
OPEN
=
=
=
=
=
=
=
=
=
=
=
=
=
1;
2;
3;
4;
5;
8;
9;
10;
18;
19;
20;
21;
200000;
47
48
"nfsEvents": [
<NFS Event>
]
}
<CIFS Event>
{
"opcode": <String>,
"uncPath": <String>,
"renamePath": <String>,
"pathType": <String>,
"sid": <String>,
username: <String>,
domain: <String>,
"timestampMsec": <Number>,
"ipAddress": <String>
}
<NFS Event> {
"opcode": <String>,
"path": <String>,
"renamePath": <String>,
"pathType": <String>,
"uid": <Number>,
"gid": <Number>,
domain: <String>,
"timestampMsec": <Number>,
"ipAddress": <String>
}
Note: opcode and pathType fields can take only a specific set of values. Refer
to the protobuf enums for a description of values for each field; enum
AccessType for the field opcode and enum PathType for the field pathType.
Example
{
"deviceId": 0,
"deviceName": "10.209.89.3",
"cifsEvents": [
{
Symantec Proprietary and Confidential
49
"opcode": "RENAME",
"uncPath": "\\\\NAMEMCCIFS1\\TestShare\\DI3.0RU1\\Data1",
"renamePath": "\\\\NAMEMCCIFS1\\TestShare\\DI3.0RU1\\Data2 ",
"pathType": "FOLDER",
"sid": "S-1-5-21-617441397-4198358099-2716562547-104771",
"timestampMsec": 1340003837,
"ipAddress": "172.31.163.29"
},
{
"opcode": "CREATE",
"uncPath": "\\\\NAMEMCCIFS1\\TestShare\\DI3.0RU1\\New Folder",
"pathType": "FOLDER",
"sid": "S-1-5-21-617441397-4198358099-2716562547-104771",
"timestampMsec": 1340003847,
"ipAddress": "172.31.163.29"
}
],
"nfsEvents": [
{
"opcode": "MKDIR",
"path": "\/openldaphome\/DIRU1",
"pathType": "FOLDER",
"uid": 0,
"gid": 0,
"domain": "0",
"timestampMsec": 1339680545
}
]
}
4.
Add shares
POST /api?function=COLLECTOR&cmd=add_shares&format=<format>
Request parameters
Name
Description
MATRIX_AUTH
Authentication token
format
Request body
Symantec Proprietary and Confidential
Comment
proto|json
50
"shareType": {string}
}
Note: The shareType parameter accepts only specific set of values. For the
possible set of values, refer enum ShareType in the Protobuf definition.
Example
{
"deviceId": 0,
"deviceName": "10.209.111.193",
"shares": [
{
"shareName": "/openldaphome",
"sharePath": "/openldaphome",
"shareType": "NFS"
},
{
"shareName": "/nfstest",
"sharePath": "/nfstest",
"shareType": "NFS"
}
]
}
Note: Data Insight scans the shares that are added only when the user enables
scanning and provides the Scanner credentials for the filer.
51
52
Chapter
Data is supplied to the scripts via command line arguments. Arguments vary
based on what the script is used for. The scripts can be created in the .exe, .bat,
.pl, or .vbs formats.
Data Insight handles custom scripts differently depending on the type of operation.
Following list shows how Data Insight handles various types of scripts:
For example,
ticketing.pl file_name
C:\DataInsight\data\workflow\tmp\PR_ticketing_1.txt
The second argument is full path to a text file containing the permission
recommendations. Each line in the text file contains one action and the required
Symantec Proprietary and Confidential
54
variables to perform that action. Lines are separated by a new line character.
The script should read each line of the input file and open one or more
remediation tickets as needed. If script exits with a non-0 exit code, the action
is considered to have failed. Each line in the file is of the following format:
OP:<OPCODE> PARAM:VALUE; PARAM:VALUE; ...
For example,
OP:REMOVE_ACE USER:foouser@domain.com;
PATH:\\fileserver1\share1\path;
Refer to the next section for possible values for opcodes and their parameters.
Data Insight will supply the following opcode and arguments for Active
Directory remediation:
OP:DEL_GROUP_MEMBER AD_GROUP:<group@domain>|AD_USER:<user@domain>
TARGET_GROUP:<target_group@domain>
Data Insight will supply the following opcode and arguments for CIFS
remediation:
OP:REMOVE_ACE GROUP:<group@domain>|USER:<user@domain>
PATH:<unc_path>
Data Insight supports the following properties that can be passed to the custom
scripts:
Properties
Format
size
size_on_disk
created_by
created_on
last_modified_by
last_modified_on
last_accessed_by
last_accessed_on
data_owner
custodian
For detailed information about how to use custom scripts for data and permission
remediation, see the Symantec Data Insight Administrator's Guide.
55
56
Chapter
file_inventory table
In this table, there is one row for each matching file that is found in the specified
index dbs.
CREATE TABLE
xid
sid
user_id
owner_account
displayname
owner_method
bu_name
bu_owner
filer
file_inventory (
INTEGER,
TEXT,
INTEGER,
TEXT,
TEXT,
TEXT,
TEXT,
TEXT,
TEXT,
58
share
dfs_server
dfs_share
dfs_path
fid
path
msu_type
interval
sensitive
msu_id
read_count
write_count
file_size
atime
ctime
mtime
fs_sid
TEXT,
TEXT,
TEXT,
TEXT,
INTEGER,
TEXT,
INTEGER,
INTEGER,
INTEGER,
INTEGER,
INTEGER,
INTEGER,
INTEGER,
INTEGER,
INTEGER,
INTEGER,
TEXT);
The sid is typically the Windows SID of the calculated owner of the file.
The owner_method column indicates the owner method that Data Insight used
to calculate the owner.
The user_id is the foreign key into the fileuser table of the current version of
the users.db stored in the DataInsight\Data\users folder. This is used for debug
purposes only.
The filer, share, path, dfs_server, dfs_share and dfs_path columns combine to
give the path to the file. The fid column is the foreign key into the fentry table
of the latest version of the index.db for this share. fid is used for debug purposes
only.
The msu_type is an integer value describing the type of share. There are four
possible values:
1 CIFS
2 SharePoint
3 NFS
8 DFS
The interval column is the foreign key into the intervals table below, based on
the last access time of the file.
The msu_id is the foreign key into the msu table of the latest version of the
config.db stored in the DataInsight\Data\conf folder.
Read count and write count are the aggregate number of audit events of each
time of events over the total time period specified for this run of the report.
File_size is the logical file size from the file system. Atime, ctime, and mtime
are the metadata for the file also pulled from the file system.
The fs_sid is the SID of the file system owner value from the file system
metadata.
lob table
This table consists of a list of distinct Lines of Businesses (LOBs). Other tables use
this table in a foreign key manner.
user_lob table
This table gives the mapping from users to the associated LOBs.
user_totals table
This table gives the total numbers of files, sensitive files etc. for each user. In the
final output, the msu_id column is displayed as empty. The user_id is the foreign
key into the fileuser table of the current version of the users.db stored in the
DataInsight\Data\users folder.
59
60
total_bytes
INTEGER,
sensitive_files INTEGER,
sensitive_bytes INTEGER);
user_interval_totals table
This table breaks out the information from the user_totals table over each interval
specified from the input database. The interval_id is a foreign key to the intervals
table.
lob_totals table
Based on the mapping specified in the User_lob table, this table gives the total
numbers for each LOBs. In the final output, the msu_id column will be empty.
lob_interval_totals table
This table breaks out the information from the lob_totals table over each interval
specified from the input database. The interval_id is a foreign key into the intervals
table.
interval_id
total_files
total_bytes
sensitive_files
sensitive_bytes
INTEGER,
INTEGER,
INTEGER,
INTEGER,
INTEGER);
intervals table
This table gives the beginning and end of each interval as specified in the input
database. The beginning and end times are specified as epoch numbers. For
example, the time 0 would be Midnight at Jan 1, 1970, and each higher number is
one second after that.
msu_info table
This table copies the data from the Dashboard database to specify if the msu is
open. The msu_id column is a foreign key to the table of the latest version of the
config.db stored in the DataInsight\Data\conf folder.
dashboard_info table
This table is similar to the msu_info table in that it copies information from the
latest version of the Dashboard database into the report output database. There
may be a slight mismatch in the numbers here versus the totals from the
user_totals table. This difference happens due to the difference in the time at
which each set of numbers are calculated.
61
62
dash_files
dash_sens_files
INTEGER,
INTEGER);