Вы находитесь на странице: 1из 14

Sample Questions

Q1. What are the different ways in which you may perform a SAP HANA database
data backup? What
happens, if a scheduled data backup starts before the previous backup is
completed?

Answer;
There are different ways and tools to perform a data backup.

a) Performing a data backup using SAP HANA studio


1. In the Navigator view, select the database that you want to back up.

2. From the context menu, choose Back Up...

3. Select a destination type, File or Backint:


o File Backs up the database to files in the file system. Each SAP HANA service
writes backup data to the specified
destination in the file system.
o Backint Backs up the database using a third-party backup tool. Each SAP HANA
service starts the Backint agent
and sends the backup data to the third-party backup tool.

4. Specify the backup destination and the backup prefix.


The default destination is the path specified in the Backup section. The backup
destination can be the same, regardless
of whether you select File or Backint.

The third-party backup tool can distinguish between multiple backups with the same
name. You do not need to use a
different name for each backup.

For file-based backups, you need to ensure that there is sufficient space at the specified
backup destination.

For backups with third-party backup tools, you can only change the backup prefix. The
destination is always
/usr/sap/<SID>/SYS/global/hdb/backint

5. Choose Next.
A summary of the backup settings is displayed.

6. When all the settings are correct, choose Finish.


The backup then starts.

1
The progress of the backup is shown for all the services (for example, the statistics
server, name server, and index
servers).

When all the volumes have been backed up, a confirmation message is displayed.

b) Performing a data backup using SQL commands


You can enter SQL commands either by using the SQL editor in SAP HANA studio, or
by using the command line
program hdbsql. You can start the backup using the following SQL command:

For file-based backup:


BACKUP DATA USING FILE ('<path><prefix>')
If no absolute path is specified, the default backup destination is prepended to the
location. If the directory does
not exist, you need to create it before you start the backup.

For third-party backup tools:


BACKUP DATA USING BACKINT ('/usr/sap/<SID>/SYS/global/hdb/backint/<prefix>')

For third-party backup tools, you can optionally omit the path specification.

c) Performing a data backup in batch mode


To perform data backups in batch mode, we recommend that you use the command line
tool hdbsql on operating
system level. hdbsql enables you to trigger backups through crontab.

d) Performing a data backup using DBACockpit (from the ABAP system only)
You can find the DBA Cockpit documentation in the SAP Help Portal
http://help.sap.com/

If a scheduled data backup starts before the previous backup is completed, the first data
backup will continue normally;
the second data backup will not start and you will receive an appropriate error message.

Q2. In case of a file-based backup can I configure the directories where the
backups should be located?
What are the naming conventions for data backup files?

Answer;
Yes. The location of file-based data and log backups can be changed.

For improved data safety, it is recommended that you specify a path to an external
backup destination. The backup
destination should never be on the same file system as the data or log areas.

2
It is recommended that you create the destination directory structures before the backup
is started.

a) Data backup
By default, data backups are written to the following destination:
$DIR_INSTANCE/backup/data.

You can specify a different destination when you perform the backup. Alternatively, you
can change the default backup
destination in the Backup section in SAP HANA studio.

To change the default destination for data backups:


1. In SAP HANA studio navigation pane go to the Backup section of your system.

2. Go to the Configuration tab and choose Data Backup Settings.

3. Specify the new default destination in the Destination field & Save.

Different data backups can be written to different destinations, but all the parts of one
particular data backup are written
to the same destination.

b) Log backup
By default, log backups are written to the following destination:
$DIR_INSTANCE/backup/log.

You can change the backup destination in the Backup section in SAP HANA studio.

To change the default destination for log backups:

1. In SAP HANA studio navigation pane go to the Backup section of your system.

2. Go to the Configuration tab and choose Log Backup Settings.

3. Specify the new default destination in the Destination field & Save.

In general changes to the data or log backup destinations take effect immediately.
Each backup naming convention comprises the following elements: <path><prefix>

If no complete path is specified, the default backup destination is prepended to the


backup name. For file-based
backups, the backups are written to a destination in the file system. For backups made
using third-party backup tools, a
pipe is created in the file system.

The pipe is always created in the directory /usr/sap/<SID>/SYS/global/hdb/backint. The

3
backup tool decides where the
backups are actually written.

You can use the prefix proposed by the system or you can specify a different prefix for
the backup name. During
backup, each service backs up its data to the specified backup destination.

Below is an example of a set of backup files from one data backup created with SAP
HANA studio. The backups can
have different names.

• /backup/data/COMPLETE_DATA_BACKUP_databackup_0_1 (name server topology)

• /backup/data/COMPLETE_DATA_BACKUP_databackup_1_1 (name server)

• /backup/data/COMPLETE_DATA_BACKUP_databackup_2_1 (for example, index


server)...

In the above example, the <path> is /backup/data/, the <prefix> is


COMPLETE_DATA_BACKUP. databackup_0_1 is
the suffix, which is automatically added by the system.

Q3. What general configuration options do I have for saving the log area?
Where can I find the backup catalog?

Answer;
You can configure the log backup behaviour using the Backup section in SAP HANA
studio except for the log mode. In
that case you have to change the configuration parameter in the global.ini configuration
file.

Automatic Log Backup


Automatic log backup can be enabled or disabled using parameter
enable_auto_log_backup. Per default the automatic
log backup is enabled.

During normal system operation (log mode normal), it is recommended that automatic
log backup be kept activated. If
automatic log backup is disabled, the log area grows until the file system is full. If the file
system is full, the database will
freeze until the situation is solved.

Log Modes
After changing the log mode parameter log_mode, you must restart the SAP HANA
database to activate the changes.
We also recommend that you create a full data backup of the database.

4
This parameter can have the following values:

o log_mode = normal (default)


Log segments are automatically backed up if automatic log backup is enabled.
log_mode = normal is recommended to provide support for point-in-time recovery.

o log_mode = overwrite
Log segments are freed by savepoints and no log backup is performed. This can be
useful for test installations that
do not need to be backed up or recovered.
log_mode = overwrite is not recommended for production systems. No point-in-time
recovery is possible. For
recovery, only data backups are used; the logs must be disregarded.

o log_mode = legacy
log_mode = legacy is still supported, but not recommended by SAP

Log Backup Timeout


You can specify a timeout for log backups. The timeout determines the intervals at
which the logs are backed up. If
the log segments become full before the timeout, the logs will be backed up anyway.
The log backup timeout is controlled
by the parameter log_backup_timeout_s.

Specifying an appropriate time interval for log backups enables you to recover a SAP
HANA database with
minimum data loss. For example, if you need to recover the database in a situation
where the log area is unusable, and
only the data and log backups are available.

Recommended: Specify a time interval. For example 900s.

If you specify a timeout 0, the log backups are only made when a log segment is full and
when services are restarted.

Note: The backup timeout only takes effect if automatic log backups are enabled.

The backup catalog is stored in the SAP HANA database and is part of the log
backups. This allows access to the
backup catalog during recovery.

Even in situations such as when log mode overwrite is set, where logs backups are not
written, the backup catalog is still
backed up. The backup catalog is backed up and versioned after every completed
backup operation. If the backup
catalog is saved using a third-party backup tool, the versioning of the backup catalog is

5
handled by the backup tool.

Q4. How do I recover my SAP HANA database? How do I recover my SAP HANA
database, if the data/log area
is unusable? Or if a logical error occurred (e. g. some data were deleted by
mistake)?

Answer;
The SAP HANA studio provides point-in-time recovery with GUI support.

In a recovery situation, you can use SAP HANA studio to restore customer-specific
changes (configuration settings)
either before or at the end of the recovery. To recover your database from SAP HANA
studio -> navigator view -> open
context menu of the database -> choose "Recover...". Confirm that the database can be
shut down - recovery is only
done when the database is offline. In the recovery wizard choose one of the recovery
types:

• Recover the database to its most recent state.

• Recover the database to a certain point in time.

• Recover the database to a specific data backup. (Just a specific data backup is
used for recovery; Log entries are
not replayed either from the log backups or from the log area.)

Note: The SAP HANA database will be initialized with the specified data backup. This
backup becomes the first backup of a new database life cycle. Older backups are not
compatible with the log written after the recovery.

• Recover the database to a certain log position. (Advanced option to be used in


exceptional cases if a previous recovery failed.)

The SAP HANA database uses the backup catalog to identify the destination of the
backups. You do not need to
specify whether File or Backint was used for the backup.
If the data area is unusable, and all the data changes after the last complete data
backup are still available in the log
backups and log area, the data from committed transactions that was in memory at the
time of failure can be recovered.
No committed data is lost.

For recovery, all relevant log backups made after the data backup are needed.

Perform a database recovery to the most recent state.

6
If the log area is unusable, it is only possible to replay the log backups. As a
consequence, any changes that were made
after the most recent log backup will be lost. In addition, all the transactions that were
open during the log backup will be
rolled back.

For recovery, the data backups and the log backups are needed.

Perform a database recovery to the most recent state and in the Recovery Wizard
choose the option Initialize log area
to prevent the recovery of entries from the unusable log area.

To reset the database to a particular point in time, you need a data backup from before
the point in time to recover to, the
subsequent log backups, and the log area. All changes made after the recovery time will
be lost.

Perform a database recovery to a point-in-time before the logical error.

Q5. When modeling with Analytic Views in SAP HANA, you need to deal with
billions of rows of data.
What features/capabilities of HANA could you use?

Answer;
Partitioning SAP HANA database tables
When modeling with Analytic Views in SAP HANA, you can use the table partitioning
feature of the database. This allows
tables to be partitioned horizontally into disjunctive sub-tables or partitions as they are
also known.

Partitioning supports the creation of very large tables by decomposing them into smaller
and more manageable pieces.
It is transparent for most SQL queries and Data Manipulation Language (DML)
statements.

SAP HANA database tables with more than two billion rows
A non-partitioned SAP HANA database table can store 2 billion rows. By utilizing SAP
HANA partitioning as described
in the guide SAP HANA Database - Partitioning and Distributing Large Tables; this
limitation of 2 billion rows for each
database table can be overcome by distributing the table rows to several partitions.

Please note, each partition can contain up to 2 billion rows. For example a 3 billion row
table can be partitioned into 2
separate tables each containing 1.5 billion rows. Having more than 2 billion rows in an

7
internal table (like for intermediate
results) is not supported.

Characteristics of modeling view types with big data tables


By utilizing SAP HANA database partitioning large datasets of over 2 billion rows can be
operated upon. When
modeling with, or querying on, these large data sets you should use Analytic Views.

Queries on this large amount of data might lead into an "out of memory" situation if a
large (intermediate) result set is
requested.

It is recommended to avoid the following query types/modeling constructs in analytic


views as they might lead into an
"out of memory" situation:

• Do not use group by on key or key like constructs

• Do not reference to key or key like constructs

Do not use formulas, which are based on calculated attributes/keys/key like constructs

Q6. A user has written a complex SQL statement (or it was autogenerated by a BI
Client Tool) and would like
to find out where most of the execution time is spent.

Answer;
You can use the functionality of EXPLAIN PLAN to find out where execution time is
spent.
The code structure is as below;

Populating QUERY_PLANS view


EXPLAIN PLAN SET PLAN_ID = <string> FOR <SQL statement>
Querying QUERY_PLANS view
SELECT
LPAD (‘ ‘, LEVEL * 2) || OPERATOR_NAME, <additional columns>
FROM QUERY_PLANS
WHERE PLAN_ID = <string>
ORDER BY OPERATOR_ID

Example:
alter session set current_schema=tpchsmoke;
delete from query_plans;
explain plan for
select * from r_lineitem left outer join r_part on l_partkey = p_partkey where l_quantity <
300;

8
select lpad(' ', level) || operator_name, operator_details from query_plans order by
operator_id;

Q7. Alerts are triggered in HANA regarding memory usage and upon further
investigation, HANA Studio and
DBACockpit seem to provide 2 different views on the resident memory.
Can you explain what may be going on?

Answer;
Before proceeding with analysing the different HANA specific memory areas, it has to
be clarified that indeed the
database and not processes running outside the SAP Software are responsible.

This involves calculating how much resident memory is used by the HANA database.
On the HANA appliance, resident
memory used outside HANA (OS, 3rd party processes) is typically quite small, not
exceeding 2 GB.

HANA Studio and DBACockpit provide 2 different views on the resident memory:

HANA Studio:

The following SQL statements are behind this output:


'Database Resident':
SELECT SUM(PHYSICAL_MEMORY_SIZE) FROM M_SERVICE_MEMORY

'Total Resident':
SELECT T1.HOST, T1.USED_PHYSICAL_MEMORY +
T2.SHARED_MEMORY_ALLOCATED_SIZE

FROM M_HOST_RESOURCE_UTILIZATION AS T1

JOIN (SELECT M_SERVICE_MEMORY.HOST,


SUM(M_SERVICE_MEMORY.SHARED_MEMORY_ALLOCATED_SIZE) AS
SHARED_MEMORY_ALLOCATED_SIZE

FROM SYS.M_SERVICE_MEMORY
GROUP BY M_SERVICE_MEMORY.HOST) AS T2

ON T2.HOST = T1.HOST;

Since the difference between 'Total Resident' and 'Database Resident' is well below 2
GB, there is no indication that
processes outside the database contribute significantly to memory consumption.

DBACockpit:

9
This is effectively the output of the following statement:
SELECT HOST,

ROUND(USED_PHYSICAL_MEMORY/1024/1024/1024, 2) AS "Resident GB",

ROUND((USED_PHYSICAL_MEMORY + FREE_PHYSICAL_MEMORY)/1024/1024/1024, 2) AS
"Physical Memory GB"

FROM PUBLIC.M_HOST_RESOURCE_UTILIZATION

DBACockpit does not currently consider shared memory as part of the resident
memory, hence the difference to
what HANA Studio reports.

Q8. An 'out of memory' error is not yet triggered, but further investigation on the
current memory usage of the
database might be required.
How would you proceed with the detailed analysis?

Answer;
There are 2 cases that have to be distinguished in general:
1. An 'out of memory' error has already occurred.

2. An 'out of memory' error was not yet triggered, but further investigation on the
current memory usage of the
database might be required.

For case (1), the foundation for further analysis is the oom trace file which follows the
naming convention
<processname>_<hostname>.<number>.rtedump.<number>.oom.trc (for example
indexserver_hdbnode1.39503.
rtedump.72290.oom.trc).

In case (2), such a trace file is not yet available and an rte dump has to be generated
which has a structure similar to the
oom trace.

For case (2), further analysis can also be done 'top-down', making use of the
information provided by the different
administrative frontends (HANA Studio / DBACockpit) and monitoring views.

2. Top down approach

2.1 Which HANA process on which host is using most of the memory?

10
SQL Statement:
SELECT TOP 3 HOST, PORT, SERVICE_NAME, TOTAL_MEMORY_USED_SIZE
FROM M_SERVICE_MEMORY
ORDER BY TOTAL_MEMORY_USED_SIZE DESC

2.2 Is shared or heap memory using the largest share?


So far, the sum of used heap and shared memory has been determined. Now, this total
has to be split:

SQL Statement:
SELECT TOP 3 HOST, PORT, SERVICE_NAME, HEAP_MEMORY_USED_SIZE,
SHARED_MEMORY_USED_SIZE,
TOTAL_MEMORY_USED_SIZE
FROM M_SERVICE_MEMORY
ORDER BY SHARED_MEMORY_USED_SIZE DESC

If it is shared memory, further analysis is required on 'Shared Memory Usage',


otherwise, further analyse 'Heap memory
usage' (see further details below)

3. Trace file approach


The following sections of those trace files are usually relevant:
[IPMM_MEMORY]

The first part of this section lists the local (heap) memory the processes that make up
the HANA database are currently
using.

For each process, the allocation limit (PAL), the amount of memory currently allocated
(AB) and used (U) are displayed.
An AB value that is significantly larger than an U value does not necessarily mean a
problem: If necessary, allocated, but
not used memory will be released.

Of particular interest are lines where AB, U and PAL have approximately the same
value. This particular process has
then reached its allocation limit. Further analysis based on the output of
M_HEAP_MEMORY in section
[PROCESS_INFO] has to be done then.

The second part of section [IPMM_MEMORY] contains information regarding the


shared memory.

Shared Memory is used for various purposes. For practical purposes, it is sufficient to
assume that mainly row store
tables occupy shared memory. In case of the 'early exits since' output in the oom trace,
the error was triggered by the

11
inability to further increase the TPAL (temporary allocation limit) of the failing process.

The allocation limit of a process is not set to the maximum right at process start, but
increased (and decreased) over
time. In this specific case, 'sum of temp process allocation limits' + SHARED_MEMORY
exceeds the
GLOBAL_MAX_ALLOCATION_LIMIT which consequently causes the error.

MEMORY_OOM
Before any detailed analysis is done, the information from section [MEMORY_OOM]
has to be reviewed as well. It might
be the case that the request was that large that it was anyway impossible to fulfil it:

The following checks can then be done based on the information available so far:

4. Shared Memory Usage


Contrary to heap memory, which is allocated using the 'malloc' system call, shared
memory is provided using the
'shmget' call. The results of shared memory creation can then be viewed on OS level
using the ipcs command.

4.1 Memory usage of the row store


1. The row store is organized in 64MB segments. Deleting large number of rows
can lead to sparse segments and
unnecessary use of memory.

2. Indirectly, high memory usage of the row store can cause problems when
parameter client_distribution_mode is
set to 'off' in distributed environments. When default value 'statement' is used, the
statement is sent to the node where
table is located. With the setting 'off', the statement might then also be directed to the
master node.

Since all row store tables are usually located on the master node and consequently, the
row store uses a lot of memory,
materializing a large amount of data on the master node (from a table that is actually
located on another node) can then
simply be too much.

3. Too many large tables were created as row store tables. The largest tables
currently contained in the row store can be retrieved by the following SQL Statement:

SELECT TOP 50 * FROM M_RS_TABLES


ORDER BY (ALLOCATED_FIXED_PART_SIZE +
ALLOCATED_VARIABLE_PART_SIZE) DESC

It is however important to keep in mind that in many scenarios, switching a table from

12
rowstore to columnstore and vice
versa must not be done. This applies to Suite on HANA / BW systems which are
delivered with particular settings for
different table types.

5. Heap memory usage


If it can be ruled out that the rowstore is responsible for high memory consumption, the
heap memory allocation of the
individual processes has to be analyzed next. In most cases, the process of interest is
the indexserver, since it usually
allocates the greatest share of the available memory. The following possibilities exist to
do this:

1. If an oom error has already occurred or a rte dump was explicitly triggered, the
content of the system view
M_HEAP_MEMORY (section[PROCESS_INFO]) that was written into the trace file can
be used.

2. If an adhoc analysis of the heap memory consumption should be done, the view
M_HEAP_MEMORY can be
queried directly, for example using HANA Studio.
In case of (1), the output can be directly copied into a csv file and further be analyzed
for example using excel. Each line
represents an allocator, which is a facility for allocating heap memory in HANA.

Of interest are those pools that are on the top of the list and have an inclusive size close
to the exclusive size. Depending
on the pool names, the following known issues exist:
Pool/parallel/ihm
There are cases where this allocator used 90% of the available memory. If such a
problem is observed, the system
should be upgraded to revision 54 when available.

Pool/itab
This allocator is needed for the intermediate result of join operation and translation table
used for join operation. This
could indicate use of a suboptimal model.

Pool/FemsCompression/CompositeFemsCompressionPool/FemsCompression/Co
mpositeFemsCompression
A known issue is associated with high memory usage that is solved with revision 53.
FEMS compression is used in BW
systems to keep the size of the dataset that is transferred to the application server as
small as possible.

Pool/ValueArray

13
This is the allocator which keeps the complete resultset in uncompressed form. A
known issue has been solved with
revision 55 which addresses memory used by this allocator not released after oom
situations, thus steadily increasing
over time.

Pool/JoinEvaluator/TranslationTable
Translation tables are created for caching join column entries of join engine queries.
The number of translation tables that
are kept in memory is determined by the following parameter in indexserver.ini:

[joins]

translator_cache_size = n

The default value is 2000. In case of excessive memory usage for this allocator, the
parameter can be set to 500. From
SP6 onwards it will be possible to adjust a memory limit for these tables instead of
specifying a number.

6. Memory Usage of the Statistics server


Even though the statisticsserver can also run into oom errors, it should not consume a
large share of the available
memory. The statisticsserver is supposed to have a much smaller allocation limit
compared to the indexserver. This
following two configuration parameters are relevant:
allocationlimit

minallocationlimit

The allocation limit of the statisticsserver is either a percentage of the availably RAM or
just the value of
minallocationlimit, depending on what is the bigger value.

In case of oom errors that indicate that the statisticsserver has reached its process
allocation limit, it can make sense to
increase the allocationlimit to 10% or 15% to have at least a workaround for the time
being.

14

Вам также может понравиться