Вы находитесь на странице: 1из 2

The life of a log segment (A story about the life of a data block that

gets into a HANA database)


FollowRSS feed

There has been some confusion in database-land about the terms

 Log Volume
 Data Volume
 Log backup files
 archived
 written
 committed
 rollback
 savepoint
 point-in-time recovery

People tend to address this confusion by defining the terms. But I would like to tell you a story
about the life of a data block that gets into a HANA database. Note that my explanations are
specific to the HANA database management system running in a Dell single-node appliance but
shed a light to “all” SQL databases.

To keep it simple let’s start with a hello world example. First we create a table world:

dellhana:~ # hdbsql -u system -p mypassword -i 42

Welcome to the SAP HANA Database interactive terminal.

Type: \h for help with commands

\q to quit

hdbsql=> create table world ( c varchar(20) )

Then we add a line:

insert into world values(‘hello’)

This is now an uncommitted (database) transaction. To get it onto disk use the command

commit

Whenever you commit an insert, update or delete statement the data is written to disk, into the
log volume (/hana/log), in a file called a log segment. Log segments are by default 1GB:

dellhana:~ # ls -ltr /hana/log/mnt00001/hdb00002

-rw——- 1 anaadm sapsys 1073741824 Aug 6 12:12


logsegment_000_00000024.dat

-rw——- 1 anaadm sapsys 1359872 Aug 6 12:14


logsegment_000_directory.dat

-rw——- 1 anaadm sapsys 1073741824 Aug 8 12:07


logsegment_000_00000025.dat
In this example mnt00001 is the first node in the HANA cluster (there is only one as we are
talking about single-node HANA). hdb00002 belongs to the indexserver service (find this out
with hdbstudio -> Administration Console -> Volumes). It is August 8 and the current log
segment is number 25. There you can find our “hello” string:

dellhana:~ # strings
/hana/log/mnt00001/hdb00002/logsegment_000_00000025.dat | grep hello

hello

Once the log segment containing the “hello” string is filled, a new log segment will be started.
The old segment will get a flag as “to be archived”. When the next log backup is performed, the
“hello” segment will be written and the “to be archived” flag will be removed. The backup is
called “archive log” in Oracle lingo and “log backup files” in HANA lingo. When your redo log
volume runs full, the log segments that have been archived will be overwritten. If there are no
archived log segments left, logmode normalwill stop operations in order to allow corrective
actions. Logmode overwrite will overwrite log segments even if they have never been archived.
Neither logmode will overwrite data that is needed for HANA to start up.

Find out your HANA’s log mode by running the SQL command:

SELECT value FROM “SYS”.”M_INIFILE_CONTENTS” WHERE KEY=’log_mode’

Once a commit is done after the insert statement, other transactions can read this new table
row. Also, during the next savepoint, “hello” will be written into the table “world” in the data
volume. By default, HANA performs a savepoint every 5 minutes. Having a short interval for the
savepoints will not only reduce data loss when the log volume fails, but also lead to faster start-
up times of the HANA database.

Why does a shorter savepoint interval reduce database startup times? Let’s imagine you issue

insert into world values(‘hello’)

commit

Now the redo logs contain an information like “table world, first row is ‘hello’ starting 3:01pm”.
When the system reboots at 3:02pm, the database will have to recover. The database
willredo all committed transactions from before reboot and after the last savepoint. Then it
willrollback all the uncommitted transactions from the same timeframe. The shorter that
timeframe is the less needs to be done.

The above example also shows how point-in-time recovery works. When you keep one data
backup and all log backups you will be able to tell the value of any cell in any table of your
database at a specific time. This first row in table “world” is empty at 3:00pm. And at 3:02pm it
contains “hello” (in the above example).

And if you now imagine that you restore one backup at a secondary site and ship the stream of
log backups from the primary to the secondary site you understand asynchroneous
datacenter mirroring by transaction log shipping.

Now our data block “hello” has been inserted, committed, written to the redo log volume, has
survived a reboot, was saved to the data volume, backed up to the archive logs and transmitted
via transaction log shipping. This story has helped me in recent times to memorize the terms
correctly and should help you to lead an educated discussion.

Вам также может понравиться