Вы находитесь на странице: 1из 22

TRACKING DATA CHANGES

With Temporal Tables and More


INTRODUCTION

What is this presentation about?


• The how's and whys of tracking changes to the data in your database using
SQL Server change tracking technologies available in SQL 2016
• Emphasis on temporal tables and as time permits
• Change Data Capture
• Change Tracking
• Triggers
TRACKING DATA CHANGES
OVERVIEW

Possible Whys
• Audit trail - who changed what when
• Analysis - maintaining a data warehouse
• Time travel - reconstructing state of the data at a point in time in the past
• Data sync – one and two way sync’ing
• Repairing row-level data - recovering from accidental data changes and
application errors
TRACKING DATA CHANGES
OVERVIEW

Things to consider when preparing to track changes


• Identifying what to track and what not to track
• Costs of tracking / not tracking
• Legal requirements
• New vs legacy systems
• What will your hardware / software support
TRACKING DATA CHANGES
OVERVIEW

Dealing with date and time


• Time zones and UTC
• Format of
• Assertion(History) vs version (Validity) vs bi-temporal
• Does the date/time represent when the change occurred or when the information in the
row was valid.
• Who or what is the date/time relative to?
TEMPORAL TABLES

• Part of the ANSI SQL 2011 standard. First introduced with SQL Server 2016

• Designed to keep a full history of data changes at the row level using a
‘current’ table and a history table

• System-versioned because the start and end dates for each row is managed by
the system

• Assertion / historical in nature meaning that the lifetime of a is record based on


the physical dates the record was changed
TEMPORAL TABLES

Good for
• Audit trail
• Point in time analysis
• Maintaining a data warehouse
• Repairing row-level data
Not so good for
• Data synchronization
• Short term data history requirements
• Column level change tracking
• Tables that have frequent schema changes
TEMPORAL TABLES - DEMO

Requirements for the base table


• Must have two explicitly defined datetime2 non-nullable columns (period columns)
• Used exclusively by the system to record ‘life-span’ of a row
• Names of columns can be customized 
• Date/time stored as coordinated universal time (UTC)
• Can be hidden
• Must have a PERIOD FOR SYSTEM_TIME statement.  Identifies the period columns
• Must have a primary key
• INSTEAD OF triggers are not allowed on the base table. AFTER triggers allowed.
• Foreign keys with cascade update or delete are not allowed
• Cannot have any computed columns based on a user defined function
TEMPORAL TABLES - DEMO

Requirements for the history table


• Must have the same schema as base table
• No primary key, foreign keys, unique indexes, table constraints or triggers.
• Cannot be configured for change data capture, change tracking, transactional or
merge replication
• By default, a clustered index created using period columns
• Can be in a different schema than the base table but the same database
• Data cannot be modified.  This includes INSERT, UPDATE, DELETE and
TRUNCATE
CHANGE DATA CAPTURE

• First introduced in SQL 2008, Change Data Capture (CDC) captures DML
changes on a tracked table. Enterprise only unless you have SQL 2016 SP1
• Uses SQL Server Agent jobs to asynchronously read the transaction log to track
and record DML transactions
• When enabling CDC on a table, an additional table is created containing the
same columns as tracked table along with metadata needed to understand the
changes
• Table-valued functions are also created to allow access the change history
• CDC was written by the SQL Server Replication team so it works with
transaction replication
CHANGE DATA CAPTURE

Good for
• Maintaining a data warehouse
• Short term audit trail
• Data synchronization (using net changes)

Not so good for


• Long term storage of change history
• Point in time analysis
• Solutions that require real time data capture
CHANGE DATA CAPTURE

How it works
SQL Server Agent jobs
• One used to populate the change tables
• One responsible for change table cleanup  
• SQL Server Agent must be running for data capture and clean up
to work 

Transaction logs
• SQL Server Agent jobs need to be running for the transaction log
to get truncated
• Change data is not available until after change is committed to
source table and capture job has processed the related log entries
CHANGE DATA CAPTURE - DEMO

Validity interval
• Changes captured by CDC has a finite lifespan called the validity interval.
• Important because the extraction interval for a request must be fully covered by this time period.
• Default is 3 days measured in mins

Metadata
• __$start_lsn identifies the commit log sequence number (LSN) that was assigned to the change.
• __$end_lsn not supported. Future compatibility not guaranteed. Column always NULL (SQL 2012)
• __$seqval used to order changes that occur in the same transaction
• __$operation records the operation associated with the change: 1 = delete, 2 = insert, 3 = update
(before), and 4 = update (after)
• __$update_mask variable bit mask with one defined bit for each captured column.
• Insert and delete - all bits set. Update - bits set correspond to changed columns
CHANGE DATA CAPTURE - DEMO

Requirements
• Does not play nice with memory-optimized tables
• SQL Server Agent must be running
• Net changes: source table has a primary key or an unique index referenced by
index_name when setting up CDC on the table.
CHANGE TRACKING

• A lightweight solution first introduced in SQL 2008 that provides an efficient


mechanism to identify records in a user table that have changed on account of
DML transactions
• Tracks the fact that a row has changed but not how many times the row has
changed or the values of any intermediate changes.
• Changes are tracked as they occur
• Clean up of internal tables used to keep track of what records have changed are
managed by the system
• Built-in functions are available to easily query to data changes
CHANGE TRACKING

How it works
• Once enabled, any DML statement will cause change tracking information for
each modified row to be recorded.
• Each table enabled for Change Tracking has an internal on-disk table created
on the same filegroup.  This table is used to record table change versions and
the rows that have changed since a particular version.
• When a change occurs the row’s primary key, list of impacted columns
(optional) and DML command are recorded in the internal tracking table.
• An auto cleanup thread purges expired content of the internal tracking tables
based on the retention period setting of the database.
CHANGE TRACKING

Good for
• Data synchronization

Not so good for


• Any solution that requires that each data change is tracked and
recorded.
CHANGE TRACKING - DEMO

Requirements
• The database compatibility level must be set to 90 (SQL 2005) or greater.
• Snapshot isolation is strongly recommended to help ensure that all change
tracking information is consistent.
• Tables being tracking must have a primary key.  
• DDL operations are not tracked
• Deletes caused by a TRUNCATE statement are not tracked. However, this will
cause the minimum valid version to be updated. Client applications would
then need re-initialize their data.
TRIGGERS

• A special kind of stored procedure attached to an event.  When the event


happens, SQL Server executes the trigger’s code
• 4 types: DML, DDL, CLR, Logon
• INSTEAD OF and AFTER
• DML triggers in SQL Server work at the statement-level. This means that one
trigger instance fires per statement regardless of how many rows were affected,
even zero
• Triggers extend the life of a transaction. While a trigger is at work, the
associated transaction will not commit until the trigger completes
• Due to their event driven nature, issues caused by triggers can be difficult to
troubleshoot
TRIGGERS

Good for
• Auditing
• One audit table per user table or one audit table for all user tables
• Anything if you want to put the work into building the solution
• Any environment that is SQL Server 2005 or earlier

Not so good for


• Scenarios where simplicity is required or other change tracking technologies
would do the job just as well or better
TRIGGERS - DEMO
CONCLUSION

The tools available natively in Microsoft SQL Server to track data changes have
gotten better and better over time. With SQL 2016 SP1 most if not all these tools
are available to everyone with a SQL standard license or better. There is really
no reason why you should be suffering with a system that does a poor job of
tracking changes in your DB.

In addition to the tools that are offered natively in SQL Server there are also 3 rd
party venders that provide off the shelf solutions for tracking data changes. For
example, one of the sponsors of today’s event ApexSQL has a product called
ApexSQL Audit that tracks and reports events on SQL Server by auditing access
and changes to the SQL Server instance and its objects.

Вам также может понравиться