5678

QuickStart Intelligence
Planning for SQL Server 2008 R2 Indexing
5-1
Module 5
Contents:
Lesson 1: Core Indexing Concepts Lesson 2: Data Types and Indexes Lesson 3: Single Column and Composite Indexes Lab 5: Planning for SQL Server Indexing 5-3 5-11 5-19 5-24
This is a unique copy of the course material identified by code 2cdcc42b-1308-4fde-9163-f0e56003def8, and provided to you by QuickStart Intelligence. It is illegal to reprint, redistribute, or resell this content. The Licensed Content is licensed "as-is." Microsoft does not support this Licensed Content in any way and Microsoft gives no express warranties, guarantees or conditions. Please report any unauthorized use of this content to piracy@microsoft.com or by calling +1 800-785-3448.
5-2
Implementing a Microsoft SQL Server 2008 R2 Database
Module Overview
An index is a collection of pages associated with a table. Indexes are used to improve the performance of queries or enforce uniqueness. Before learning to implement indexes, it is important to understand how they work, how effective different data types are when used within indexes, and how indexes can be constructed from multiple columns.
Objectives
After completing this lesson, you will be able to: Explain core indexing concepts Describe the effectiveness of each data type common used in indexes Plan for single column and composite indexes
5-3
Lesson 1
Core Indexing Concepts
While it is possible for SQL Server to read all the pages in a table when calculating the results of a query, doing so is often highly inefficient. Indexes can be used to point to the location of required data and to minimize the need for scanning entire tables. In this lesson, you will learn how indexes are structured and learn the key measures associated with the design of indexes. Finally, you will see how indexes can become fragmented over time.
Objectives
After completing this lesson, you will be able to: Describe how SQL Server accesses data Describe the need for indexes Explain the concept of B-Tree index structures Explain the concepts of index selectivity, density and depth Explain why index fragmentation occurs
5-4
How SQL Server Accesses Data
Key Points
SQL Server can access data in a table by reading all the pages of the table (known as a table scan) or by using index pages to locate the required rows.
Indexes
Whenever SQL Server needs to access data in a table, it makes a decision about whether to read all the pages of the table or whether there are one or more indexes on the table that would reduce the amount of effort required in locating the required rows. Queries can always be resolved by reading the underlying table data. Indexes are not required but accessing data by reading large numbers of pages is usually considerably slower than methods that use appropriate indexes. On occasions, SQL Server will create its own temporary indexes to improve query performance. However, doing so is up to the optimizer and beyond the control of the database administrator or programmer, so these temporary indexes will not be discussed in this module. The temporary indexes are only used to improve a query plan, if no proper indexing already exists. In this module, you will consider standard indexes created on tables. SQL Server includes other types of index: Integrated full-text search is a special type of index that provides flexible searching of text. Spatial indexes are used with the GEOMETRY and GEOGRAPHY data types. Primary and secondary XML indexes assist when querying XML data.
Each of these other index types is discussed in later modules in this course. Question: When might a table scan be more efficient than using an index?
5-5
The Need for Indexes
Key Points
Indexes are not described in ANSI SQL definitions. Indexes are considered to be an implementation detail. SQL Server uses indexes for improving the performance of queries and for implementing certain constraints.
The Need for Indexes

As mentioned in the last topic, SQL Server can always read the entire table to work out required results but doing so can be inefficient. Indexes can reduce the effort required to locate results but only if the indexes are well-designed. SQL Server also uses indexes as part of its implementation of primary key and unique constraints. When you assign a primary key or unique constraint to a column or set of columns, SQL Server automatically indexes that column or set of columns. It does so to make it fast to check whether or not a given value is already present.
A Useful Analogy
At this point, it is useful to consider an analogy that might be easier to relate to. Consider a physical library. Most libraries store books in a given order, which is basically an alphabetical order within a set of defined categories. Note that even when you store the books in alphabetical order, there are various ways that this could be done. The order of the books could be based on the name of the book or the name of the author. Whichever option is chosen makes one form of access easy and does not help other methods of access. For example, if books were stored in book name order, how would you locate books written by a particular author? Indexes assist with this type of problem. Question: Which different ways might you want to locate books in a physical library?
5-6
Index Structures
Key Points
Tree structures are well known for providing rapid search capabilities for large numbers of entries in a list.
Index Structures
Indexes in database systems are often based on binary tree (B-Tree) structures. Binary trees are simple structures where at each level, a decision is made to navigate left or right. This style of tree can quickly become unbalanced and less useful. SQL Server indexes are based on a form of self-balancing tree. Whereas binary trees have at most two children per node, SQL Server indexes can have a large number of children per node. This helps improve the efficiency of the indexes and avoids the need for excessive depth within an index. Depth is defined as the number of levels from the top node (called the root node) and the bottom nodes (called leaf nodes).
5-7
Selectivity, Density and Index Depth
Key Points
When designing indexes, three core concepts are important: selectivity, density and index depth.
Selectivity, Density and Index Depth

Additional indexes on a table are most useful when they are highly selective. For example, imagine how you would locate books by a specific author in a physical library using a card file index. This would a process such as: Find the first entry for the author in the index. Locate the book in the bookcases based on the information in the index entry. Return to the index and find the next entry for the author. Locate the book in the bookcases based on the information in that next index entry. And so on.
Now imagine doing the same for a range of authors such as one third of all the authors. You quickly reach a point where it would be quicker to just scan the whole library and ignore the author index rather than running backwards and forwards between the index and the bookcases. Density is a measure of the lack of uniqueness of the data in a table. A dense table is one that has a high number of duplicates. Index Depth is a measure of the number of levels from the root node to the leaf nodes. Users often imagine that SQL Server indexes are quite deep but the reality is quite different to this. The large number of children that each node in the index can have produces a very flat index structure. Indexes with only 3 or 4 layers are very common.
5-8
Index Fragmentation
Key Points
Index fragmentation is the inefficient use of pages within an index. Fragmentation occurs over time as data is modified.
Index Fragmentation
For operations that read data, indexes perform best when each page of the index is as full as possible. While indexes may initially start full (or relatively full), modifications to the data in the indexes can cause the need to split index pages. From our physical library analogy, imagine a fully populated library with full bookcases. What occurs when a new book needs to be added? If the book is added to the end of the library, the process is easy but if the book needs to be added in the middle of a full bookcase, there is a need to readjust the bookcase.
Internal vs. External Fragmentation

Internal fragmentation is similar to what would occur if an existing bookcase was split into two bookcases. Each bookcase would then be only half full. External fragmentation relates to where the new bookcase would be physically located. It would probably need to be placed at the end of the library, even though it would "logically" need to be in a different order. That means that to read the bookcases in order, you could no longer just walk directly from bookcase to bookcase but would need to follow pointers around the library to follow a chain from one bookcase to another.
Detecting Fragmentation
SQL Server provides a measure of fragmentation in the sys.dm_db_index_physical_stats dynamic management view. The avg_fragmentation_in_percent column shows the percentage of fragmentation.
5-9
SQL Server Management Studio also provides details of index fragmentation in the properties page for each index as shown in the following screenshot from the AdventureWorks2008R2 database:
Question: Why does fragmentation affect performance?
5-10
Demonstration 1A: Viewing Index Fragmentation
Key Points
In this demonstration you will see how to identify fragmented indexes
Demonstration Steps
1. 2. Revert the 623XB-MIA-SQL virtual machine using Hyper-V Manager on the host system. In the virtual machine, click Start, click All Programs, click Microsoft SQL Server 2008 R2, click SQL Server Management Studio. In the Connect to Server window, type Proseware in the Server name text box and click Connect. From the File menu, click Open, click Project/Solution, navigate to D:\6232B_Labs\6232B_05_PRJ\6232B_05_PRJ.ssmssln and click Open. Open and execute the 00 Setup.sql script file from within Solution Explorer. Open the 11 Demonstration 1A.sql script file. Follow the instructions contained within the comments of the script file.
3. 4. 5.
Question: How might solid state disk drives change concerns around fragmentation?
5-11
Lesson 2
Data Types and Indexes
Not all data types work equally well as components of indexes. In this lesson, you will learn how effective a number of common data types are, when used within indexes. This will assist you in choosing data types when designing indexes.
Objectives
After completing this lesson, you will be able to: Describe the effectiveness of numeric data when used in indexes Describe the effectiveness of character data when used in indexes Describe the effectiveness of date-related data when used in indexes Describe the effectiveness of GUID data when used in indexes Describe the effectiveness of BIT data when used in indexes Explain how computed columns can be indexed.
5-12
Numeric Index Data
Key Points
Numeric data types tend to produce highly-efficient indexes. Exact numeric types are the most efficient.
Numeric Index Data

When numeric values are used as components in indexes, large number of entries can fit in a small number of index pages. This makes reading indexes based on numeric values very fast. Sort operations are very common in index operations. Numeric values are fast to compare and sort. This improves both the general performance of index operations but also reduces the time taken to rebuild an index, should this be required. While numeric values are efficient in indexes, this typically applies only to the exact numeric data types. FLOAT and REAL data types are much less useful in indexes as they are larger and require more complex comparison techniques than exact numeric data types. FLOAT and REAL data types are also not precise which can led to unpredictable results when used with the equality (=) operator. The same situation occurs outside their use in indexes. Care needs to be taken with FLOAT and REAL predicates in certain operations. INT and BIGINT are the most efficient data types for indexing as they are relatively small and operations on them are very fast. Question: Would you imagine that processor bit size effects the speed when comparing INT or BIGINT values?
5-13
Character Index Data
Key Points
While it might seem natural to base indexes on character data, indexes constructed on character data tend to be less efficient than those constructed on numeric data. However, character based indexes are both common and useful.
Character Index Data

Character data values tend to be larger than numeric values. For example, a character column might hold a customer's name or address details. This means that far less entries can exist in a given number of index pages. This makes character-based indexes slower to seek. Consider the operations required when comparing two string values. The complexity of these operations depends upon whether or not a strict binary comparison of the values can be undertaken. Most SQL Server systems use collations that are not based on binary comparisons. This means that every time a string value needs to be compared to another string value, a complex set of rules needs to be applied to determine the outcome of the comparison. This complexity during comparisons makes character based index operations slow by comparison with the same operations on numeric values. Character based indexes also tend to cause fragmentation problems as new values are almost never ascending or descending.
5-14
Date-Related Index Data
Key Points
Date related data types make good keys within indexes.
Date-Related Index Data

Date related data types are only slightly less efficient than the integer data types. Date related data types are relatively small and can be compared and sorted quickly. Dates are very important (and very commonly used) in business applications as almost all business transactions involve a date (and possibly a time) when the transaction occurred. It is a very common requirement to need to locate all the transactions that occurred on a particular date or during a particular period (or range of dates). SQL Server 2008 introduced the date data type. It is very effective as a component of an index and more efficient than data types that also include time components.
5-15
GUID Index Data
Key Points
While GUID values can be relatively efficient in indexes, operations on those indexes can lead to fragmentation problems and inefficiency.
GUID Index Data

GUID values are reasonably efficient within indexes. There is a common misconception that they are large. They are 16 bytes long and can be compared in a binary fashion. This means that they pack quite tightly into indexes and can be compared and sorted quite quickly. Because GUID values are random in nature, significant problems arise when they are used in indexes, if those indexes need to process a large number of insert operations. Fragmentation problems are commonplace with indexes created on GUID data types and these problems are a very common cause of performance problems in SQL Server databases. In the next module, you will see how the use of GUID data types within indexes affects the performance of operations on the indexes.
5-16
BIT Index Data
Key Points
BIT columns are highly efficient in indexes. There is a common misconception that they are not useful but many valid scenarios exist for the use of BIT data type within indexes.
BIT Index Data

There is a very common misconception that bit columns are not useful in indexes. This stems from the fact that there are only two values. However, the number of values is not the issue. It was discussed earlier in the module that the selectivity of queries is the most important issue issue. For example, consider a transaction table that contains 100 million rows and one of the columns (IsFinalized) indicates whether or not a transaction has been completed. There might only be 500 transactions that are not completed. An index that uses the IsFinalized column would be very useful for finding the unfinalized transactions. It would be highly selective. Note that the same index would be entirely useless for locating the finalized transactions. This difference is a good indication that it is an ideal candidate for the creation of a filtered index. (Filtered indexes are discussed later).
5-17
Indexing Computed Columns
Key Points
Indexing a computed column can be highly efficient. It can also assist with improving the performance of poorly designed databases.
Indexing Computed Columns

You can only create indexes on computed columns when certain conditions are met: The expressions must be deterministic and precise The ANSI_NULLS connection SET option must be ON Expressions that return text, ntext or image aren't permitted in the definition of the computed column The NUMERIC_ROUNDABORT connection SET option needs to be OFF
Note that SQL Server's query optimizer may ignore the index on a computed column, even if the requirements shown are met. The requirement for determinism and precision means that for a given set of input values, the same output values would always be returned. For example, the function SYSDATETIME() returns the current date and time whenever it is called. Its output would not be considered deterministic. You may want to create an index on a computed column when the results are queried or reported often. For example, a retail store may want to report on sales by day of the week (Sunday, Monday, Tuesday, etc.). You can create a computed column that determines the day of the week based on the date of the sale and then index that computed column. SQL Server 2005 introduced the ability to persist computed columns. Rather than calculating the value every time a SELECT operation is performed, the value can be calculated and stored whenever an INSERT
5-18
or UPDATE occurs. This is useful for data that is not updated frequently but is selected frequently. Indexes can be created on the persisted computed column.
Physical Analogy
From our physical library analogy, a persisted computed column for a book could be imagined as a label that is placed on the book that records the number of pages in the book. Nothing about the book itself changes when the label is placed on it but you now don't have to pick the book up and count the number of pages in it, if you need to make a decision based on the number of pages in the book. An index could then be created based on the value on the label similarly to how an index could be created on the name of the author. Question: If a column in a database mostly held character values but occasionally (30 rows out of 50,000 rows in the table) holds a number, how could you quickly locate a row with a specific numeric value?
5-19
Lesson 3
Single Column and Composite Indexes
The indexes discussed so far have been based on data from single columns. Indexes can also be based on the data from multiple columns. Indexes can also be constructed in ascending or descending order. This lesson investigates these concepts and the effects they have on index design along with details of how SQL Server maintains statistics on the data contained within indexes.
Objectives
After completing this lesson, you will be able to: Describe the differences between single column versus composite indexes Describe the differences between ascending versus descending indexes Explain how SQL Server keeps statistics on indexes
5-20
Single Column vs. Composite Indexes
Key Points
Indexes can be constructed on multiple columns rather than on single columns. Multi-column indexes are known as composite indexes.
Single Columns vs. Composite Indexes

Composite indexes are often more useful than single column indexes in business applications. The advantages of composite indexes are: Higher selectivity The possibility of avoiding the need to sort the output rows
In our physical library analogy, consider a query that required the location of books by a publisher within a specific release year. While a publisher index would be useful for finding all the books released by the publisher, it would not help to narrow down the search to those books within the release year. Separate indexes on publisher and release year would not be useful but an index that contained both publisher and release year could be very selective. Similarly, an index by topic would be of limited value also. Once the correct topic was located, all the books on that topic would have to be searched to determine if they were by the specified author. The best option would be an author index that also included details of each book's topic. In that case, a scan of the index pages for the author would be all that is required to work out which books need to be accessed. In the absence of any other design criteria, you should typically index the most selective column first, when constructing composite indexes. Question: Why might an index on customer then order date be more or less effective than an index on order date then customer?
5-21
Ascending vs. Descending Indexes
Key Points
Each component of an index can be created in an ascending or descending order. For single column indexes, ascending and descending indexes are equally useful. For composite indexes, specifying the order of individual columns within the index might be useful.
Ascending vs. Descending Indexes

In general it makes no difference whether a single column index is ascending or descending. From our physical library analogy, you could scan either the bookshelves or the indexes from either end. The same amount of effort would be required no matter which end you started from. Composite indexes can benefit from different order in each component. Often this is used to avoid sorts. For example, you might need to output orders by date descending within customer ascending. From our physical library analogy, imagine that an author index a list of books by release date within the author index. This would be easier if the index was already structured this way.
5-22
Index Statistics
Key Points
SQL Server keeps statistics on indexes to assist when making decisions about how to access the data in a table.
Index Statistics
Earlier in the module, you saw that SQL Server needs to make decisions about how to access the data in a table. For each table that is referenced in a query, SQL Server might decide to read the data pages or it might decide to use an index. It is important to realize though, that SQL Server must make this decision before it begins to execute a query. This means that it needs to have information that will assist it in making this determination. For each index, SQL Server keeps statistics that tell it how the data is distributed.
Physical Analogy
When discussing the physical library analogy earlier, it was mentioned that if you were looking up the books for an author, using an index that is ordered by author could be useful. However, if you were locating books for a range of authors, that there would be a point at which scanning the entire library would be quicker than running backwards and forwards from the index to the shelves of books. The key issue here is that you need to know, before executing the query, how selective (and therefore useful) the indexes would be. The statistics held on indexes provide this knowledge. Question: Before starting to perform your lookup in a physical library, how would you know which way was quicker?
5-23
Demonstration 3A: Viewing Index Statistics
Key Points
In this demonstration you will see how to work with index statistics.
Demonstration Steps
1. If Demonstration 1A was not performed: Revert the 623XB-MIA-SQL virtual machine using Hyper-V Manager on the host system. In the virtual machine, click Start, click All Programs, click Microsoft SQL Server 2008 R2, click SQL Server Management Studio. In the Connect to Server window, type Proseware in the Server name text box and click Connect. From the File menu, click Open, click Project/Solution, navigate to D:\6232B_Labs\6232B_05_PRJ\6232B_05_PRJ.ssmssln and click Open. Open and execute the 00 Setup.sql script file from within Solution Explorer.
2. 3.
Open the 31 Demonstration 3A.sql script file. Follow the instructions contained within the comments of the script file.
Question: Why would you not always choose to use FULLSCAN for statistics?
5-24
Lab 5: Planning for SQL Server Indexing
Lab Setup
For this lab, you will use the available virtual machine environment. Before you begin the lab, you must complete the following steps: 1. 2. 3. On the host computer, click Start, point to Administrative Tools, and then click Hyper-V Manager. Maximize the Hyper-V Manager window. In the Virtual Machines list, if the virtual machine 623XB-MIA-DC is not started: 4. Right-click 623XB-MIA-DC and click Start. Right-click 623XB-MIA-DC and click Connect. In the Virtual Machine Connection window, wait until the Press CTRL+ALT+DELETE to log on message appears, and then close the Virtual Machine Connection window.
In the Virtual Machines list, if the virtual machine 623XB-MIA-SQL is not started: Right-click 623XB-MIA-SQL and click Start. Right-click 623XB-MIA-SQL and click Connect. In the Virtual Machine Connection window, wait until the Press CTRL+ALT+DELETE to log on message appears.
5. 6. 7.
In Virtual Machine Connection window, click on the Revert toolbar icon. If you are prompted to confirm that you want to revert, click Revert. Wait for the revert action to complete. In the Virtual Machine Connection window, if the user is not already logged on: On the Action menu, click the Ctrl-Alt-Delete menu item.
5-25
Click Switch User, and then click Other User. Log on using the following credentials: i. ii. User name: AdventureWorks\Administrator Password: Pa$$w0rd
8. 9.
From the View menu, in the Virtual Machine Connection window, click Full Screen Mode. If the Server Manager window appears, check the Do not show me this console at logon check box and close the Server Manager window.
10. In the virtual machine, click Start, click All Programs, click Microsoft SQL Server 2008 R2, and click SQL Server Management Studio. 11. In Connect to Server window, type Proseware in the Server name text box. 12. In the Authentication drop-down list box, select Windows Authentication and click Connect. 13. In the File menu, click Open, and click Project/Solution. 14. In the Open Project window, open the project D:\6232B_Labs\6232B_05_PRJ\6232B_05_PRJ.ssmssln. 15. In Solution Explorer, double-click the query 00-Setup.sql. When the query window opens, click Execute on the toolbar.
Lab Scenario
You have been asked to explain the concept of index statistics and selectivity to a new developer. You will explore the statistics available on an existing index and determine how selective some sample queries would be. One of the company developers has provided you with a list of the most important queries that will be executed by the new marketing management system. Depending upon how much time you have available, you need to determine the best column orders for indexes to support each query. Complete as many as possible within the allocated time. In later modules, you will consider how these indexes would be implemented. Each query is to be considered in isolation in this exercise.
Supporting Documentation
Query 1:
SELECT ProspectID, FirstName, LastName FROM Marketing.Prospect WHERE ProspectID = 12553;
Query 2:
SELECT ProspectID, FirstName, LastName FROM Marketing.Prospect WHERE FirstName LIKE 'Arif%';
Query 3:
SELECT ProspectID, FirstName, LastName FROM Marketing.Prospect WHERE FirstName LIKE 'Alejandro%'
5-26
ORDER BY LastName, FirstName;
Query 4:
SELECT ProspectID, FirstName, LastName FROM Marketing.Prospect WHERE FirstName >= 'S' ORDER BY LastName, FirstName;
Query 5:
SELECT LanguageID, COUNT(1) FROM Marketing.ProductDescription GROUP BY LanguageID;
5-27
Exercise 1: Explore existing index statistics

Scenario
You have been asked to explain the concept of index statistics and selectivity to a new developer. You will explore the statistics available on an existing index and determine how selective some sample queries would be. The main tasks for this exercise are as follows: 1. Execute the following command in the MarketDev database:
EXEC sp_helpstats Marketing.Product
2. 3. 4. 5. 6.
Review the results. Have any autostats been generated? Create manual statistics on the Color column. Call the statistics Product_Color_Stats. Use a full scan of the data when creating the statistics. Re-execute the command from task 1 to see the change. Using the DBCC SHOW_STATISTICS command, review the created Product_Color_Stats statistics. Answer the following questions related to the Product_Color_Stats statistics: a. b. c. d. How many rows were sampled? How many steps were created? What was the average key length? How many Black products are there?
7.
Execute the following command to check how accurate the statistics that have been generated are:
SELECT COUNT(1) FROM Marketing.Product WHERE Color = 'Black';
8.
Calculate the selectivity of each of the three queries shown: a) b) c) SELECT ProspectID, FirstName, LastName FROM Marketing.Prospect WHERE FirstName LIKE 'A%'; SELECT ProspectID, FirstName, LastName FROM Marketing.Prospect WHERE FirstName LIKE 'Alejandro%'; SELECT ProspectID, FirstName, LastName FROM Marketing.Prospect WHERE FirstName LIKE 'Arif%';
Task 1: Execute SQL Command

1. Execute the following command in the MarketDev database:
Task 2: Review the results

Review the results. Check to see if any autostats has been generated?
Task 3: Create statistics

Create manual statistics on the Color column. Call the statistics Product_Color_Stats. Use a full scan of the data when creating the statistics.
5-28
Task 4: Re-execute the SQL command from task 1

Re-execute the following command in the MarketDev database:
Task 5: Use DBCC SHOW_STATISTICS

Using the DBCC SHOW_STATISTICS command, review the created Product_Color_Stats statistics
Task 6: Answer questions

Answer the following questions related to the Product_Color_Stats statistics: a. b. c. d. How many rows were sampled? How many steps were created? What was the average key length? How many Black products are there?
Task 7: Execute SQL Command and check accuracy of statistics

Execute the following command to check how accurate the statistics that have been generated are:
SELECT COUNT(1) FROM Marketing.Product WHERE Color = 'Black';
Task 8: Calculate Selectivity of each query

Calculate the selectivity of each of the three queries shown: Query 1:
SELECT ProspectID, FirstName, LastName FROM Marketing.Prospect WHERE FirstName LIKE 'A%';
Query 2:
SELECT ProspectID, FirstName, LastName FROM Marketing.Prospect WHERE FirstName LIKE 'Alejandro%';
Query 3:
SELECT ProspectID, FirstName, LastName FROM Marketing.Prospect WHERE FirstName LIKE 'Arif%';
Results: After this exercise, you have assessed Selectivity on each various queries.
5-29
Challenge Exercise 2: Design column orders for indexes (Only if time permits)
Scenario
One of the company developers has provided you with a list of the most important queries that will be executed by the new marketing management system. You need to determine the best column orders for indexes to support each query. In later modules, you will consider how these indexes would be implemented. Each query is to be considered in isolation in this exercise. The main tasks for this exercise are as follows: 1. 2. 3. 4. 5. Determine which columns should be part of an index for Query1 and the best order for the columns to support the query. Determine which columns should be part of an index for Query2 and the best order for the columns to support the query. Determine which columns should be part of an index for Query3 and the best order for the columns to support the query. Determine which columns should be part of an index for Query4 and the best order for the columns to support the query. Determine which columns should be part of an index for Query5 and the best order for the columns to support the query.
Task 1: Design an index

Review the supporting documentation, determine which columns should be part of an index for Query 1 and the best order for the columns to support the query.




Review the supporting documentation, determine which columns should be part of an index for Query 5 and the best order for the columns to support the query. Results: After this exercise, you should designed new indexes taking into consideration of Selectivity
5-30
Module Review and Takeaways
Review Questions
1. 2. Do tables need indexes? Why do constraints use indexes?
Best Practices
1. 2. Design indexes to maximize sensitivity which leads to lower I/O. In absence of other requirements, aim to have the most selective columns first in composite indexes.
Implementing Table Structures in SQL Server 2008 R2
6-1
Module 6
Contents:
Lesson 1: SQL Server Table Structures Lesson 2: Working with Clustered Indexes Lesson 3: Designing Effective Clustered Indexes Lab 6: Implementing Table Structures in SQL Server 6-3 6-13 6-20 6-26
6-2
Module Overview
One of the most important decisions that needs to be taken when designing tables in SQL Server databases relates to the structure of the table. Regardless of whether or not other indexes are used to locate rows, the table itself can be structured like an index or left without such a structure. In this module, you will learn how to choose an appropriate table structure. For situations where you decide to have a specific structure in place, you will learn how to create an effective structure.
Objectives
After completing this lesson, you will be able to: Explain how tables can be structured in SQL Server databases Work with clustered indexes Design effective clustered indexes
6-3
Lesson 1
SQL Server Table Structures
There are two ways that SQL Server tables can be structured. Rows can be added in any order or rows can be ordered. In this lesson, you will investigate both options; gain an understanding of how common data modification operations are impacted by each option. Finally, you will see how unique clustered indexes are structured differently to non-unique clustered indexes.
Objectives
After completing this lesson, you will be able to: Describe how tables can be organized as heaps Explain how common operations are performed on heaps Detail the issues that can arise with forwarding pointers Describe how tables can be organized with clustered indexes Explain how common operations are performed on tables with clustered indexes Describe how unique clustered indexes are structured differently to non-unique clustered indexes
6-4
What is a Heap?
Key Points
A heap is a table that has no enforced order for either the pages within the table or for the data rows within each page.
Heaps
The simplest table structure available in SQL Server is a heap. Data rows are added to the first available location within the table's pages that have sufficient space. If no space is available, additional pages are added to the table and the rows placed in those pages. Even though no index structure exists for a heap, SQL Server tracks the available pages using an entry in an internal structure called an Index Allocation Map (IAM). Heaps are allocated index id zero in this map.
Physical Analogy
In the physical library analogy, a heap would be represented by structuring your library so that every book is just placed in any available space found that is large enough. Without any other assistance, finding a book would involve scanning one bookcase after another. Question: Why might modifying a row cause it to need to move between pages?
6-5
Operations on Heaps
Key Points
The most common operations performed on tables are INSERT, UPDATE, DELETE and SELECT operations. It is important to understand how each of these operations is affected by structuring a table as a heap.
Physical Analogy
In the library analogy, an INSERT would be executed by locating any gap large enough to hold the book and placing it there. If no space that is large enough is available, a new bookcase would be allocated and the book placed into it. This would continue unless a limit existed on the number of bookcases that the library could contain. A DELETE operation could be imagined as scanning the bookcases until the book is found, removing the book and throwing it away. More precisely, it would be like placing a tag on the book to say that it is to be thrown out the next time the library is cleaned up or space on the bookcase is needed. An UPDATE operation would be represented by replacing a book with a (potentially) different copy of the same book. If the replacement book was the same (or smaller) size as the original book, it could be placed directly back in the same location as the original book. However, if the replacement book was larger, the original book would be removed and placed into another location. The new location for the book could be in the same bookcase or in another bookcase. Question: What would be involved in finding a book in a library structured as a heap? (This would simulate a SELECT operation).
6-6
Forwarding Pointers
Key Points
When other indexes point to rows in a heap, data modification operations cause forwarding pointers to be inserted into the heap. This can cause performance issues over time.
Physical Analogy
Now imagine that the physical library was organized as a heap where books were stored in no particular order. Further imagine that three additional indexes were created in the library, to make it easier to find books by author, ISBN, and release date. As there was no order to the books on the bookcases, when an entry was found in the ISBN index, the entry would refer to the physical location of the book. The entry would include an address like "Bookcase 12 - Shelf 5 - Book 3". That is, there would need to be a specific address for a book. An update to the book that caused it to need to be moved to a different location would be problematic. One option for resolving this would be to locate all index entries for the book and update the new physical location. An alternate option would be to leave a note in the location where the book used to be that points to where the book has been moved to. This is what a forwarding pointer is in SQL Server. This allows rows to be updated and moved without the need to update other indexes that point to them. A further challenge arises if the book needed to be moved again. There are two options ways that this could be handled. Either yet another note could be left pointing to the new location or the original note could be modified to point to the new location. Either way, the original indexes would not need to be updated. SQL Server deals with this by updating the original forwarding pointer. This way, performance does not continue to degrade by having to follow a chain of forwarding pointers.
6-7
ALTER TABLE WITH REBUILD

Forwarding pointers were a common performance problem with SQL Server tables that were structured as heaps. There were no straightforward options for "cleaning up" a heap to remove the forwarding pointers. While options existed for removing forwarding pointers, each had significant downsides. SQL Server 2008 introduced a method for dealing with this problem via the command:
ALTER TABLE SomeTable WITH REBUILD;
Note that while options to rebuild indexes have been available in prior versions, the option to rebuild a table was not available. This command can also be used to change the compression settings for a table. (Page and Row Compression are an advanced topic beyond the scope of this course).
6-8
What is a Clustered Index?
Key Points
Rather than storing data rows of a data as a heap, tables can be designed with an internal logical ordering. This is known as a clustered index.
Clustered Index
A table with a clustered index has a predefined order for rows within a page and for pages within the table. The order is based on a key made up of one or more columns. The key is commonly called a clustering key. Because the rows of a table can only be in a single order, there can be only a single clustered index on a table. An Index Allocation Map entry is used to point to a clustered index. Clustered indexes are always index id = 1. There is a common misconception that pages in a clustered index are "physically stored in order". While this is possible in rare situations, it is not commonly the case. If it was true, fragmentation of clustered indexes would not exist. SQL Server tries to align physical and logical order while creating an index but disorder can arise as data is modified. Index and data pages are linked within a logical hierarchy and also double-linked across all pages at the same level of the hierarchy to assist when scanning across an index.
Physical Analogy
In the library analogy, a clustered index is similar to storing all books in a specific order. An example of this would be to store books in ISBN (International Standard Book Number) order. Clearly, the library can only be in a single order.
6-9
Operations on Clustered Indexes
Key Points
Earlier you saw how common operations were carried out on tables structured as heaps. It is important to understand how each of those operations is affected by structuring a table with a clustered index.
Physical Analogy
In a library that is ordered in ISBN order, an INSERT operation requires a new book to be placed in exactly the correct logical ISBN order. If there is space somewhere on the bookcase that is in the required position, the book can be placed into the correct location and all other books in the bookcase moved to accommodate the new book. If there is not sufficient space, the bookcase needs to be split. Note that a new bookcase would be physically placed at the end of the library but would be logically inserted into the list of bookcases. INSERT operations would be straightforward if the books were being added in ISBN order. New books could always be added to the end of the library and new bookcases added as required. In this case, no splitting is required. When an UPDATE operation is performed, if the replacement book is the same size or smaller and the ISBN has not changed, the book can just be replaced in the same place. If the replacement book is larger and the ISBN has not changed, and there is spare space within the bookcase, all other books in the bookcase can be slid along to allow the larger book to be replaced in the same spot. If there was insufficient space in the bookcase to accommodate the larger book, the bookcase would need to be split. If the ISBN of the replacement book wass different to the original book, the original book would need to be removed and the replacement book treated like the insertion of a new book. A DELETE operation would involve the book being removed from the bookcase. (Again, more formerly, it would be flagged as free space but simply left in place for later removal). When a SELECT is performed, if the ISBN is known, the required book can be quickly located by efficiently searching the library. If a range of ISBN's was requested, the books would be located by finding the first
6-10
book and continuing to collect books in order until a book is encountered that is out of range or until the end of the library is reached. Question: What sort of queries would now perform better in this library?
6-11
Unique vs. Non-Unique Clustered Indexes
Key Points
SQL Server must be able to uniquely identify any row in a table. Clustered indexes can be created as unique or non-unique.
Unique vs. Non-Unique Clustered Indexes

If you do not specify indexes as being unique, SQL Server will add another value to the clustering key to ensure that the values are unique for each row. This value is commonly called a "uniqueifier".
Physical Analogy
In the library analogy, a unique index is like a rule that says that no more than a single copy of any book can ever be stored. If an insert of a new book is attempted and another book is found to have the same ISBN (assuming that the ISBN was the clustering key), the insertion of the new book would be refused. It is important to understand that the comparison is made only on the clustering key. The book would be rejected for having the same ISBN, even if other properties of the book are different. A non-unique clustered index is similar to having a rule that allows more than a single book with the same ISBN. The issue is that it is likely to be desirable to track each copy of the book separately. The uniqueifier that is added by SQL Server would be like a "Copy Number" being added to books that can be duplicated. The uniqueifier is not visible to users.
6-12
Demonstration 1A: Rebuilding Heaps
Key Points
In this demonstration you will see how to: Create a table as a heap Check the fragmentation and forwarding pointers for a heap Rebuild a heap
Demonstration Steps
3. 4. 5.
6-13
Lesson 2
Working with Clustered Indexes
If a decision has been made to structure a table with a clustered index, it is important to be familiar with how the indexes are created, altered or dropped. In this lesson, you will see how to perform these actions, understand how SQL Server performs them automatically in some situations and see how to incorporate free space within indexes to improve insert performance.
Objectives
After completing this lesson, you will be able to: Create clustered indexes Drop a clustered index Alter a clustered index Incorporate Free Space in indexes
6-14
Creating Clustered Indexes
Key Points
Clustered indexes can be created either directly using the CREATE INDEX command or automatically in some situations where a PRIMARY KEY constraint is specified on the table.
Creating Clustered Indexes

It is very important to understand the distinction between a PRIMARY KEY and a clustering key. Many users confuse the two terms or attempt to use them interchangeably. A PRIMARY KEY is a constraint. It is a logical concept that is supported by an index but the index may or may not be a clustered index. The default action in SQL Server when a PRIMARY KEY constraint is added to a table is to make it a clustered PRIMARY KEY if no other clustered index already exists on the table. This action can be overridden by specifying the word NONCLUSTERED when declaring the PRIMARY KEY constraint. In the first example on the slide, the dbo.Article table is being declared. The ArticleID column has a PRIMARY KEY constraint associated with it. As there is no other clustered index on the table, the index that is created to support the PRIMARY KEY constraint will be created as a clustered PRIMARY KEY. ArticleID will be the clustering key as well as the PRIMARY KEY for the table. In the second example on the slide, the table dbo.LogData is initially created as a heap. When the PRIMARY KEY constraint is added to the table, no other clustered index is present on the table, so SQL Server will create the index to support the PRIMARY KEY constraint as a clustered index. If a table has been created as a heap, it can be converted to a clustered index structure by adding a clustered index to the table. In the fourth command shown in the examples on the slide, a clustered index named CL_LogTime is added to the dbo.LogTime table with the LogTimeID column as the clustering key. This command will not only create an index over the data; it causes the entire structure of the table to be reorganized.
6-15
Question: What else would be added to your table if you added a non-unique clustered index to it?
6-16
Dropping a Clustered Index
Key Points
The method used to drop a clustered index depends upon the way the clustered index was created.
Dropping a Clustered Index

The DROP INDEX command can be used to drop clustered indexes that were created with the CREATE INDEX command. Indexes that are created internally to support constraints need to be removed by removing the constraint. Note in the second example on the slide that the PRIMARY KEY constraint is being dropped. This would cause a clustered index that had been created to support that key to also be dropped. When the clustered index is dropped, the data in the table is not lost. The table is reorganized as a heap. Question: How could you remove a primary key constraint that was being referenced by a foreign key constraint?
6-17
Altering a Clustered Index
Key Points
Minor modifications to indexes are permitted through the ALTER INDEX statement but it cannot be used to modify the structure of the index, including the columns that make up the key.
Altering a Clustered Index

A few maintenance operations are possible with the ALTER INDEX statement. For example, an index can be rebuilt or reorganized. (Reorganizing an index only affects the leaf level of the index). Restructuring an index is not permitted within an ALTER INDEX statement. Columns that make up the clustering key cannot be added or removed using this command and the index cannot be moved to a different filegroup. (Filegroups are a concept that is covered in the course 6231B Maintaining a SQL Server 2008 R2 Database).
WITH DROP EXISTING

An option to change the structure of an index is provided while creating a replacement index. The CREATE INDEX command includes a WITH DROP EXISTING clause that can allow the statement to replace an existing index. Note that an index cannot be changed from a clustered to a non-clustered index or back using this command. (Non-clustered indexes are covered in module 08).
Disabling Indexes
While the ALTER INDEX statement includes a DISABLE option that can be applied to any index, this option is of limited use with clustered indexes. Once a clustered index is disabled, no access to the data in the table is then permitted until it is rebuilt.
6-18
Incorporating Free Space in Indexes
Key Points
The FILLFACTOR and PADINDEX options are used to provide free space within index pages. This can improve INSERT and UPDATE performance in some situations but often to the detriment of SELECT operations.
FILLFACTOR and PADINDEX

The availability of free space in an index page can have a significant effect on the performance of index update operations. If an index record must be inserted and there is no free space, a new index page must be created and the contents of the old page split across the two pages. This can affect performance if it happens too frequently. The performance impacts of page splits can be alleviated by leaving empty space on each page when creating an index, including a clustered index. This is achieved by specifying a FILLFACTOR value. FILLFACTOR defaults to 0, which means "fill 100%". Another other value (including 100) is taken as the percentage of how full each page should be. For the example in the slide, this means 70% full and 30% free space on each page. FILLFACTOR only applies to leaf level pages in an index. PAD_INDEX is an option that, when enabled, causes the same free space to be allocated in the non-leaf levels of the index. Question: While you could avoid many page splits by setting a FILLFACTOR of 50, what would be the downside of doing this? Question: When would a FILLFACTOR of 100 be useful?
6-19
Demonstration 2A: Clustered Indexes
Key Points
In this demonstration you will see how to: Create a table with a clustered index Detect fragmentation in a clustered index Correct fragmentation in a clustered index
Demonstration Steps
2. 3.
Question: Where was the performance of the UPDATE statement against this table much faster than the one against the heap?
6-20
Lesson 3
Designing Effective Clustered Indexes
When creating clustered indexes on tables, it is important to understand the characteristics of good clustering keys. Some data types work better for clustering keys than others. In this lesson, you will see how to design good clustering keys and also see how clustered indexes can be created on views.
Objectives
After completing this lesson, you will be able to: Describe characteristics of good clustering keys Explain which data types are most appropriate for use in clustering keys Create indexed views Explain considerations that must be made when working with indexed views
6-21
Characteristics of Good Clustering Keys
Key Points
Many different types of data can be used for clustering a table. While not every situation is identical, there is a set of characteristics that generally create the best clustering keys. Keys should be short, static, increasing and unique.
Characteristics of Good Clustering Keys

Although some designs might call for different styles of clustering key, most designs call for clustering keys with the following characteristics: Short clustering keys should be short. They need to be sorted and they are stored at the leaf level of every other index. While there is a limit of 16 columns and 900 bytes, good clustering keys are typically much, much smaller than this. Static clustering keys should be based on data values that do not change. This is one reason why primary keys are often used for this purpose. A change to the clustering key will mean the need to move the row. You have seen already that moving rows is generally not desirable. Increasing this assists with INSERT behavior. If the keys within the data are increasing as they are inserted, then the inserts happen directly at the logical end of the table. This minimizes fragmentation, the need to split pages, and reduces the amount of memory needed for page buffers. Unique unique clustering keys do not need to have a uniqueifier column added by SQL Server. It is important to declare unique values as being unique. Otherwise, SQL Server will still add a uniqueifier column to the key.
6-22
Appropriate Data Types for Clustering Keys
Key Points
Similar to the way that some data types are generally better as components of indexes than other data types, some data types are more appropriate for use as clustering keys than others.
Appropriate Data Types for Clustering Keys

int and bigint typically make the best clustering keys in general use, particularly if they are used in conjunction with an IDENTITY constraint that causes their values to continue to increase. (Constraints are discussed in a later module). The biggest challenge in current designs is the use (and overuse) of GUIDs that are stored in uniqueidentifier columns. While they are larger than the integer types, GUIDs are random in nature and routinely cause index fragmentation through page splits when used as clustering keys. Character data types can be used for clustering keys but the sorting performance of character data types is limited. Character values often tend to change in typical business applications. Date data is typically not unique but provides excellent advantages in size, sorting performance. It works well for date range queries that are common in typical business applications.
Logical vs. Physical Schema

Users typically struggle with the concept that their physical data schema does not have to match their logical data schema. For example, while GUIDs might be used throughout an application layer, they do not have to be used throughout the physical implementation of the schema. One option would be to use one table to look up an int based on a GUID and have that int used everywhere else in the design. Question: New uniqueidentifier values in SQL Server can be generated with the NEWID() function. SQL Server 2005 introduced the NEWSEQUENTIALID() function to try to address the issue of increasing values. Why doesn't this typically solve the problem of random values?
6-23
Creating Indexed Views
Key Points
Clustered indexes can be created over views. A view with a clustered index is called an "indexed view". Indexed views are the closest SQL Server equivalent to "materialized views" in other databases. Indexed views can have a profound (positive) impact on the performance of queries in particular circumstances.
Creating Indexed Views

The concept of an indexed view might at first seem odd as an index is being created over an object that is not persisted. Indexes views are very useful for maintaining precalculated aggregates or joins. When updates to the underlying data are made, SQL Server makes updates to the data stored in the indexed view automatically. You can imagine an indexed view as a special type of table with a clustered index. The differences are that the schema of the table isn't defined directly; it is defined by the SELECT statement in the view. Also, you don't modify the table directly; you modify the data in the "real" tables that underpin the view. When the data in the underlying tables is modified, SQL Server realizes that it needs to update the data in the indexed view. Indexed views have a negative impact on the performance of INSERT, DELETE, and UPDATE operations on the underlying tables but they can also have a very positive impact on the performance of SELECT queries on the view. They are most useful for data that is regularly selected but much less frequently updated.
6-24
Indexed View Considerations
Key Points
The use of indexed views is governed by a set of considerations that must be met for the views to be utilized. Premium editions of SQL Server take more complete advantage of indexed views.
Indexed View Considerations

Indexed views can be a challenge to set up and use. Books Online details a list of SET options that need to be in place both at creation time for the indexed view and in sessions that take advantage of the indexed views. Particular attention should be given to the CONCAT_NULL_YIELDS_NULL and QUOTED_IDENTIFIER settings. Indexes can only be built on views that are deterministic. That is, the views must always return the same data unless the underlying table data is altered. For example, an indexed view could not contain a column that returned the outcome of the SYSDATETIME() function. SCHEMABINDING is an option that the view must have been created with before an index can be created on the view. The SCHEMABINDING option prevents changes to the schema of the underlying tables while the view exists.
6-25
Demonstration 3A: Indexed Views
Key Points
In this demonstration you will see how to: Obtain details of indexes created on views See if an indexed view has been used in an estimated execution plan
Demonstration Steps
2. 3.
Question: How could you ensure that an indexed view is selected when working with Standard Edition of SQL Server?
6-26
Lab 6: Implementing Table Structures in SQL Server
Lab Setup
5. 6. 7.
6-27
8. 9.
Lab Scenario
One of the most important decisions when designing a table is to choose an appropriate table structure. In this lab, you will choose an appropriate structure for some new tables required for the relationship management system.
Table 1: Relationship.ActivityLog Name ActivityTime SessionID Duration ActivityType Table 2: Relationship.PhoneLog Name PhoneLogID SalespersonID CalledPhoneNumber CallDurationSeconds Data Type int int nvarchar(16) int Constraint Primary Key Data Type datetimeoffset int int int Constraint
6-28
Table 3: Relationship.MediaOutlet Name MediaOutletID MediaOutletName PrimaryContact City Data Type int nvarchar(40) nvarchar(50) nvarchar(50) Constraint
Table 4: Relationship.PrintMediaPlacement Name PrintMediaPlacementID MediaOutletID PlacementDate PublicationDate RelatedProductID PlacementCost Table 5: Name ApplicationID ApplicantName EmailAddress ReferenceID Comments Data Type int nvarchar(150) nvarchar(100) uniqueidentifier nvarchar(500) Constraint IDENTITY(1,1) Data Type int int datetime datetime int decimal(18,2) Constraint Primary Key
6-29
Exercise 1: Creating Tables as Heaps

Scenario
You need to create some new tables to support the relationship management system. You will create two tables that are structured as heaps. The main tasks for this exercise are as follows: 1. 2. Review the Requirements. Create the Tables in the MarketDev database.
Task 1: Review the Requirements

Review the requirements in the supporting documentation for Table 1 and 2.
Task 2: Create the Tables in the MarketDev database

Create a table based on the supporting documentation for Table 1. Create a table based on the supporting documentation for Table 2. Results: After this exercise, you have created two tables that are structured as Heaps.
6-30
Exercise 2: Creating Tables with Clustered Indexes

Scenario
The design documentation also calls for some tables with clustered indexes. You will then create two tables that have clustered indexes. The main tasks for this exercise are as follows: 1. 2. Review the Requirements. Create the Tables in the MarketDev database.
Task 1: Review the Requirements

Review the requirements in the supporting documentation for Table 3 and 4.
Task 2: Create the Tables in the MarketDev database

Create a table based on the supporting documentation for Table 3. Create a table based on the supporting documentation for Table 4. Results: After this exercise, you have created two tables that have clustered indexes.
6-31
Challenge Exercise 3: Comparing the Performance of Clustered Indexes vs. Heaps (Only if time permits)
Scenario
A company developer has approached you to decide whether a new table should have a clustered index or not. Insert performance of the table is critical. You will consider the design, create a number of alternatives and compare the performance of each against a set of test workloads. The main tasks for this exercise are as follows: 1. 2. 3. 4. 5. 6. Review the Design for Table 5. Create a table based on the design with no clustered index. Call the table Relationship.Table_Heap. Create a table based on the design with a clustered index on the ApplicantID column. Call the table Relationship.Table_ApplicationID. Create a table based on the design with a clustered index on the EmailAddress column. Call the table Relationship.Table_EmailAddress. Create a table based on the design with a clustered index on the ReferenceID column. Call the table Relationship.Table_ReferenceID. Load and execute the workload script. (Note: this may take some minutes to complete. You can check where it is up to by viewing the Messages tab. A message is printed as each of the four sections is completed. While the script is running, review the contents of the script and estimate the proportion of time difference you expect to see in the results). Compare the performance of each table structure.
7.
Task 1: Review the Table Design

Review the table design in the supporting documentation for Table 5.
Task 2: Create the Relationship.Table_Heap Table

In the supporting documentation for Table5, create a table based on the design with no clustered index. Call the table Relationship.Table_Heap
Task 3: Create the Relationship.Table_ApplicationID Table

In the supporting documentation for Table5, create a table based on the design with a clustered index on the ApplicantID column. Call the table Relationship.Table_ApplicationID
Task 4: Create the Relationship.Table_EmailAddress Table

In the supporting documentation for Table5, create a table based on the design with a clustered index on the EmailAddress column. Call the table Relationship.Table_EmailAddress.
6-32
Task 5: Create the Relationship.Table_ReferenceID Table

In the supporting documentation for Table5, create a table based on the design with a clustered index on the ReferenceID column. Call the table Relationship.Table_ReferenceID.
Task 6: Load and Execute the Workload Script

Load and execute the workload script. (Note: this may take some minutes to complete. You can check where it is up to by viewing the Messages tab. A message is printed as each of the four sections is completed. While the script is running, review the contents of the script and estimate the proportion of time difference you expect to see in the results).
Task 7: Compare Table Performance

Compare the performance of each table structure Results: After this exercise, you have created four tables compare performance between clustered and non-clustered indexes.
6-33
Review Questions
1. 2. What is the main problem with uniqueidentifiers used as primary keys? Where are newly inserted rows placed when a table is structured as a heap?
Best Practices
1. 2. 3. Unless specific circumstances arise, most tables should have a clustered index. The clustered index may or may not be placed on the table's primary key. When using GUID primary keys in the logical data model, consider avoiding their use throughout the physical implementation of the data model.
6-34
Reading SQL Server 2008 R2 Execution Plans
7-1
Module 7
Contents:
Lesson 1: Execution Plan Core Concepts Lesson 2: Common Execution Plan Elements Lesson 3: Working with Execution Plans Lab 7: Reading SQL Server Execution Plans 7-3 7-14 7-24 7-31
7-2
Module Overview
In earlier modules, you have seen that one of the most important decisions that SQL Server takes when executing a query, is how to access the data in any of the tables involved in the query. SQL Server can read the underlying table (which might be structured as a heap or with a clustered index) but it might also choose to use another index. In the next module, you will see how to design additional indexes but before learning this, it is important to know how to determine the outcomes of the decisions that SQL Server makes. Execution plans show how each step of a query is to be executed. In this module, you will learn how to read and interpret execution plans.
Objectives
After completing this lesson, you will be able to: 1. 2. 3. Explain the core concepts related to the use of execution plans Describe the role of the most common execution plan elements Work with execution plans
7-3
Lesson 1
Execution Plan Core Concepts
The first steps in working with SQL Server execution plans are to understand why they are so important and to understand the phases that SQL Server passes through when executing a query. Armed with that information, you can learn what an execution plan is, what the different types of execution plans are, and how execution plans relate to execution contexts. Execution plans can be retrieved in a variety of formats. It is also important to understand the differences between each of these formats and to know when to use each format.
Objectives
After completing this lesson, you will be able to: 1. 2. 3. 4. 5. 6. Explain why execution plans matter Describe the phases that SQL Server passes through while executing a query Explain what execution plans are Describe the difference between actual and estimated execution plans Describe execution contexts Make effective use of the different execution plan formats
7-4
Why Execution Plans Matter
Key Points
Rather than trying to guess how a query is to be performed or how it was performed, execution plans allow precise answers to be obtained. Execution plans are also commonly referred to as query plans.
Why Execution Plans Matter

If you spend any time reading posts in the SQL Server forums or newsgroups, or participating in any of the SQL Server related email distribution lists, you will notice questions that occur very regularly: Why is it that my query takes such a long time to complete? This query is so similar to another query that executes quickly, yet this query takes much longer to complete. Why is this happening? I created an index to make access to the table fast but SQL Server is ignoring the index. Why won't it use my index? I've created an index on every column in the table yet SQL Server still takes the same time to execute my query. Why is it ignoring the indexes?
These are such common questions yet SQL Server provides tools to help answer the questions. Execution plans show how SQL Server intends to execute a query or how it did execute a query. The ability to interpret these execution plans provides you with the ability to answer the questions above. Many users capture execution plans and then try to resolve the worst performing aspects of a query. The best use of execution plans however, is in verifying that the plan you expected to be used, was the plan that was used. This means that you need to already have an idea of how you expect SQL Server to execute your queries. You will see more information on doing this in the next module.
7-5
Query Execution Phases
Key Points
SQL Server executes queries in a series of phases. A key outcome of one of the phases is an execution plan. Once compiled, a plan may be cached for later use.
T-SQL Parsing
The first phase when executing queries is to check that the statements supplied in the batch follow the rules of the language. Each statement is checked to find any syntax errors. Object names within the statements are located. Question: What is the difference between a statement and a batch?
Object Name Resolution

In the second phase, SQL Server resolves the names of objects to their underlying object IDs. SQL Server needs to know exactly which object is being referred to. For example, consider the following statement:
SELECT * FROM Product;
While at first glance, it might seem that mapping the Product table to its underlying object ID would be easy, consider that SQL Server supports more than a single object with the same name in a database, through the use of schemas. For example, note that each of the objects in the following code could be completely different in structure and that the names relate to entirely different objects:
SELECT * FROM Production.Product; SELECT * FROM Sales.Product; SELECT * FROM Marketing.Product;
SQL Server needs to apply a set of rules to relate the table name "Product" to the intended object.
7-6
Query Optimization
Once the object IDs have been resolved, SQL Server needs to decide how to execute the overall batch. Based on the available statistics, SQL Server will make decisions about how to access the data contained in each of the tables that are part of each query. SQL Server does not always find the best possible plan. It weighs up the cost of a plan, based on its estimate of the cost of resources required to execute the plan. The aim is to find a satisfactory plan in a reasonable period of time. The more complex a SQL batch is, the longer it could take SQL Server to evaluate all the possible plans that could be used to execute the batch. Finding the best plan might take longer than executing a less optimal plan. There is no need to consider alternate plans for DDL (Data Definition Language) statements, such as CREATE, ALTER or DROP. Many simple queries also have trivial plans that are quickly identified. Question: Can you think of a type of query that might lead to a trivial plan?
Query Plan Execution

Once a plan is found, the execution engine and storage engine work to execute the plan. It may or may not succeed as runtime errors could occur.
Plan Caching
If the plan is considered sufficiently useful, it may be stored in the Plan Cache. On later executions of the batch, SQL Server will attempt to reuse execution plans from the Plan Cache. This is not always possible and, for certain types of query, not always desirable.
7-7
What is an Execution Plan?
Key Points
An execution plan is a map that details either how SQL Server would execute a query or how SQL Server did execute a query. SQL Server uses a cost-based optimizer.
Execution Plans
Execution plans show the overall method that SQL Server is using to satisfy the requirements of the query. As part of the plan, SQL Server decides the types of operations to be performed and the order that the operations will be performed in. Many operations are related to the choice SQL Server makes about how to access data in a table and whether or not available indexes will be used. These decisions are based on the statistics that are available to SQL Server at the time. SQL Server uses a cost-based optimizer and each element of the query plan is assigned a cost in relation to the total cost of the batch. SSMS also calculates a relationship between the costs of each statements, which is useful where a batch contains more than one statement. The costs that are either estimated or calculated as part of the plan can be interpreted within the plan. The cost of individual elements can be compared across statements in a single batch but comparisons should not be made between the costs of elements in different batches. Costs can only be used to determine if an operation is cheaper or more expensive than another operation. Costs cannot be used to estimate execution time. Question: What resources do you imagine the cost would be based upon?
7-8
Actual vs. Estimated Execution Plans
Key Points
SQL Server can record the plan it used for executing a query. Before it executes a query though, it needs to create an initial plan.
Actual vs. Estimated Execution Plans

It is possible to ask SQL Server to return details of the execution plan used, along with results returned from a query. These plans are known as "actual" execution plans. In SQL Server Management Studio, on the Query menu, there is an option to "Include Actual Execution Plan". Once the results from a query are returned, another output tab is created that shows the execution plan that was used. Another option on the Query menu is to "Display Estimated Execution Plan". This asks SQL Server to calculate an execution plan for a query (or batch) based on how it would attempt to execute the query. This is calculated without actually executing the query. This type of plan is known as an "Estimated Execution Plan". Estimated execution plans are very useful when designing queries or when debugging queries that are suffering from performance problems. Note that it is not always possible to retrieve an estimated execution plan. One common reason for this is that the batch might include statements that create objects and then access them. As the objects do not yet exist, SQL server has no knowledge of them and cannot create a plan for processing them. You will see an example of this in the next demonstration. When SQL Server executes a plan, it may also make choices that differ from an estimated plan. This is commonly related to the available resources (or more likely the lack of available resources) at the time the batch is executed.
7-9
Execution plans include row counts in each data path. For estimated execution plans, these are based on estimates from the available statistics. For actual execution plans, both the estimated and actual row counts are shown.
7-10
What is an Execution Context?
Key Points
Execution plans are reentrant. This means that more than one user can be executing exactly the same execution plan at one time. Each user needs separate data related to their individual execution of the plan. This data is held in an object known as an Execution Context.
Execution Context
Execution plans detail the steps that SQL Server would take (or did take) when executing a batch of statements. When multiple users are executing the plan concurrently, there needs to be a structure that holds data related to their individual executions of the plan. Execution contexts are cached for reuse in a very similar way to the caching that occurs with execution plans. When a user executes a plan, SQL Server retrieves an execution context from cache if there is one available. To maximize performance and minimize memory requirements, execution contexts are not fully completed when they are created. Branches of the code are "fleshed out" when the code needs to move to the branch. This means that if a procedure includes a set of procedural logic statements (like the IF statement), the execution context retrieved from cache may have gone in a different logical direction and not yet have all the details required. From a caching reuse point of view, it is useful to avoid too much procedural logic in stored procedures. You should favor set-based logic instead.
7-11
Execution Plan Formats
Key Points
There are three formats for execution plans. Text based plans are now deprecated. XML based plans should be used instead. Graphical plans render XML based plans for each of use.
Execution Plan Formats

Prior to SQL Server 2005, only text-based plans were available and many tools still use this type of plan. Text based plans can be retrieved from SQL Server by executing the statement:
SET SHOWPLAN_TEXT ON;
Text based execution plans were superseded by XML based plans in SQL Server 2005 and are now deprecated. They should not be used in new development work.
Plan Portability
SQL Server provided a graphical rendering of execution plans to make reading text based plans easier. One challenge with this however, was that it was very difficult to send a copy of a plan to another user for review. XML plans can be saved as a .sqlplan filetype and are entirely portable. Graphical plans can be rendered from XML plans, including plans that have been received from other users. Note that graphical plans include only a subset of the information that is available from an XML plan. While it is not easy to read XML plans directly, further information can be obtained by reading the contents of the XML plan. XML plans are also ideal for programmatic access for users creating tools and utilities, as XML is relatively easy to consume programmatically in an application.
7-12
Question: What impact does having SSMS associated with the .sqlplan filetype have?
7-13
Demonstration 1A: Viewing Execution Plans in SSMS
Key Points
In this demonstration you will see how to: Show an estimated execution plan Compare execution costs between two queries in a batch Show an actual execution plan Save an execution plan
Demonstration Steps
3. 4. 5.
Question: How do you explain that such different queries return the same plan?
7-14
Lesson 2
Common Execution Plan Elements
Now that the role of execution plans is understood, along with the format of the plans, it is important to learn to interpret the plans. Execution plans can contain a large number of different types of elements. Certain elements however, appear regularly in execution plans. In this lesson, you will learn to interpret execution plans and learn about the most common execution plan elements.
Objectives
After completing this lesson, you will be able to: Describe the execution plan elements for table and clustered index scans and seeks Describe the execution plan elements for nested loops and lookups Describe the execution plan elements for merge and hash joins Describe the execution plan elements for aggregations Describe the execution plan elements for filter and sort Describe the execution plan elements for data modification
7-15
Table and Clustered Index Scans and Seeks
Key Points
Three execution plan elements relate to reading data from a table. The particular element used depends upon the structure of the table: heap or clustered index and whether the clustered index (if present) is useful in resolving the query.
Table and Clustered Index Scans and Seeks

Question: What is the difference between a table scan and a clustered index scan? Table scans are a problem in many queries. There is a common misconception that table scans are a problem but that clustered index scans are not. No doubt this relates to the word "index" in the name of the element. Table scans and clustered index scans are essentially identical except that table scans apply to heaps and clustered index scans apply to tables that are structured with clustered indexes. If a query's logic is related to the clustering key for the table, SQL Server may be able to use the index that supports it to quickly locate the row or rows required. For example, if a Customer table is clustered on a CustomerID column, consider how the following query would be executed:
SELECT * FROM dbo.Customer WHERE CustomerID = 12;
SQL Server does not need to read the entire table and can use the index to quickly locate the correct customer. This is referred to as a clustered index seek.
7-16
Nested Loops and Lookups
Key Points
Nested Loops are one of the most commonly encountered operations. They are used to implement join operations and are commonly associated with RID or Key Lookup elements.
Nested Loops and Lookups

Nested loop operations are used to implement joins. Two data paths will enter the nested loop element from the right-hand side as shown in the following screenshot:
For each row in the upper input, a lookup is performed against the bottom input. The difference between a RID Lookup and a Key Lookup is whether the table has a clustered index. RID Lookups apply to heaps. Key Lookups apply to tables with clustered indexes. In some earlier documentation, a Key Lookup was also referred to as a Bookmark Lookup. The Key Lookup operator was introduced in SQL Server 2005 Service Pack 2. Note also that in earlier versions of SQL Server 2005, the Bookmark Lookup was shown as a Clustered Index Seek operator with a LOOKUP keyword associated with it.
7-17
In the physical library analogy, a lookup is similar to reading through an author index and for each book found in the index, going to collect it from the bookcases. Lookups are often expensive operations as they need to be executed once for every row of the top input source. Note that in the execution plan shown, more than half the cost of the query is accounted for by the Key Lookup operator. In the next module, you will see options for minimizing this cost in some situations. Nested Loop is the preferred choice whenever the number of rows in the top input source is small when compared with the number of rows in the bottom input source.
7-18
Merge and Hash Joins
Key Points
Merge Joins and Hash Matches are other forms of join operations. Merge Joins are more efficient than Hash Matches but require sorted inputs.
Merge Joins
Apart from Nested Loop operations in which each row of one table is used to lookup rows from another table, it is common to need to join tables where simple lookups are not possible. Imagine two piles of paper sitting on the floor of your office. One pile of paper holds details of all your customers, one customer to a sheet. The other pile of paper holds details of customer orders, one order per sheet. If you needed to merge the two piles of paper together so that each customer's sheet was adjacent to his/her orders, how would you perform the merge? The answer depends upon the order of the sheets. If the customer sheets were in customer ID order and the customer order sheets were also in customer ID order, merging the two piles would be easy. The process involved is similar to what occurs with a Merge Join operator. It can only be used when the inputs are already in the same order. Merge Joins can be used to implement a variety of join types such as left outer joins, left semi joins, left anti semi joins, right outer joins, right semi joins, right anti semi joins and unions.
Hash Matches
Now imagine how you would merge the piles of customers and customer orders if the customers were in customer ID order but the customer orders were ordered by customer order number. The same problem would occur if the customer sheets were in postal code order. These situations are similar to the problem encountered by Hash Match operations. There is no easy way to merge the piles.
7-19
Hash Matches are using a relatively "brute force" method of joining. One input is broken into a set of "hash buckets" based on an algorithm. The other input is processed based on the same algorithm. In the analogy with the piles of paper, the algorithm could be to obtain the first digit of the customer ID. With this algorithm, ten buckets would be created. Although it may not be possible to always avoid Hash Matches in query plans, their presence is often an indication of a lack of appropriate indexing on the underlying tables.
7-20
Aggregations
Key Points
There are two types of Aggregate operator: Stream Aggregate and Hash Match Aggregate. Stream Aggregate operations are very efficient.
Aggregations
Imagine being asked to count how many orders are present for each customer based on a list of customer orders. How would you perform this operation? Similar to the discussion on Merge Joins and Hash Matches, the answer depends on the order that the customer orders are being held in. If the customer orders are already in customer ID order, then performing the count (or other aggregation) is very easy. This is the equivalent of a Stream Aggregate operation. However, if the aggregate being calculated is based on a different attribute of the customer orders than the attribute they are sorted by, performing the calculations is much more complex. One option would be to first sort all the customer orders by customer ID, then to count all the customer orders for each customer ID. Another alternative is to process the input via a hashing algorithm like the one used for Hash Match operations. This is what SQL Server does when using a Hash Match Aggregate operation. The presence of these operations in a query plan is often (but not always) an indication of a lack of appropriate indexing on the underlying table.
7-21
Filter and Sort
Key Points
Filter operations implement WHERE clause predicates or HAVING clause predicates. Sort operations sort input data.
Filter and Sort

WHERE clauses and HAVING clauses limit the rows returned by a query. A Filter operation can be used to implement this limit. Data rows from the input are only passed to the output if they meet specified filter criteria based on the predicates in those clauses. Filter operations are typically low cost and are processed as the data passes through the element. Sort operations are often used to implement ORDER BY clauses in queries but they have other uses. For example, a Sort operation could be used to sort rows before they are passed to other operations such as Merge Joins or for performing DISTINCT or UNION operations. Sorting data rows can be an expensive operation. Unnecessary ORDER BY operations should be avoided. Not all data needs to be output in a specific order. Question: What would affect the cost of a sort operation?
7-22
Data Modification
Key Points
INSERT, UPDATE and DELETE operations are used to present the outcome of underlying T-SQL data modification statements. T-SQL MERGE statements can be implemented by combinations of INSERT, UPDATE and DELETE operations.
Data Modification
The purpose of these operations will usually be self-evident but what might be obvious is the potential cost of these operations or the complexity that can be involved with them. A T-SQL INSERT, UPDATE or DELETE statement might involve much more than the related execution plan operation. Question: Can you think of an example where an INSERT statement in T-SQL need to perform more than an INSERT operation in an execution plan?
7-23
Demonstration 2A: Common Execution Plan Elements
Key Points
In this demonstration you will see queries that demonstrate the most common execution plan elements.
Demonstration Steps
2. 3.
Question: Why is the plan for a simple delete so complex?
7-24
Lesson 3
Working with Execution Plans
Now that you understand the importance of execution plans and are familiar with common elements contained within the plans, consideration needs to be given to the different ways that the plans can be captured. In this lesson, you will see a variety of ways to capture plans and explore the criteria by which SQL Server decides whether or not to reuse plans. When working with execution plans, SQL Server exposes a number of dynamic management views (DMVs) that can be used to explore query plan reuse. You will also see how they are used.
Objectives
After completing this lesson, you will be able to: Implement methods for capturing plans Explain how SQL Server decides whether or not to reuse existing plans when re-executing queries Use execution plan related DMVs
7-25
Methods for Capturing Plans
Key Points
Earlier in this module you saw how to capture execution plans using SQL Server Management Studio. Other options exist for capturing plans.
Methods for Capturing Execution Plans

SQL Server Management Studio (SSMS) can be used to obtain both estimated and actual execution plans. The same options have been added to Visual Studio 2010 (VS). This can help avoid the need to have two tools open when performing development against SQL Server. It is not always possible however, to load queries into SSMS or VS for analysis. Often you will need to analyze systems that are in production or to analyze queries generated by third party applications that you have no direct access to the source code of. SQL Profiler has an event: Performance events > Showplan XML that can be used to add a column to a trace. The trace will then include the actual execution plans. Caution needs to be taken with using this option as a very large trace output could be generated very quickly if appropriate filtering is not used. The overall performance of the system could be degraded. Dynamic management views provide information about recent expensive queries and missing indexes that were detected by SQL Server when creating the plan. SQL Server Activity Monitor can display the results of querying these DMVs. The SQL Server Data Collection system collects information from the DMVs, uploads it to a central database and provides a series of reports based on the data. Unlike Activity Monitor which shows recent expensive queries, the data collection system can show historical entries. This can be very useful when a user asks about a problem occurring last Tuesday morning rather than at the time the problem is occurring.
7-26
Demonstration 3A: Capturing Plans in Activity Monitor
Key Points
In this demonstration you will see how to use Activity Monitor to view recent expensive queries.
Demonstration Steps
2. 3.
Question: What could cause an expensive query to be removed from the Activity Monitor window?
7-27
Re-Executing Queries
Key Points
SQL Server attempts to reuse execution plans where possible. While this is often desirable, the reuse of existing plans can be counterproductive to performance.
Re-Executing Queries
Reusing query plans avoids the overhead of compiling and optimizing the queries. Some queries, however, perform poorly when executed with a plan that was generated for a different set of parameters. For example, consider a query with FromCustomerID and ToCustomerID parameters. If the value of the FromCustomerID parameter was the same as the value of the ToCustomerID parameter, an index seek based on the CustomerID might be highly selective. However, a later execution of that query where a large number of customers were requested would not be selective. This means that SQL Server would perform better if it reconsidered how to execute the query, and thus generate a new plan. You will see a further discussion on this "parameter sniffing" issue in later modules.
Usefulness of Cached Plans

Even for cached plans, SQL Server may eventually decide to evict them from the cache and recompile the queries. The two main reasons for this are: Correctness (changes to SET options, schema changes, etc.) Optimality (data has been modified enough that a new plan should be considered)
SQL Server assigns a cost to each plan that is cached, to estimate its "value". The value is a measure of how expensive the execution plan was to generate. When memory resources become tight, SQL Server will need to decide which plans are the most useful to keep. The decision to evict a plan from memory is based on this cost value and on whether or not the plan has been reused recently.
7-28
Options are available to force compilation behavior of code but they should be used sparingly and where necessary. You will see a further discussion on this issue in a later module.
7-29
Execution Plan Related DMVs
Key Points Dynamic Management Views provide insight into the internal operations of SQL Server. Several of these views are useful when investigating execution plans. Most DMV values are reset whenever the server is restarted. Some are reset more often. View sys.dm_exec_connections sys.dm_exec_sessions sys.dm_exec_query_stats sys.dm_exec_requests Description One row per user connection to the server One row per session, including system and user sessions Query statistics Associated with a session and providing one row per currently executing request Provides the ability to find the T-SQL code being executed for a request Provides the ability to find the execution plan associated with a request Details of cached query plans Details of dependent objects for those plans
sys.dm_exec_sql_text()
sys.dm_exec_query_plan()
sys.dm_exec_cached_plans sys.dm_exec_cached_plan_depend ent_objects()
7-30
Demonstration 3B: Viewing Cached Plans
Key Points
In this demonstration you will see how to view cached execution plans.
Demonstration Steps
2. 3.
Open the 32 Demonstration 3B.sql script file. Follow the instructions contained within the comments of the script file.
Question: No matter how quickly you execute the command to check the cache after you clear it, you would not see it empty. Why?
7-31
Lab 7: Reading SQL Server Execution Plans
Lab Setup
5. 6. 7.
7-32
8. 9.
Lab Scenario
You have been learning about the design of indexes. To take this learning further, you need to have a way to view how these indexes are used. In the first exercise, you will learn to view both estimated and actual execution plans. Execution plans can contain many types of elements. In the second exercise, you will learn to identify the most common plan elements and see how statements lead to these elements being used. You regularly find yourself trying to decide between different ways of structuring SQL queries. You are concerned that you arent always choosing the highest-performing options. If time permits, you will learn to use execution plans to compare the cost of statements in multi-statement batches.
7-33
Exercise 1: Actual vs. Estimated Plans

Scenario
In the first exercise, you will learn to view both estimated and actual execution plans. The main tasks for this exercise are as follows: 1. 2. 3. 4. 5. 6. 7. Load the test script. Generate an estimated execution plan for script 7.1. View the estimated execution plan for script 7.2 using SHOWPLAN_XML. Generate the actual execution plan for script 7.3. Try to generate an estimated execution plan for script 7.4 Review the actual execution plan for script 7.4. Review the execution plans currently cached in memory using script 7.5.
Task 1: Load the test script

Load the 51 Lab Exercise 1.sql script from Solution Explorer. Change the database context to AdventureWorks2008R2.
Task 2: Generate an estimated execution plan for script 7.1

Generate an estimated plan for script 7.1
Task 3: View the estimated execution plan for script 7.2 using SHOWPLAN_XML
Execute script 7.2 in SQL Server Query Analyzer. Click on the returned XML and view the execution plan. Right-click in the whitespace in the plan. Choose Show Execution Plan XML. Briefly review the XML. Close the XML window and the execution plan window.
Task 4: Generate the actual execution plan for script 7.3

Enable the option to include actual plans, then execute script 7.3. Note the returned execution plan tab and note that the plan is identical from the previous task.
Task 5: Try to generate an estimated execution plan for script 7.4

Request an estimated plan for script 7.4. Note the inability to create an estimated plan the reason is shown in the messages tab.
Task 6: Review the actual execution plan for script 7.4

Execute script 7.4 and note the returned plan
Task 7: Review the execution plans currently cached in memory using script 7.5
Execute script 7.5 to view the plans currently cached in memory
7-34
Results: After this exercise, you have reviewed various actual and estimated query plans.
7-35
Exercise 2: Identify Common Plan Elements

Scenario
Execution plans can contain many types of elements. You will learn to identify the most common plan elements and see how statements lead to these elements being used. The main tasks for this exercise are as follows: 1. 2. 3. 4. 5. 6. 7. 8. 9. Load the test script Explain the actual execution plan from script 7.6 Explain the actual execution plan from script 7.7 Explain the actual execution plan from script 7.8 Explain the actual execution plan from script 7.9 Explain the actual execution plan from script 7.10 Explain the actual execution plan from script 7.11 Explain the actual execution plan from script 7.12 Explain the actual execution plan from script 7.13
10. Explain the actual execution plan from script 7.14

Load the 61 Lab Exercise 2.sql script from Solution Explorer. Change the database context to AdventureWorks2008R2. Select the option to include actual execution plans from the Query menu.
Task 2: Explain the actual execution plan from script 7.6

Execute script 7.6. Explain the plan returned based upon the existing table structure.




7-36

Execute script 7.11. Compare the plan to the one returned by script 7.10. Suggest a reason for the difference in plan, where the queries are almost identical. Also note the green Missing Index warning.


Execute script 7.13. Compare the plan to the one returned by script 7.12. Suggest a reason for the difference in plan, where the queries are very similar.
Execute script 7.14. Note the difference in this plan from the plan for script 7.12. Results: After this exercise, you will have analyzed the most common plan elements returned from queries.
7-37
Challenge Exercise 3: Query Cost Comparison (Only if time permits)

Scenario
You regularly find yourself trying to decide between different ways of structuring SQL queries. You are concerned that you arent always choosing the highest-performing options. You will learn to use execution plans to compare the cost of statements in multi-statement batches. The main tasks for this exercise are as follows: 1. 2. Load the test script Explain the actual execution plan from script 7.15.

Load the 71 Lab Exercise 3.sql script from Solution Explorer. Change the database context to AdventureWorks2008R2. Select the option to include actual execution plans from the Query menu.

Execute script 7.15 as a single batch (both queries should be executed together). Explain the execution plan that is returned. In particular, explain the relationship between the two query plans. Results: After this exercise, you have used execution plans to compare the cost of statements in multistatement batches.
7-38
Review Questions
1. 2. What is the difference between a graphical execution plan and an XML execution plan? Give an example of why a T-SQL DELETE statement could have a complex execution plan?
Best Practices
1. 2. Avoid capturing execution plans for large numbers of statements when using SQL Profiler. If you need to capture plans using Profiler, make sure the trace is filtered to reduce the number of events being captured.
Improving Performance through Nonclustered Indexes
8-1
Module 8
Contents:
Lesson 1: Designing Effective Nonclustered Indexes Lesson 2: Implementing Nonclustered Indexes Lesson 3: Using the Database Engine Tuning Advisor Lab 8: Improving Performance through Nonclustered Indexes 8-3 8-10 8-18 8-25
8-2
Module Overview
The biggest improvements in database query performance on most systems come from appropriate use of indexing. In previous modules, you saw how to structure tables for efficiency, including the option of creating a clustered index on the table. In this module, you will see how nonclustered indexes have the potential to significantly enhance the performance of your applications and learn to use a tool that can help you design these indexes appropriately.
Objectives
After completing this lesson, you will be able to: Design effective nonclustered indexes Implement nonclustered indexes Use the database engine tuning advisor to design indexes
8-3
Lesson 1
Designing Effective Nonclustered Indexes
Before you start to implement nonclustered indexes, you need to design them appropriately. In this module, you will learn how SQL Server structures nonclustered indexes and how they can provide performance improvements for your applications. You will also see how to find information about the indexes that have been created.
Objectives
After completing this lesson, you will be able to: Describe the concept of nonclustered indexes Explain how SQL Server structures nonclustered indexes when the underlying table is organized as a heap Explain how SQL Server structures nonclustered indexes when the underlying table is organized with a clustered index Obtain information about indexes that have been created
8-4
What is a Nonclustered Index?
Key Points
You have seen how tables can be structured as heaps or have clustered indexes. Additional indexes can be created on the tables to provide alternate ways to rapidly locate required data. These additional indexes are called nonclustered indexes.
Nonclustered Indexes
A table can have up to 999 non-clustered indexes. These indexes are assigned index IDs greater than 1. Non-clustered indexes can be defined on a table regardless of whether the table uses a clustered index or a heap, and are used to improve the performance of important queries. Whenever updates to key columns from the nonclustered index or updates to clustering keys on the base table are made, the nonclustered indexes need to be updated as well. This impacts the data modification performance of the system. Each additional index that is added to a table increases the work that SQL Server might need to perform when modifying the data rows in the table. Care must be taken to balance the number of indexes created against the overhead that they introduce.
Ongoing Review
An application's data access patterns may change over time, particularly in enterprises where ongoing development work is being performed on the applications. This means that nonclustered indexes that are created at one point in time may need to be altered or even dropped at a later point in time, to continue to achieve high performance levels.
Physical Analogy
Continuing our library analogy, nonclustered indexes are indexes that point back to the bookcases. They provide alternate ways to look up the information in the library. For example, they might allow access by author, by release date, by publisher, etc. They can also be composite indexes where you could find an index by release date within the entries for each author.
8-5
Nonclustered Indexes Over Heaps
Key Points
Nonclustered indexes have the same B-tree structure as clustered indexes, but in the nonclustered index, the data and the index are stored separately. When the underlying table is structured as a heap, the leaf level of a nonclustered index holds Row ID pointers instead of data. By default, no data apart from the keys is stored at the leaf level.
Nonclustered Indexes Over Heaps

After traversing the structure of the nonclustered index, SQL Server obtains Row ID pointers in the leaf level of the index and uses these pointers to directly access all required data pages. Multiple nonclustered indexes can be created on a table regardless of whether the table is structured as a heap or if the table has a clustered index.
Physical Analogy
Based on the library analogy, a nonclustered index over a heap is like an author index pointing to books that have been stored in no particular order within the bookcases. Once an author is found in the index, the entry in the index for each book would have an address like "Bookcase 4, Shelf 3, Book 12". Note that it would be a pointer to the exact location of the book. Question: What is an upside of having the indexes point directly to RowIDs? Question: What is the downside of having multiple indexes pointing to data pages via RowID?
8-6
Nonclustered Indexes Over Clustered Indexes
Key Points
You have seen that the base table could be structured with a clustered index instead of a heap. While SQL Server could have been designed so that nonclustered indexes still pointed to Row IDs, it was not designed that way. Instead, the leaf levels of a nonclustered index contain the clustering keys for the base table.
Nonclustered Indexes Over Clustered Indexes

After traversing the structure of the nonclustered index, SQL Server obtains clustering keys from the leaf level of the index. It then uses these keys to traverse the structure of the clustered index to locate the required data pages. Note that two sets of index traversal occur. If the clustered index was not a unique clustered index, the leaf level of the nonclustered index also needs to hold the uniqueifier value for the data rows.
Physical Analogy
In the library analogy, a nonclustered index over a clustered index is like having an author index built over a library where the books are all stored in ISBN order. When the required author is found in the author index, the entry in the index provides details of the ISBNs for the required books. These ISBNs are then used to locate the books within the bookcases. If the bookcases need to be rearranged (for example due to other rows being modified), no changes need to be made to the author index as it is only providing keys that are used for locating the books, rather than the physical location of the books. Question: What is the downside of holding clustering keys in the leaf nodes of a nonclustered index instead of RowIDs? Question: What is the upside of holding clustering keys in the leaf nodes of a nonclustered index instead of RowIDs?
8-7
Methods for Obtaining Index Information
Key Points
You might require information about existing indexes before you create, modify, or remove an index. SQL Server 2008 provides many ways to obtain information about indexes.
SQL Server Management Studio

SQL Server Management Studio (SSMS) offers a variety of ways to obtain information about indexes. Object Explorer lists the indexes associated with tables. This includes indexes that have been created by users and those indexes that relate to PRIMARY KEY and UNIQUE constraints in cases where indexes have been created by SQL Server to support those constraints. Each index has a property page that details the structure of the index and of its operational, usage and physical layout characteristics. SSMS also includes a set of prebuilt reports that show the state of a database. Many of these reports relate to index structure and usage.
System Stored Procedures and Catalog Views

The sp_helpindex system stored procedure returns details of the indexes created on a specified table. SQL Server provides a series of catalog views that provide information about indexes. Some of the more useful views are shown in the following table: System View sys.indexes Notes Index type, filegroup or partition scheme ID, and the current setting of index options that are stored in metadata. Column ID, position within the index, type (key or nonkey), and sort order (ASC or DESC).
sys.index_columns
8-8
System View sys.stats
Notes Statistics associated with a table, including statistic name and whether it was created automatically or by a user. Column ID associated with the statistic.
sys.stats_columns
Dynamic Management Views

SQL Server provides a series of dynamic management objects with useful information about the structure and usage of indexes. Some of the most useful views and functions are shown in the following table: View sys.dm_db_index_physical_stats sys.dm_db_index_operational_stats sys.dm_db_index_usage_stats Notes Index size and fragmentation statistics. Current index and table I/O statistics. Index usage statistics by query type.
System Functions
SQL Server provides a set of functions that provide information about the structure of indexes. Some of the more useful functions are shown in the following table: Function INDEXKEY_PROPERTY Notes Index column position within the index and column sort order (ASC or DESC). Index type, number of levels, and current setting of index options that are stored in metadata. Name of the key column of the specified index.
INDEXPROPERTY
INDEX_COL
In the next demonstration, you will see examples of many of these methods for obtaining information on indexes.
8-9
Demonstration 1A: Obtaining Index Information
Key Points
In this demonstration you will see several ways to view information about indexes.
Demonstration Steps
3. 4. 5.
Question: What would be another way to find information about the physical structure of indexes?
8-10
Lesson 2
Implementing Nonclustered Indexes
Now that you have learned how nonclustered indexes are structured, it is important to learn how nonclustered indexes are implemented. In earlier modules, you have seen how Lookup functions used with Nested Loop execution plan elements can be very expensive operations. In this module, you will see options for alleviating these costs. You will also see how to alter or drop nonclustered indexes and how filtered indexes can reduce the overhead associated with some nonclustered indexes.
Objectives
After completing this lesson, you will be able to: Create nonclustered indexes Describe the performance impact of Lookup operations as part of Nested Loops in execution plans Use INCLUDE Clause to create covering indexes Drop or alter nonclustered indexes Use filtered indexes
8-11
Creating Nonclustered Indexes
Key Points
Nonclustered indexes are created with the CREATE INDEX statement. By default, the CREATE INDEX statement creates nonclustered indexes rather than clustered indexes when you do not specify which type of index you require. Wherever possible, the clustered index (if the table needs one) should be created prior to the nonclustered indexes. Otherwise SQL Server needs to rebuild all nonclustered indexes while creating the clustered index.
Creating Nonclustered Indexes

Creating a Nonclustered index requires supplying a name for the index, the name of the table to be indexed and the columns that need to be used to create the index key. It is important to choose an appropriate naming scheme for indexes. Many standards for naming indexes exist, along with strong opinions on which of the standards is best. The important thing is to choose a standard and follow it consistently. If an index is created only to enhance performance, rather than as part of the initial schema of an application, one suggested standard is to include in the name of the index the date of creation and a reference to documentation that describes why the index was created. Database administrators are often hesitant to remove indexes when they do not know why those indexes were created. Keeping documentation that explains why indexes were created avoids that confusion.
Composite Nonclustered Indexes

A composite index specifies more than one column as the key value. Query performance can be enhanced by using composite indexes, especially when users regularly search for information in more than one way. However, wide keys increase the storage requirements of an index. The majority of useful nonclustered indexes in business applications are composite indexes. A common error is to create single column indexes on many columns of a table. These indexes are rarely useful.
8-12
In composite indexes, the ordering of key columns is important and that the most selective column should be specified first, in the absence of any other requirements. Each column that makes up the key can be specified as ASC (ascending) or DESC (descending). Ascending is the default order.
8-13
Performance Impact of Lookups in Nested Loops
Key Points
Nonclustered indexes can be very useful when needing to find specific data based on the key columns of the index. However, for each entry found, SQL Server needs to then use the values from the leaf level of the index (either clustering keys or rowid) to look up the data rows from the base table. This lookup process can be very expensive.
Performance Impact of Lookups in Nested Loops

In the example shown on the slide, note that the percentage cost breakdown. The key lookups are estimated at 95% of the cost of executing the query. In the library analogy, this is equivalent to looking up an author in an index and for each entry found, running over to the bookcase to retrieve the books pointed to by the index. There is a point at which the effort of doing this is not worthwhile and it is quicker to scan the entire library. Question: How selective would you imagine a query needs to be before SQL Server will decide to ignore the index and just scan the data? Question: Is there any situation where there is no need for the lookups?
8-14
INCLUDE Clause
Key Points
In earlier versions of SQL Server (prior to 2005), it was common for DBAs or developers to create indexes with a large number of columns, to attempt to "cover" important queries. Covering a query avoids the need for lookup operations and can greatly increase the performance of queries. The INCLUDE clause was introduced to make the creation of covering indexes easier.
INCLUDE Clause
Adding columns to the key of an index adds a great deal of overhead to the index structure. For example, in the library analogy, if an index was constructed on PublisherID, ReleaseDate and Title, the index would internally be sorted by Title for no benefit. A further issue is the limitation of 16 columns and 900 bytes for an index, as this limits the ability to add columns to index keys when trying to cover queries. SQL Server 2005 introduced the ability to include one or more columns (up to 1024 columns) only at the leaf level of the index. The index structure in other levels is unaffected by these included columns. They are included, only to help with covering queries. If more than one column is listed in an INCLUDE clause, the order of the columns within the clause is not relevant. Question: For an index to cover a single table query, which columns would need to be present in the index?
Performance Impacts
Covering indexes can have a very positive performance impact on the queries that they are designed to support. However, while it would be possible to create an index to cover most queries, doing so could be counterproductive. Each index that is added to a table can negatively impact the performance of data modifications on the table. For this reason, it is important to decide which queries are most important and to aim to cover only those queries.
8-15
Dropping or Altering Nonclustered Indexes
Key Points
Only indexes created via CREATE INDEX can be dropped via DROP INDEX. If an index has been created by SQL Server to support a PRIMARY KEY or UNIQUE constraint, then those indexes need to be dropped by dropping the constraint instead.
Limitations on Altering Indexes

While it might at first glance seem that an index could be restructured via the ALTER INDEX statement. Changing the columns that make up the key, altering the sort order of the columns or changing settings such as FILLFACTOR and PADINDEX is not permitted. These changes can be implemented by using the CREATE INDEX statement with the WITH DROP_EXISTING option. In the example shown in the slide, an index is being disabled. Once an index is disabled, it is re-enabled by rebuilding the index. The rebuild command example shown in the example on the slide uses the ONLINE = ON option. This is only supported on Enterprise or higher editions of SQL Server. The ability to perform online index operations is one of the key reasons for purchasing these editions as many organizations no longer have available time windows for index maintenance operations.
8-16
Filtered Indexes
Key Points
By default, SQL Server includes an entry for every row in a table at the leaf level of each index. This is not always desirable. Filtered indexes only include rows that match a WHERE predicate that is specified when the index is created.
Filtered Indexes
For the example in the slide, consider a large table of transactions with one column that indicates if the transaction is finalized or not. Often only a very small number of rows will be unfinalized. An index on the finalized transactions would be pointless as it would never be sufficiently selective to be helpful. However, an index on the unfinalized transactions could be highly selective and very useful. Standard indexes created in this situation would contain an entry at the leaf level for every transaction row, even though most entries in the index would never be used. Filtered indexes only include entries for rows that match the WHERE predicate. Note that only very simple logic is permitted in the WHERE clause predicate for filtered indexes. For example, you cannot use the clause to compare two columns and you cannot reference a computed column, even if it is persisted. Question: What is the downside of having an entry at the leaf level for every transaction row, whether finalized or not?
8-17
Demonstration 2A: Nonclustered Indexes
Key Points
In this demonstration you will see how to: Create covering indexes View included columns in indexes
Demonstration Steps
2. 3.
Question: If included columns only apply to nonclustered indexes, why do you imagine that the columns in the clustered primary key also showed as included?
8-18
Lesson 3
Using the Database Engine Tuning Advisor
Designing useful indexes is considered by many people as more of an art than a science. While there is some truth to this statement, a number of tools are available to assist with learning to create useful indexes. In this module, you will learn how to capture activity against SQL Server using SQL Profiler and then how to analyze that activity using the Database Engine Tuning Advisor.
Objectives
After completing this lesson, you will be able to: Capture traces of activity using SQL Server Profiler Use Database Engine Tuning Advisor to analyze trace results
8-19
SQL Server Profiler
Key Points
SQL Server Profiler is an important tool when tuning the performance of SQL Server queries. It captures the activity from client applications to SQL Server and stores it in a trace. These traces can then be analyzed.
SQL Server Profiler

SQL Server profiler captures data when events occur. Only events that have been selected are captured. A variety of information (shown as a set of columns) is available when each event occurs. The trace created contains only the selected columns for the selected events. Rather than needing to select events and columns each time you run SQL Server Profiler, a set of existing templates are available. You can also save your own selections as a new template. The captured traces are useful when tuning the performance of an application and when diagnosing specific problems that are occurring. When using traces for diagnosing problems, log data from the Windows Performance Monitor tool can be loaded. This allows relationships between system resource impacts and the execution of queries in SQL Server to be made. The traces can also be replayed. The ability to replay traces is useful for load testing systems or for ensuring that upgraded versions of SQL Server can be used with existing applications. SQL Server Profiler also allows you to step through queries when diagnosing problems.
SQL Trace
SQL Server Profiler is a graphical tool and it is important to realize that it can have significant performance impacts on the server being traced, depending upon the options chosen. SQL Trace is a library of system stored procedures that can be used for tracing when minimizing the performance impacts of the tracing is necessary.
8-20
The Extended Events system that was introduced in SQL Server 2008 also provides capabilities for tracing SQL Server activity and resources. Both SQL Trace and Extended Events are outside the scope of this course. Question: Where would the ability to replay a trace be useful?
8-21
Demonstration 3A: SQL Server Profiler
Key Points
In this demonstration you will see how to use SQL Server Profiler.
Demonstration Steps
1. If Demonstration 1A was not performed: Revert the 623XB-MIA-SQL virtual machine using Hyper-V Manager on the host system. In the virtual machine, click Start, click All Programs, click Microsoft SQL Server 2008 R2, click SQL Server Management Studio. In the Connect to Server window, type Proseware in the Server name text box and click Connect. From the File menu, click Open, click Project/Solution, navigate to D:\6232B_Labs\6232B_08_PRJ\6232B_08_PRJ.ssmssln and click Open. Open and execute the 00 Setup.sql script file from within Solution Explorer. Open the 31 Demonstration 3A.sql script file. Follow the instructions contained within the comments of the script file.
2. 3.
Question: When so many statements were executed, why was there only one entry in the trace?
8-22
Database Engine Tuning Advisor
Key Points
The Database Engine Tuning Advisor utility analyzes the performance effects of workloads run against one or more databases. Typically these workloads are obtained from traces captured by SQL Server Profiler. After analyzing the effects of a workload on your databases, Database Engine Tuning Advisor provides recommendations for improving the performance of your system.
Database Engine Tuning Advisor

In SQL Server 2000 and earlier, a previous version of this tool was supplied. It was called the "Index Tuning Wizard". In SQL Server 2005, the name was changed as the tool evolved to be able to provider a broader range of recommendations. Database Engine Tuning Advisor was further enhanced in SQL Server 2008 with improved workload parsing, integrated tuning, and the ability to tune multiple databases concurrently.
Workloads
A workload is a set of Transact-SQL statements that executes against databases that you want to tune. The workload source can be a file containing Transact-SQL statements, a trace file generated by SQL Profiler, or a table of trace information, again generated by SQL Profiler. SQL Server Management Studio also has the ability to launch Database Engine Tuning Advisor to analyze an individual statement.
Recommendations
The recommendations that can be produced include suggested changes to the database such as new indexes, indexes that should be dropped, and depending on the tuning options you set, partitioning recommendations. The recommendations that are produced are provided as a set of Transact-SQL statements that would implement the suggested changes. You can view the Transact-SQL and save it for later review and application, or you can choose to implement the recommended changes immediately.
8-23
Be careful of applying changes to a database without detailed consideration, especially in production environments. Also, ensure that any analysis that you perform is based on appropriately sized workloads so that recommendations are not made based on partial information. Question: Why is it important to tune an entire workload rather than individual queries?
8-24
Demonstration 3B: Database Engine Tuning Advisor
Key Points
In this demonstration you will see how to use Database Engine Tuning Advisor.
Demonstration Steps
2. 3.
Open the 32 Demonstration 3B.sql script file. Follow the instructions contained within the comments of the script file.
Question: Should you immediately apply the recommendations to your server?
8-25
Lab 8: Improving Performance through Nonclustered Indexes
Lab Setup
5. 6. 7.
In Virtual Machine Connection window, click on the Revert toolbar icon. If you are prompted to confirm that you want to revert, click Revert. Wait for the revert action to complete. In the Virtual Machine Connection window, if the user is not already logged on: On the Action menu, click the Ctrl-Alt-Delete menu item. Click Switch User, and then click Other User.
8-26
Log on using the following credentials:
i. User name: AdventureWorks\Administrator ii. Password: Pa$$w0rd 8. From the View menu, in the Virtual Machine Connection window, click Full Screen Mode. 9. If the Server Manager window appears, check the Do not show me this console at logon check box and close the Server Manager window. 10. In the virtual machine, click Start, click All Programs, click Microsoft SQL Server 2008 R2, and click SQL Server Management Studio. 11. In Connect to Server window, type Proseware in the Server name text box. 12. In the Authentication drop-down list box, select Windows Authentication and click Connect. 13. In the File menu, click Open, and click Project/Solution. 14. In the Open Project window, open the project D:\6232B_Labs\6232B_08_PRJ\6232B_08_PRJ.ssmssln. 15. In Solution Explorer, double-click the query 00-Setup.sql. When the query window opens, click Execute on the toolbar.
Lab Scenario
The marketing system includes a query that is constantly executed and is performing too slowly. It retrieves 5000 web log entries beyond a given starting time. Previously, a non-clustered index was created on the SessionStart column. When 100 web log entries were being retrieved at a time, the index was being used. The developer is puzzled that changing the request to 5000 entries at a time has caused SQL Server to ignore the index he built. You need to investigate the query and suggest the best non-clustered index to support the query. You will then test your suggestion. After you have created the new index, the developer noted the cost of the sort operation and tried to create another index that would eliminate the sort. You need to explain to him why SQL Server has decided not to use this index. Later you will learn to set up a basic query tuning trace in SQL Server Profiler and use the trace captured in Database Engine Tuning Advisor. If time permits, you will design a required nonclustered index.
Query 1: Query to test
DECLARE @StartTime datetime2 = '2010-08-30 16:27'; SELECT TOP(5000) wl.SessionID, wl.ServerID, wl.UserName FROM Marketing.WebLog AS wl WHERE wl.SessionStart >= @StartTime ORDER BY wl.SessionStart, wl.ServerID;
Query 2: Index Design

CREATE INDEX IX_WebLog_Perf_20100830_B ON Marketing.WebLog (ServerID, SessionStart) INCLUDE (SessionID, UserName);
Query 3: Query to review

SELECT PostalCode, Country
8-27
FROM Marketing.PostalCode WHERE StateCode = 'KY' ORDER BY StateCode, PostalCode;
8-28
Exercise 1: Nonclustered index usage review

Scenario
The marketing system includes a query that is constantly executed and is performing too slowly. It retrieves 5000 web log entries beyond a given starting time. Previously, a non-clustered index was created on the SessionStart column. When 100 web log entries were being retrieved at a time, the index was being used. The developer is puzzled that changing the request to 5000 entries at a time has caused SQL Server to ignore the index he built. You need to investigate the query and suggest the best non-clustered index to support the query. You will then test your suggestion. The main tasks for this exercise are as follows: 1. 2. 3. Review the query. Review the existing Index and Table structures. Design a more appropriate index. Test your design.
4.
Task 1: Review the query

Review the Query 1 in the supporting documentation.
Task 2: Review the existing Index and Table structures

Review the existing Index and Table structures.
Task 3: Design a more appropriate index

Design a more appropriate index.
Task 4: Test your design

In the supporting documentation, use Query 1 to test your new index. Results: After this exercise, you have created a non-clustered index.
8-29
Exercise 2: Improving nonclustered index designs

Scenario
After you have created the new index, the developer noted the cost of the sort operation and tried to create another index that would eliminate the sort. Explain why SQL Server has decided not to use this index: The main tasks for this exercise are as follows: 1. 2. 3. Review the index design. Implement the index. Test the design and explain why the index was not used.
Task 1: Review the index design

In Query 2 in the supporting documentation, review the index design.
Task 2: Implement the index

Create the index as per the index design.
Task 3: Test the design and explain why the index was not used
Enable Include Actual Execution Plan. Execute the query. Review the Execution Plan and explain why the index was not used. Results: After this exercise, you have understood why some indexes are not appropriate in some scenerios.
8-30
Exercise 3: SQL Server Profiler and Database Engine Tuning Advisor

Scenario
Query 3 is another important query. You need to investigate the query and suggest the best nonclustered index to support the query. You will then test your suggestion. The main tasks for this exercise are as follows: 1. 2. 3. 4. Review the query. Review the existing Index and Table structures. Design a more appropriate index by following the Missing Index suggestion. Create a better index that removes the sort operation. If you create another index, confirm that SQL Server selects it.
Task 1: Review the query

Review Query 3 in the supporting documentation.
Task 2: Review the existing Index and Table structures

Review the existing Index and Table structures.
Task 3: Design a more appropriate index by following the Missing Index suggestion
Review and implement the Missing Index that SQL Server has suggested. Test to ensure that the new index is being used.
Task 4: Create a better index that removes the sort operation. If you create another
index, confirm that SQL Server selects it
Create a new index that will remove the Sort operation. Test to ensure that the new index is being used. Results: After this exercise, you should have created a better index that will remove the sort operation.
8-31
Challenge Exercise 4: Nonclustered index design (Only if time permits)

Scenario
You will learn to set up a basic query tuning trace in SQL Server Profiler and to analyze use the trace captured in Database Engine Tuning Advisor. The main tasks for this exercise are as follows: 1. 2. 3. Open SQL Server Profiler and configure and start a trace. Load and execute the workload file. Stop and analyze the trace using DTA.
Task 1: Open SQL Server Profiler and configure and start a trace
Open SQL Server Profiler. Configure it use the following: a. b. c. d. e. Template: Tuning Save To File: should be selected and any file name provided for a file on the desktop Enable file rollover: Not selected Maximum File Size: 500MB Filter: DatabaseName LIKE MarketDev
Start the SQL Server Profiler Trace. Disable AutoScroll from the Window Menu.
Task 2: Load and execute the workload file

Load and execute the workload file 81 Lab Exercise 4.sql.
Task 3: Stop and analyze the trace using DTA

Stop the SQL Server Profiler trace. Analyze the trace results using DTA. Review the recommendations provided by the Database Tuning Advisor. Results: After this exercise, you should have created a SQL Server Profiler trace and analyzed the recommendations from the Database Tuning Advisor.
8-32
Review Questions
1. 2. What is a covering index? Can a clustered index be a covering index?
Best Practices
1. 2. 3. Never apply Database Engine Tuning Advisor recommendations without further reviewing what is being suggested. Record details of why and when you create any indexes. DBAs are hesitant to ever remove indexes without this knowledge. When DETA suggests new statistics, this should be taken as a hint to investigate the indexing structure of the table.If using an offline version of Books Online, ensure it is kept up to date.

5678

Загружено:

Сведения о документе

Исходное описание:

Авторское право

Доступные форматы

Поделиться этим документом

Поделиться или встроить документ

Параметры публикации

Этот документ был вам полезен?

Это неприемлемый материал?

Авторское право:

Доступные форматы

5678

Загружено:

Авторское право:

Доступные форматы

QuickStart Intelligence

Planning for SQL Server 2008 R2 Indexing

Implementing a Microsoft SQL Server 2008 R2 Database

Planning for SQL Server 2008 R2 Indexing

Core Indexing Concepts

Implementing a Microsoft SQL Server 2008 R2 Database

How SQL Server Accesses Data

Planning for SQL Server 2008 R2 Indexing

The Need for Indexes

The Need for Indexes

Implementing a Microsoft SQL Server 2008 R2 Database

Planning for SQL Server 2008 R2 Indexing

Selectivity, Density and Index Depth

Selectivity, Density and Index Depth

Implementing a Microsoft SQL Server 2008 R2 Database

Internal vs. External Fragmentation

Planning for SQL Server 2008 R2 Indexing

Question: Why does fragmentation affect performance?

Implementing a Microsoft SQL Server 2008 R2 Database

Demonstration 1A: Viewing Index Fragmentation

Planning for SQL Server 2008 R2 Indexing

Data Types and Indexes

Implementing a Microsoft SQL Server 2008 R2 Database

Numeric Index Data

Numeric Index Data

Planning for SQL Server 2008 R2 Indexing

Character Index Data

Character Index Data

Implementing a Microsoft SQL Server 2008 R2 Database

Date-Related Index Data

Date-Related Index Data

Planning for SQL Server 2008 R2 Indexing

GUID Index Data

GUID Index Data

Implementing a Microsoft SQL Server 2008 R2 Database

BIT Index Data

BIT Index Data

Planning for SQL Server 2008 R2 Indexing

Indexing Computed Columns

Indexing Computed Columns

Implementing a Microsoft SQL Server 2008 R2 Database

Planning for SQL Server 2008 R2 Indexing

Single Column and Composite Indexes

Implementing a Microsoft SQL Server 2008 R2 Database

Single Column vs. Composite Indexes

Single Columns vs. Composite Indexes

Planning for SQL Server 2008 R2 Indexing

Ascending vs. Descending Indexes

Ascending vs. Descending Indexes

Implementing a Microsoft SQL Server 2008 R2 Database

Planning for SQL Server 2008 R2 Indexing

Demonstration 3A: Viewing Index Statistics

Implementing a Microsoft SQL Server 2008 R2 Database

Lab 5: Planning for SQL Server Indexing

Planning for SQL Server 2008 R2 Indexing

Implementing a Microsoft SQL Server 2008 R2 Database

ORDER BY LastName, FirstName;

Planning for SQL Server 2008 R2 Indexing

Exercise 1: Explore existing index statistics

EXEC sp_helpstats Marketing.Product

SELECT COUNT(1) FROM Marketing.Product WHERE Color = 'Black';

Task 1: Execute SQL Command

Task 2: Review the results