Search

Friday, February 15, 2013

Internals of TempDB

Tempdb is a critical resource in SQL Server.

It is used internally by the database engine for many operations, and it might consume a lot of disk space. In the past two weeks I encountered 3 different scenarios in which tempdb has grown very large, so I decided to write about troubleshooting such scenarios.
Before I describe the methods for troubleshooting tempdb space usage, let’s begin with an overview of the types of objects that consume space in tempdb. There are 3 types of objects stored in tempdb:
  • User Objects
  • Internal Objects
  • Version Stores
A user object can be a temporary table, a table variable or a table returned by a table-valued function. It can also be a regular table created in the tempdb database. A common misconception is that table variables (@) do not consume space in tempdb, as opposed to temporary tables (#), because they are only stored in memory. This is not true. But there are two important differences between temporary tables and table variables, when it comes to space usage:
  1. Indexes and statistics on temporary tables also consume space in tempdb, while indexes and statistics on table variables don’t. This is simply because you cannot create indexes or statistics on table variables.
  2. The scope of a temporary table is the session in which it has been created, while the scope of a table variable is the batch in which it has been created. This means that a temporary table consumes space in tempdb as long as the session is still open (or until the table is explicitly dropped), while a table variable’s space in tempdb is deallocated as soon as the batch is ended.
Internal objects are created and managed by SQL Server internally. Their data or metadata cannot be accessed. Here are some examples of internal objects in tempdb:
  • Query Intermediate Results for Hash Operations
  • Sort Intermediate Results
  • Contents of LOB Data Types
  • Query Result of a Static Cursor
Unlike user objects, operations on internal objects in tempdb are not logged, since they do not need to be rolled back. But internal objects do consume space in tempdb. Each internal object occupies at least 9
pages (one IAM page and 8 data pages). tempdb can grow substantially due to internal objects when queries that process large amounts of data are executed on the instance, depending on the nature of the queries.
Version stores are used for storing row versions generated by transactions in any database on the instance. The row versions are required by features such as snapshot isolation, after triggers and online index build. Only when row versioning is required, the row versions will be stored in tempdb.
As long as there are row versions to be stored, a new version store is created in tempdb approximately every minute. These version stores are similar to internal objects in many ways. Their data and metadata cannot be accessed, and operations on them are not logged. The difference is, of-course, the data that is stored in them.
When a transaction that needs to store row versions begins, it stores its row versions in the current version store (the one that has been created in the last minute). This transaction will continue to store row versions in the same version store as long as it’s running, even if it will run for 10 minutes. So the size of each version store is determined by the number and duration of transactions that began in the relevant minute, and also by the amount of data modified by those transactions.
Version stores that are not needed anymore are deallocated periodically by a background process. This process deallocates complete version stores, not individual row versions. So, in some cases, it might take a while till some version store is deallocated.
There are two types of version stores. One type is used to store row versions for tables that undergo online index build operations. The second type is used for all other scenarios.
Since the release of SQL Server 2005 there are 3 dynamic management views, which make the task of troubleshooting tempdb space usage quite easy. The views are:
All 3 views return a column named “database_id”, so you might think that they return information for all the databases in the instance, right? Wrong! These views return information for the tempdb database only, so the value returned in this column is always “2” (the database ID of tempdb).
The first view (sys.dm_db_file_space_usage) returns space usage information for each data file in tempdb. It gives a high level distribution of the space occupied by tempdb.

No comments:

Post a Comment