SQL Server provides numerous counters for monitoring and evaluating performance of your system. Although there are no hard and fast rules for performance monitoring, you should audit your system periodically to ensure that it lives up to user expectations. The first step in performance monitoring efforts should be collecting baseline information. A baseline has a collection of performance counter values while the system isn't heavily utilized and performs well. Next you should take benchmarks at various times of day and during different days of week to figure out acceptable ranges for the performance counters you intend to monitor. For example, your tests could reveal that buffer cache hit ratio values between 93.5% and higher are acceptable, but any value below 93.5% indicates sub-optimal performance.
Then setup continuous monitoring and define alerts which notify you if any performance counter values fall below (or rise above) a predefined threshold. Although we try to provide some guidelines for performance counter values adequate range for each counter will vary from one environment to next; this is why it's important to create a baseline and determine acceptable values for each counter.
You should watch several memory related counters very closely as they present a clear indication of acceptable or sub-optimal performance. Memory bottleneck typically leads to excessive paging. You can reduce or eliminate memory related performance issues by:
- Adding memory to the server computer.
- Altering SQL Server and operating system configuration.
- Eliminating unnecessary processes on the database server.
- Directing some user activity from the current server to another server.
A large majority of these memory performance counters are available through Memory Manager, Plan Cache and Buffer Manager objects.
SQL Server Buffer Manager provides the following counters:
AWE lookup maps / sec
Shows the number of times per second data pages requested by SQL Server were found in the buffer pool and then mapped as AWE memory and made part of server's virtual address space.
AWE stolen maps / sec
Shows the number of times data pages were taken from the free list in buffer pool and mapped as AWE memory.
AWE unmap calls / sec
When a buffer is unmapped it is excluded from virtual address space; buffers are unmapped because they haven't been used and to make room for new buffers. This counter shows the number of times buffers were un-mapped per second.
AWE unmap pages / sec
Shows the number of data pages unmapped per second.
AWE write maps / sec
Shows the number of times per second that it is necessary to map in a dirty buffer so it can be written to disk. This means that no free buffers are available so SQL Server has to write to a "dirty" buffer, meaning a buffer that contains data changes. If the value is high, then the server could use additional memory.
Note: AWE related counters are only available if AWE enabled option is turned on.
Buffer cache hit ratio
Percentage of pages that were found in the buffer pool without having to incur a read from disk. The ratio is calculated as total number of cache hits divided by the total number of cache lookups over the last few thousand page accesses. Since reading from the cache is much less expensive than reading from disk, servers that have a high buffer cache hit ratio perform well. For optimal performance this ratio should remain above 99%. However, in most environments a number above 95% is acceptable. If you use AWE enabled option you should never see this counter drop below 99% unless the total amount of RAM is not sufficient for your application's load.
Checkpoint pages per second
Shows how many pages are flushed per second by the CHECKPOINT process. If this value is too high, then you might want to increase the recovery interval or dedicate more RAM to SQL Server. By increasing the value of recovery interval, you advise SQL Server to checkpoint the databases less frequently thereby reserving more resources for user applications. The flip side of the coin, however, is that databases will take longer to recover in case of server failure (recovery process entails rolling back failed transactions and rolling forward the successful transactions).
Database pages
Number of pages in the buffer pool with database content. The higher the value of this counter, the more queries are sent to your server. If the number keeps increasing considerably above your baseline, consider allowing more memory for SQL Server usage by modifying the value of "max server memory" setting.
Free list stalls / sec
Number of times a request for a "free" page had to wait for one to become available. High value indicates that the server could use additional memory.
Lazy writes / sec
How many times per second lazy writer has to flush dirty pages out of the buffer cache instead of waiting on a checkpoint. Typically dirty pages are flushed out of cache by the checkpoint process. If the value of this counter is higher than 20, then server could use additional RAM.
Look at Lazy Writes/sec which tells you that number of times the buffer pool flushed dirty pages to disk outside of the CHECKPOINT process. This should be near zero.
Page life expectancy
Shows on average how long each data page is staying in buffer cache before being flushed out to make room for other pages. If value is less than 300 seconds than additional memory could be helpful.
Page lookups / sec
Number of requests per second to find the page in the buffer pool.
Page reads / sec
Shows the number of data pages read per second.
Page writes / sec
Shows the number of data pages written per second.
Readahead pages / sec
Shows the number of data pages read per second in anticipation of their use.
Procedure cache pages
Number of pages used to store compiled queries; indicates the total number of stored procedures in cache.
Reserved pages
Shows the number of buffer pool reserved pages.
Stolen pages
Number of pages acquired from buffer cache and used for miscellaneous server purposes (including procedure cache).
Target pages
Shows the ideal number of pages in buffer pool.
Total pages
Total number of pages in buffer pool (including data pages, free and stolen pages).
Another group of important memory related counters are exposed through Cache Manager object in SQL Server 2000 and Plan Cache object with SQL Server 2005. These objects allow you to monitor the memory usage for storing ad-hoc Transact-SQL statements, stored procedures and triggers.
Cache Hit Ratio
Ratio between cache hits and lookups. The value of this counter should never fall below 90%.
Cache Object Counts
Shows the number of objects in cache.
Cache Pages
Number of 8KB pages in cache.
Cache Objects in use
Shows the number of objects in cache that are currently being used.
Each counter has the following instances:
- _total - information on all types of instances.
- SQL Plans - query plans for ad-hoc Transact-SQL statements, or Transact-SQL statements prepared using sp_prepare or sp_cursorprepare system procedures. The execution plans are cached for reuse by identical statements executed at a later time. With SQL Server 2000 this instance is called "ad hoc SQL Plans".
- Object Plans - query plans for stored procedures, user-defined functions and triggers. With SQL Server 2000 this instance is further broken down into procedure plans, replication procedure plans and trigger plans.
- Bound Trees - normalized trees for views, rules, computed columns and check constraints. With SQL Server 2000 this instance is called "misc. Normalized Trees".
- Extended stored procedures - if you use extended procedures this instance will show the number of these objects referenced in cache.
- Temporary tables & table variables - information for temp tables and table variables referenced in cache.
SQL Server Memory Manager object exposes the following counters:
Connection Memory
Total amount of dynamic memory the server is using for maintaining connections
Granted Workspace Memory
Total amount of memory currently granted to executing processes such as hash, sort, bulk copy, and index creation operations.
Lock Blocks
A lock block is a single locked object - a table, row, page, etc. This counter shows the total number of currently used lock blocks.
Lock Blocks Allocated
Current number of allocated lock blocks. By default SQL Server allocates locks dynamically as needed. However, you can override the default behavior by modifying locks configuration parameter.
Memory Grants Outstanding
Number of processes that have successfully acquired workspace memory grant. High value of this counter indicates peak user activity.
Memory Grants Pending
Number of processes waiting on a workspace memory grant. If this counter has a high value SQL Server could benefit from additional memory. You should first examine the database design, queries and indexes however to ensure the system is properly tuned before purchasing additional RAM.
Optimizer Memory
Memory reserved for query optimization. Ideally, the value of this counter should remain relatively static. If this isn't the case you might be using dynamic SQL execution excessively.
SQL Cache Memory
Total memory reserved for dynamic SQL statements.
Target Server Memory
Total amount of memory in KB that the server can consume.
Total Server Memory
The amount of committed memory in KB from the buffer pool. If this counter has a high values, as compared to the total amount of RAM available on server then SQL Server can benefit from additional memory.
The Target Server Memory counter shows SQL Server estimate of how much RAM it would need to operate effectively. The total memory counter shows what is actually used. The goal is to keep total memory less than target memory. If over time total memory number is higher than target your server could benefit from additional RAM.
In addition to examining SQL Server specific counters related to memory, you should also inspect the value of the following operating system level counters:
Total Server Memory (KB) counter < Target Server Memory (KB) counter=That means SQL Server has enough memory to run efficiently
Total Server Memory (KB) counter is >= Target Server Memory (KB) counter = This indicates that SQL Server may be under memory pressure and could use access to more physical memory.
Page faults / sec
Shows the average number of pages faulted per second. This counter includes both hard faults (those that require disk access) and soft faults (where the faulted page is found elsewhere in physical memory.) Most processors can handle large numbers of soft faults without significant consequence. However, hard faults, which require disk access, can cause significant delays. You need to ensure this value stays as low as possible.
Pages / sec
Shows the rate at which pages are read from or written to disk to resolve hard page faults. This counter is a primary indicator of the kinds of faults that cause system-wide delays. Although some spikes are normal the value of this counter should remain close to zero.
Look at Free Pages/sec and Free List Stalls/sec. You don't want to see Free Pages bottom out which will result in a Free List Stall while the buffer pool has to free pages for usage.
Available Bytes
The amount of physical memory, in bytes, immediately available for allocation to a process or for system use. Your system should maintain at least 50MB of available memory. If the value falls below this threshold then the server is short on memory or the application (SQL Server) isn't releasing the memory back to the operating system as it should.
Process: Working Set
Shows the current size of the Working Set of the process in bytes. The Working Set is the set of memory pages touched recently by the threads in the SQL Server process. If the value of this counter is consistently below the amount configured by min server memory option then SQL Server is configured to have more memory than it really needs.