While nosing in the Tuning options for SQL Server 2005 and SQL Server 2008 when running in high performance workloads article in Microsoft’s Knowledge Base, I found the following flag available for Windows 2003 servers with at least 8 GB of RAM running SQL 2005 or above.
Trace flag 834: Use Microsoft Windows large-page allocations for the buffer pool
Trace flag 834 causes SQL Server to use Microsoft Windows large-page allocations for the memory that is allocated for the buffer pool. The page size varies depending on the hardware platform, but the page size may be from 2 MB to 16 MB. Large pages are allocated at startup and are kept throughout the lifetime of the process. Trace flag 834 improves performance by increasing the efficiency of the translation look-aside buffer (TLB) in the CPU.
First, a better explanation of what the TLB is, how its efficiency can suffer, and how allocating large pages helps:
In the CPU, there’s a translation table of pages to their locations in memory called the Translation Lookaside Buffer. If you have more pages than will fit in the table, only the most recent address are kept in the CPU, and the whole table is in elsewhere. Just like data served out of SQL Server’s buffer pool is accessed much more quickly than data served off a disk, pages in the TLB are accessed much more quickly than if the address has to be retrieved out of the full table in main memory. The fewer memory pages you have, the more likely that they will all fit in the buffer in the CPU, avoiding those costly trips out to look at the full translation table. The 834 trace flag tells SQL Server to allocate larger pages for its buffer pool (2 MB to 16 MB, depending on your hardware, rather than 8 KB) so that there will be fewer of them.
Sounds intriguing! Would my high performance system benefit from this? The information out there is pretty slim, but here’s what I dug up.
- This Usenet posting by a SQL MVP suggests that you should only use it if your system is CPU-bound rather than IO-bound and your signal to resource wait times are high. To see if your signal to resource wait times are high, shimmy on over to the sys.dm_os_wait_stats DMV:
SELECT wait_type, signal_wait_time_ms / wait_time_ms
WHERE wait_time_ms > 0
ORDER BY wait_type
SELECT SUM(signal_wait_time_ms) / SUM(wait_time_ms)
Mine are all zero, and my production cluster sees about 1500 batch requests per second during its peak use.
- Monitoring for translation lookaside buffer misses is also mentioned, but I can’t find any way to do this on a Windows system. There are a few Intel Pentium 4 manuals and an O’Reilly system tuning book that mention the existence of the TLB, but there are no perfmon counters to monitor. The O’Reilly book does mention a tool for Solaris.
- Bob Ward has a post on the PSS blog regarding trace flag 834.
My conclusion is that this wouldn’t benefit me. Check your CPU usage, I/O stalling, I/O waits and signal-to-resource ratios to see if it would help you.