데이터베이스 개발자 Tip & 강좌

SQLER의 개발자들이 만들어가는 데이터베이스 사용자 Tip & 강좌 게시판입니다. SQL서버, Oracle, MySQL 등 여러 클라우드/오픈소스 기반 데이터베이스 개발 및 운영 관련 팁과 쿼리 노하우를 이곳에서 가장 먼저 접하실 수 있습니다. 많은 도움 되시길 바랍니다.

SQL Server Performance Counter Guidance

 

  • Version : SQL Server 2005, 2008, 2008R2, 2012

 

SQL Server를 운영하면서 우리는 어떤 항목을 모니터링 해야 할까? 그리고 이들의 경고 임계치 값은 얼마일까? 판단하는 기준은 어떻게 될까?

 

이번 포스트에서 제시하는 카운터의 값이 모든 서버에 적용된다고는 할 수 없다. 사용자는 매우다양한 환경을 가지고 있기 때문에 참고용으로 사용하며 항상 자신이 운영하는 서버의 하드웨어 사양 및 특징을 잘 파악하여 최적의 가이드라인을 스스로 만들어서 사용할 수 있기를 바란다.

 

성능 모니터 실행은 [시작] – [실행] –[pefmon]을 실행 한다.

 

다음 성능 카운터를 참고하여 자신이 운영하는 SQL Server를 모니터링 할 수 있도록 하자.

 

SQL Performance Counters

Object

Counter

Preferred Value

Description

SQLServer:Access Methods

Forwarded Records/sec

< 10 per 100 Batch Requests/Sec

Rows with varchar columns can experience expansion when varchar values are updated with a longer string. In the case where the row cannot fit in the existing page, the row migrates and access to the row will traverse a pointer. This only happens on heaps (tables without clustered indexes). Evaluate clustered index for heap tables. In cases where clustered indexes cannot be used, drop non-clustered indexes, build a clustered index to reorg pages and rows, drop the clustered index, then recreate non-clustered indexes.

SQLServer:Access Methods

Full Scans / sec

(Index Searches/sec)/(Full Scans/sec) > 1000

This counter monitors the number of full scans on base tables or indexes. Values greater than 1 or 2 indicate that we are having table / Index page scans. If we see high CPU then we need to investigate this counter, otherwise if the full scans are on small tables we can ignore this counter. A few of the main causes of high Full Scans/sec are
• Missing indexes
• Too many rows requested
Queries with missing indexes or too many rows requested will have a large number of logical reads and an increased CPU time.

SQLServer:Access Methods

Index Searches/sec

(Index Searches/sec)/(Full Scans/sec) > 1000

Number of index searches. Index searches are used to start range scans, single index record fetches, and to reposition within an index. Index searches are preferable to index and table scans. For OLTP applications, optimize for more index searches and less scans (preferably, 1 full scan for every 1000 index searches). Index and table scans are expensive I/O operations.

SQLServer:Access Methods

Page Splits/sec

< 20 per 100 Batch Requests/Sec

Number of page splits per second that occur as the result of overflowing index pages. Interesting counter that can lead us to our table / index design. This value needs to be low as possible. If you find out that the number of page splits is high, consider increasing the fillfactor of your indexes. An increased fillfactor helps to reduce page splits because there is more room in data pages before it fills up and a page split has to occur.

Note that this counter also includes the new page allocations as well and doesn't necessarily pose a problem. The other place we can confirm the page splits that involve data or index rows moves are the fragmented indexes on page splits.

SQL Server:Buffer Manager

Buffer Cache hit ratio

> 90%

This counter indicates how often SQL Server goes to the buffer, not the hard disk, to get data. The higher this ratio, the less often SQL Server has to go to the hard disk to fetch data, and performance overall is boosted. Unlike many of the other counters available for monitoring SQL Server, this counter averages the Buffer Cache Hit Ratio from the time the last instance of SQL Server was restarted. In other words, this counter is not a real-time measurement, but an average of all the days since SQL Server was last restarted. In OLTP applications, this ratio should exceed 90-95%. If it doesn't, then you need to add more RAM to your server to increase performance. In OLAP applications, the ratio could be much less because of the nature of how OLAP works. In any case, more RAM should increase the performance of SQL Server OLAP activity.

SQL Server:Buffer Manager

Free list stalls/sec

< 2

Free list stalls/sec is the frequency with which requests for available database pages are suspended because no buffers are available. Free list stall rates of 3 or 4 per second indicate too little SQL memory available.

SQL Server:Buffer Manager

Free pages

> 640

Total number of pages on all free lists.

SQL Server:Buffer Manager

Lazy Writes/Sec

< 20

This counter tracks how many times a second that the Lazy Writer process is moving dirty pages from the buffer to disk in order to free up buffer space. Generally speaking, this should not be a high value, say more than 20 per second or so. Ideally, it should be close to zero. If it is zero, this indicates that your SQL Server's buffer cache is plenty big and SQL Server doesn't have to free up dirty pages, instead waiting for this to occur during regular checkpoints. If this value is high, then a need for more memory is indicated.

SQL Server:Buffer Manager

Page Life Expectancy

> 300

This performance monitor counter tells you, on average, how long data pages are staying in the buffer. If this value gets below 300 seconds, this is a potential indication that your SQL Server could use more memory in order to boost performance.

SQLServer:Buffer Manager

Page lookups/sec

(Page lookups/sec) / (Batch Requests/sec) < 100

Number of requests to find a page in the buffer pool. When the ratio of page lookups to batch requests is much greater than 100, this is an indication that while query plans are looking up data in the buffer pool, these plans are inefficient. Identify queries with the highest amount of logical I/O's and tune them.

SQL Server:Buffer Manager

Page reads/sec

< 90

Number of physical database page reads issued. 80 – 90 per second is normal, anything that is above indicates indexing or memory constraint.

SQL Server:Buffer Manager

Page writes/sec

< 90

Number of physical database page writes issued. 80 – 90 per second is normal, anything more we need to check the lazy writer/sec and checkpoint counters, if these counters are also relatively high then, it's memory constraint.

SQLServer:General Statistics

Logins/sec

< 2

> 2 per second indicates that the application is not correctly using connection pooling.

SQLServer:General Statistics

Logouts/sec

< 2

> 2 per second indicates that the application is not correctly using connection pooling.

SQLServer:General Statistics

User Connections

See Description

The number of users currently connected to the SQL Server.

Note: It is recommended to review this counter along with "Batch Requests/Sec". A surge in "user connections" may result in a surge of "Batch Requests/Sec". So if there is a disparity (one going up and the other staying flat or going down), then that may be a cause for concern. With a blocking problem, for example, you might see user connections, lock waits and lock wait time all increase while batch requests/sec decreases.

SQL Server:Latches

Latch Waits/sec

(Total Latch Wait Time) / (Latch Waits/Sec) < 10

This is the number of latch requests that could not be granted immediately. In other words, these are the amount of latches, in a one second period that had to wait.

SQL Server:Latches

Total Latch Wait Time (ms)

(Total Latch Wait Time) / (Latch Waits/Sec) < 10

This is the total latch wait time (in milliseconds) for latch requests in the last second

SQL Server:Locks

Lock Wait Time (ms)

See Description"

Total wait time (milliseconds) for locks in the last second.

Note: For "Lock Wait Time" it is recommended to look beyond the Avg value. Look for any peaks that are close (or exceeds) to a wait of 60 sec. Though this counter counts how many total milliseconds SQL Server is waiting on locks during the last second, but the counter actually records at the end of locking event. So most probably the peaks represent one huge locking event. If those events exceeds more than 60seconds then they may have extended blocking and could be an issue. In such cases, thoroughly analyze the blocking script output. Some applications are written for timing out after 60 seconds and that's not acceptable response for those applications.

SQL Server:Locks

Lock Waits/sec

0

This counter reports how many times users waited to acquire a lock over the past second. Note that while you are actually waiting on the lock that this is not reflected in this counter—it gets incremented only when you "wake up" after waiting on the lock. If this value is nonzero then it is an indication that there is at least some level of blocking occurring. If you combine this with the Lock Wait Time counter, you can get some idea of how long the blocking lasted. A zero value for this counter can definitively prove out blocking as a potential cause; a nonzero value will require looking at other information to determine whether it is significant.

SQL Server:Locks

Number of Deadlocks/sec

< 1

The number of lock requests that resulted in a deadlock.

SQLServer:Memory Manager

Total Server Memory(KB)

See Description

The Total Server Memory is the current amount of memory that SQL Server is using. If this counter is still growing the server has not yet reached its steady-state, and it is still trying to populate the cache and get pages loaded into memory. Performance will likely be somewhat slower during this time since more disk I/O is required at this stage. This behavior is normal. Eventually Total Server Memory should approximate Target Server Memory.

SQLServer:SQL Statistics

Batch Requests/Sec

See Description

This counter measures the number of batch requests that SQL Server receives per second, and generally follows in step to how busy your server's CPUs are. Generally speaking, over 1000 batch requests per second indicates a very busy SQL Server, and could mean that if you are not already experiencing a CPU bottleneck, that you may very well soon. Of course, this is a relative number, and the bigger your hardware, the more batch requests per second SQL Server can handle. From a network bottleneck approach, a typical 100Mbs network card is only able to handle about 3000 batch requests per second. If you have a server that is this busy, you may need to have two or more network cards, or go to a 1Gbs network card.

Note: Sometimes low batch requests/sec can be misleading. If there were a SQL statements/sec counter, this would be a more accurate measure of the amount of SQL Server activity. For example, an application may call only a few stored procedures yet each stored procedure does lot of work. In that case, we will see a low number for batch requests/sec but each stored procedure (one batch) will execute many SQL statements that drive CPU and other resources. As a result, many counter thresholds based on the number of batch requests/sec will seem to identify issues because the batch requests on such a server are unusually low for the level of activity on the server.

We cannot conclude that a SQL Server is not active simply by looking at only batch requests/sec. Rather, you have to do more investigation before deciding there is no load on the server. If the average number of batch requests/sec is below 5 and other counters (such as SQL Server processor utilization) confirm the absence of significant activity, then there is not enough of a load to make any recommendations or identify issues regarding scalability.

SQLServer:SQL Statistics

SQL Compilations/sec

< 10% of the number of Batch Requests/Sec

The number of times per second that SQL Server compilations have occurred. This value needs to be as low as possible. If you see a high value such as over 100, then it's an indication that there are lots of adhoc queries that are running, might cause CPU usage, solution is to re-write these adhoc as stored procedure or use sp_executeSQL.

SQLServer:SQL Statistics

SQL Re-Compilations/sec

< 10% of the number of SQL Compilations/sec

This needs to be nil in our system as much as possible. A recompile can cause deadlocks and compile locks that are not compatible with any locking type.

 

 

[참고자료]

http://www.grumpyolddba.co.uk/monitoring/Performance%20Counter%20Guidance%20-%20SQL%20Server.htm

 

 



강성욱 / jevida@naver.com
Microsoft SQL Server MVP
Blog : http://sqlmvp.kr
Facebook : http://facebook.com/sqlmvp

No. Subject Author Date Views
1850 프로파일러를 이용한 중첩된 프로시저 디버깅 jevida(강성욱) 2016.10.07 1542
1849 SAN 스토리지 성능 모니터 - SAN 스토리지를 사용하는 경우 성능 카운터를 어떻게 모니터링 할까? jevida(강성욱) 2016.10.07 1720
1848 저장된 Plan Cache 확인 및 활용 jevida(강성욱) 2016.10.07 4497
1847 Xp_fixeddrives 세부 정보 확인하기 jevida(강성욱) 2016.10.07 1715
1846 강제 매개변수화로 인한 성능 저하 사례 jevida(강성욱) 2016.10.07 1537
1845 파라메터 스니핑과 데이터 스큐 jevida(강성욱) 2016.10.07 1528
1844 DBCC CHECKDB 버그 및 해결 방법 jevida(강성욱) 2016.10.07 1536
1843 NOLOCK HINT 이해 jevida(강성욱) 2016.10.07 7700
1842 인증으로부터 분리된 사용자 방지 jevida(강성욱) 2016.10.07 2103
1841 비관리자 계정으로 쿼리 계획 보기 jevida(강성욱) 2016.10.07 1241
1840 SSMS 폴링 간격 구성 jevida(강성욱) 2016.10.07 1770
1839 Deadlock 감지하여 알림하기 jevida(강성욱) 2016.10.07 1766
1838 Suspect_pages 테이블 이해 및 관리 jevida(강성욱) 2016.10.07 1370
1837 SSRS SocketException jevida(강성욱) 2016.10.07 1469
1836 파티션 분할 시 I/O 최소화 하기 jevida(강성욱) 2016.10.07 1854
1835 대량 BCP 작업 시 발생하는 오류 (665, 1450, 33) jevida(강성욱) 2016.10.07 1643
1834 기본 추적(default tace) 활성화 및 로그 확인 jevida(강성욱) 2016.10.07 1388
1833 SQL Server ALTER TABLE syntax diagrams jevida(강성욱) 2016.10.07 1157
» SQL Server Performance Counter Guidance jevida(강성욱) 2016.09.30 2467
1831 SQL Server CREATE TABLE syntax diagrams jevida(강성욱) 2016.09.30 1421





XE Login