SQL Server 2016 Large RAM  Checkpoint

 

·         Version : SQL Server 2016, 2014, 2012

 

SQL Server Checkpoint 현재 데이터베이스의 buffer pool 있는 커밋되지 않는 데이터 페이지를 디스크로 플러시하는 작업이다하드웨어 사양이 고급화 되면서  TB이상의  메모리를 사용하는 경우가 늘어나고 있는데,  많은 양의 메모리가 버퍼캐시로 할당되어 실행될  SQL Server에서 발생할  있는 DB 체크포인트와 관련한 내용에 대해서 알아본다.

 포스트는 CSS SQL Server Engineers 내용을 정리한것으로 자세한 내용은 원문을 참고하길 바란다.

·         SQL Server : large RAM and DB Checkpointing : https://blogs.msdn.microsoft.com/psssql/2017/06/29/sql-server-large-ram-and-db-checkpointing/

 

FlushCache라고 하는 Database Checkpoint ‘Legacy(레거시)’ 구현은 특정 데이터베이스의 체크포인트를 실행하기 위해 SQL Server 전체 버퍼풀을 검사해야한다체크포인트의 지속 시간은 일반적으로 요청하는 더티페이지 수에 의해 결정되지만 대형 RAM 사용하는 프로덕션 서버의 경우 RAM크리가  TB 값에 도달하면서 체크포인트 시간 자체가  의미가 있게되었다.  스캔 지속시간은 버퍼풀의 크기와 직접 연관이 있으며 최근 관찰된 값은2TB 메모리의 경우  12 정도였으며 시스템마다 다를  있다.

DPM(Data Protection Manager)이라는 새로운 간접 체크포인트 옵션은  이상 버퍼풀 검사에  존하지 않는다. SQL Server 2012에서 처음 도입된 Indirection checkpoint 완전한 DPM 사용할  있는 것은 아니었다.  SQL Server 2016에서는 데이터베이스 생성시 기본적으로 간접 체크포인트가 활성화되는 DPM 방식을 사용하여 대용량 메모리를 사용하더라도 지연이 발생하지 않는다.

DPM 활성화 하기 위해서는 아래 요구사항이 충족되어야 한다.

required build

·         SQL Server 2012 : SP3 CU3 and later

·         SQL Server 2014 : RTM CU14 or SP1 CU7 and later

Indirect Checkpoint activated for the relevant Databases

·         ALTER DATABASE Dbname SET TARGET_RECOVERY_TIME = 60 SECONDS

·         Any non-zero value activates Indirect Checkpoint

Traceflag 3449 enabled

·         DBCC TRACEON (3449,-1)

·         (the traceflag activation/deactivation has immediate effect without service restart needed)

·         or more likely, startup parameter for the instance (-T).

 

FlushCache 통계를 에러로그에 기록하기 위해서는 특정 TF 적용하여 사용할  있다. SQL Server 2012이상에서는 실제로 매우 느릴때 해당 메시지를 자동으로 생성하기도 한다. TF 추가 로깅이기 때문에 다른 시스템에 영향을 미치지 않고 안전하다.

 

SQL Server 2012 또는 2014에서는 TF3504 적용하였을때 아래와 같은 메시지가 기록된다기록된 로그를 살펴보면  FlushCache  수행되고 있음을 알수 있지만 실제로FlushCache 안정적으로 진행되는지에 대해서는 알수가 없다또한 대형 RAM 사용하는 서버의 경우 1ms   정도 지연될수도 있다. 0 bufs write 나타낸 경우 주로 디스크 입출력 지연과 관계되어 있지 않기 때문에 buffer pool 스캔에 묶여 있다고 판단할  있다.

DBCC TRACEON(3504,-1)

Go

Checkpoint

 

FlushCache: cleaned up 0 bufs with 0 writes in 1 ms (avoided 0 new dirty bufs) for db 8:0

average throughput:   0.00 MB/sec, I/O saturation: 0, context switches 0

last target outstanding: 2, avgWriteLatency 858993

 

SQL Server 2014이상 사용자는 TF3502 추가적으로 사용하여 FlushCache 또는 DPM 구현을 사용하는지 관계없이 Checkpoint Operation 발생하는것을 반영한다. DPM 경로의 경우TF 3504만으로 보고된 것이 없기 때문에 TF3502 함께 사용하면 유용하다.

DBCC TRACEON(3502,-1)

DBCC TRACEON(3504,-1)

GO

checkpoint

 

Ckpt dbid 9 started

About to log Checkpoint begin.

Ckpt dbid 9 phase 1 ended (8)

FlushCache: cleaned up 0 bufs with 0 writes in 1 ms (avoided 0 new dirty bufs) for db 9:0

           average writes per second:   0.00 writes/sec

           average throughput:   0.00 MB/sec, I/O saturation: 0, context switches 0

           bandwidth server limit:         1.00 MB/sec

About to log Checkpoint end.

Ckpt dbid 9 complete

 

SQL Server 2016에서는 TF3504 정보가 업그레이드되어  체크포인트에 대해 FlushCache 또는 DPM메시지를 생성하므로 TF3504하나로 모든 메시지를 얻을  있다.

DBCC TRACEON(3504,-1)

raiserror('=====================================Separator',10,1) with LOG

use master -- DBID 1 by default not enabled for indirect checkpoint on SQL 2016

checkpoint

raiserror('=====================================Separator',10,1) with LOG

use test -- DBID 10, using default indirect checkpoint

checkpoint

raiserror('=====================================Separator',10,1) with LOG

 

DBCC TRACEON 3504, server process ID (SPID) 54. This is an informational message only; no user action is required.

=====================================Separator

FlushCache: cleaned up 0 bufs with 0 writes in 1 ms (avoided 0 new dirty bufs) for db 1:0

           average writes per second:   0.00 writes/sec

                      average throughput:   0.00 MB/sec, I/O saturation: 0, context switches 0

           last target outstanding: 2, avgWriteLatency 0

=====================================Separator

DirtyPageMgr::ForceCatchupOrFlushCache: cleaned up 0 dirty pages in 0 ms for db 10

=====================================Separator

 

대형 RAM 사용하는 서버의 경우체크포인트 로그 정보를 활용하여 DPM 활성여부를 판단하여 체크포인트에 대한 지연을 방지   있도록 한다.

 

[참고자료]

·         https://blogs.msdn.microsoft.com/psssql/2017/06/29/sql-server-large-ram-and-db-checkpointing/

·         SQL Server2012 Checkpoint 제어 : http://sqlmvp.kr/140171578379

·         Checkpoint 추적하기 : http://sqlmvp.kr/140192370231

 

 

2017-11-07 / 강성욱 http://sqlmvp.kr / http://sqlangeles.com

 

 

 

 

SQL Server, MSSQL, SQL 2016, Larger RAM Checkpoint, Checkpoint, DPM, Data Protection Manager, TF3504,  TF3502,





profile

강성욱 / jevida@naver.com
Microsoft SQL Server MVP
Blog : http://sqlmvp.kr
Facebook : http://facebook.com/sqlmvp

Kakao Talk : SQLMVP

Line : jevida


현재 LA에 거주하고 있으며 SQL에 관심있는 분이면 언제든 친추 환영합니다.