摘要

这篇文章介绍SQL Server的一个典型的应用案例,即如何利用Event Notification与Service Broker技术相结合来实现死锁信息自动收集系统。通过这个系统,我们可以全面把控SQL Server数据库环境中所有实例上发生的死锁详细信息,供我们后期分析和解决死锁场景。

死锁自动收集系统需求分析

当 SQL Server 中某组资源的两个或多个线程或进程之间存在循环的依赖关系时,但因互相申请被其他进程所占用,而不会释放的资源处于的一种永久等待状态,将会发生死锁。SQL Server服务自动死锁检查进程默认每5分钟跑一次,当死锁发生时,会选择一个代价较小的进程做为死锁牺牲品,以此来避免死锁导致更大范围的影响。被选择做为死锁牺牲品的进程会报告如下错误:

  1. Msg 1205, Level 13, State 51, Line 8
  2. Transaction (Process ID 54) was deadlocked on lock resources with another process and has been chosen as the deadlock victim. Rerun the transaction.

如果进程间发生了死锁,对于用户业务系统,乃至整个SQL Server服务健康状况影响很大,轻者系统反应缓慢,服务假死;重者服务挂起,拒绝请求。那么,我们有没有一种方法可以完全自动、无人工干预的方式异步收集SQL Server系统死锁信息并远程保留死锁相关信息呢?这些信息包括但不仅限于:

  • 死锁发生在哪些进程之间

  • 各个进程执行的语句块是什么?死锁时,各个进程在执行哪条语句?

  • 死锁的资源是什么?死锁发生在哪个数据库?哪张表?哪个数据页?哪个索引上?

  • 死锁发生的具体时间点,包含语句块开始时间、语句执行时间等

  • 用户进程使用的登录用户是什么?客户端驱动是什么? …… 如此的无人值守的自动死锁收集系统,就是我们今天要介绍的应用案例分享:利用SQL Server的Event Notification与Service Broker建立自动死锁信息收集系统。

Service Broker和Event Notification简介

在死锁自动收集系统介绍开始之前,先简要介绍下SQL Server Service Broker和Event Notification技术。

Service Broker简介

Service Broker是微软至SQL Server 2005开始集成到数据库引擎中的消息通讯组件,为 SQL Server提供队列和可靠的消息传递的能力,可以用来构建基于异步消息通讯为基础的应用程序。Service Broker既可用于单个 SQL Server 实例的应用程序,也可用于在多个实例间进行消息分发工作的应用程序。Service Broker使用TCP/IP端口在实例间交换消息,所包含的功能有助于防止未经授权的网络访问,并可以对通过网络发送的消息进行加密,以此来保证数据安全性。多实例之间使用Service Broker进行异步消息通讯的结构图如下所示(图片来自微软的官方文档):

01.png

Event Notification简介

Event Notification的中文名称叫事件通知,执行事件通知可对各种Transact-SQL数据定义语言(DDL)语句和SQL跟踪事件做出响应,采取的响应方式是将这些事件的相关信息发送到 Service Broker 服务。事件通知可以用来执行以下操作:

  • 记录和检索发生在数据库上的更改或活动。

  • 执行操作以异步方式而不是同步方式响应事件。

可以将事件通知用作替代DDL 触发器和SQL跟踪的编程方法。事件通知的信息媒介是以xml数据类型的信息传递给Service Broker服务,它提供了有关事件的发生时间、受影响的数据库对象、涉及的 Transact-SQL 批处理语句等详细信息。对于SQL Server死锁而言,可以使用Event Notification来跟踪死锁事件,来获取DEADLOCK_GRAPH XML信息,然后通过异步消息组件Service Broker发送到远端的Deadlock Center上的Service Broker队列,完成死锁信息收集到死锁中央服务。

死锁收集系统架构图

在介绍完Service Broker和Event Notification以后,我们来看看死锁手机系统的整体架构图。在这个系统中,存在两种类型角色:我们定义为死锁客户端(Deadlock Client)和死锁中央服务(Deadlock Center)。死锁客户端发生死锁后,首先会将Deadlock Graph XML通过Service Broker发送给死锁中央服务,死锁中央服务获取到Service Broker消息以后,解析这个XML就可以拿到客户端的死锁相关信息,最后存放到本地日志表中,供终端客户查询和分析使用。最终的死锁收集系统架构图如下所示: 02.png

详细的死锁信息收集过程介绍如下:死锁客户端通过本地SQL Server的Event Notification捕获发生在该实例上的Deadlock事件,并在死锁发生以后将Deadlock Graph XML数据存放到Event Notification绑定的队列中,然后通过绑定在该队列上的存储过程自动触发将Deadlock Graph XML通过Service Broker异步消息通讯的方式发送到死锁中央服务。中央服务在接收到Service Broker消息以后,首先放入Deadlock Center Service Broker队列中,该队列绑定了消息自动处理存储过程,用来解析Deadlock Graph XML信息,并将死锁相关的详细信息存入到Deadlock Center的Log Table中。最后,终端用户可以直接对Log Table来查询和分析所有Deadlock Client上发生的死锁信息。通过这系列的过程,最终达到了死锁信息的自动远程存储、收集,以提供后期死锁场景还原和复盘,达到死锁信息可追溯,及时监控,及时发现的目的。

Service Broker配置

系统架构设计完毕后,接下来是系统的配置和搭建过程,首先看看Service Broker的配置。这个配置还是相对比较繁琐的,包含了以下步骤:

  • 创建Service Broker数据库(假设数据库名为DDLCenter)并开启Service Broker选项

  • 创建Service Broker队列的激活存储过程和相关表对象

  • 创建Master数据库下的Master Key

  • 创建传输层本地和远程证书

  • 创建基于证书的用户登录

  • 创建Service Broker端口并授权用户连接

  • 创建DDLCenter数据库下的Master Key

  • 创建会话层本地及远程证书

  • 创建Service Broker组件所需要的对象,包括:Message Type、Contact、Queue、Service、Remote Service Binding、Route

Deadlock Client Server

以下的配置请在Deadlock Client SQL Server实例上操作。

  • 创建DDLCenter数据库并开启Service Broker选项
  1. -- Run script on client server to gather deadlock graph xml
  2. USE master
  3. GO
  4. -- Create Database
  5. IF DB_ID('DDLCenter') IS NULL
  6. CREATE DATABASE [DDLCenter];
  7. GO
  8. -- Change datbase to simple recovery model
  9. ALTER DATABASE [DDLCenter] SET RECOVERY SIMPLE WITH NO_WAIT
  10. GO
  11. -- Enable Service Broker
  12. ALTER DATABASE [DDLCenter] SET ENABLE_BROKER,TRUSTWORTHY ON
  13. GO
  14. -- Change database Owner to sa
  15. ALTER AUTHORIZATION ON DATABASE::DDLCenter TO [sa]
  16. GO
  • 三个表和两个存储过程

表[DDLCollector].[Deadlock_Traced_Records]:从Event Notification队里接收的消息会记录到该表中。 表[DDLCollector].[Send_Records]:Deadlock Client成功发送Service Broker消息记录 表[DDLCollector].[Error_Records]:记录发生异常情况时的信息。 存储过程[DDLCollector].[UP_ProcessDeadlockEventMsg]:Deadlock Client绑定到队里的激活存储过程,一旦队列中有消息进入,这个存储过程会被自动调用。 存储过程[DDLCollector].[UP_SendDeadlockMsg]:Deadlock Client发送异步消息给Deadlock Center,这个存储过程会被上面的激活存储过程调用。

  1. -- Run on Client Instance
  2. USE [DDLCenter]
  3. GO
  4. -- Create Schema
  5. IF NOT EXISTS(
  6. SELECT TOP 1 *
  7. FROM sys.schemas
  8. WHERE name = 'DDLCollector'
  9. )
  10. BEGIN
  11. EXEC('CREATE SCHEMA DDLCollector');
  12. END
  13. GO
  14. -- Create table to log Traced Deadlock Records
  15. IF OBJECT_ID('DDLCollector.Deadlock_Traced_Records', 'U') IS NOT NULL
  16. DROP TABLE [DDLCollector].[Deadlock_Traced_Records]
  17. GO
  18. CREATE TABLE [DDLCollector].[Deadlock_Traced_Records](
  19. [RowId] [BIGINT] IDENTITY(1,1) NOT NULL,
  20. [Processed_Msg] [xml] NULL,
  21. [Processed_Msg_CheckSum] INT,
  22. [Record_Time] [datetime] NOT NULL
  23. CONSTRAINT DF_Deadlock_Traced_Records_Record_Time DEFAULT(GETDATE()),
  24. CONSTRAINT PK_Deadlock_Traced_Records_RowId PRIMARY KEY
  25. (RowId ASC)
  26. ) ON [PRIMARY]
  27. GO
  28. -- Create table to record deadlock graph xml sent successfully log
  29. IF OBJECT_ID('DDLCollector.Send_Records', 'U') IS NOT NULL
  30. DROP TABLE [DDLCollector].[Send_Records]
  31. GO
  32. CREATE TABLE [DDLCollector].[Send_Records](
  33. [RowId] [BIGINT] IDENTITY(1,1) NOT NULL,
  34. [Send_Msg] [xml] NULL,
  35. [Send_Msg_CheckSum] INT,
  36. [Record_Time] [datetime] NOT NULL
  37. CONSTRAINT DF_Send_Records_Record_Time DEFAULT(GETDATE()),
  38. CONSTRAINT PK_Send_Records_RowId PRIMARY KEY
  39. (RowId ASC)
  40. ) ON [PRIMARY]
  41. GO
  42. -- Create table to record error info when exception occurs
  43. IF OBJECT_ID('DDLCollector.Error_Records', 'U') IS NOT NULL
  44. DROP TABLE [DDLCollector].[Error_Records]
  45. GO
  46. CREATE TABLE [DDLCollector].[Error_Records](
  47. [RowId] [int] IDENTITY(1,1) NOT NULL,
  48. [Msg_Body] [xml] NULL,
  49. [Conversation_handle] [uniqueidentifier] NULL,
  50. [Message_Type] SYSNAME NULL,
  51. [Service_Name] SYSNAME NULL,
  52. [Contact_Name] SYSNAME NULL,
  53. [Record_Time] [datetime] NOT NULL
  54. CONSTRAINT DF_Error_Records_Record_Time DEFAULT(GETDATE()),
  55. [Error_Details] [nvarchar](4000) NULL,
  56. CONSTRAINT PK_Error_Records_RowId PRIMARY KEY
  57. (RowId ASC)
  58. ) ON [PRIMARY]
  59. GO
  60. USE [DDLCenter]
  61. GO
  62. -- Create Store Procedure to Send Deadlock Graph xml to Center Server
  63. IF OBJECT_ID('DDLCollector.UP_SendDeadlockMsg', 'P') IS NOT NULL
  64. DROP PROC [DDLCollector].[UP_SendDeadlockMsg]
  65. GO
  66. CREATE PROCEDURE [DDLCollector].[UP_SendDeadlockMsg](
  67. @DeadlockMsg XML
  68. )
  69. AS
  70. BEGIN
  71. SET NOCOUNT ON;
  72. DECLARE
  73. @handle UNIQUEIDENTIFIER
  74. ,@Proc_Name SYSNAME
  75. ,@Error_Details VARCHAR(2000)
  76. ;
  77. -- get the store procedure name
  78. SELECT
  79. @Proc_Name = ISNULL(QUOTENAME(SCHEMA_NAME(SCHEMA_ID))
  80. + '.'
  81. + QUOTENAME(OBJECT_NAME(@@PROCID)),'')
  82. FROM sys.procedures
  83. WHERE OBJECT_ID = @@PROCID
  84. ;
  85. BEGIN TRY
  86. -- Begin Dialog
  87. BEGIN DIALOG CONVERSATION @handle
  88. FROM SERVICE [http://soa/deadlock/service/ClientService]
  89. TO Service 'http://soa/deadlock/service/CenterService'
  90. ON CONTRACT [http://soa/deadlock/contract/CheckContract]
  91. ;
  92. -- Send deadlock graph xml as the message to Center Server
  93. SEND ON CONVERSATION @handle
  94. MESSAGE TYPE [http://soa/deadlock/MsgType/Request] (@DeadlockMsg);
  95. -- Log it successfully
  96. INSERT INTO [DDLCollector].[Send_Records]([Send_Msg], [Send_Msg_CheckSum])
  97. VALUES( @DeadlockMsg, CHECKSUM(CAST(@DeadlockMsg as NVARCHAR(MAX))))
  98. END TRY
  99. BEGIN CATCH
  100. -- Record the error info when exception occurs
  101. SET @Error_Details=
  102. ' Error Number: ' + CAST(ERROR_NUMBER() AS VARCHAR(10)) +
  103. ' Error Message : ' + ERROR_MESSAGE() +
  104. ' Error Severity: ' + CAST(ERROR_SEVERITY() AS VARCHAR(10)) +
  105. ' Error State: ' + CAST(ERROR_STATE() AS VARCHAR(10)) +
  106. ' Error Line: ' + CAST(ERROR_LINE() AS VARCHAR(10)) +
  107. ' Exception Proc: ' + @Proc_Name
  108. ;
  109. -- record into table
  110. INSERT INTO [DDLCollector].[Error_Records]([Msg_Body], [Conversation_handle], [Message_Type], [Service_Name], [Contact_Name], [Error_Details])
  111. VALUES(@DeadlockMsg, @handle, 'http://soa/deadlock/MsgType/Request', 'http://soa/deadlock/service/ClientService', 'http://soa/deadlock/contract/CheckContract', @Error_Details);
  112. END CATCH
  113. END
  114. GO
  115. -- Create Store Procedure for Queue: when extend event notification queue message
  116. -- this store procedure will be called.
  117. IF OBJECT_ID('DDLCollector.UP_ProcessDeadlockEventMsg', 'P') IS NOT NULL
  118. DROP PROC [DDLCollector].[UP_ProcessDeadlockEventMsg]
  119. GO
  120. CREATE PROCEDURE [DDLCollector].[UP_ProcessDeadlockEventMsg]
  121. AS
  122. /*
  123. SELECT * FROM [DDLCollector].[Deadlock_Traced_Records]
  124. SELECT * FROM [DDLCollector].[Send_Records]
  125. SELECT * FROM [DDLCollector].[Error_Records]
  126. */
  127. BEGIN
  128. SET NOCOUNT ON;
  129. DECLARE
  130. @handle UNIQUEIDENTIFIER
  131. , @Message_Type SYSNAME
  132. , @Service_Name SYSNAME
  133. , @Contact_Name SYSNAME
  134. , @Error_Details VARCHAR(2000)
  135. , @Message_Body XML
  136. , @Proc_Name SYSNAME
  137. ;
  138. -- Store Procedure Name
  139. SELECT
  140. @Proc_Name = ISNULL(QUOTENAME(SCHEMA_NAME(SCHEMA_ID))
  141. + '.'
  142. + QUOTENAME(OBJECT_NAME(@@PROCID)),'')
  143. FROM sys.procedures
  144. WHERE OBJECT_ID = @@PROCID
  145. ;
  146. BEGIN TRY
  147. -- Receive message from queue
  148. WAITFOR(RECEIVE TOP(1)
  149. @handle = conversation_handle
  150. , @Message_Type = message_type_name
  151. , @Service_Name = service_name
  152. , @Contact_Name = service_contract_name
  153. , @Message_Body = message_body
  154. FROM dbo.[http://soa/deadlock/queue/ClientQueue]),Timeout 500
  155. ;
  156. -- just return if there is no message needed to process
  157. IF(@@Rowcount=0)
  158. BEGIN
  159. RETURN
  160. END
  161. -- Get data from message queue
  162. ELSE IF @Message_Type = 'http://schemas.microsoft.com/SQL/Notifications/EventNotification'
  163. BEGIN
  164. -- Record message log first
  165. INSERT INTO [DDLCollector].[Deadlock_Traced_Records](Processed_Msg, [Processed_Msg_CheckSum])
  166. VALUES(@Message_Body, CHECKSUM(CAST(@Message_Body as NVARCHAR(MAX))))
  167. -- BE NOTED HERE: PLEASE DO'T END CONVERSATION, OR ELSE EXCEPTION WILL BE THROWN OUTPUT
  168. /*
  169. Error: 17001, Severity: 16, State: 1.
  170. Failure to send an event notification instance of type 'DEADLOCK_GRAPH' on conversation handle '{67419386-7C34-E711-A709-001C42099969}'. Error Code = '8429'.
  171. Error: 17005, Severity: 16, State: 1.
  172. Event notification 'DeadLockNotificationEvent' in database 'master' dropped due to send time service broker errors. Check to ensure the conversation handle, service broker contract, and service specified in the event notification are active.
  173. */
  174. --END CONVERSATION @handle
  175. --Here call another Store Procedure to send deadlock graph info to center server
  176. EXEC [DDLCollector].[UP_SendDeadlockMsg] @Message_Body;
  177. END
  178. --End Diaglog Message Type, that means we should end this conversation
  179. ELSE IF @Message_Type = N'http://schemas.microsoft.com/SQL/ServiceBroker/EndDialog'
  180. BEGIN
  181. END CONVERSATION @handle;
  182. END
  183. -- Konwn Service Broker Errors by System.
  184. ELSE IF @Message_Type = N'http://schemas.microsoft.com/SQL/ServiceBroker/Error'
  185. BEGIN
  186. END CONVERSATION @handle
  187. INSERT INTO [DDLCollector].[Error_Records]([Msg_Body], [Conversation_handle], [Message_Type], [Service_Name], [Contact_Name], [Error_Details])
  188. VALUES(@Message_Body, @handle, @Message_Type, @Service_Name, @Contact_Name, ' Exception Store Procedure: ' + @Proc_Name);
  189. END
  190. ELSE
  191. -- unknown Message Types.
  192. BEGIN
  193. END CONVERSATION @handle
  194. INSERT INTO [DDLCollector].[Error_Records]([Msg_Body], [Conversation_handle], [Message_Type], [Service_Name], [Contact_Name], [Error_Details])
  195. VALUES(@Message_Body, @handle, @Message_Type, @Service_Name, @Contact_Name, ' Received unexpected message type when executing Store Procedure: ' + @Proc_Name);
  196. -- unexpected message type
  197. RAISERROR (N' Received unknown message type: %s', 16, 1, @Message_Type) WITH LOG;
  198. END
  199. END TRY
  200. BEGIN CATCH
  201. BEGIN
  202. SET @Error_Details=
  203. ' Error Number: ' + CAST(ERROR_NUMBER() AS VARCHAR(10)) +
  204. ' Error Details : ' + ERROR_MESSAGE() +
  205. ' Error Severity: ' + CAST(ERROR_SEVERITY() AS VARCHAR(10)) +
  206. ' Error State: ' + CAST(ERROR_STATE() AS VARCHAR(10)) +
  207. ' Error Line: ' + CAST(ERROR_LINE() AS VARCHAR(10)) +
  208. ' Exception Proc: ' + @Proc_Name
  209. ;
  210. INSERT INTO [DDLCollector].[Error_Records]([Msg_Body], [Conversation_handle], [Message_Type], [Service_Name], [Contact_Name], [Error_Details])
  211. VALUES(@Message_Body, @handle, @Message_Type, @Service_Name, @Contact_Name, @Error_Details);
  212. END
  213. END CATCH
  214. END
  215. GO
  • 创建Master库下Master Key
  1. USE master
  2. GO
  3. -- If the master key is not available, create it.
  4. IF NOT EXISTS (SELECT *
  5. FROM sys.symmetric_keys
  6. WHERE name LIKE '%MS_DatabaseMasterKey%')
  7. BEGIN
  8. CREATE MASTER KEY ENCRYPTION BY PASSWORD = 'ClientMasterKey*';
  9. END
  10. GO
  • 创建传输层本地证书并备份到本地文件系统

这里请注意证书的开始生效时间要略微早于当前时间,并设置合适的证书过期日期,我这里是设置的过期日期为9999年12月30号。

  1. USE master
  2. GO
  3. -- Crete Transport Layer Certification
  4. CREATE CERTIFICATE TrpCert_ClientLocal
  5. AUTHORIZATION dbo
  6. WITH SUBJECT = 'TrpCert_ClientLocal',
  7. START_DATE = '05/07/2017',
  8. EXPIRY_DATE = '12/30/9999'
  9. GO
  10. -- then backup it up to local path
  11. -- and after that copy it to Center server
  12. BACKUP CERTIFICATE TrpCert_ClientLocal
  13. TO FILE = 'C:\Temp\TrpCert_ClientLocal.cer';
  14. GO
  • 创建传输层远程证书

这里的证书是通过证书文件来创建的,这个证书文件来自于远程通讯的另一端Deadlock Center SQL Server的证书文件的一份拷贝。

  1. USE master
  2. GO
  3. -- Create certification came from Center Server.
  4. CREATE CERTIFICATE TrpCert_RemoteCenter
  5. FROM FILE = 'C:\Temp\TrpCert_RemoteCenter.cer'
  6. GO
  • 创建基于证书文件的用户登录

这里也可以创建带密码的常规用户登录,但是为了规避安全风险,这里最好创建基于证书文件的用户登录。

  1. USE master
  2. GO
  3. -- Create user login
  4. IF NOT EXISTS(SELECT *
  5. FROM sys.syslogins
  6. WHERE name='SSBDbo')
  7. BEGIN
  8. CREATE LOGIN SSBDbo FROM CERTIFICATE TrpCert_ClientLocal;
  9. END
  10. GO
  • 创建Service Broker TCP/IP通讯端口并授权用户连接权限

这里需要注意的是,端口授权的证书一定本地实例创建的证书,而不是来自于远程服务器的那个证书。比如代码中的AUTHENTICATION = CERTIFICATE TrpCert_ClientLocal部分。

  1. USE master
  2. GO
  3. --Creaet Tcp endpoint for SSB comunication and grant connect to users.
  4. CREATE ENDPOINT EP_SSB_ClientLocal
  5. STATE = STARTED
  6. AS TCP
  7. (
  8. LISTENER_PORT = 4022
  9. )
  10. FOR SERVICE_BROKER (AUTHENTICATION = CERTIFICATE TrpCert_ClientLocal, ENCRYPTION = REQUIRED
  11. )
  12. GO
  13. -- Grant Connect on Endpoint to User SSBDbo
  14. GRANT CONNECT ON ENDPOINT::EP_SSB_ClientLocal TO SSBDbo
  15. GO
  • 创建DDLCenter数据库Master Key
  1. -- Now, let's go inside to conversation database
  2. USE DDLCenter
  3. GO
  4. -- Create Master Key
  5. IF NOT EXISTS (SELECT *
  6. FROM sys.symmetric_keys
  7. WHERE name LIKE '%MS_DatabaseMasterKey%')
  8. BEGIN
  9. CREATE MASTER KEY ENCRYPTION BY PASSWORD = 'DDLCenterMasterKey*';
  10. END
  11. GO
  • 创建会话层本地证书
  1. USE DDLCenter
  2. GO
  3. -- Create conversation layer certification
  4. CREATE CERTIFICATE DlgCert_ClientLocal
  5. AUTHORIZATION dbo
  6. WITH SUBJECT = 'DlgCert_ClientLocal',
  7. START_DATE = '05/07/2017',
  8. EXPIRY_DATE = '12/30/9999'
  9. GO
  10. -- backup it up to local path
  11. -- and then copy it to remote Center server
  12. BACKUP CERTIFICATE DlgCert_ClientLocal
  13. TO FILE = 'C:\Temp\DlgCert_ClientLocal.cer';
  14. GO
  • 创建DDLCenter用户,不需要和任何用户登录匹配
  1. USE DDLCenter
  2. GO
  3. -- Create User for login under conversation database
  4. IF NOT EXISTS(
  5. SELECT TOP 1 *
  6. FROM sys.database_principals
  7. WHERE name = 'SSBDbo'
  8. )
  9. BEGIN
  10. CREATE USER SSBDbo WITHOUT LOGIN;
  11. END
  12. GO
  • 创建会话层远程证书,这个证书文件来自Deadlock Center SQL Server备份
  1. USE DDLCenter
  2. GO
  3. -- Create converstaion layer certification came from remote Center server.
  4. CREATE CERTIFICATE DlgCert_RemoteCenter
  5. AUTHORIZATION SSBDbo
  6. FROM FILE='C:\Temp\DlgCert_RemoteCenter.cer'
  7. GO
  8. GRANT CONNECT TO SSBDbo;
  • 创建Service Broker组件对象

Deadlock Client与Deadlock Center在创建Service Broker组件对象时存在差异:第一个差异是创建Service的时候,需要包含Event Notification的Contract,名称为 http://schemas.microsoft.com/SQL/Notifications/PostEventNotification;第二个差异是需要多创建一个指向本地服务的路由http://soa/deadlock/route/LocalRoute。

  1. USE DDLCenter
  2. GO
  3. -- Create Message Type
  4. CREATE MESSAGE TYPE [http://soa/deadlock/MsgType/Request]
  5. VALIDATION = WELL_FORMED_XML;
  6. CREATE MESSAGE TYPE [http://soa/deadlock/MsgType/Response]
  7. VALIDATION = WELL_FORMED_XML;
  8. GO
  9. -- Create Contact
  10. CREATE CONTRACT [http://soa/deadlock/contract/CheckContract](
  11. [http://soa/deadlock/MsgType/Request] SENT BY INITIATOR,
  12. [http://soa/deadlock/MsgType/Response] SENT BY TARGET
  13. );
  14. GO
  15. -- Create Queue
  16. CREATE QUEUE dbo.[http://soa/deadlock/queue/ClientQueue]
  17. WITH STATUS = ON, RETENTION = OFF
  18. , ACTIVATION (STATUS = ON ,
  19. PROCEDURE_NAME = [DDLCollector].[UP_ProcessDeadlockEventMsg] ,
  20. MAX_QUEUE_READERS = 2 ,
  21. EXECUTE AS N'dbo')
  22. GO
  23. -- Create Service
  24. -- Here is very import, we have to create service for both contacts
  25. -- to get extend event notification and SSB work.
  26. CREATE SERVICE [http://soa/deadlock/service/ClientService]
  27. ON QUEUE [http://soa/deadlock/queue/ClientQueue]
  28. (
  29. [http://soa/deadlock/contract/CheckContract],
  30. [http://schemas.microsoft.com/SQL/Notifications/PostEventNotification]
  31. );
  32. GO
  33. -- Grant Send on service
  34. GRANT SEND ON SERVICE::[http://soa/deadlock/service/ClientService] to SSBDbo;
  35. GO
  36. -- Create Remote Service Bingding
  37. CREATE REMOTE SERVICE BINDING [http://soa/deadlock/RSB/CenterRSB]
  38. TO SERVICE 'http://soa/deadlock/service/CenterService'
  39. WITH USER = [SSBDbo],
  40. ANONYMOUS=Off
  41. GO
  42. -- Create Route
  43. CREATE ROUTE [http://soa/deadlock/route/CenterRoute]
  44. WITH SERVICE_NAME = 'http://soa/deadlock/service/CenterService',
  45. ADDRESS = 'TCP://10.211.55.3:4024';
  46. GO
  47. -- Create route for the DeadlockNotificationSvc
  48. CREATE ROUTE [http://soa/deadlock/route/LocalRoute]
  49. WITH SERVICE_NAME = 'http://soa/deadlock/service/ClientService',
  50. ADDRESS = 'LOCAL';
  51. GO

Deadlock Center Server

  • 创建DDLCenter数据库并开启Service Broker选项
  1. -- Run script on center server to receive client deadlock xml
  2. USE master
  3. GO
  4. -- Create Database
  5. IF DB_ID('DDLCenter') IS NULL
  6. CREATE DATABASE [DDLCenter];
  7. GO
  8. -- Change datbase to simple recovery model
  9. ALTER DATABASE [DDLCenter] SET RECOVERY SIMPLE WITH NO_WAIT
  10. GO
  11. -- Enable Service Broker
  12. ALTER DATABASE [DDLCenter] SET ENABLE_BROKER,TRUSTWORTHY ON
  13. GO
  14. -- Change database Owner to sa
  15. ALTER AUTHORIZATION ON DATABASE::DDLCenter TO [sa]
  16. GO
  • 三张表和两个存储过程

表[DDLCollector].[Collect_Records]:Deadlock Center成功接收到的Service Broker消息。 表[DDLCollector].[Error_Records]:记录发生异常情况的详细信息。 表[DDLCollector].[Deadlock_Info]:记录所有Deadlock Client端发生的Deadlock详细信息。 存储过程[DDLCollector].[UP_ProcessDeadlockGraphEventMsg]:Deadlock Center上绑定到队列的激活存储过程,一旦队列中有消息进入,这个存储过程会被自动调用。 存储过程[DDLCollector].[UP_ParseDeadlockGraphEventMsg]:Deadlock Center上解析Deadlock Graph XML的存储过程对象,这个存储过程会被上面的激活存储过程调用来解析XML,然后放入表[DDLCollector].[Deadlock_Info]中。

  1. USE [DDLCenter]
  2. GO
  3. -- Create Schema
  4. IF NOT EXISTS(
  5. SELECT TOP 1 *
  6. FROM sys.schemas
  7. WHERE name = 'DDLCollector'
  8. )
  9. BEGIN
  10. EXEC('CREATE SCHEMA DDLCollector');
  11. END
  12. GO
  13. -- Create table to log the received message
  14. IF OBJECT_ID('DDLCollector.Collect_Records', 'U') IS NOT NULL
  15. DROP TABLE [DDLCollector].[Collect_Records]
  16. GO
  17. CREATE TABLE [DDLCollector].[Collect_Records](
  18. [RowId] [BIGINT] IDENTITY(1,1) NOT NULL,
  19. [Deadlock_Graph_Msg] [xml] NULL,
  20. [Deadlock_Graph_Msg_CheckSum] INT,
  21. [Record_Time] [datetime] NOT NULL
  22. CONSTRAINT DF_Collect_Records_Record_Time DEFAULT(GETDATE()),
  23. CONSTRAINT PK_Collect_Records_RowId PRIMARY KEY
  24. (RowId ASC)
  25. ) ON [PRIMARY]
  26. GO
  27. -- create table to record the exception when error occurs
  28. IF OBJECT_ID('DDLCollector.Error_Records', 'U') IS NOT NULL
  29. DROP TABLE [DDLCollector].[Error_Records]
  30. GO
  31. CREATE TABLE [DDLCollector].[Error_Records](
  32. [RowId] [int] IDENTITY(1,1) NOT NULL,
  33. [Msg_Body] [xml] NULL,
  34. [Conversation_handle] [uniqueidentifier] NULL,
  35. [Message_Type] SYSNAME NULL,
  36. [Service_Name] SYSNAME NULL,
  37. [Contact_Name] SYSNAME NULL,
  38. [Record_Time] [datetime] NOT NULL
  39. CONSTRAINT DF_Error_Records_Record_Time DEFAULT(GETDATE()),
  40. [Error_Details] [nvarchar](4000) NULL,
  41. CONSTRAINT PK_Error_Records_RowId PRIMARY KEY
  42. (RowId ASC)
  43. ) ON [PRIMARY]
  44. GO
  45. -- create business table to record deadlock analysised info
  46. IF OBJECT_ID('DDLCollector.Deadlock_Info', 'U') IS NOT NULL
  47. DROP TABLE [DDLCollector].[Deadlock_Info]
  48. GO
  49. CREATE TABLE [DDLCollector].[Deadlock_Info](
  50. RowId INT IDENTITY(1,1) NOT NULL
  51. ,SQLInstance sysname NULL
  52. ,SPid INT NULL
  53. ,is_Vitim BIT NULL
  54. ,DeadlockGraph XML NULL
  55. ,DeadlockGraphCheckSum INT NULL
  56. ,lasttranstarted DATETIME NULL
  57. ,lastbatchstarted DATETIME NULL
  58. ,lastbatchcompleted DATETIME NULL
  59. ,procname SYSNAME NULL
  60. ,Code NVARCHAR(max) NULL
  61. ,LockMode sysname NULL
  62. ,Indexname sysname NULL
  63. ,KeylockObject sysname NULL
  64. ,IndexLockMode sysname NULL
  65. ,Inputbuf NVARCHAR(max) NULL
  66. ,LoginName sysname NULL
  67. ,Clientapp sysname NULL
  68. ,Action varchar(1000) NULL
  69. ,status varchar(10) NULL
  70. ,[Record_Time] [datetime] NOT NULL
  71. CONSTRAINT DF_Deadlock_Info_Record_Time DEFAULT(GETDATE()),
  72. CONSTRAINT PK_Deadlock_Info_RowId PRIMARY KEY
  73. (RowId ASC)
  74. )
  75. GO
  76. USE [DDLCenter]
  77. GO
  78. -- Create store procedure to analysis deadlock graph xml
  79. -- and log into business table
  80. IF OBJECT_ID('DDLCollector.UP_ParseDeadlockGraphEventMsg', 'P') IS NOT NULL
  81. DROP PROC [DDLCollector].[UP_ParseDeadlockGraphEventMsg]
  82. GO
  83. CREATE PROCEDURE [DDLCollector].[UP_ParseDeadlockGraphEventMsg](
  84. @DeadlockGraph_Msg XML
  85. )
  86. AS
  87. BEGIN
  88. SET NOCOUNT ON;
  89. ;WITH deadlock
  90. AS
  91. (
  92. SELECT
  93. OwnerID = T.C.value('@id', 'varchar(50)')
  94. ,SPid = T.C.value('(./@spid)[1]','int')
  95. ,status = T.C.value('(./@status)[1]','varchar(10)')
  96. ,Victim = case
  97. when T.C.value('@id', 'varchar(50)') = T.C.value('./../../@victim','varchar(50)') then 1
  98. else 0 end
  99. ,LockMode = T.C.value('@lockMode', 'sysname')
  100. ,Inputbuf = T.C.value('(./inputbuf/text())[1]','nvarchar(max)')
  101. ,Code = T.C.value('(./executionStack/frame/text())[1]','nvarchar(max)')
  102. ,SPName = T.C.value('(./executionStack/frame/@procname)[1]','sysname')
  103. ,Hostname = T.C.value('(./@hostname)[1]','sysname')
  104. ,Clientapp = T.C.value('(./@clientapp)[1]','varchar(1000)')
  105. ,lasttranstarted = T.C.value('(./@lasttranstarted)[1]','datetime')
  106. ,lastbatchstarted = T.C.value('(./@lastbatchstarted)[1]','datetime')
  107. ,lastbatchcompleted = T.C.value('(./@lastbatchcompleted)[1]','datetime')
  108. ,LoginName = T.C.value('@loginname', 'sysname')
  109. ,Action = T.C.value('(./@transactionname)[1]','varchar(1000)')
  110. FROM @DeadlockGraph_Msg.nodes('EVENT_INSTANCE/TextData/deadlock-list/deadlock/process-list/process') AS T(C)
  111. )
  112. ,
  113. keylock
  114. AS
  115. (
  116. SELECT
  117. OwnerID = T.C.value('./owner[1]/@id', 'varchar(50)')
  118. ,KeylockObject = T.C.value('./../@objectname', 'sysname')
  119. ,Indexname = T.C.value('./../@indexname', 'sysname')
  120. ,IndexLockMode = T.C.value('./../@mode', 'sysname')
  121. FROM @DeadlockGraph_Msg.nodes('EVENT_INSTANCE/TextData/deadlock-list/deadlock/resource-list/keylock/owner-list') AS T(C)
  122. )
  123. SELECT
  124. SQLInstance = A.Hostname
  125. ,A.SPid
  126. ,is_Vitim = A.Victim
  127. ,DeadlockGraph = @DeadlockGraph_Msg.query('EVENT_INSTANCE/TextData/deadlock-list')
  128. ,DeadlockGraphCheckSum = CHECKSUM(CAST(@DeadlockGraph_Msg AS NVARCHAR(MAX)))
  129. ,A.lasttranstarted
  130. ,A.lastbatchstarted
  131. ,A.lastbatchcompleted
  132. ,A.SPName
  133. ,A.Code
  134. ,A.LockMode
  135. ,B.Indexname
  136. ,B.KeylockObject
  137. ,B.IndexLockMode
  138. ,A.Inputbuf
  139. ,A.LoginName
  140. ,A.Clientapp
  141. ,A.Action
  142. ,status
  143. ,[Record_Time] = GETDATE()
  144. FROM deadlock AS A
  145. LEFT JOIN keylock AS B
  146. ON A.OwnerID = B.OwnerID
  147. ORDER BY A.SPid, A.Victim
  148. ;
  149. END
  150. GO
  151. -- Create store Procedure for Center server service queue to process deadlock xml
  152. -- when message sending from client server.
  153. IF OBJECT_ID('DDLCollector.UP_ProcessDeadlockGraphEventMsg', 'P') IS NOT NULL
  154. DROP PROC [DDLCollector].[UP_ProcessDeadlockGraphEventMsg]
  155. GO
  156. CREATE PROCEDURE [DDLCollector].[UP_ProcessDeadlockGraphEventMsg]
  157. AS
  158. /*
  159. EXEC [DDLCollector].[UP_ProcessDeadlockGraphEventMsg]
  160. SELECT * FROM [DDLCollector].[Collect_Records]
  161. SELECT * FROM [DDLCollector].[Error_Records]
  162. SELECT * FROM [DDLCollector].[Deadlock_Info]
  163. */
  164. BEGIN
  165. SET NOCOUNT ON;
  166. DECLARE
  167. @handle UNIQUEIDENTIFIER
  168. , @Message_Type SYSNAME
  169. , @Service_Name SYSNAME
  170. , @Contact_Name SYSNAME
  171. , @Error_Details VARCHAR(2000)
  172. , @Message_Body XML
  173. , @Proc_Name SYSNAME
  174. ;
  175. -- Store Procedure name
  176. SELECT
  177. @Proc_Name = ISNULL(QUOTENAME(SCHEMA_NAME(SCHEMA_ID))
  178. + '.'
  179. + QUOTENAME(OBJECT_NAME(@@PROCID)),'')
  180. FROM sys.procedures
  181. WHERE OBJECT_ID = @@PROCID
  182. ;
  183. BEGIN TRY
  184. -- Receive deadlock message from service queue
  185. WAITFOR(RECEIVE TOP(1)
  186. @handle = conversation_handle
  187. , @Message_Type = message_type_name
  188. , @Service_Name = service_name
  189. , @Contact_Name = service_contract_name
  190. , @Message_Body = message_body
  191. FROM dbo.[http://soa/deadlock/queue/CenterQueue]),Timeout 500
  192. ;
  193. IF(@@Rowcount=0)
  194. BEGIN
  195. RETURN
  196. END
  197. -- Message type is the very correct one
  198. ELSE IF @Message_Type = N'http://soa/deadlock/MsgType/Request'
  199. BEGIN
  200. -- Record message log first
  201. INSERT INTO [DDLCollector].[Collect_Records](Deadlock_Graph_Msg, [Deadlock_Graph_Msg_CheckSum])
  202. VALUES(@Message_Body, CHECKSUM(cast(@Message_Body as NVARCHAR(MAX))))
  203. END CONVERSATION @handle
  204. --Here call another Store Procedure to process our message to record deadlock relation info
  205. INSERT INTO [DDLCollector].[Deadlock_Info]
  206. EXEC [DDLCollector].[UP_ParseDeadlockGraphEventMsg] @Message_Body;
  207. END
  208. --End Diaglog Message Type, that means we should end this conversation
  209. ELSE IF @Message_Type = N'http://schemas.microsoft.com/SQL/ServiceBroker/EndDialog'
  210. BEGIN
  211. END CONVERSATION @handle;
  212. END
  213. -- Konwn Service Broker Errors by System.
  214. ELSE IF @Message_Type = N'http://schemas.microsoft.com/SQL/ServiceBroker/Error'
  215. BEGIN
  216. END CONVERSATION @handle
  217. INSERT INTO [DDLCollector].[Error_Records]([Msg_Body], [Conversation_handle], [Message_Type], [Service_Name], [Contact_Name], [Error_Details])
  218. VALUES(@Message_Body, @handle, @Message_Type, @Service_Name, @Contact_Name, ' Exception Store Procedure: ' + @Proc_Name);
  219. END
  220. ELSE
  221. -- unknown Message Types.
  222. BEGIN
  223. END CONVERSATION @handle
  224. INSERT INTO [DDLCollector].[Error_Records]([Msg_Body], [Conversation_handle], [Message_Type], [Service_Name], [Contact_Name], [Error_Details])
  225. VALUES(@Message_Body, @handle, @Message_Type, @Service_Name, @Contact_Name, ' Received unexpected message type when executing Store Procedure: ' + @Proc_Name);
  226. -- unexpected message type
  227. RAISERROR (N' Received unexpected message type: %s', 16, 1, @Message_Type) WITH LOG;
  228. END
  229. END TRY
  230. BEGIN CATCH
  231. BEGIN
  232. -- record exception record
  233. SET @Error_Details=
  234. ' Error Number: ' + CAST(ERROR_NUMBER() AS VARCHAR(10)) +
  235. ' Error Message : ' + ERROR_MESSAGE() +
  236. ' Error Severity: ' + CAST(ERROR_SEVERITY() AS VARCHAR(10)) +
  237. ' Error State: ' + CAST(ERROR_STATE() AS VARCHAR(10)) +
  238. ' Error Line: ' + CAST(ERROR_LINE() AS VARCHAR(10)) +
  239. ' Exception Proc: ' + @Proc_Name
  240. ;
  241. INSERT INTO [DDLCollector].[Error_Records]([Msg_Body], [Conversation_handle], [Message_Type], [Service_Name], [Contact_Name], [Error_Details])
  242. VALUES(@Message_Body, @handle, @Message_Type, @Service_Name, @Contact_Name, @Error_Details);
  243. END
  244. END CATCH
  245. END
  246. GO
  • 创建Master库下Master Key
  1. USE master
  2. GO
  3. -- If the master key is not available, create it.
  4. IF NOT EXISTS (SELECT *
  5. FROM sys.symmetric_keys
  6. WHERE name LIKE '%MS_DatabaseMasterKey%')
  7. BEGIN
  8. CREATE MASTER KEY ENCRYPTION BY PASSWORD = 'CenterMasterKey*';
  9. END
  10. GO
  • 创建传输层本地证书并备份到本地文件系统
  1. USE master
  2. GO
  3. -- Crete Transport Layer Certification
  4. CREATE CERTIFICATE TrpCert_RemoteCenter
  5. AUTHORIZATION dbo
  6. WITH SUBJECT = 'TrpCert_RemoteCenter',
  7. START_DATE = '05/07/2017',
  8. EXPIRY_DATE = '12/30/9999'
  9. GO
  10. -- then backup it up to local path
  11. -- and after that copy it to Client server
  12. BACKUP CERTIFICATE TrpCert_RemoteCenter
  13. TO FILE = 'C:\Temp\TrpCert_RemoteCenter.cer';
  14. GO
  • 创建传输层远程证书,这个证书文件来至于Deadlock Client SQL Server
  1. USE master
  2. GO
  3. -- Create certification came from client Server.
  4. CREATE CERTIFICATE TrpCert_ClientLocal
  5. FROM FILE = 'C:\Temp\TrpCert_ClientLocal.cer'
  6. GO
  • 创建基于证书文件的用户登录
  1. USE master
  2. GO
  3. -- Create user login
  4. IF NOT EXISTS(SELECT *
  5. FROM sys.syslogins
  6. WHERE name='SSBDbo')
  7. BEGIN
  8. CREATE LOGIN SSBDbo FROM CERTIFICATE TrpCert_RemoteCenter;
  9. END
  10. GO
  • 创建Service Broker TCP/IP通讯端口并授权用户连接权限
  1. USE master
  2. GO
  3. -- Creaet Tcp endpoint for SSB comunication and grant connect to users.
  4. CREATE ENDPOINT EP_SSB_RemoteCenter
  5. STATE = STARTED
  6. AS TCP
  7. (
  8. LISTENER_PORT = 4024
  9. )
  10. FOR SERVICE_BROKER (AUTHENTICATION = CERTIFICATE TrpCert_RemoteCenter, ENCRYPTION = REQUIRED
  11. )
  12. GO
  13. -- Grant Connect on Endpoint to User SSBDbo
  14. GRANT CONNECT ON ENDPOINT::EP_SSB_RemoteCenter TO SSBDbo
  15. GO
  • 创建DDLCenter数据库Master Key
  1. -- Now, let's go inside to conversation database
  2. USE DDLCenter
  3. GO
  4. -- Create Master Key
  5. IF NOT EXISTS (SELECT *
  6. FROM sys.symmetric_keys
  7. WHERE name LIKE '%MS_DatabaseMasterKey%')
  8. BEGIN
  9. CREATE MASTER KEY ENCRYPTION BY PASSWORD = 'DDLCenterMasterKey*';
  10. END
  11. GO
  • 创建会话层本地证书
  1. USE DDLCenter
  2. GO
  3. -- Create conversation layer certification
  4. CREATE CERTIFICATE DlgCert_RemoteCenter
  5. AUTHORIZATION dbo
  6. WITH SUBJECT = 'DlgCert_RemoteCenter',
  7. START_DATE = '05/07/2017',
  8. EXPIRY_DATE = '12/30/9999'
  9. GO
  10. -- backup it up to local path
  11. -- and then copy it to remote client server
  12. BACKUP CERTIFICATE DlgCert_RemoteCenter
  13. TO FILE = 'C:\Temp\DlgCert_RemoteCenter.cer';
  14. GO
  • 创建DDLCenter用户,不需要和任何用户登录匹配
  1. USE DDLCenter
  2. GO
  3. -- Create User for login under conversation database
  4. IF NOT EXISTS(
  5. SELECT TOP 1 *
  6. FROM sys.database_principals
  7. WHERE name = 'SSBDbo'
  8. )
  9. BEGIN
  10. --CREATE USER SSBDbo FOR LOGIN SSBDbo;
  11. CREATE USER SSBDbo WITHOUT LOGIN;
  12. END
  13. GO
  • 创建会话层远程证书,这个证书文件来自Deadlock Center SQL Server备份
  1. USE DDLCenter
  2. GO
  3. -- Create converstaion layer certification came from remote client server.
  4. CREATE CERTIFICATE DlgCert_ClientLocal
  5. AUTHORIZATION SSBDbo
  6. FROM FILE='C:\Temp\DlgCert_ClientLocal.cer'
  7. GO
  8. GRANT CONNECT TO SSBDbo;
  • 创建Service Broker组件对象
  1. USE DDLCenter
  2. GO
  3. -- Create Message Type
  4. CREATE MESSAGE TYPE [http://soa/deadlock/MsgType/Request]
  5. VALIDATION = WELL_FORMED_XML;
  6. CREATE MESSAGE TYPE [http://soa/deadlock/MsgType/Response]
  7. VALIDATION = WELL_FORMED_XML;
  8. GO
  9. -- Create Contact
  10. CREATE CONTRACT [http://soa/deadlock/contract/CheckContract](
  11. [http://soa/deadlock/MsgType/Request] SENT BY INITIATOR,
  12. [http://soa/deadlock/MsgType/Response] SENT BY TARGET
  13. );
  14. GO
  15. -- Create Queue
  16. CREATE QUEUE [dbo].[http://soa/deadlock/queue/CenterQueue]
  17. WITH STATUS = ON , RETENTION = OFF
  18. , ACTIVATION (STATUS = ON ,
  19. PROCEDURE_NAME = [DDLCollector].[UP_ProcessDeadlockGraphEventMsg] ,
  20. MAX_QUEUE_READERS = 3 ,
  21. EXECUTE AS N'dbo')
  22. GO
  23. -- Create Service
  24. CREATE SERVICE [http://soa/deadlock/service/CenterService]
  25. ON QUEUE [http://soa/deadlock/queue/CenterQueue]
  26. (
  27. [http://soa/deadlock/contract/CheckContract]
  28. );
  29. GO
  30. -- Grant Send on service to User SSBDbo
  31. GRANT SEND ON SERVICE::[http://soa/deadlock/service/CenterService] to SSBDbo;
  32. GO
  33. -- Create Remote Service Bingding
  34. CREATE REMOTE SERVICE BINDING [http://soa/deadlock/RSB/ClientRSB]
  35. TO SERVICE 'http://soa/deadlock/service/ClientService'
  36. WITH USER = SSBDbo,
  37. ANONYMOUS=Off
  38. GO
  39. -- Create Route
  40. CREATE ROUTE [http://soa/deadlock/route/ClientRoute]
  41. WITH SERVICE_NAME = 'http://soa/deadlock/service/ClientService',
  42. ADDRESS = 'TCP://10.211.55.3:4022';
  43. GO

Event Notification配置

Event Notification只需要在Deadlock Client Server创建即可,因为只需要在Deadlock Client上跟踪死锁事件。在为Deadlock Client 配置Service Broker章节,我们已经为Event Notification创建了队列、服务和路由。因此,在这里我们只需要创建Event Notification对象即可。方法参见如下的代码:

  1. USE DDLCenter
  2. GO
  3. -- Create Event Notification for the deadlock_graph event.
  4. IF EXISTS(
  5. SELECT * FROM sys.server_event_notifications
  6. WHERE name = 'DeadLockNotificationEvent'
  7. )
  8. BEGIN
  9. DROP EVENT NOTIFICATION DeadLockNotificationEvent
  10. ON SERVER;
  11. END
  12. GO
  13. CREATE EVENT NOTIFICATION DeadLockNotificationEvent
  14. ON SERVER
  15. WITH FAN_IN
  16. FOR DEADLOCK_GRAPH
  17. TO SERVICE
  18. 'http://soa/deadlock/service/ClientService',
  19. 'current database'
  20. GO

模拟死锁

至此为止,所有对象和准备工作已经准备完成,万事俱备只欠东风,让我们在Deadlock Client实例上模拟死锁场景。首先,我们在Test数据库下创建两个测试表,表名分别为:dbo.test_deadlock1和dbo.test_deadlock2,代码如下:

  1. IF DB_ID('Test') IS NULL
  2. CREATE DATABASE Test;
  3. GO
  4. USE Test
  5. GO
  6. -- create two test tables
  7. IF OBJECT_ID('dbo.test_deadlock1','u') IS NOT NULL
  8. DROP TABLE dbo.test_deadlock1
  9. GO
  10. CREATE TABLE dbo.test_deadlock1(
  11. id INT IDENTITY(1,1) not null PRIMARY KEY
  12. ,name VARCHAR(20) null
  13. );
  14. IF OBJECT_ID('dbo.test_deadlock2','u') IS NOT NULL
  15. DROP TABLE dbo.test_deadlock2
  16. GO
  17. CREATE TABLE dbo.test_deadlock2(
  18. id INT IDENTITY(1,1) not null PRIMARY KEY
  19. ,name VARCHAR(20) null
  20. );
  21. INSERT INTO dbo.test_deadlock1
  22. SELECT 'AA'
  23. UNION ALL
  24. SELECT 'BB';
  25. INSERT INTO dbo.test_deadlock2
  26. SELECT 'AA'
  27. UNION ALL
  28. SELECT 'BB';
  29. GO

接下来,我们使用SSMS打开一个新的连接,我们假设叫session 1,执行如下语句:

  1. --session 1
  2. USE Test
  3. GO
  4. BEGIN TRAN
  5. UPDATE dbo.test_deadlock1
  6. SET name = 'CC'
  7. WHERE id = 1
  8. ;
  9. WAITFOR DELAY '00:00:05'
  10. UPDATE dbo.test_deadlock2
  11. SET name = 'CC'
  12. WHERE id = 1
  13. ;
  14. ROLLBACK

紧接着,我们使用SSMS打开第二个连接,假设叫Session 2,执行下面的语句:

  1. --session 2
  2. USE Test
  3. GO
  4. BEGIN TRAN
  5. UPDATE dbo.test_deadlock2
  6. SET name = 'CC'
  7. WHERE id = 1
  8. ;
  9. UPDATE dbo.test_deadlock1
  10. SET name = 'CC'
  11. WHERE id = 1
  12. ;
  13. COMMIT

等待一会儿功夫以后,死锁发生,并且Session 2做为了死锁的牺牲品,我们会在Session 2的SSMS信息窗口中看到如下的死锁信息:

  1. Msg 1205, Level 13, State 51, Line 8
  2. Transaction (Process ID 60) was deadlocked on lock resources with another process and has been chosen as the deadlock victim. Rerun the transaction.

用户查询死锁信息

根据上面的模拟死锁小节,说明死锁已经真真切切的发生了,那么,死锁信息到底有没有被捕获到呢?如果终端用户想要查看和分析所有客户端的死锁信息,只需要连接Deadlock Center SQL Server,执行下面的语句:

  1. -- Run on Deadlock Center Server
  2. USE DDLCenter
  3. GO
  4. SELECT * FROM [DDLCollector].[Deadlock_Info]

由于结果集宽度太宽,人为将查询结果分两段截图,第一段结果集展示如下: 03.png

第二段结果集截图如下: 04.png

从这个结果集,我们可以清楚的看到Deadlock Client发生死锁的详细信息,包含:

  • 死锁发生的Deadlock Client实例名称:CHERISH-PC

  • 被死锁进程号60,死锁进程57号

  • 死锁相关进程的事务开始时间,最后一个Batch开始执行时间和完成时间

  • 死锁进程执行的代码和Batch语句

  • 死锁发生时锁的类型

  • 表和索引名称

  • 死锁相关进程的登录用户

…… 等等。

踩过的坑

当Deadlock Client 上SQL Server发生两次或者两次以上的Deadlock事件以后,自建的Event Notification对象(名为:DeadLockNotificationEvent)会被SQL Server系统自动删除,从而导致整个死锁收集系统无法工作。

表象

SQL Server在错误日志中会抛出如下4个错误信息:两个错误编号为17004,一个编号为17001的错误,最后是一个编号为17005错误,其中17005明确说明了,Event Notification对象被删除了。如下:

  1. Error: 17004, Severity: 16, State: 1.
  2. Event notification conversation on dialog handle '{4A6A0FBD-7A34-E711-A709-001C42099969}' closed without an error.
  3. Error: 17004, Severity: 16, State: 1.
  4. Event notification conversation on dialog handle '{476A0FBD-7A34-E711-A709-001C42099969}' closed without an error.
  5. Error: 17001, Severity: 16, State: 1.
  6. Failure to send an event notification instance of type 'DEADLOCK_GRAPH' on conversation handle '{F711A404-7934-E711-A709-001C42099969}'. Error Code = '8429'.
  7. Error: 17005, Severity: 16, State: 1.
  8. Event notification 'DeadLockNotificationEvent' in database 'master' dropped due to send time service broker errors. Check to ensure the conversation handle, service broker contract, and service specified in the event notification are active.

错误日志截图如下: 05.png

问题分析

从错误提示信息due to send time service broker errors来看,最开始花了很长时间来排查Service Broker方面的问题,在长达数小时的问题排查无果后,静下心来仔细想想:如果是Service Broker有问题的话,我们不可能完成第一、第二条死锁信息的收集,所以问题应该与Service Broker没有直接关系。于是,注意到了错误提示信息的后半部分Check to ensure the conversation handle, service broker contract, and service specified in the event notification are active,再次以可以成功收集两条deadlock错误信息为由,排除Contact和Service的问题可能性,所以最有可能出问题的地方猜测应该是conversation handle,继续排查与conversation handle相关操作的地方,发现存储过程[DDLCollector].[UP_ProcessDeadlockEventMsg]的中的代码:

  1. ...
  2. ELSE IF @Message_Type = 'http://schemas.microsoft.com/SQL/Notifications/EventNotification'
  3. BEGIN
  4. -- Record message log first
  5. INSERT INTO [DDLCollector].[Deadlock_Traced_Records](Processed_Msg, [Processed_Msg_CheckSum])
  6. VALUES(@Message_Body, CHECKSUM(CAST(@Message_Body as NVARCHAR(MAX))))
  7. END CONVERSATION @handle
  8. --Here call another Store Procedure to send deadlock graph info to center server
  9. EXEC [DDLCollector].[UP_SendDeadlockMsg] @Message_Body;
  10. END
  11. ...

这个逻辑分支不应该有End Conversation的操作,因为这里是与Event Notification相关的Message Type操作,而不是Service Broker相关的Message Type操作。

解决问题

问题分析清楚了,解决方法就非常简单了,注释掉这条语句END CONVERSATION @handle后,重新创建存储过程。再多次模拟死锁操作,再也没有出现Event Notification被系统自动删除的情况了,说明这个问题已经被彻底解决,坑已经被填上了。 解决问题的代码修改和注释如下截图,以此纪念下踩过的这个坑: 06.png

福利发放

以下是关于SQL Server死锁相关的系列文章,可以帮助我们全面了解、分析和解决死锁问题,其中第一个是这篇文章的视频演示。

最后总结

这篇文章是一个完整的SQL Server死锁收集系统典型案例介绍,你甚至可以很轻松简单的将这个方案应用到你的产品环境,来收集产品环境所有SQL Server实例发生死锁的详细信息,并根据该系统收集到的场景来改进和改善死锁发生的概率,从而降低死应用发生异常错误的可能性。因此这篇文章有着非常重要的现实价值和意义。