Earlier I have been writting about Waits and queues and how to use that approach to find out about performance issues. Waits and queues does not give you the answers, but it guides you to the area(s) where there might be a problem. You can read my post about waits and queues here.
You must have read and understood my blog about waits and queues before you continue with this blog.
With this post I want to show you how to setup an environment where you can see changes in waits and queues and you can do some analysis. If time permits, then I will post a blog later about how to dive into the waits and see what caused the wait.
What I want to achieve is a database where I capture the accumulated waits and queues every hour (half hour or every day, depending on what you prefer).
So, what I need is:
- Setup a database.
- Create a table to hold the data.
- Create a stored procedure to collect the data.
- Setup a SQL Server Agent job to automate the collection.
- Create a view to see the capture times.
- Create a stored procedure to report the data
Before we start, then note that the script to show the waits is done by Paul Randal, www.sqlskills.com.
1. Setup a database
First we must setup the database, and I suggest that you do the following (remember to change physical path):
USE [master] GO CREATE DATABASE [Logging] ON PRIMARY ( NAME = N'Logging', FILENAME = N'd:\SQL\Data\Logging.mdf' , SIZE = 524288KB , MAXSIZE = UNLIMITED, FILEGROWTH = 131072KB ) LOG ON ( NAME = N'Logging_log', FILENAME = N'e:\SQL\LOG\Logging_log.ldf' , SIZE = 262144KB , MAXSIZE = 2048GB , FILEGROWTH = 10%) GO
2. Setup a table to hold the data
Now we must have a table to store the data that we collect every hour, and I would recommend the following table. It can be optimized, so this is a basic table:
USE [Logging] GO CREATE TABLE [dbo].[wait_statistics]( [CaptureTime] [datetime] NOT NULL, [WaitType] [nvarchar](60) NOT NULL, [Wait_S] [decimal](14, 2) NULL, [Resource_S] [decimal](14, 2) NULL, [Signal_S] [decimal](14, 2) NULL, [WaitCount] [bigint] NOT NULL, [Percentage] [decimal](4, 2) NULL, [AvgWait_S] [decimal](14, 4) NULL, [AvgRes_S] [decimal](14, 4) NULL, [AvgSig_S] [decimal](14, 4) NULL ) ON [PRIMARY]
3. Create a stored procedure to collect the data
So, now we are ready to fill in some data, and for that we use a stored procedure. The reason for that, is that the wait types we are filtering (that we don’t want to see) changes from version to version, and therefore it would be nice to have a stored procedure where we can easily change what to filter and not.
USE [Logging] GO SET ANSI_NULLS ON GO SET QUOTED_IDENTIFIER ON GO CREATE procedure [dbo].[collect] as WITH Waits AS (SELECT wait_type, wait_time_ms / 1000.0 AS WaitS, (wait_time_ms - signal_wait_time_ms) / 1000.0 AS ResourceS, signal_wait_time_ms / 1000.0 AS SignalS, waiting_tasks_count AS WaitCount, 100.0 * wait_time_ms / SUM (wait_time_ms) OVER() AS Percentage, ROW_NUMBER() OVER(ORDER BY wait_time_ms DESC) AS RowNum FROM sys.dm_os_wait_stats WHERE wait_type NOT IN ( 'CLR_SEMAPHORE', 'LAZYWRITER_SLEEP', 'RESOURCE_QUEUE', 'SLEEP_TASK', 'SLEEP_SYSTEMTASK', 'SQLTRACE_BUFFER_FLUSH', 'WAITFOR', 'LOGMGR_QUEUE', 'CHECKPOINT_QUEUE', 'REQUEST_FOR_DEADLOCK_SEARCH', 'XE_TIMER_EVENT' , 'BROKER_TO_FLUSH', 'BROKER_TASK_STOP', 'CLR_MANUAL_EVENT', 'CLR_AUTO_EVENT' , 'DISPATCHER_QUEUE_SEMAPHORE', 'FT_IFTS_SCHEDULER_IDLE_WAIT', 'XE_DISPATCHER_WAIT', 'XE_DISPATCHER_JOIN' , 'BROKER_EVENTHANDLER', 'TRACEWRITE', 'FT_IFTSHC_MUTEX', 'SQLTRACE_INCREMENTAL_FLUSH_SLEEP', 'BROKER_RECEIVE_WAITFOR', 'ONDEMAND_TASK_QUEUE', 'DBMIRROR_EVENTS_QUEUE', 'DBMIRRORING_CMD', 'BROKER_TRANSMITTER', 'SQLTRACE_WAIT_ENTRIES', 'SLEEP_BPOOL_FLUSH', 'SQLTRACE_LOCK') ) insert into wait_statistics SELECT getdate() as CaptureTime ,W1.wait_type AS WaitType, CAST (W1.WaitS AS DECIMAL(14, 2)) AS Wait_S, CAST (W1.ResourceS AS DECIMAL(14, 2)) AS Resource_S, CAST (W1.SignalS AS DECIMAL(14, 2)) AS Signal_S, W1.WaitCount AS WaitCount, CAST (W1.Percentage AS DECIMAL(4, 2)) AS Percentage, case when w1.WaitCount<>0 then CAST ((W1.WaitS / W1.WaitCount) AS DECIMAL (14, 4)) else 0 end AS AvgWait_S, case when w1.WaitCount<>0 then CAST ((W1.ResourceS / W1.WaitCount) AS DECIMAL (14, 4)) else 0 end AS AvgRes_S, case when w1.WaitCount<>0 then CAST ((W1.SignalS / W1.WaitCount) AS DECIMAL (14, 4)) else 0 end AS AvgSig_S FROM Waits AS W1 INNER JOIN Waits AS W2 ON W2.RowNum <= W1.RowNum GROUP BY W1.RowNum, W1.wait_type, W1.WaitS, W1.ResourceS, W1.SignalS , W1.WaitCount, W1.Percentage GO
It is very important to know what version of SQL Server you are using, because then you can decide which wait types you are going to exclude in the line WHERE WAIT_TYPE NOT IN (xxx).
4. Setup a SQL Server Agent Job to automate collection
Now when we have a stored procedure to collect the data it is very easy to setup a sql server agent job to automate collection.
USE [msdb] GO DECLARE @jobId BINARY(16) EXEC msdb.dbo.sp_add_job @job_name=N'Collection', @enabled=1, @notify_level_eventlog=0, @notify_level_email=2, @notify_level_netsend=2, @notify_level_page=2, @delete_level=0, @description=N'This job collects wait statistics.', @category_name=N'Database Maintenance', @owner_login_name=N'YOURUSERID', @job_id = @jobId OUTPUT select @jobId GO EXEC msdb.dbo.sp_add_jobserver @job_name=N'Collection', @server_name = N'YOURSERVERNAVN' GO USE [msdb] GO EXEC msdb.dbo.sp_add_jobstep @job_name=N'Collection', @step_name=N'Wait_statistics', @step_id=1, @cmdexec_success_code=0, @on_success_action=1, @on_fail_action=2, @retry_attempts=0, @retry_interval=0, @os_run_priority=0, @subsystem=N'TSQL', @command=N'exec collect', @database_name=N'Logging', @flags=0 GO USE [msdb] GO EXEC msdb.dbo.sp_update_job @job_name=N'Collection', @enabled=1, @start_step_id=1, @notify_level_eventlog=0, @notify_level_email=2, @notify_level_netsend=2, @notify_level_page=2, @delete_level=0, @description=N'This job collects wait statistics.', @category_name=N'Database Maintenance', @owner_login_name=N'YOURUSER', @notify_email_operator_name=N'', @notify_netsend_operator_name=N'', @notify_page_operator_name=N'' GO USE [msdb] GO DECLARE @schedule_id int EXEC msdb.dbo.sp_add_jobschedule @job_name=N'Collection', @name=N'EveryHour', @enabled=1, @freq_type=8, @freq_interval=1, @freq_subday_type=8, @freq_subday_interval=1, @freq_relative_interval=0, @freq_recurrence_factor=1, @active_start_date=20130312, @active_end_date=99991231, @active_start_time=0, @active_end_time=235959, @schedule_id = @schedule_id OUTPUT select @schedule_id GO
5. Create a view to see the capture times
So.. Let’s see if we can create a view that can show us all capture times, because we need them as inputparameters for the stored procedure that will return the figures.
use Logging go create view CaptureTimes as select distinct capturetime from wait_statistics
6. Create a stored procedure to report the data
So, now we only need to create a stored procedure to capture the data. That is very easy, and when that is done, you can start monitoring the waits and queues.
USE [logging] GO /****** Object: StoredProcedure [dbo].[InternalDezision_reports_waits_output] Script Date: 03/12/2013 09:50:34 ******/ SET ANSI_NULLS ON GO SET QUOTED_IDENTIFIER ON GO CREATE procedure [dbo].[ReportWaits] @endtime datetime, @starttime datetime as begin try drop table #tmptable end try begin catch end catch select WaitType , convert(numeric(38,10), Wait_S - isnull((select Wait_s from wait_statistics i1 where t1.WaitType=i1.WaitType and i1.CaptureTime=@starttime),0)) as Wait_s , Resource_S - isnull((select Resource_s from wait_statistics i1 where t1.WaitType=i1.WaitType and i1.CaptureTime=@starttime),0) as Resource_s , Signal_S - isnull((select Signal_s from wait_statistics i1 where t1.WaitType=i1.WaitType and i1.CaptureTime=@starttime),0) as signal_s , WaitCount - isnull((select WaitCount from wait_statistics i1 where t1.WaitType=i1.WaitType and i1.CaptureTime=@starttime),0) as waitcount , convert(numeric(38,10), 0.00) as percentage , convert(numeric(38,10), 0) as avg_Wait_S , convert(numeric(38,10), 0) as avg_Resource_S , convert(numeric(38,10), 0) as avg_signal_s into #tmptable from wait_statistics t1 where CaptureTime=@endtime order by CaptureTime desc -- Lets delete delete #tmptable where waitcount=0 -- Let's find the wait declare @wait_s numeric(38,10) select @wait_s=SUM(wait_s) from #tmptable update #tmptable set percentage=wait_s/@wait_s *100 , avg_Wait_S=wait_s/waitcount , avg_resource_S=resource_s/waitcount , avg_signal_S=signal_s/waitcount select * , @starttime as starttime , @endtime as endtime , datediff(s, @starttime, @endtime) as PeriodInSeconds , 100*wait_s/datediff(s, @starttime, @endtime) as ShareOfPeriod from #tmptable order by percentage desc
7. Monitoring – An example
Let’s see how it works:
First you try to find two dates you can compare, and for that you simply write:
select * from capturetimes
The result in my case will be:
So I have two capturetimes to compare and therefore I can write:
USE [Logging] GO EXEC @return_value = [dbo].[ReportWaits] @endtime = N'2013-03-12 09:49:19.373', @starttime = N'2013-03-12 09:41:59.793'
In my case (it is a very small timeperiod), the result is something like this:
As you can see, then on my server it is BackupIO that has caused 97.4877% of the waits.
What is missing in a lot of other blogs you are reading, is the column ShareOfPeriod. Here you can see that BackupIO has caused 98.865909, but the main thing is: With this column, you can see how long time of the capturetime we have been waiting. There could be a lot of theoretical discussions about this figure, and if it is good or bad, if we can use this column when we have multiple processor and so on.
BUT!!! It is better than just knowing: It is from the last restart or DBCC. When we collect each hour, we get a lot of flexibility to narrowing down where the problem is.
You can do a lot of optimization and make it easier to run the stored procedure without having to find the capturetimes, but… there should also be something for you to do.