Waits and queues with a practical approach

Earlier I have been writting about Waits and queues and how to use that approach to find out about performance issues. Waits and queues does not give you the answers, but it guides you to the area(s) where there might be a problem.  You can read my post about waits and queues here.

You must have read and understood my blog about waits and queues before you continue with this blog.

With this post I want to show you how to setup an environment where you can see changes in waits and queues and you can do some analysis. If time permits, then I will post a blog later about how to dive into the waits and see what caused the wait.

What I want to achieve is a database where I capture the accumulated waits and queues every hour (half hour or every day, depending on what you prefer).

So, what I need is:

  1. Setup a database.
  2. Create a table to hold the data.
  3. Create a stored procedure to collect the data.
  4. Setup a SQL Server Agent job to automate the collection.
  5. Create a view to see the capture times.
  6. Create a stored procedure to report the data

Before we start, then note that the script to show the waits is done by Paul Randal, www.sqlskills.com.

1. Setup a database

First we must setup the database, and I suggest that you do the following (remember to change physical path):

USE [master]
GO

CREATE DATABASE [Logging] ON  PRIMARY 
( NAME = N'Logging', FILENAME = N'd:\SQL\Data\Logging.mdf' 
         , SIZE = 524288KB , MAXSIZE = UNLIMITED, FILEGROWTH = 131072KB )
 LOG ON 
( NAME = N'Logging_log', FILENAME = N'e:\SQL\LOG\Logging_log.ldf' 
      , SIZE = 262144KB , MAXSIZE = 2048GB , FILEGROWTH = 10%)
GO

2. Setup a table to hold the data

Now we must have a table to store the data that we collect every hour, and I would recommend the following table. It can be optimized, so this is a basic table:

USE [Logging]
GO

CREATE TABLE [dbo].[wait_statistics](
    [CaptureTime] [datetime] NOT NULL,
    [WaitType] [nvarchar](60) NOT NULL,
    [Wait_S] [decimal](14, 2) NULL,
    [Resource_S] [decimal](14, 2) NULL,
    [Signal_S] [decimal](14, 2) NULL,
    [WaitCount] [bigint] NOT NULL,
    [Percentage] [decimal](4, 2) NULL,
    [AvgWait_S] [decimal](14, 4) NULL,
    [AvgRes_S] [decimal](14, 4) NULL,
    [AvgSig_S] [decimal](14, 4) NULL
) ON [PRIMARY]

3. Create a stored procedure to collect the data

So, now we are ready to fill in some data, and for that we use a stored procedure. The reason for that, is that the wait types we are filtering (that we don’t want to see) changes from version to version, and therefore it would be nice to have a stored procedure where we can easily change what to filter and not.

USE [Logging]
GO

SET ANSI_NULLS ON
GO

SET QUOTED_IDENTIFIER ON
GO

CREATE procedure [dbo].[collect]
as
WITH Waits AS
    (SELECT
        wait_type,
        wait_time_ms / 1000.0 AS WaitS,
        (wait_time_ms - signal_wait_time_ms) / 1000.0 AS ResourceS,
        signal_wait_time_ms / 1000.0 AS SignalS,
        waiting_tasks_count AS WaitCount,
        100.0 * wait_time_ms / SUM (wait_time_ms) OVER() AS Percentage,
        ROW_NUMBER() OVER(ORDER BY wait_time_ms DESC) AS RowNum
    FROM sys.dm_os_wait_stats
    WHERE wait_type NOT IN (
        'CLR_SEMAPHORE', 'LAZYWRITER_SLEEP', 'RESOURCE_QUEUE', 'SLEEP_TASK',
        'SLEEP_SYSTEMTASK', 'SQLTRACE_BUFFER_FLUSH', 'WAITFOR', 'LOGMGR_QUEUE',
        'CHECKPOINT_QUEUE', 'REQUEST_FOR_DEADLOCK_SEARCH', 'XE_TIMER_EVENT'
        , 'BROKER_TO_FLUSH',
        'BROKER_TASK_STOP', 'CLR_MANUAL_EVENT', 'CLR_AUTO_EVENT'
        , 'DISPATCHER_QUEUE_SEMAPHORE',
        'FT_IFTS_SCHEDULER_IDLE_WAIT', 'XE_DISPATCHER_WAIT', 'XE_DISPATCHER_JOIN'
        , 'BROKER_EVENTHANDLER',
        'TRACEWRITE', 'FT_IFTSHC_MUTEX', 'SQLTRACE_INCREMENTAL_FLUSH_SLEEP',
        'BROKER_RECEIVE_WAITFOR', 'ONDEMAND_TASK_QUEUE', 'DBMIRROR_EVENTS_QUEUE',
        'DBMIRRORING_CMD', 'BROKER_TRANSMITTER', 'SQLTRACE_WAIT_ENTRIES',
        'SLEEP_BPOOL_FLUSH', 'SQLTRACE_LOCK')
    )
insert into wait_statistics
SELECT
    getdate() as CaptureTime
    ,W1.wait_type AS WaitType,
    CAST (W1.WaitS AS DECIMAL(14, 2)) AS Wait_S,
    CAST (W1.ResourceS AS DECIMAL(14, 2)) AS Resource_S,
    CAST (W1.SignalS AS DECIMAL(14, 2)) AS Signal_S,
    W1.WaitCount AS WaitCount,
    CAST (W1.Percentage AS DECIMAL(4, 2)) AS Percentage,
    case when w1.WaitCount<>0 then 
        CAST ((W1.WaitS / W1.WaitCount) AS DECIMAL (14, 4)) 
    else
        0
    end AS AvgWait_S,
    case when w1.WaitCount<>0 then 
        CAST ((W1.ResourceS / W1.WaitCount) AS DECIMAL (14, 4)) 
    else
        0
    end AS AvgRes_S,
    case when w1.WaitCount<>0 then 
        CAST ((W1.SignalS / W1.WaitCount) AS DECIMAL (14, 4)) 
    else
        0
    end AS AvgSig_S
FROM Waits AS W1
    INNER JOIN Waits AS W2 ON W2.RowNum <= W1.RowNum
GROUP BY W1.RowNum, W1.wait_type, W1.WaitS, W1.ResourceS, W1.SignalS
        , W1.WaitCount, W1.Percentage

GO

It is very important to know what version of SQL Server you are using, because then you can decide which wait types you are going to exclude in the line WHERE WAIT_TYPE NOT IN (xxx).

4. Setup a SQL Server Agent Job to automate collection

Now when we have a stored procedure to collect the data it is very easy to setup a sql server agent job to automate collection.

USE [msdb]
GO
DECLARE @jobId BINARY(16)
EXEC  msdb.dbo.sp_add_job @job_name=N'Collection', 
        @enabled=1, 
        @notify_level_eventlog=0, 
        @notify_level_email=2, 
        @notify_level_netsend=2, 
        @notify_level_page=2, 
        @delete_level=0, 
        @description=N'This job collects wait statistics.', 
        @category_name=N'Database Maintenance', 
        @owner_login_name=N'YOURUSERID', @job_id = @jobId OUTPUT
select @jobId
GO
EXEC msdb.dbo.sp_add_jobserver @job_name=N'Collection', @server_name = N'YOURSERVERNAVN'
GO
USE [msdb]
GO
EXEC msdb.dbo.sp_add_jobstep @job_name=N'Collection', @step_name=N'Wait_statistics', 
        @step_id=1, 
        @cmdexec_success_code=0, 
        @on_success_action=1, 
        @on_fail_action=2, 
        @retry_attempts=0, 
        @retry_interval=0, 
        @os_run_priority=0, @subsystem=N'TSQL', 
        @command=N'exec collect', 
        @database_name=N'Logging', 
        @flags=0
GO
USE [msdb]
GO
EXEC msdb.dbo.sp_update_job @job_name=N'Collection', 
        @enabled=1, 
        @start_step_id=1, 
        @notify_level_eventlog=0, 
        @notify_level_email=2, 
        @notify_level_netsend=2, 
        @notify_level_page=2, 
        @delete_level=0, 
        @description=N'This job collects wait statistics.', 
        @category_name=N'Database Maintenance', 
        @owner_login_name=N'YOURUSER', 
        @notify_email_operator_name=N'', 
        @notify_netsend_operator_name=N'', 
        @notify_page_operator_name=N''
GO
USE [msdb]
GO
DECLARE @schedule_id int
EXEC msdb.dbo.sp_add_jobschedule @job_name=N'Collection', @name=N'EveryHour', 
        @enabled=1, 
        @freq_type=8, 
        @freq_interval=1, 
        @freq_subday_type=8, 
        @freq_subday_interval=1, 
        @freq_relative_interval=0, 
        @freq_recurrence_factor=1, 
        @active_start_date=20130312, 
        @active_end_date=99991231, 
        @active_start_time=0, 
        @active_end_time=235959, @schedule_id = @schedule_id OUTPUT
select @schedule_id
GO

5. Create a view to see the capture times

So.. Let’s see if we can create a view that can show us all capture times, because we need them as inputparameters for the stored procedure that will return the figures.

use Logging
go

create view CaptureTimes
as
select distinct capturetime
from wait_statistics

6. Create a stored procedure to report the data

So, now we only need to create a stored procedure to capture the data. That is very easy, and when that is done, you can start monitoring the waits and queues.

USE [logging]
GO

/****** Object:  StoredProcedure [dbo].[InternalDezision_reports_waits_output]    Script Date: 03/12/2013 09:50:34 ******/
SET ANSI_NULLS ON
GO

SET QUOTED_IDENTIFIER ON
GO

CREATE procedure [dbo].[ReportWaits]

@endtime        datetime, 
@starttime        datetime
as

begin try
    drop table #tmptable
end try
begin catch
end catch

select WaitType
        , convert(numeric(38,10), Wait_S - isnull((select Wait_s 
from wait_statistics i1 where t1.WaitType=i1.WaitType 
and i1.CaptureTime=@starttime),0)) as Wait_s
        , Resource_S - isnull((select Resource_s from wait_statistics i1 
where t1.WaitType=i1.WaitType and i1.CaptureTime=@starttime),0) as Resource_s
        , Signal_S - isnull((select Signal_s from wait_statistics i1 
where t1.WaitType=i1.WaitType and i1.CaptureTime=@starttime),0) as signal_s
        , WaitCount - isnull((select WaitCount from wait_statistics i1 
where t1.WaitType=i1.WaitType and i1.CaptureTime=@starttime),0) as waitcount
        , convert(numeric(38,10), 0.00) as percentage
        , convert(numeric(38,10), 0) as avg_Wait_S
        , convert(numeric(38,10), 0) as avg_Resource_S
        , convert(numeric(38,10), 0) as avg_signal_s
into #tmptable
from wait_statistics t1
where CaptureTime=@endtime
order by CaptureTime desc

-- Lets delete
delete #tmptable 
where waitcount=0

-- Let's find the wait
declare @wait_s numeric(38,10)
select @wait_s=SUM(wait_s) from #tmptable

update #tmptable
set percentage=wait_s/@wait_s *100
    , avg_Wait_S=wait_s/waitcount
    , avg_resource_S=resource_s/waitcount
    , avg_signal_S=signal_s/waitcount

select *
        , @starttime as starttime
        , @endtime as endtime
        , datediff(s, @starttime, @endtime) as PeriodInSeconds
        , 100*wait_s/datediff(s, @starttime, @endtime) as ShareOfPeriod
from #tmptable
order by percentage desc

7. Monitoring – An example

Let’s see how it works:

First you try to find two dates you can compare, and for that you simply write:

select *
from capturetimes

The result in my case will be:

image

So I have two capturetimes to compare and therefore I can write:

USE [Logging]
GO

EXEC    @return_value = [dbo].[ReportWaits]
        @endtime = N'2013-03-12 09:49:19.373',
        @starttime = N'2013-03-12 09:41:59.793'

In my case (it is a very small timeperiod), the result is something like this:

image

As you can see, then on my server it is BackupIO that has caused 97.4877% of the waits.

What is missing in a lot of other blogs you are reading, is the column ShareOfPeriod. Here you can see that BackupIO has caused 98.865909, but the main thing is: With this column, you can see how long time of the capturetime we have been waiting. There could be a lot of theoretical discussions about this figure, and if it is good or bad, if we can use this column when we have multiple processor and so on.

BUT!!! It is better than just knowing: It is from the last restart or DBCC. When we collect each hour, we get a lot of flexibility to narrowing down where the problem is.

You can do a lot of optimization and make it easier to run the stored procedure without having to find the capturetimes, but… there should also be something for you to do.

One comment

  1. USE [Logging]
    GO

    SET ANSI_NULLS ON
    GO

    SET QUOTED_IDENTIFIER ON
    GO

    CREATE procedure [dbo].[collect]
    as
    WITH Waits AS
    (SELECT
    wait_type,
    wait_time_ms / 1000.0 AS WaitS,
    (wait_time_ms – signal_wait_time_ms) / 1000.0 AS ResourceS,
    signal_wait_time_ms / 1000.0 AS SignalS,
    waiting_tasks_count AS WaitCount,
    100.0 * wait_time_ms / SUM (wait_time_ms) OVER() AS Percentage,
    ROW_NUMBER() OVER(ORDER BY wait_time_ms DESC) AS RowNum
    FROM sys.dm_os_wait_stats
    WHERE wait_type NOT IN (
    ‘CLR_SEMAPHORE’, ‘LAZYWRITER_SLEEP’, ‘RESOURCE_QUEUE’, ‘SLEEP_TASK’,
    ‘SLEEP_SYSTEMTASK’, ‘SQLTRACE_BUFFER_FLUSH’, ‘WAITFOR’, ‘LOGMGR_QUEUE’,
    ‘CHECKPOINT_QUEUE’, ‘REQUEST_FOR_DEADLOCK_SEARCH’, ‘XE_TIMER_EVENT’
    , ‘BROKER_TO_FLUSH’,
    ‘BROKER_TASK_STOP’, ‘CLR_MANUAL_EVENT’, ‘CLR_AUTO_EVENT’
    , ‘DISPATCHER_QUEUE_SEMAPHORE’,
    ‘FT_IFTS_SCHEDULER_IDLE_WAIT’, ‘XE_DISPATCHER_WAIT’, ‘XE_DISPATCHER_JOIN’
    , ‘BROKER_EVENTHANDLER’,
    ‘TRACEWRITE’, ‘FT_IFTSHC_MUTEX’, ‘SQLTRACE_INCREMENTAL_FLUSH_SLEEP’,
    ‘BROKER_RECEIVE_WAITFOR’, ‘ONDEMAND_TASK_QUEUE’, ‘DBMIRROR_EVENTS_QUEUE’,
    ‘DBMIRRORING_CMD’, ‘BROKER_TRANSMITTER’, ‘SQLTRACE_WAIT_ENTRIES’,
    ‘SLEEP_BPOOL_FLUSH’, ‘SQLTRACE_LOCK’)
    )
    insert into wait_statistics
    SELECT
    getdate() as CaptureTime
    ,W1.wait_type AS WaitType,
    max(CAST (W1.WaitS AS DECIMAL(14, 2))) AS Wait_S,
    max(CAST (W1.ResourceS AS DECIMAL(14, 2))) AS Resource_S,
    max(CAST (W1.SignalS AS DECIMAL(14, 2))) AS Signal_S,
    max(W1.WaitCount) AS WaitCount,
    max(CAST (W1.Percentage AS DECIMAL(4, 2))) AS Percentage,
    max(case when w1.WaitCount0 then
    CAST ((W1.WaitS / W1.WaitCount) AS DECIMAL (14, 4))
    else
    0
    end) AS AvgWait_S,
    max(case when w1.WaitCount0 then
    CAST ((W1.ResourceS / W1.WaitCount) AS DECIMAL (14, 4))
    else
    0
    end) AS AvgRes_S,
    max(case when w1.WaitCount0 then
    CAST ((W1.SignalS / W1.WaitCount) AS DECIMAL (14, 4))
    else
    0
    end) AS AvgSig_S
    FROM Waits AS W1
    INNER JOIN Waits AS W2 ON W2.RowNum <= W1.RowNum
    GROUP BY W1.wait_type

    GO

    Like

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s