sqlskills-logo-2015-white.png

Answering Questions with the AlwaysOn Dashboard

The word “dashboard” immediately puts me into a state of suspicion. This is probably because I’ve been a part of the corporate world for 18 years now and, fair or not, the word “dashboard” gives me flashbacks of executive conversations where I’m asked to summarize complicated information into a single square – while somehow defying the laws of physics and providing all the necessary detail being asked for.

So when I heard of the “AlwaysOn Dashboard” – I was pretty sure I wouldn’t like it and that I would stick primarily with other methods.

SNAGHTML1d45817

Well – I was wrong.

While any tool could use further improvement, the AlwaysOn Dashboard is definitely something I’ll be using in the future. Hence the purpose of this post was just to show the various questions the dashboard can answer.

Where am I?

I start with this question because depending on which replica you’re launching the dashboard from, you’ll see a different view of the world. For example, the following is the first section of the dashboard when launched from the primary replica of an availability group called EMU-AG1:

SNAGHTML1d4043d

As you can see, from the primary I’m getting three sections and a view of my replicas, availability databases and more (which I’ll get in to later in this post).

Contrast this to launching the dashboard from a secondary replica:

SNAGHTML1d3bc65

You can see that I’m getting information from the perspective of just that replica. You may have no other choice in a disaster recovery scenario – and in that case, you will see even less. But the key is not to be surprised to see this subset of information if you’re launching the dashboard from a secondary instead of the primary. If you want to ensure you’re connecting to the primary, you could connect with SSMS to the Availability Group listener name instead, again assuming you’re not in an outage scenario.

What can I find out by default?

 

You saw by the earlier screen shots that we can find out quite a bit by default:

· Where is the primary replica? (called “primary instance” on the dashboard)

· What are the roles of each replica? (primary or secondary)

· What are the failover modes? (automatic or manual)

· What are the synchronization states? (for example – “not synchronizing”, “synchronizing”, “synchronized”)

· What are the availability databases?

· What are their individual availability database states?

· What are the database failover readiness states? (data loss possible, no data loss)

· Any issues at the replica scope?

· Any issues at the availability database scope?

Regarding the “issues” – let’s say one of my Windows Server Failover Cluster (WSFC) nodes is paused, and thus, one of my replicas is now offline. The issues column in the availability replicas shows the following:

SNAGHTML1d29a73

Clicking on the link-text, I see additional information in the Policy Evaluation Result on Availability Replica dialog box:

SNAGHTML1d235b9

So the theme is becoming clear that the dashboard is pointing out the summarized things I should be concerned with and then allowing me to click through to add additional information – instead of overwhelming me with information by default.

What else?

Right-click on the availability replica section and you’ll see other very useful data points:

SNAGHTML1d2ff4e

We start seeing information that merges the availability group and Windows Server Failover Cluster (WSFC) worlds – such as Quorum Votes and Member State (for example – if I pause a WSFC node, I’ll see Member State = Offline).

 

We have the same ability to add columns for the availability databases section. Right clicking the column headers reveals quite a few useful statistics (and again – I’m impressed that the UI designers exerted editorial control and didn’t try to stuff all columns into the report by default):

SNAGHTML1d16df7

Some excellent data points here – including Estimated Recovery Time, Estimated Data Loss, Synchronization Performance, Log Send Queue Size and Redo Queue Size. And while this isn’t replacing Perfmon for trending analysis – if you’re in the middle of a “things are slow now” scenario, this additional information is incredibly useful.

Where else can I go?

In the upper-right hand corner of the dashboard (I’m using the 11.0.2316 version of the SQL Server 2012, by the way), I can navigate into three different useful areas…

SNAGHTML1d134de

The “Start Failover Wizard” is as you might suspect. This wizard allows me to fail over to a secondary replica, making it the new primary replica.

The View AlwaysOn Health Events option allows you to explore Extended Events captured via the “AlwaysOn_health” session (which Jonathan has already talked about in more detail). What I love about this viewer (pictured below) is that unlike the dashboard which gives us a moment in time, with the viewer I can see what has happened recently that is noteworthy. And just like the dashboard, I can add the columns I’m interested in by right-clicking the column headers and then choosing what I need and also clicking specific rows to reveal details:

SNAGHTML1d0f5ad

Back to the main dashboard, if I click the “View Cluster Quorum Information” link, I’ll see data from a WSFC quorum perspective (nodes, member type, member state, and vote assignment):

SNAGHTML1d0c3a5

What about the T-SQL?

The information captured on the dashboard also provides us (if you snoop via Extended Events or SQL Trace) with guidance on the new catalog views and DMVs. Below I’ve capturing the commands behind the scenes – removing most of the extraneous T-SQL to show the core areas where some of this data is pulled for the dashboard. I didn’t clean up the original formatting from what I captured – so please forgive everything being in lower-case. Also – in many cases more information is pulled than displayed – and many fields are converted to a more readable format for use on the dashboard. I still find it useful to understand the origins of the data:

SNAGHTML1d08ada

select *

from master.sys.availability_groups;

  

select agstates.group_id, agstates.primary_replica

from master.sys.dm_hadr_availability_group_states as agstates;

   

select group_id, replica_id, replica_metadata_id

from master.sys.availability_replicas;

     

select replica_id, is_local, role

from master.sys.dm_hadr_availability_replica_states;

SNAGHTML1d017fc

select *

from master.sys.availability_groups;

   

select group_id, replica_id, replica_server_name,create_date,

 modify_date, endpoint_url, read_only_routing_url,

 primary_role_allow_connections, secondary_role_allow_connections,

 availability_mode,failover_mode, session_timeout, backup_priority,

 owner_sid 

 from master.sys.availability_replicas;

 select group_id, replica_id, role,operational_state,recovery_health,

 synchronization_health,connected_state, last_connect_error_number,

 last_connect_error_description, last_connect_error_timestamp

from master.sys.dm_hadr_availability_replica_states;

   

select replica_id,join_state

from master.sys.dm_hadr_availability_replica_cluster_states;

   

SELECT DISTINCT replica_server_name, node_name

FROM master.sys.dm_hadr_availability_replica_cluster_nodes AS arrc

LEFT OUTER JOIN master.sys.dm_hadr_cluster_members AS cm ON

      UPPER(arrc.node_name) = UPPER(cm.member_name) ;

select * from master.sys.dm_hadr_availability_group_states;

SNAGHTML1cfe114

select *

from master.sys.availability_groups;

   

select group_id, replica_id,replica_server_name,availability_mode 

from master.sys.availability_replicas;

   

select replica_id,group_database_id,database_name,is_database_joined,is_failover_ready 

from master.sys.dm_hadr_database_replica_cluster_states;

   

select*

from master.sys.dm_hadr_database_replica_states;

select replica_id,role,is_local

from master.sys.dm_hadr_availability_replica_states;

SNAGHTML1cf9e2b

declare @cluster_name nvarchar(128)

declare @quorum_type tinyint

declare @quorum_state tinyint

BEGINTRY

    SELECT @cluster_name = cluster_name,

        @quorum_type = quorum_type,

        @quorum_state = quorum_state

    FROM sys.dm_hadr_cluster

ENDTRY

BEGINCATCH

    IF(ERROR_NUMBER() != 297)

    BEGIN

        THROW

    END

ENDCATCH

     

SELECT

ISNULL(@cluster_name, ) AS [ClusterName],

ISNULL(@quorum_type, 4) AS [ClusterQuorumType],

ISNULL(@quorum_state, 3) AS [ClusterQuorumState]

SELECT

‘Server[@Name=’ + quotename(CAST(

        serverproperty(N’Servername’)

       AS sysname),””) + ‘]’ + ‘/ClusterMemberState[@Name=’ + quotename(cm.member_name,””) + ‘]’ AS [Urn],

cm.member_name AS [Name],

cm.member_type AS [MemberType],

ISNULL(cm.member_state, 2) AS [member_state],

ISNULL(cm.number_of_quorum_votes, 1) AS [NumberOfQuorumVotes]

FROM

sys.dm_hadr_cluster_memberscm

How often does the dashboard refresh?

First of all, you have an “on” and “off” option in the upper right-hand corner:

SNAGHTML1cf50a8

And in SSMS you can configure the refresh interval in seconds using Tools/Options/SQL Server AlwaysOn/Dashboard:

SNAGHTML1cf06ec

(Initial) Wish List

So regarding what I would like to see added or improved upon, I think this will evolve over time, but initially I’d like to see the following:

· Ability to “snapshot” the report to an external file (PDF or XML file for example). Granted, I can just take a screen shot, but it would be nice to take a quick snapshot if I’ve stumbled across some unusual or short lived conditions.

· Ability to access the full availability group metadata from the dashboard of secondary replicas. This extends beyond the dashboard and into the data available through the data sources themselves.

Not a very long list yet – and again, I’ll certainly be adding the AlwaysOn Dashboard to my “useful” list of tools. Unlike the executive dashboards I mentioned at the beginning of the post, this one is actually quite useful and strikes that rare balance of providing just enough detail while also exerting editorial restraint.

Other articles

Imagine feeling confident enough to handle whatever your database throws at you.

With training and consulting from SQLskills, you’ll be able to solve big problems, elevate your team’s capacity, and take control of your data career.