The word “dashboard” immediately puts me into a state of suspicion. This is probably because I’ve been a part of the corporate world for 18 years now and, fair or not, the word “dashboard” gives me flashbacks of executive conversations where I’m asked to summarize complicated information into a single square – while somehow defying the laws of physics and providing all the necessary detail being asked for.
So when I heard of the “AlwaysOn Dashboard” – I was pretty sure I wouldn’t like it and that I would stick primarily with other methods.
Well – I was wrong.
While any tool could use further improvement, the AlwaysOn Dashboard is definitely something I’ll be using in the future. Hence the purpose of this post was just to show the various questions the dashboard can answer.
Where am I?
I start with this question because depending on which replica you’re launching the dashboard from, you’ll see a different view of the world. For example, the following is the first section of the dashboard when launched from the primary replica of an availability group called EMU-AG1:
As you can see, from the primary I’m getting three sections and a view of my replicas, availability databases and more (which I’ll get in to later in this post).
Contrast this to launching the dashboard from a secondary replica:
You can see that I’m getting information from the perspective of just that replica. You may have no other choice in a disaster recovery scenario – and in that case, you will see even less. But the key is not to be surprised to see this subset of information if you’re launching the dashboard from a secondary instead of the primary. If you want to ensure you’re connecting to the primary, you could connect with SSMS to the Availability Group listener name instead, again assuming you’re not in an outage scenario.
What can I find out by default?
You saw by the earlier screen shots that we can find out quite a bit by default:
· Where is the primary replica? (called “primary instance” on the dashboard)
· What are the roles of each replica? (primary or secondary)
· What are the failover modes? (automatic or manual)
· What are the synchronization states? (for example – “not synchronizing”, “synchronizing”, “synchronized”)
· What are the availability databases?
· What are their individual availability database states?
· What are the database failover readiness states? (data loss possible, no data loss)
· Any issues at the replica scope?
· Any issues at the availability database scope?
Regarding the “issues” – let’s say one of my Windows Server Failover Cluster (WSFC) nodes is paused, and thus, one of my replicas is now offline. The issues column in the availability replicas shows the following:
Clicking on the link-text, I see additional information in the Policy Evaluation Result on Availability Replica dialog box:
So the theme is becoming clear that the dashboard is pointing out the summarized things I should be concerned with and then allowing me to click through to add additional information – instead of overwhelming me with information by default.
What else?
Right-click on the availability replica section and you’ll see other very useful data points:
We start seeing information that merges the availability group and Windows Server Failover Cluster (WSFC) worlds – such as Quorum Votes and Member State (for example – if I pause a WSFC node, I’ll see Member State = Offline).
We have the same ability to add columns for the availability databases section. Right clicking the column headers reveals quite a few useful statistics (and again – I’m impressed that the UI designers exerted editorial control and didn’t try to stuff all columns into the report by default):
Some excellent data points here – including Estimated Recovery Time, Estimated Data Loss, Synchronization Performance, Log Send Queue Size and Redo Queue Size. And while this isn’t replacing Perfmon for trending analysis – if you’re in the middle of a “things are slow now” scenario, this additional information is incredibly useful.
Where else can I go?
In the upper-right hand corner of the dashboard (I’m using the 11.0.2316 version of the SQL Server 2012, by the way), I can navigate into three different useful areas…
The “Start Failover Wizard” is as you might suspect. This wizard allows me to fail over to a secondary replica, making it the new primary replica.
The View AlwaysOn Health Events option allows you to explore Extended Events captured via the “AlwaysOn_health” session (which Jonathan has already talked about in more detail). What I love about this viewer (pictured below) is that unlike the dashboard which gives us a moment in time, with the viewer I can see what has happened recently that is noteworthy. And just like the dashboard, I can add the columns I’m interested in by right-clicking the column headers and then choosing what I need and also clicking specific rows to reveal details:
Back to the main dashboard, if I click the “View Cluster Quorum Information” link, I’ll see data from a WSFC quorum perspective (nodes, member type, member state, and vote assignment):
What about the T-SQL?
The information captured on the dashboard also provides us (if you snoop via Extended Events or SQL Trace) with guidance on the new catalog views and DMVs. Below I’ve capturing the commands behind the scenes – removing most of the extraneous T-SQL to show the core areas where some of this data is pulled for the dashboard. I didn’t clean up the original formatting from what I captured – so please forgive everything being in lower-case. Also – in many cases more information is pulled than displayed – and many fields are converted to a more readable format for use on the dashboard. I still find it useful to understand the origins of the data:
select *
from master.sys.availability_groups;
select agstates.group_id, agstates.primary_replica
from master.sys.dm_hadr_availability_group_states as agstates;
select group_id, replica_id, replica_metadata_id
from master.sys.availability_replicas;
select replica_id, is_local, role
from master.sys.dm_hadr_availability_replica_states;
select *
from master.sys.availability_groups;
select group_id, replica_id, replica_server_name,create_date,
modify_date, endpoint_url, read_only_routing_url,
primary_role_allow_connections, secondary_role_allow_connections,
availability_mode,failover_mode, session_timeout, backup_priority,
owner_sid
from master.sys.availability_replicas;
select group_id, replica_id, role,operational_state,recovery_health,
synchronization_health,connected_state, last_connect_error_number,
last_connect_error_description, last_connect_error_timestamp
from master.sys.dm_hadr_availability_replica_states;
select replica_id,join_state
from master.sys.dm_hadr_availability_replica_cluster_states;
SELECT DISTINCT replica_server_name, node_name
FROM master.sys.dm_hadr_availability_replica_cluster_nodes AS arrc
LEFT OUTER JOIN master.sys.dm_hadr_cluster_members AS cm ON
UPPER(arrc.node_name) = UPPER(cm.member_name) ;
select * from master.sys.dm_hadr_availability_group_states;
select *
from master.sys.availability_groups;
select group_id, replica_id,replica_server_name,availability_mode
from master.sys.availability_replicas;
select replica_id,group_database_id,database_name,is_database_joined,is_failover_ready
from master.sys.dm_hadr_database_replica_cluster_states;
select*
from master.sys.dm_hadr_database_replica_states;
select replica_id,role,is_local
from master.sys.dm_hadr_availability_replica_states;
declare @cluster_name nvarchar(128)
declare @quorum_type tinyint
declare @quorum_state tinyint
BEGINTRY
SELECT @cluster_name = cluster_name,
@quorum_type = quorum_type,
@quorum_state = quorum_state
FROM sys.dm_hadr_cluster
ENDTRY
BEGINCATCH
IF(ERROR_NUMBER() != 297)
BEGIN
THROW
END
ENDCATCH
SELECT
ISNULL(@cluster_name, ”) AS [ClusterName],
ISNULL(@quorum_type, 4) AS [ClusterQuorumType],
ISNULL(@quorum_state, 3) AS [ClusterQuorumState]
SELECT
‘Server[@Name=’ + quotename(CAST(
serverproperty(N’Servername’)
AS sysname),””) + ‘]’ + ‘/ClusterMemberState[@Name=’ + quotename(cm.member_name,””) + ‘]’ AS [Urn],
cm.member_name AS [Name],
cm.member_type AS [MemberType],
ISNULL(cm.member_state, 2) AS [member_state],
ISNULL(cm.number_of_quorum_votes, –1) AS [NumberOfQuorumVotes]
FROM
sys.dm_hadr_cluster_memberscm
How often does the dashboard refresh?
First of all, you have an “on” and “off” option in the upper right-hand corner:
And in SSMS you can configure the refresh interval in seconds using Tools/Options/SQL Server AlwaysOn/Dashboard:
(Initial) Wish List
So regarding what I would like to see added or improved upon, I think this will evolve over time, but initially I’d like to see the following:
· Ability to “snapshot” the report to an external file (PDF or XML file for example). Granted, I can just take a screen shot, but it would be nice to take a quick snapshot if I’ve stumbled across some unusual or short lived conditions.
· Ability to access the full availability group metadata from the dashboard of secondary replicas. This extends beyond the dashboard and into the data available through the data sources themselves.
Not a very long list yet – and again, I’ll certainly be adding the AlwaysOn Dashboard to my “useful” list of tools. Unlike the executive dashboards I mentioned at the beginning of the post, this one is actually quite useful and strikes that rare balance of providing just enough detail while also exerting editorial restraint.