Alwayson Availability Group Replica is Not Synchronizing (Error: 976, Severity: 14, State: 1.)

sqlserver2016

You might lose communication between the primary and secondary replicas.

As known, there is a policy on SQL Server to check the data synchronization state of the database replica. The policy is in an unhealthy state when the data synchronization state is NOT SYNCHRONIZING or the state is not SYNCHRONIZED for the synchronous-commit database replica.

After investigating the sql server errorlog, I got the following msg:

The target database, ‘YourDatabase’, is participating in an availability group and is currently not accessible for queries. Either data movement is suspended or the availability replica is not enabled for read access. To allow read-only access to this and other databases in the availability group, enable read access to one or more secondary availability replicas in the group.  For more information, see the ALTER AVAILABILITY GROUP statement in SQL Server Books Online. (Microsoft SQL Server, Error: 976)

According to Technet, this issue can be caused by the following:

  • The availability replica might be disconnected.
  • The data movement might be suspended.
  • The database might not be accessible.
  • There might be a temporary delay issue due to network latency or the load on the primary or secondary replica.

Further information: https://technet.microsoft.com/en-us/library/hh245199.aspx

Resolution:
You can fix it, using the following T-SQL command to force the resume synchronization:

ALTER DATABASE YourDatabase SET HADR RESUME;

You can also follow the link: How to resume an AG database.

Check out these related tips:

How to monitor Timeout and Changes Roles from Availability Groups with Alerts
How to Monitor AlwaysON – Primary / Secondary Replicas

 

How to monitor Timeout and Changes Roles from Availability Groups with Alerts

tech-coffee-sql-server2

Agent alert can send alerts by email when occur some action/timeout between databases in your Availability Group Environment. It allows you to become proactive.

The following script shows how to create these alerts.

EXEC msdb .dbo. sp_add_operator @name =N'AlertTeam',
               @enabled=1 ,
               @weekday_pager_start_time=90000 ,
               @weekday_pager_end_time=180000 ,
               @saturday_pager_start_time=90000 ,
               @saturday_pager_end_time=180000 ,
               @sunday_pager_start_time=90000 ,
               @sunday_pager_end_time=180000 ,
               @pager_days=0 ,
               @email_address=N'email@email.com.br' ,
               @category_name=N'[Uncategorized]'
GO

-- 1480 - AG Role Change (failover)
EXEC msdb .dbo. sp_add_alert
        @name = N'AG Role Change',
        @message_id = 1480,
    @severity = 0,
    @enabled = 1,
    @delay_between_responses = 0,
    @include_event_description_in = 1;
GO
EXEC msdb .dbo. sp_add_notification
        @alert_name = N'AG Role Change',
        @operator_name = N'AlertTeam',
        @notification_method = 1;
GO

-- 35264 - AG Data Movement - Resumed
EXEC msdb .dbo. sp_add_alert
        @name = N'AG Data Movement - Suspended',
        @message_id = 35264,
    @severity = 0,
    @enabled = 1,
    @delay_between_responses = 0,
    @include_event_description_in = 1;
GO
EXEC msdb .dbo. sp_add_notification
        @alert_name = N'AG Data Movement - Suspended',
        @operator_name = N'AlertTeam',
        @notification_method = 1;
GO

-- 35265 - AG Data Movement - Resumed
EXEC msdb .dbo. sp_add_alert
        @name = N'AG Data Movement - Resumed',
        @message_id = 35265,
    @severity = 0,
    @enabled = 1,
    @delay_between_responses = 0,
    @include_event_description_in = 1;
GO
EXEC msdb .dbo. sp_add_notification
        @alert_name = N'AG Data Movement - Resumed',
        @operator_name = N'AlertTeam',
        @notification_method = 1;
GO

-- 35206 - AG Timeout to Secondary Replica
EXEC msdb .dbo . sp_add_alert
        @name = N'AG Timeout to Secondary Replica',
        @message_id = 35206,
    @severity = 0,
    @enabled = 1,
    @delay_between_responses = 0,
    @include_event_description_in = 1;
GO
EXEC msdb .dbo . sp_add_notification
        @alert_name = N'AG Timeout to Secondary Replica',
        @operator_name = N'AlertTeam',
        @notification_method = 1;
GO
-- 35202 - AG Timeout to Secondary Replica
EXEC msdb .dbo . sp_add_alert
        @name = N'AG Connection has been successfully established',
        @message_id = 35202,
    @severity = 0,
    @enabled = 1,
    @delay_between_responses = 0,
    @include_event_description_in = 1;
GO
EXEC msdb .dbo . sp_add_notification
        @alert_name = N'AG Connection has been successfully established',
        @operator_name = N'AlertTeam',
        @notification_method = 1;

You can also configure to execute another SQL Agent job to perform your required action.
I recommend you to execute a Job with the Stored Procedure inside the link below to show how is the synchronism status.
https://danilocbraga.wordpress.com/2014/10/20/how-to-monitor-alwayson-primary-secondary-replicas/

I hope this post can help

How to validate Quorum Vote from Windows server Failover Cluster (AlwaysOn)

If you have configured a Windows server Failover Cluster with more than two Nodes, you must care about the “votes” in each Node.

Let’s imagine the following scenario:

You have two Nodes configured in Automatic Failover/Synchronous mode with AlwaysOn Availability Group.
Another Node in Manual Failover/Asynchronous as a Reporting Service.

So, you have 3 Nodes where each one is counting one vote, and another one as a FileShare. (Quorum configured during the Alwayson installation).

You should always have an odd number of quorum votes in your WSFC cluster. For the purposes of quorum voting, SQL Server does not have to be installed on all nodes in the cluster. An additional server can act as a quorum member, or the WSFC quorum model can be configured to use a remote file share as a tie-breaker.

For more information, see: WSFC Quorum Modes and Voting Configuration (SQL Server)

In other words, if your FileShare and Reporting Service are not available, your production environment will go down. It’s not a good idea, right?

To fix it, you must Install the hotfix KB2494036 in each Node.
This hotfix is available to allow you to configure a cluster node that does not have quorum vote in Windows Server 2008 and 2008 R2.

Now, you are able to remove the quorum vote from Reporting Services environment with the script below on cmd:

Cluster.exe . node SERVERNAME /prop NodeWeight=0

To validate Quorum vote, take a look at the following scripts

Management Studio

SELECT  member_name, member_state_desc, number_of_quorum_votes
 FROM   sys.dm_hadr_cluster_members;

PowerShell

Import-Module FailoverClusters
$cluster = "CLUSTERNAME"
$nodes = Get-ClusterNode -Cluster $cluster
$nodes | Format-Table -property NodeName, State, NodeWeight

I hope this could help you. See ya!

How to Monitor AlwaysON – Primary / Secondary Replicas on SQL Server

I’ve developed a stored procedure to run on SQL Server Agent Job, to monitor the synchronism status from AlwaysOn Availability Groups.
In case of “NO HEALTHY” status, the SP will send you an email providing details of the database.

Pre-Requirements

– DatabaseMail Instaled

So, to set up the alert properly, I execute the stored procedure above in a job task (every 5 or 10 minutes) on the primary instance, passing ‘Primary’ as parameter.

EXEC [pr_monitor_alwayson] 'PRIMARY'

On the “expected” secondary replica, I pass ‘SECONDARY’ as parameter using the same stored procedure

EXEC [pr_monitor_alwayson] 'SECONDARY'

Once you need to force a failover, you should just change the parameters in both instances.

Related Links:

Monitor Availability Groups (Transact-SQL)

In the next post, I’m going to provide scripts of alerts that I usually configure to complement with more details about each event that occur in AlwaysOn Availability Groups.

Please, let me know if you have any questions.

Availability Groups Setup and Configuration from A to Z

AlwaysOn Availability Groups was introduced in SQL Server 2012, and it’s one of the most important features in the history of SQL Server.
The link provided below, as far as I’m concerned, it’s a complete step by step on how to configure a high availability solution with the Availability Groups.
There are points that you must consider before installation, as well as installing the hotfix KB2494036.
  • A hotfix is available to let you configure a cluster node that does not have quorum votes in Windows Server 2008 and in Windows Server 2008 R2

http://blogs.lessthandot.com/index.php/architect/availability-groups-setup-and-configuration-from-a-to-z/

Enjoy it!