Azure Cache for Redis supports zone redundancy in its Premium and Enterprise tiers. A zone-redundant cache runs on VMs spread across multiple Availability Zones. It provides higher resilience and availability.
Today I’ll show hot to test the failover of a zone-redundant cache.
Deploy Azure Cache for Redis with availability zones
Create a main.tf file with the following content:
1terraform {
2 required_version = "> 0.14"
3 required_providers {
4 azurerm = {
5 version = "= 2.57.0"
6 }
7 random = {
8 version = "= 3.1.0"
9 }
10 }
11}
12
13provider "azurerm" {
14 features {}
15}
16
17# Location of the services
18variable "location" {
19 default = "west europe"
20}
21
22# Resource Group Name
23variable "resource_group" {
24 default = "redis-failover"
25}
26
27# Name of the Redis cluster
28variable "redis_name" {
29 default = "redis-failover"
30}
31
32resource "random_id" "random" {
33 byte_length = 8
34}
35
36resource "azurerm_resource_group" "rg" {
37 name = var.resource_group
38 location = var.location
39}
40
41resource "azurerm_redis_cache" "redis" {
42 name = "${var.redis_name}-${lower(random_id.random.hex)}"
43 location = azurerm_resource_group.rg.location
44 resource_group_name = azurerm_resource_group.rg.name
45 capacity = 2
46 family = "P"
47 sku_name = "Premium"
48 enable_non_ssl_port = true
49 minimum_tls_version = "1.2"
50
51 redis_configuration {
52 }
53
54 zones = ["1", "2"]
55}
56
57resource "azurerm_log_analytics_workspace" "logs" {
58 name = "redis-logs"
59 location = azurerm_resource_group.rg.location
60 resource_group_name = azurerm_resource_group.rg.name
61 sku = "PerGB2018"
62 retention_in_days = 30
63}
64
65resource "azurerm_monitor_diagnostic_setting" "monitor" {
66 name = lower("extaudit-${var.redis_name}-diag")
67 target_resource_id = azurerm_redis_cache.redis.id
68 log_analytics_workspace_id = azurerm_log_analytics_workspace.logs.id
69
70 metric {
71 category = "AllMetrics"
72
73 retention_policy {
74 enabled = false
75 }
76 }
77
78 log {
79 category = "ConnectedClientList"
80 enabled = false
81
82 retention_policy {
83 days = 0
84 enabled = false
85 }
86 }
87
88 lifecycle {
89 ignore_changes = [metric]
90 }
91}
92
93output "redis_name" {
94 value = azurerm_redis_cache.redis.name
95}
96
97output "redis_host_name" {
98 value = azurerm_redis_cache.redis.hostname
99}
100
101output "redis_primary_access_key" {
102 value = azurerm_redis_cache.redis.primary_access_key
103 sensitive = true
104}
Note: the zones are specified: zones = ["1", "2"]
, making the cache zone-redundant.
Deploy the Azure Cache for Redis with availability zones:
Run the following command to deploy the Azure Cache for Redis with availability zones:
1terraform init
2terraform apply -auto-approve
Test the Azure Cache for Redis failover
Donwload the redis-cli tool:
1Invoke-WebRequest -Uri "https://github.com/microsoftarchive/redis/releases/download/win-3.2.100/Redis-x64-3.2.100.zip" -OutFile redis.zip -UseBasicParsing
2Expand-Archive -Path .\redis.zip -DestinationPath .\redis-cli
Use the redis-cli to prepare the cache instance with data:
1$redis_name=$(terraform output redis_name)
2$redis_host_name=$(terraform output redis_host_name)
3$redis_primary_access_key=$(terraform output redis_primary_access_key)
4
5.\redis-cli\redis-benchmark -h $redis_host_name -a $redis_primary_access_key -t SET -n 10 -d 1024
Check the availability zone hosting the master node:
1$redis_name=$(terraform output redis_name)
2az redis show -n $redis_name -g redis-failover --query "instances[?isMaster]"
you should get an output similar to:
1[
2 {
3 "isMaster": true,
4 "isPrimary": true,
5 "nonSslPort": 13000,
6 "shardId": 0,
7 "sslPort": 15000,
8 "zone": "1"
9 }
10]
Use the redis-cli to execute a long running process:
1.\redis-cli\redis-benchmark -h $redis_host_name -a $redis_primary_access_key -t GET -n 1000000 -d 1024 -c 50
Test the Azure Cache for Redis failover (CLI):
From another terminal, run the following command to test the Azure Cache for Redis failover:
1$redis_name=$(terraform output redis_name)
2az redis force-reboot --reboot-type PrimaryNode -n $redis_name -g redis-failover
Note: at the time of writing, the previous command fails with an exception:
1(InternalServerError) Something went wrong.
2RequestID=dececf94-7f11-4ffa-9a4b-35694dd3f091
3Code: InternalServerError
4Message: Something went wrong.
5RequestID=dececf94-7f11-4ffa-9a4b-35694dd3f091
Please track the following issue for more information: az redis force-reboot fails with InternalServerError
Test the Azure Cache for Redis failover (Azure Portal):
To reboot the Primary Node, head to the Azure portal and use the Administration/Reboot section of the Redis cluster:
Once failover start you should see the long running process disconnect. This means your applications must be able to recover from transient errors when working with the cache.
Once the failover is complete, check again which availability zone hosts the primary node:
1$redis_name=$(terraform output redis_name)
2az redis show -n $redis_name -g redis-failover --query "instances[?isMaster]"
the output should be similar to:
1[
2 {
3 "isMaster": true,
4 "isPrimary": true,
5 "nonSslPort": 13001,
6 "shardId": 0,
7 "sslPort": 15001,
8 "zone": "2"
9 }
10]
Hope it helps!!!
Comments