Tuesday 11 March 2014

Microsoft Case Study: Microsoft SQL Server 2014 - Amway

How do you boost availability and speed disaster recovery across an expanding, global data infrastructure? For Amway, the answer is with Microsoft SQL Server 2014 and Windows Azure. In pilot tests, Amway found the two Microsoft technologies worked together to support 100 percent availability and reduce data center recovery time from six person-hours to 30 seconds.

Business Needs

As the number of Amway Business Owners (ABOs) has grown—it now tops 3 million—so has the company's data infrastructure. Today, Amway hosts not only a pair of data centers near its suburban Grand Rapids, Michigan, headquarters, but also data centers in Malaysia and China. These data centers support about 34 terabytes of information spread across 100 instances of Microsoft SQL Server software, with the data load growing at an annual rate of about 15 percent.

The company faces the same challenges of almost any sizable organization in maximizing data availability and ensuring disaster recovery. It also faces a data challenge that is particular to global organizations, namely getting information to destinations almost anywhere in the world without productivity-crippling increases in latency.

Amway uses its international data centers to meet these challenges, for example, by hosting information closer to the company's many ABOs, in order to lower latency. The company uses those data centers especially for customer relationship management (CRM) data, so it can provide its ABOs with a great web experience without their having to wait for information to make the roundtrip between Michigan and Asia.

To support these additional production data centers, Amway has created additional secondary data centers as disaster-recovery sites. It has installed more hardware. It has hired more database administrators (DBAs). All this additional infrastructure has, inevitably, introduced more complexity and cost into the Amway data environment.

Previously, Amway used database mirroring between its two Michigan data centers, but failovers still required technicians to restart all web app services, Java apps, authentication/authorization servers, and DNS servers after a failover—a process that took hours.

Amway wanted higher data availability, greater reliability, and more cost-effective disaster recovery.

*
* SQL Server Availability Groups are our preferred high-availability and disaster-recovery implementation. It's what has gotten us the 100 percent uptime.  *

Kurt Mast,
Principal Database Administrator,
Amway

*
Solution

In 2013, Amway conducted a pilot test of a prerelease version of Microsoft SQL Server 2014 software, focusing on the software's AlwaysOn Availability Groups for high availability and disaster recovery. That feature is based on multisite data clustering with failover to databases hosted both on-premises and in Windows Azure, the Microsoft cloud-computing platform.

The pilot represented the company's first use of Windows Azure, but Amway is no stranger to Microsoft technology; it uses Microsoft products and services for its custom data applications, e-commerce website, content management, and CRM.

Amway had begun to use on-premises Availability Groups with SQL Server 2012, but wanted to participate in prerelease testing of SQL Server 2014 because of the version's extension of Availability Groups to Windows Azure. Making the cloud part of an Availability Group held the promise of increased reliability and reduced cost.

Previously, Amway had been concerned about cloud configurations that were not under its control. But with Windows Azure Infrastructure as a Service, the company could create its own virtual machine configuration image and install it on Windows Azure, addressing that concern. Amway uses the same virtual-machine media for instances on Windows Azure and in its own data centers, insuring that installations are consistent across both environments.

The pilot test focused on a CRM application. It consolidated data from multiple Oracle and SQL Server databases into a single SQL Server data warehouse that could be queried by customer service representatives for comprehensive customer information.

The test architecture consisted of three nodes in a hybrid on-premises/cloud configuration:

A primary replica and secondary replica, operating synchronously to support high availability through automatic failover, both located on-premises

A secondary replica located in Windows Azure, operating in asynchronous mode to provide disaster recovery through manual failover

To test the configuration, Amway simulated the disruption of the primary replica with failover to the secondary on-premises instance, as well as the disruption of both on-premises replicas with failover to the Windows Azure instance.

Benefits

Amway found that the test of SQL Server AlwaysOn Availability Groups with Windows Azure replicas delivered the 100 percent uptime, recovery speed, and cost-effectiveness it sought. The company is now considering how best to deploy the solution.

Delivers 100 Percent Uptime

The tests of AlwaysOn Availability Groups with Windows Azure replicas achieved 100 percent uptime, according to Kurt Mast, Principal Database Administrator, Amway.

"SQL Server Availability Groups are our preferred high-availability and disaster-recovery implementation," says Mast. "It's what has gotten us the 100 percent uptime for the entire pilot test."

Mast credits factors including the Availability Group Listener, which connects clients to database replicas and provides fast application failover in the event of an availability group failover. In tests, failover took place in 10 seconds or less, compared to the 45 seconds Amway experienced with traditional SQL Server Failover Clusters.

Reduces Recovery Time from Six Person-Hours to 30 Seconds

Amway is looking forward to an even bigger reduction in the time required to recover from a complete data center failure. Instead of the two-hour, three-person process required with database mirroring, Amway will be able to restore a data center with just 30 seconds of one DBA's time.

"We haven't seen SQL Server fail, but even if we lose an entire data center, we expect to bring it up again before most users have noticed anything wrong," says Mast.

Facilitates More-Frequent Recovery Testing

Another key benefit from faster and more cost-effective data center recovery is the ability to test the company's disaster- recovery practices more frequently without disrupting users or adding to the workload on IT staff. Currently, Amway tests its disaster-recovery capabilities annually; with AlwaysOn Availability Groups, that frequency could go to quarterly, according to Mast.

"We continually make changes to our IT environment," says Mast. "More frequent testing would give us peace of mind that our disaster-recovery system works with our current configurations. With faster, more-automated failover, we'd also be able to do more data center maintenance, network maintenance, SAN upgrades, and other operations whenever we wished, without inconveniencing our users."

This case study is for informational purposes only.

0 comments:

Post a Comment