Thursday, 8 May 2014

Hitachi : "Fast Data Recovery from Hdd Failure" in Patent Application Approval Process

05/07/2014 | 11:55pm US/Eastern

Recommend:

0

By a News Reporter-Staff News Editor at Information Technology Newsweekly -- A patent application by the inventor KAWAGUCHI, Tomohiro (Cupertino, CA), filed on December 24, 2013, was made available online on May 1, 2014, according to news reporting originating from Washington, D.C., by VerticalNews correspondents.

This patent application is assigned to Hitachi, Ltd.

The following quote was obtained by the news editors from the background information supplied by the inventors: "The present invention relates generally to data recovery in storage systems and, more particularly, to methods and apparatus for fast data recovery from storage device failure such as HDD (hard disk drive) failure. The invention demonstrates the agility of storage data recovery and ease of use of disk maintenance against disk failure.

"Currently, RAID (Redundant Array of Independent Disks) architecture is generally used to protect data from disk failure. For example, RAID5 and RAID 6 each make it possible to recover from one disk failure of the RAID Group. RAID5 and RAID6 are each more efficient for capacity than RAID1 or RAID10. When a disk failure occurs, the storage system recovers data to a reserved 'spare disk.' It needs to access the entire area of healthy disks to recover data. The time to data recovery depends on disk capacity and disk throughput performance. Generally, the technology growth ratio of capacity is larger than that of throughput. As a result, the RAID approach is slow to rebuild from disk failure and will be slower each year. Long time data rebuild has the possibility of causing long time performance decrement by corrosion between rebuilt disk I/O and normal disk I/O. Long time data rebuild also has the possibility of encountering the next disk failure during d ata recovery.

"Under another approach based on RAIN (Redundant Array of Independent Nodes), the storage system includes a plurality of nodes (disks, storage subsystems, and so on). The storage system stores data to suitably-chosen two or more nodes. When node failure occurs, the storage system copies the data to another node(s) from redundant data. It can be conducive to better rebuild performance by a pillared process. Because the RAID approach needs to reserve one or more spare disk, the rebuild time under the RAIN approach will be faster than that under the RAID approach. The RAIN approach does not need reserved spare disk because it automatically stores redundant data to free space (self-recovery). On the other hand, the capacity efficiency under the RAIN approach is lower than that under the RAID approach."

In addition to the background information obtained for this patent application, VerticalNews journalists also obtained the inventor's summary information for this patent application: "Embodiments of the invention provide methods and apparatus for fast data recovery from storage device failure such as HDD failure. Employing data distribution in plural disks, RAID in distributed data, page mapping management between virtual volume and physical disks, and parallel access data recovery by copying from pages to pages, the invention achieves fast rebuild, capacity efficiency, and self-recovery.

"In accordance with an aspect of the present invention, a storage system comprises a first storage device having a first plurality of hard disk drives and a first controller controlling the first plurality of hard disk drives. The first controller stores data in the first plurality of hard disk drives by stripes, each stripe includes M data and N parity data, where M and N are integers, and the first controller calculates for each stripe the N parity data using the M data. The M data and N parity data of each stripe are allocated to M+N hard disk drives of the first plurality of hard disk drives. A first hard disk drive of the first plurality of hard disk drives includes data or parity data of both a first stripe of the stripes and a second stripe of the stripes, while a second hard disk drive of the first plurality of hard disk drives includes data or parity data of only one of the first stripe of the stripes or the second stripe of the stripes. Durin g data recovery involving failure of one of the first plurality of hard disk drives as a failed hard disk drive, the data in the failed hard disk drive is recovered for each stripe by calculation using data and parity data in other hard disk drives of the first plurality of hard disk drives for each stripe.

"In some embodiments, the second hard disk drive of the first plurality of hard disk drives includes data or parity data of the first stripe of the stripes. A third hard disk drive of the first plurality of hard disk drives includes data or parity data of the second stripe of the stripes and does not include data or parity data of the first stripe of the stripes. In addition, M is 3 and N is 1. The number of the first plurality of hard disk drives is a multiple of four. Data and parity data of the first stripe are included in the first and second hard disk drives of the first plurality of hard disk drives and in fourth and fifth hard disk drives of the first plurality of hard disk drives. Data and parity data of the second stripe are included in the first, third, fourth, fifth hard disk drives of the first plurality of hard disk drives.

"In specific embodiments, the storage system further comprises a second storage device having a second plurality of hard disk drives and a second controller controlling the second plurality of hard disk drives. The data stored by the first controller is received from the second storage device. The first controller includes a plurality of processors. The second hard disk drive and the third hard disk drive are accessed by different processors of the plurality of processors concurrently when data is migrated from the second storage device to the first storage device. The storage system further comprises a capacity pool volume having unallocated hard disk drives of the first plurality of the hard disk drives. The stripes are allocated from the capacity pool volume. The allocation of each stripe is conducted in response to receiving the data from the second storage device. The N parity data of each stripe are coupled to the first controller via different b uses.

"In some embodiments, the storage system further comprises a second storage device having a second plurality of hard disk drives and a second controller controlling the second plurality of hard disk drives. The data stored by the first controller is received from the second storage device. Data and parity data of the first and second stripes are processed in parallel by the first controller. The first controller includes a table including information of allocation of each stripe to the first plurality of hard disk drives. M is 6 and N is 2. The number of the first plurality of hard disk drives is a multiple of eight. In case of reading data from one of the stripes including a failure of one of the first plurality of hard disk drives, the first controller is controlled to access only seven hard disk drives of the first plurality of hard disk drives without access to the failed hard disk drive. The storage system further comprises a capacity pool volume having unallocated hard disk drives of the first plurality of the hard disk drives. The stripes are allocated from the capacity pool volume. The storage system further comprises a second storage device having a second plurality of hard disk drives and a second controller controlling the second plurality of hard disk drives. The allocation of each stripe is conducted in response to receiving the data from the second storage device.

"Another aspect of the invention is directed to a method for data recovery in a storage system which includes a first storage device having a first plurality of hard disk drives and a first controller controlling the first plurality of hard disk drives. The method comprises storing data in the first plurality of hard disk drives of the first controller by stripes, each stripe includes M data and N parity data, where M and N are integers, and the first controller calculates for each stripe the N parity data using the M data; allocating the M data and N parity data of the each stripe to M+N hard disk drives of the first plurality of hard disk drives, wherein a first hard disk drive of the first plurality of hard disk drives includes data or parity data of both a first stripe of the stripes and a second stripe of the stripes, while a second hard disk drive of the first plurality of hard disk drives includes data or parity data of only one of the first str ipe of the stripes or the second stripe of the stripes; and during data recovery involving failure of one of the first plurality of hard disk drives as a failed hard disk drive, recovering the data in the failed hard disk drive for each stripe by calculation using data and parity data in other hard disk drives of the first plurality of hard disk drives for each stripe.

"Another aspect of the invention is directed to a computer-readable medium storing a plurality of instructions for controlling a data processor to perform data recovery in a storage system which includes a first storage device having a first plurality of hard disk drives and a first controller controlling the first plurality of hard disk drives. The computer-readable medium comprises instructions that cause the data processor to store data in the first plurality of hard disk drives of the first controller by stripes, each stripe includes M data and N parity data, where M and N are integers, and the first controller calculates for each stripe the N parity data using the M data; instructions that allocate the M data and N parity data of the each stripe to M+N hard disk drives of the first plurality of hard disk drives, wherein a first hard disk drive of the first plurality of hard disk drives includes data or parity data of both a first stripe of the str ipes and a second stripe of the stripes, while a second hard disk drive of the first plurality of hard disk drives includes data or parity data of only one of the first stripe of the stripes or the second stripe of the stripes; and instructions that, during data recovery involving failure of one of the first plurality of hard disk drives as a failed hard disk drive, recover the data in the failed hard disk drive for each stripe by calculation using data and parity data in other hard disk drives of the first plurality of hard disk drives for the each stripe. The data processor may reside in the first controller.

"These and other features and advantages of the present invention will become apparent to those of ordinary skill in the art in view of the following detailed description of the specific embodiments.

BRIEF DESCRIPTION OF THE DRAWINGS

"FIG. 1 illustrates the hardware configuration of a system in which the method and apparatus of the invention may be applied.

"FIG. 2 illustrates an example of a memory in the storage subsystem of FIG. 1 according to a first embodiment of the invention.

"FIG. 3 illustrates an example of a RAID Group Management Table in the memory of FIG. 2.

"FIG. 4 illustrates an example of a Virtual Volume Management Table in the memory of FIG. 2.

"FIG. 5 illustrates an example of a Virtual Volume Page Management Table in the memory of FIG. 2.

"FIG. 6 illustrates an example of a Capacity Pool Chunk Management Table in the memory of FIG. 2.

"FIG. 7 illustrates an example of a Capacity Pool Page Management Table in the memory of FIG. 2.

"FIG. 8 illustrates an example of a Cache Management Table in the memory of FIG. 2.

"FIG. 9 illustrates an example of a 5.times.2 RAID group having eight HDDs each including a plurality of parcels.

"FIG. 10 illustrates an example of a chunk having a plurality of parcels each including a plurality of stripes.

"FIG. 11 illustrates an example of a chunk having a plurality of pages.

"FIG. 12 illustrates an example of a virtual volume having a plurality of pages.

"FIG. 13 illustrates an example of a page having a plurality of slots.

"FIG. 14 illustrates an example of a virtual volume and its Virtual Volume Management Table and Virtual Volume Page Management Table.

"FIG. 15 illustrates an example of the table reference structure toward capacity pool in the virtual volume of FIG. 14.

"FIG. 16 illustrates an example of the table reference structure toward virtual volumes.

"FIG. 17 illustrates an example of a process flow of the Write I/O Control in the memory of FIG. 2.

"FIG. 18 illustrates an example of a process flow of the Read I/O Control in the memory of FIG. 2.

"FIG. 19 illustrates an example of a process flow of the Staging Control in the memory of FIG. 2.

"FIG. 20 illustrates an example of a process flow of the Destaging Control in the memory of FIG. 2.

"FIG. 21 illustrates an example of a process flow of the Copy Control in the memory of FIG. 2.

"FIG. 22 illustrates an example of a process flow of the Parity Calculation Control in the memory of FIG. 2.

"FIG. 23 illustrates an example of a process flow of the Physical Disk Address Control in the memory of FIG. 2.

"FIG. 24 illustrates an example of a process flow of the Flush Control in the memory of FIG. 2.

"FIG. 25 illustrates an example of a process flow of the Cache Control in the memory of FIG. 2.

"FIG. 26 illustrates an example of a process flow of the Page Detection Control (A) in the memory of FIG. 2.

"FIG. 27 illustrates an example of a process flow of the Page Detection Control (B) in the memory of FIG. 2.

"FIG. 28 illustrates an example of a process flow of the Page Migration Control in the memory of FIG. 2.

"FIG. 29 illustrates an example of the data recovery by chunks and pages copy.

"FIG. 30 illustrates the summary and sequence of the data recovery of FIG. 29.

"FIG. 31 illustrates an overall sequence of data recovery by chunks and pages copy.

"FIG. 32 illustrates an example of a memory in the storage subsystem of FIG. 1 according to a second embodiment of the invention.

"FIG. 33 illustrates an example of an HDD Management Table in the memory of FIG. 32.

"FIG. 34 illustrates an example of a Virtual Volume Management Table in the memory of FIG. 32.

"FIG. 35 illustrates an example of a Virtual Volume Page Management Table in the memory of FIG. 32.

"FIG. 36 illustrates an example of a Capacity Pool Chunk Management Table in the memory of in FIG. 32.

"FIG. 37 illustrates an example of a virtual volume and its Virtual Volume Management Table and Virtual Volume Page Management Table.

"FIG. 38 illustrates an example of the table reference structure toward capacity pool in the virtual volume of FIG. 37.

"FIG. 39 illustrates an example of a process flow of the Page Detection Control (A) in the memory of FIG. 32.

"FIG. 40 illustrates an example of a process flow of the Page Migration Control in the memory of FIG. 32."

URL and more information on this patent application, see: KAWAGUCHI, Tomohiro. Fast Data Recovery from Hdd Failure. Filed December 24, 2013 and posted May 1, 2014. Patent URL: http://appft.uspto.gov/netacgi/nph-Parser?Sect1=PTO2&Sect2=HITOFF&u=%2Fnetahtml%2FPTO%2Fsearch-adv.html&r=360&p=8&f=G&l=50&d=PG01&S1=20140424.PD.&OS=PD/20140424&RS=PD/20140424

Keywords for this news article include: Hitachi Ltd., Information Technology, Information and Data Loss and Recovery.

Our reports deliver fact-based news of research and discoveries from around the world. Copyright 2014, NewsRx LLC

(c) 2014 Information Technology Newsweekly via VerticalNews.com

Recommend :

0

0 comments:

Post a Comment