
By Thorsten von Eicken | Article Rating: |
|
August 25, 2008 06:00 AM EDT | Reads: |
35,812 |
Incremental snapshotting of volumes and freezing
I mentioned that snapshots were a very useful but also a complex, difficult to understand feature. I wanted to explain how snapshots of an EBS volume can be taken at any time, and why using RightScale’s scripts to freeze data are important -- and a great add-on to this important feature.
Taking a snapshot causes the data on the volume to be written to S3 where it is stored redundantly in multiple availability zones as all data in S3 is. It’s worth noting snapshots do not appear in your S3 buckets, thus you can't access them using the standard S3 API. You can only list the snapshots using the EC2 API and you can restore a snapshot by creating a new volume from it.
The second thing is that snapshots are incremental, which means that in order to create a snapshot, EBS saves only the disk blocks that have changed to S3.
Each volume is divided up into blocks. When the first snapshot of a volume is taken, all blocks of the volume that have ever been written are copied to S3, and then a snapshot table of contents is written to S3 that lists all these blocks. Now, when the second snapshot is taken of the same volume, only the blocks that have changed since the first snapshot are copied to S3. The table of contents for the second snapshot is then written to S3 and lists all the blocks on S3 that belong to the snapshot. Some are shared with the first snapshot, some are new. The third snapshot is created similarly and can contain blocks copied to S3 for the first, second and third snapshots.
There are two nice things about the incremental nature of the snapshots: it saves time and space. Taking subsequent snapshots can be very fast because only changed blocks need to be sent to S3, and it saves time because you're only paying for the storage in S3 of the incremental blocks. What is difficult to answer is how much space a snapshot uses. Or, to put it differently, how much space would be saved if a snapshot were deleted. If you delete a snapshot, only the blocks that are only used by that snapshot (i.e. are only referenced by that snapshot's table of contents) are deleted.
Something to be very careful about with snapshots is consistency. A snapshot is taken at a precise moment in time even though the blocks may trickle out to S3 over many minutes. But in most situations you will really want to control what's on disk vs. what's in-flight at the moment of the snapshot. This is particularly important when using a database. We recommend you freeze the database (or any application writing critical data to disk), freeze the file system, take the snapshot, then unfreeze everything. At the file system level we've been using xfs for all the large local drives and EBS volumes because it's fast to format and supports freezing. Thus when taking a snapshot we perform an xfs freeze, take the snapshot, and unfreeze. All this ensures that the snapshot doesn't contain partial updates that need to be recovered when the snapshot is mounted.
With support for large datasets, attachments, better throughput, snapshotting and more robust, incremental backups and redundancy, Amazon’s EBS should attract a lot more enterprise and on-demand customers, as well as Web 2.0 users with large database-driven applications.
Thorsten von Eicken is RightScale, Inc.’s Chief Technical Officer. To try out a free developer version of RightScale, visit http://www.rightscale.com/m/products.html#developer.
Published August 25, 2008 Reads 35,812
Copyright © 2008 SYS-CON Media, Inc. — All Rights Reserved.
Syndicated stories and blog feeds, all rights reserved by the author.
- Cloud Computing Expo - Deploying Into the Clouds: Concepts, Benefits and Experiences
- RightScale Delivers Full Support for Amazon’s Elastic Block Store In the Cloud
- The Three Levels of Cloud Computing
- Cloud Computing Journal: Current Themes & Topics
- SYS-CON's Cloud Computing Expo Will Be Larger Than Any Recent Gartner Event
- SYS-CON Launches Another Worldwide First: "Cloud Computing Journal"
- Merrill Lynch Estimates "Cloud Computing" To Be $100 Billion Market
- Cloud Computing - Morgan Stanley is Banking on the Cloud
- Cloud Computing: It's the Future of Enterprise IT
- Twenty-One Experts Define Cloud Computing
More Stories By Thorsten von Eicken
Thorsten von Eicken is the CTO and a founder of RightScale and is responsible for the overall technology direction of the RightScale Cloud Management Platform. Previously, he was founder and chief architect at Expertcity (acquired by Citrix Online), where he directed the architecture of the company's online services, including the popular GoToMeeting service. von Eicken also managed the Expertcity/Citrix data center operations, where he acquired deep knowledge in deploying and running secure, scalable online services. He was a professor of Computer Science at Cornell University and received his Ph.D. from the University of California, Berkeley.
![]() Sep. 6, 2019 09:00 AM EDT |
By Zakia Bouachraoui ![]() Sep. 5, 2019 07:00 PM EDT |
By Zakia Bouachraoui ![]() Sep. 4, 2019 11:15 PM EDT |
By Zakia Bouachraoui ![]() Sep. 4, 2019 11:00 PM EDT Reads: 312 |
By Elizabeth White ![]() Jul. 1, 2019 07:30 AM EDT |
By Zakia Bouachraoui Jun. 27, 2019 08:00 AM EDT |
By Pat Romanski Jun. 24, 2019 06:00 AM EDT |
By Zakia Bouachraoui Jun. 17, 2019 01:00 PM EDT |
By Pat Romanski Jun. 15, 2019 08:15 PM EDT |
By Pat Romanski Jun. 15, 2019 01:00 PM EDT |