As I move into my new role as a database architect in the DBaaS team, one of the areas I find people seem to know the least about is the Snap Clone functionality, so I thought it might be worthwhile to blog an introduction to some of the issues that Snap Clone addresses (no doubt, there will be more follow up posts to come!)
In simple terms, Snap Clone is a storage agnostic self service approach to rapidly creating space efficient clones of large databases (and by large, we’re talking terabytes or more). Now that’s probably more buzz words in one sentence than anyone’s brain can deal with without exploding, so let’s explain some of those terms more:
- Storage agnostic – by that I mean Snap Clone supports all storage vendors, both NAS and SAN.
- Self service – in the XaaS world – where X can be any of I, MW, P and DB 🙂 – one of the key features is empowering the end user to do the work, rather than waiting on some techie to find time in their otherwise busy schedules. So it’s the end user who makes the adhoc clones here, not the storage admin.
- Rapid – People simply don’t have the time to wait weeks for provisioning to happen any more (for that matter, they probably never did, but that’s another discussion!), so you have to support the functionality to clone databases in minutes rather than the days or weeks things used to take.
- Space efficient – When you’re working with terabyte or larger databases, you simply may not have the storage to create full-sized clones, so you have to significantly reduce the storage footprint to start with.
The Challenges Snap Clone Addresses
There are a number of major challenges that Snap Clone can be used to address:
- Lack of automation – Manual tasks such as provisioning and cloning of new databases (for example, for test or development systems) is one area that many DBA’s complain is too time consuming. It can take days to weeks, often because of the need to coordinate the involvement of different groups, as shown in the image below:
When an end user, be it a developer or a QA engineer, needs a database he or she typically has to go through an approval process like this, which then translates into a series of tasks for the DBA, the sysadmin and storage admin. The sysadmin has to provide the compute capacity while the storage admin has to provide the space on a filer. Finally, the DBA would install the bits, create the database (optionally on Real Application Clusters), and deliver that to the user. Clearly, this is a cumbersome and time-consuming process that needs to be improved on.
- Database unfriendly solutions – Obviously, when there is a need looking for a solution, different people take different approaches to solving that need. There are a variety of point solutions and storage solutions out there, but the vast bulk of them are not database aware. They tend to clone storage volumes rather than databases and have no visibility into the database stack, which of course makes it hard to triage performance issues as a DBA. They also lack the ability to track configuration, compliance and data security issues, as well as having limited or no lifecycle capabilities.
- Storage issues and archaic processes – Of course, one of the main issues is storage. Data volumes are ever increasing, particularly in these Big Data days, and the growth can often outpace your storage capacity. You can throw more disks at the problem, but it never seems to be enough, and you can end up with degraded performance if you take the route of sharing clones between users. There can also be different processes and different priorities between the storage team and the DBA team, and you may still have fixed refresh cycles, making it difficult to clone on an adhoc basis.
So the end result of all of this is that far too often, there are competing priorities at odds. Users want flexibility – simplified self service access, rapid cloning, and the ability to revert data changes. IT, on the other hand, want standardization and control, which allows a reduction in storage use, reduction in administrative overhead, visibility into the complete database stack and lineage tracking. How do you solve these competing priorities with Snap Clone? Well, that will be the subject of another blog, so stay tuned for more!