
It’s a well-known fact in the IT world: Change one part of the software stack, and there’s a good chance you’ll have to change another. For a shining example, look no further than big data.
First, big data shook up the database arena, ushering in a new class of “scale out” technologies. That’s the model exemplified by products like Hadoop, MongoDB, and Cassandra, where data is distributed across multiple commodity servers rather than packed into one massive one. The beauty there, of course, is the flexibility: To accommodate more petabytes, you just add another inexpensive machine or two rather than “scaling up” and paying big bucks for a bigger mammoth.
That’s all been great, but now there’s a new sticking point: backup and recovery.
“Traditional backup products have challenges with very large amounts of data,” said Dave Russell, a vice president with Gartner. “The scale-out nature of the architecture can also be difficult for traditional backup applications to handle.”
Today’s horizontally scalable databases do include some capabilities for availability and recovery, but typically they’re not as robust as those IT users have become accustomed to, Russell added.
It’s a problem that can leave large enterprises vulnerable when outages strike. But it’s also where a new class of data-protection products is beginning to enter the picture.
Datos IO’s RecoverX is one of those.
“If you have a traditional database like Oracle or MySQL, it’s scale-up, and there’s always the notion of a durable log,” said Tarun Thakur, Datos IO’s co-founder and CEO.
In such scenarios, a copy of that log is what constitutes a backup when problems arise.
In the world of today’s next-generation databases — where data is distributed across small machines — it’s not quite so simple.
“There is no concept of a durable log because there is no master — each node is working on its own stuff,” Thakur explained. “Different nodes could get different rights, and every node has a different view of an operation.”
That’s in part because of a trade-off that’s been required to accommodate what’s commonly referred to as the “three V’s” of big data — volume, velocity, and variety. Specifically, to offer scalability while accommodating the crazy amounts of diverse data flying at us at ever-more-alarming speeds, today’s distributed databases have departed from the “ACID” criteria generally promised by traditional relational databases. Instead, they’ve adopted what are known as “BASE” principles.
It’s a critical distinction. Most pertinent is that where traditional databases promise strong consistency throughout — that’s the “C” in ACID — distributed ones strive instead for what’s called “eventual consistency.” Updates will be reflected in all nodes of the database sooner or later, but there’s a time lag.
“If you need scalability, you need to give up consistency — you have to give up one or the other,” Thakur said.
That makes it tough to get a reliable snapshot of the big picture for point-in-time recovery. Not only is it more difficult to track which data might have moved where in a distributed database at any given moment, but the resiliency features that often come “baked” into newer distributed databases — replication, for example — won’t protect you if data gets corrupted, said Simon Robinson, a research vice president with 451 Research.
“You just replicate that corrupted data,” he said.
Earlier this month, Datos IO launched RecoverX to address those concerns through features including what it calls scalable versioning and semantic deduplication. The result is cluster-consistent backups that are both space-efficient and available in native formats, the company says.
Souvik Das, who until recently was CTO and managing vice president of engineering with CapitalOne Auto Finance, has felt the backup crunch first-hand.
After years of using traditional databases, CapitalOne underwent a “massive transformation” a few years back that included rolling out new distributed technologies such as Cassandra, said Das, who is now senior vice president of engineering at healthcare-focused startup Grand Rounds.
That meant looking for a new strategy for backup and recovery.
“Most of the backup vendors and software are typically tuned to the type of systems that they’re backing up,” he explained.
Using an older-style backup product with a newer distributed database could spell trouble, he said.
“Either that software would completely fail because it has no idea how to back up the new data stores, or it would work in a very suboptimal way,” Das said. “We knew going in that we would have to have different backup solutions.”
CapitalOne has been evaluating Datos IO as well as Talena, another major player in the space, Das said.
Vendors of more traditional backup products are gradually adjusting their own technologies for big data as well.
“It usually takes the incumbent backup vendors some time to support the newer technologies,” 451 Research’s Robinson said.
“Rewind 10 years and it was very difficult initially to easily do backups for VMware virtual machines,” he added. “This opened the door for players like Veeam to enter and steal the VM backup market from under the noses of the incumbents.”
This article was originally published on www.pcworld.com and can be viewed in full


Archive
- October 2024(44)
- September 2024(94)
- August 2024(100)
- July 2024(99)
- June 2024(126)
- May 2024(155)
- April 2024(123)
- March 2024(112)
- February 2024(109)
- January 2024(95)
- December 2023(56)
- November 2023(86)
- October 2023(97)
- September 2023(89)
- August 2023(101)
- July 2023(104)
- June 2023(113)
- May 2023(103)
- April 2023(93)
- March 2023(129)
- February 2023(77)
- January 2023(91)
- December 2022(90)
- November 2022(125)
- October 2022(117)
- September 2022(137)
- August 2022(119)
- July 2022(99)
- June 2022(128)
- May 2022(112)
- April 2022(108)
- March 2022(121)
- February 2022(93)
- January 2022(110)
- December 2021(92)
- November 2021(107)
- October 2021(101)
- September 2021(81)
- August 2021(74)
- July 2021(78)
- June 2021(92)
- May 2021(67)
- April 2021(79)
- March 2021(79)
- February 2021(58)
- January 2021(55)
- December 2020(56)
- November 2020(59)
- October 2020(78)
- September 2020(72)
- August 2020(64)
- July 2020(71)
- June 2020(74)
- May 2020(50)
- April 2020(71)
- March 2020(71)
- February 2020(58)
- January 2020(62)
- December 2019(57)
- November 2019(64)
- October 2019(25)
- September 2019(24)
- August 2019(14)
- July 2019(23)
- June 2019(54)
- May 2019(82)
- April 2019(76)
- March 2019(71)
- February 2019(67)
- January 2019(75)
- December 2018(44)
- November 2018(47)
- October 2018(74)
- September 2018(54)
- August 2018(61)
- July 2018(72)
- June 2018(62)
- May 2018(62)
- April 2018(73)
- March 2018(76)
- February 2018(8)
- January 2018(7)
- December 2017(6)
- November 2017(8)
- October 2017(3)
- September 2017(4)
- August 2017(4)
- July 2017(2)
- June 2017(5)
- May 2017(6)
- April 2017(11)
- March 2017(8)
- February 2017(16)
- January 2017(10)
- December 2016(12)
- November 2016(20)
- October 2016(7)
- September 2016(102)
- August 2016(168)
- July 2016(141)
- June 2016(149)
- May 2016(117)
- April 2016(59)
- March 2016(85)
- February 2016(153)
- December 2015(150)