As you do, I was reading up on RAID levels while in the bath. The topic of atomicity came up, and it’s something I wanted to share.
Not usually the most reliable source of technical data, but I’ll quote Wikipedia to help explain atomicity to set the stage. Taken from http://en.wikipedia.org/wiki/RAID under the section of “Problems with RAID”…
This is a little understood and rarely mentioned failure mode for redundant storage systems that do not utilize transactional features. Database researcher Jim Gray wrote “Update in Place is a Poison Apple” during the early days of relational database commercialization. However, this warning largely went unheeded and fell by the wayside upon the advent of RAID, which many software engineers mistook as solving all data storage integrity and reliability problems. Many software programs update a storage object “in-place”; that is, they write a new version of the object on to the same disk addresses as the old version of the object. While the software may also log some delta information elsewhere, it expects the storage to present “atomic write semantics,” meaning that the write of the data either occurred in its entirety or did not occur at all.
This has come back into light recently but under a different guise with SSD write failure problems. Many SSD manufacturers and enterprise storage vendors are addressing this with new firmware that writes all data sequentially, never over-writing a data block until all of the disk has been written then starting over-writing blocks from the start (that have obviously been freed up first).