"SRP" and Acronym Overload

This article originally appeared on Logicworks' old blog.
Logicworks' new blog can be found at www.gatheringclouds.com.

We're working to push the performance limits here at Logicworks so that we can offer a no-compromise cloud/virtualization product, and one of the technologies we're working with is SRP. This is cool stuff, but sometimes the acronym soup of modern computing technology makes it hard to stay sane when doing research into new things. This particular acronym, SRP, finally made my head hurt and I had to step back just a little to consider how pregnant with meaning it is. SRP stands for "SCSI RDMA Protocol", a term which is itself 2/3rd's acronym. Expanding further, we arrive at "Small Computer System Interface Remote Direct Memory Access Protocol". Let's analyze, while pedantically expanding every acronym we encounter.

SCSI (Small Computer System Interface) has a history going back to the late 70's, but it was in 1982 that it was given its current name by ANSI (another acronym, the American National Standards Institute). Standardized in 1986, it was intended as a peripheral interconnect for "small" machines, in contrast to minicomputers such as the DEC's (Digital Equipment Corporation's) VAX (Virtual Address eXtension) systems and IBM's (...you know this one) mainframe systems. Since then small machines have grown up and absorbed almost the entire minicomputer market and even put a serious dent in the mainframe market. Also, the "system interface" part of SCSI is a bit too general today as SCSI devices are almost exclusively used for attaching high-end storage and the preferred peripheral interconnect these days is USB (Universal Serial Bus). Considering that SCSI is today used in server storage, it's probably more accurate to consider SCSI as "Server Computer Storage Interface". At least it still kinda fits.

RDMA, "Remote Direct Memory Access", is a contrast to the much more common local DMA. DMA is a mechanism by which a computer can coordinate local subsystem access to main memory. For example, if a CPU (Central Processing Unit) needed to read a packet from or write a packet to a network interface's internal buffer byte-by-byte it would crush the system's performance. With such a design even the fastest CPUs on the market today would spend almost all of it's processing power shuttling data back and forth and doing little actual processing. DMA allows the CPU to tell the memory controller to autonomously transfer sections of memory to and from peripherals such as network interfaces, disk controllers, or user interface devices. This gives peripherals direct access to memory, hence the name.

"Remote-Direct" is something of an oxymoron, but it just means that instead of a CPU directing a DMA controller to copy a section of memory to and from a local peripheral, a network interface will allow independent copying of data from one machine's main memory to another machine's main memory. This is extremely powerful technology and may be slightly ahead of its time for servers in 2009. As clusters of machines become more commonplace, so will this technology. So RDMA is an accurate acronym, although at first analysis it may appear nonsensical.

And finally there's "Protocol", which is a common industry term for "standard rules of communication", but you already knew that.

So to pull it all together, "Small Computer System Interface Remote Direct Memory Access Protocol" or "SRP" is a standard set of communication rules that allows the transfer of a storage protocol and associated data between two machines' main memory, over a network interface, and without burdening the CPU with the explicit transfer of every byte of data. It sounds like a simple idea that would be very complex to implement, and that's definitely the case. But when combined with a network technology like Infiniband it allows for very low latency and extremely high bandwidth storage access for our clients. This is a critical piece of what we offer in our cloud storage products to differentiate us from our competitors in performance and reliability.

Easy to say, very difficult to explain precisely. But it seems that's the way with anything in "cloud computing". ;)