The Fake Synchronous SAN

In response to this article on “EnterpriseStorageForum”:

 Synchronous SAN Sets Fibre Channel Distance Record

My Response:

True Synchronous transmission works over any distance – if you can live with the latency.   The problem is that most hosts operating systems can’t.  So different buffering schemes are cooked up to fool the host into thinking the write is complete on both sides when in fact it’s not.

Any time you get over about 30km the latency, that is the time it takes for the IO to be transmitted, acknowledged, and released, becomes about that of a normal unbuffered physical drive, about 20-30 ms.

Any further and you will start seeing slower and slower response times and eventually IO timeouts on the source hosts.

In order for a storage system to be truly “synchronous”, the array cannot acknowledge the I/O to the host until it’s received the write ACK from both the source, *AND* the target array.  If there is buffering going on between point-A and point-B, such as a cisco MDS with the buffer credits cranked up or a Nisshan IPS3300, it is not a truly synchronous replication, because the failure of the switch on the source (or target side) will cause the target array to have missed I/O’s that have already been acknowledged to the host as having been complete.

Sorry – but this test, as it appears to have been run was obviously designed by the various vendors to accentuate their hardware without showing the failures and flaws in the logic.  I’m sure I could walk in there and in about 5 minutes simulate a link failure that would have the remote site in an inconsistent state at worst, or having to roll back journaled IO’s at best.

Leave a Reply

Your email address will not be published.