To Cloud, or not to Cloud…

It really does seem to be the question…  the sad part is how many people I talk to in my travels don’t really understand what cloud even is, let alone what the pros and cons are of moving your applications into it.

Background – a company is considering moving probably 3,000-5,000+ users to gmail as a ‘corporate’ email system…  They are running exchange currently…

Apparently, they don’t read the news and have missed out on the multiple spectacular failures of services like Google, Amazon and the like.

Cloud services are GREAT if you are running a small business, don’t want to / can’t afford an IT budget, or just plain don’t want to deal with it.

If you’re a billion dollar corporation with a multi-million dollar IT infrastructure already in place.  Outsourcing email seems a bit…odd.

Granted, if you are this company, you are obviously going to get the top-of-the-line service, dedicated support personel, etc.  You’re also buying plausible deniability should data-loss put you in jeopardy under subpoena. (While “I disposed of the data” is bad, “The company I was outsourcing to lost it” is not as bad.)

“Honest your honor, we had the emails but Google deleted them by accident.”

*DISCLAIMER – I’m not implying that google would ever do something like this on purpose, using them as a generic, like Xerox.

** It’s Google’s fault…they’re big enough to have become the verb.

***Does anyone actually own a Xerox branded machine anymore?

So if you’re SuperMegaCorp, LLC…you pay for the real service.  You get dedicated support staff, a private line to call, etc.  But to be honest, you might as well keep it in house because hey, you already have the staff, the datacenter, the VMWare farm, etc.  At that point you’re talking a few dollars in licensing and you’ve got email address for your thousands of employees for pennies each.  (Ok, yes, add in replication, backup, etc and it gets a bit higher, but the point is you’ve already comoditized it. (is too a word))

But think about it this way.  The company you’re contracting too has to pay for the same things *YOU* have to pay for.  *PLUS* they have to make enough of a profit to keep their shareholders off their back.  They do get a bit of a discount for bulk licensing, hardware, etc…

But what you GET for hosting it in house is immeasurable.  You get control.

At my last gig I heard the following phrase over and over again.  “I want one neck to choke.” (Oddly enough it was the argument given for moving AWAY from their previously preferred vendor, but you get the idea.)

When the email admin works for you, you have one neck to choke.  You get immediate results. Or you get the pleasure of firing someone.  (Can be fun in the right circumstances, ask The Donald.)

Now say you hosted with Amazon, just for grins.

Not only are your hosts down, potentially THOUSANDS of other hosts are down as well.  Now while we would like to believe they have a thousand techs on staff to give each customer equal time…let’s face it.  it’s not going to happen.  They  have, EXTREMELY generously, 10 technicians per thousand customers.  The techs will bring hosts up as soon as they can…

In an egalitarian society, odds are quite simply about 1000:1 against your site being the first one brought up…  990:1 against it being the second, etc.  See where I’m getting?  Eventually they’ll get around to it, but unless they figured out time travel and can loop back and do them all at the same point in time…you’re out of luck.  Yes, you’ve probably got a 99.999% uptime guarantee…but read the small print of your contract…  Their liability to you cannot exceed the cost of the hosting, if that, or some similiar legalease that limits their liability for downtime and, god forbid, data loss.

But this is not an egalitarian society…  Pure capitalism and “he who has the most gold gets their email back first.” If you’re with Amazon, well they host some PRETTY big sites…including their own.  Netflix comes to mind.  So in a downtime event if it comes down to bringing Joe the Plumber’s CRM app or Netflix’s east-coast streaming…which one do you think is going to get priority?

Right.

I have one neck to choke…  50Micron is hosted by Catbytes… the company that I do my consulting through.  Reason being that I maintain the lab anyway for “play” (officially: self-education and training) purposes, it’s easy for me to spin up an extra VM and put Exchange on it, a couple of CentOS Mailscanners, a few webservers, etc, even off-site replication of backups over a 10MBit link to a “DR” site (that happens to be in my basement)  (If someone wants to donate another CX3-20i or a couple of FCIP bridges I’ll have block-level replication. 😉 )

When Amazon EC2 had their issues, suspiciously I had a pretty major crash as well… (As did the customer I was working for at the time, don’t get me started on my paranoid theories.)

But when my stuff breaks… It’s my fault, it’s my responsibility, and *I* am the only one in line.  If I had hosted with Google or Amazon I might have been down for weeks…

I was back up in about 2 hours.  The time it took me to cycle the environment remotely. 🙂

Yes…building an IT infrastructure if you already have one can be pricey..  Paying someone else for hosting when you already HAVE an IT infrastructure just plain doesn’t make sense.

P.S. The funniest part is I’m now hosting about a half-dozen servers for friends/family (not free, I’m ugly, not stupid; and co-lo cages are NOT cheap) and about 40-50 websites that I’ve gotten via friends and word-of-mouth…

Of course my guarantee is as follows:

“Best effort, and you have to realize I have a day job that by it’s very nature comes first.”  🙂

In case you’re wondering…

Point of reference – A few months ago I wrote a post that I never ended up publishing that started with the line:

“My gods I need to work with technology that wasn’t conceived of in the 1990s.”

With that in mind, in case you’re wondering where I’ve been this past month or so…

I’ve been playing with this beast…

8 Engine VMAX

225 400G SSD Drives (90 TB Raw)

Direct Attached to *ONE* host.

Biggest.  Thumbdrive.  Ever.

Well I was saying I needed to get some serious hands-on VMAX experience.  When you put a request like that out there, sometimes the universe answers LOUDLY. 😉

VMWare Host Isolation Response…

I learned what “Host Isolation Response” was today.  Well I already knew what it was, but I learned that in a VMWare cluster, if you leave it at the default, then if the network goes away between the clustered hosts, the HOST then RESPONDS to this ISOLATION by shutting your entire environment down.

Oops.

Not that anyone would notice, but from 1:30 to 2:00pm EST the system was offline because I (ironically) unplugged the switch briefly to put a battery behind it.  Needless to say it’s better now, but it wasn’t quite the “momentary interruption” I had hoped for.

Not going to be long with this one, needless to say I’m prepping to start traveling again.  Not totally excited about it, but I hear Seattle is nice during the summer in much the same way Virginia isn’t, so at least there’s an upside.

And suddenly…(redux)

**ALERT** I’ve had to…modify this post so it won’t offend someone who doesn’t realize that the storage community is very small and that word will get out regardless…

I’m unemployed.

Unexpectedly too.  Unexpected because right up until the day they told me to go home because I wasn’t getting paid, everyone assured me that the contract renewal was in the bag.

I’m such a sucker.  Believing people like that.  Never again.  I’ll also never believe anyone who tells me “don’t worry about it, I’ve got you covered if there’s a gap.”

It’s ok, next gig is on the horizon already…  And it looks like it will be something that while geographically unpleasant, will be a great job I can learn a LOT from, and truly excel at, which for me is key, because I’ve spent the past two years trying to shoe-horn new ideas into the heads of people who think a new idea is like anthrax, to be avoided at all costs.

(And with that I’d like to say hello to the nice folks at the NSA.  Please forgive me, it was an analogy, if a badly placed one.)

Consulting sucks sometimes.  The worst part of course is not knowing where you’ll be working from year to year, or the fact that you have to keep your eyes open, in permanent recruiter mode.

Of course the money is great, and if you tend to go stagnant on doing the same thing over, and over, and over again…It’s nice to be able to change.

It’s a pity that with being yanked out of an environment with no notice comes no turnover on the projects, and that there are a few implementations that I was in the middle of that might blow up if not tended to properly and in the right time-frame, which sadly isn’t far off.

(Ok, first anthrax and then the phrase ‘blow-up’ – the boys in black are DEFINITELY knocking on my door tonight.)

So the real question is…who is going to get saddled with picking up where I left off, *AND* are they going to ask me to help…

Can’t wait until that happens to give me the opportunity to lecture someone on the value of giving notice. 🙂

Good Cloud, Bad Cloud, a Titanic story…

This weeks abject failure of Amazon.com’s EC2 hosting environment has caused quite the stir.  There are those who say that this proves that this incident “Proves Cloud Failure Recovery is a Myth” and others who say that we should just give it a chance.

Facts are facts.  Amazon screwed the pooch big-time last week.  Their outage caused ripple effects nation-wide.  But while it’s easy to throw the blame at Amazon for the failure ti’s important to remember that cloud computing is still only in it’s infancy, this mad rush to adopt it is part and parcel of the reason these problems are happening.  Customers rushing for a new product creates demand, companies looking to be the first to capitalize on that demand create a product that may or may not be ready for prime time.

But because no-one ever (because it’s impossible) thought to test the kind of cascade failure they experienced, they were pushing the high-availability envelope right out of the gate.

So no big deal, right?  Foursquare, parts of netflix, etc. were down due to the outage.  Other than inconvenience and the inability of narcissistic people to let the world know where they are and what they’re doing, it’s not really that big a deal (for us)

And then this came out: https://forums.aws.amazon.com/thread.jspa?threadID=65649&tstart=0

Specifically this line:

“We are a monitoring company and are monitoring hundreds of cardiac patients at home.  We were unable to see their ECG signals since 21st of April.”

Really?  You have a life-critical application and you hosted it “in the cloud”?  Did it never occur to you that it’s probably *NOT* a good place for a life-or-death application?  While I would consider it as a backup, definitely not my one and only.

People who know me know I have a rule.  I don’t say it works until I’ve seen it work at least once, and even then I’ll qualify my statement with “well I saw it work under THESE conditions.”  I do *NOT* say something works based on what some sales or marketing person tells me works.  (Trust me, this has been a major sticking point between me and my sales team. 😉

That being said.  You have to accept that if you put your critical apps in “the cloud” by it’s very nature you are abdicating your control over it, and putting your full faith in someone ELSE to fix the problem.  Someone who may not think your application is as important as the one in the rack next to yours.

Are you going to take someone’s word that something is “Highly Available” if you haven’t actually pulled the plug yourself and watched it fail over?  I won’t.  I will candidly couch my answer in “That’s the way it’s supposed to work” or “That’s the way it’s designed to work”  But until you see a failover, that’s not the way it DOES work, because it never has.

I run my own email, my own webserver, my own infrastructure. I prefer it this way, because now if the system goes down, I know exactly whose butt to kick.

As a rule, and If I’m paying someone else to provide a service… I make sure I know where, how, and who to call when it blows up.  It’s probably the best advise I can give.

Amazon billed this as being “highly avaialble” and maybe it is, for the most part.  But obviously if you think of a million ways for something to go wrong, you can bet even money on their being at least a million and one ways for it to fail.

Instead of EC2, they should have named it “Titanic” because everyone knows the easiest way to invite disaster is to tell the world you’re immune to it.

IBM XiV – Real-Life impressions…

The Ethernet back-end on an XiV will still be it's undoing

That's a lotta ethernet...

First impression of the XiV in “action”

The GUI is fancy.  Looks like a Mac turned on it’s side. The GUI is also NOT web-based.  It’s an app-install. I do believe however it’s available for multiple platforms.

It really does seem to take all of the guess work out of provisioning since you don’t really have any say on what goes where in your array.

Our first use?  Backing up 6+ TB that was stored on Clariion and moving it to XiV…

Now first off, I’m glad it was decided to do it this way.  Whereas a copy straight from one to the other is possible, utilizing both arrays at the same time, it wouldn’t have provided any comparison as to performance.

The backup was done using Veritas NetBackup, over the network.  The data consisted of a pair of hosts running an extensive XML-type database used for indexing and categorization of unstructured content.  The backup and restore were both done to the same host, over the same network, and the storage was addressed over the same switches, just zoned to different arrays.  The only significant difference was that while the backup was done multiplexed, the restore had to be done single-threaded…(because NBU multiplexed both backups to the same tape)

I have to get the final start/stop-times out of NBU, but from the halway conversation I had with the NBU guy, the backup took 6-8 hours (for both hosts), the restore took 21+ hours…

The most interesting part of it was the first restore took almost the same amount of time as the backup, which is kind of what we would expect.  The second host took dramatically longer to restore than to back-up.

This would indicate to me that, as expected, the XiV didn’t handle the long, sequential write very well.  Since the host only connects to two of the six data nodes, virtually 100% of writes have to be destaged over the Gig-E backend.  My guess is we nailed the cache to the wall with the first restore, and then kept it pegged with the second one.

I like sequential write-tests on this scale because it shows without a doubt whether the cache is masking a back-end issue or not.  If it is, this is exactly what you’ll see.  An initial burst of writes followed by a sharp drop as cache is saturated.  This is even more pronounced in a more utilized array (rather than an idle one) because a certain percentage of cache will already be utilized by host reads/writes.

This doesn’t bode well for an application that requires occasional complete reloads of the XML database…

I can’t wait to see it in action.

The Macintosh Expirement – Final

Well, my 30 days are up.

I enjoyed using it, and I definitely see the upside in Apple computers over PC’s.  But I’m going back to my Dell Precision690. (Already have actually)

Most of the “failings” of the Mac G5 Pro I was using can probably be attributed to the fact that it’s a G5.  So much software doesn’t work on the PowerPC’s, developers have given up on them.. (as is probably justified, they’re old)  and upgrading to a MacIntel would probably solve a few (but not all) of the problems I was having with compatibility.

A few points:

Negative:

  • MS Entourage had significant issues.  I was forced to use the EWS (Exchange Web Services) client instead of the standard, because my exchange environment is Exchange 2010.  Maybe I jumped the gun in upgrading to Exchange2010.  Entourage 2008 doesn’t work with Excahnge2010, because Microsoft did away with WebDAV.
  • The MS RDP Client for Mac (v1.0 due to PPC Support) only supports a single session.  I usually have 3-4 RDP sessions open at a time, so this was a significant limitation.
  • TimeMachine doesn’t like to back up to a network drive.  I found a few workarounds but was never able to try and get it working.  I prefer to backup to an offsite location.
  • NTFS read-write support doesn’t exist in Leopard (10.5.8)  Though read-only support exists, if I can’t write to an NTFS formatted thumbdrive this is useless to me.  I’ve found some third-party drivers but they are both expensive and buggy.  I’m told this exists in SnowLeopard (10.6.x) but again, not willing to shell out that kind of money for a computer to do something I can do with windows.
  • Software is expensive…  The Version of Quickbooks that I paid $99 for on windows was $299 on Mac.  WTF is up with that?!

Positive:

  • I love having a native BASH shell.  I do a *LOT* of scripting, and it’s nice to be able to do it hands on.
  • The GUI is very intuitive, I like the Dock (Akin to Cairo-Dock for Linux)
  • I enjoyed iPhoto – the face-recognition, while imperfect, was interesting to play with.
  • Application installations were easy, and almost NEVER required a reboot.
  • It’s mostly quiet.  I love a computer I don’t hear running.  Though the Precision is pretty quiet too.  And the Mac “Jet-Engines” when you put it under load whereas the Dell doesn’t.

– And finally:

  • The start-up chime the mac makes *REALLY* annoys my eldest son, who for some reason (couldn’t be his dad, could it?) HATES apple products.  I must have rebooted it ten times one night while he was in the other room playing BlackOps just to hear him complain.

Bottom line, I work with EMC products.  Much of the software I use in my work runs on Windows by virtue of the fact that EMC writes it that way.  (Why Symmwin hasn’t been ported to CentOS or some such yet is beyond me….would save the company MILLIONS every year in software licensing)

But it all comes down to cost.  The starting price of a new Mac Pro is $2499 (Source: Apple)  That’s for a ‘simple’ box with a quad-core processor.  The higher-end systems (12-core, 2x 6-core CPU’s) run $4,999.

Macs is more expensive.  As a side-note.  I walked into Micro-Center to buy memory for it.  The G5 uses standard DDR, PC3200 memory.  In the *SAME STORE* memory was two different prices, depending on whether you were in the Mac side or the PC side.  For PC’s the 1GB PC3200 memory was $29/ea.

On the Mac side, it was $59/ea.  What amazed me mostly was the fact that the guy behind the counter said that people would GLADLY pay the extra $30 for the exact same memory because it said “Mac Ready” on the label.  (It was even the same manufacturer)

Wow.  That’s all I can say about that.  Wow.  That’s abusive.  That’s taking advantage of people who don’t know any better.  Double?  Really Apple?  (Well this wasn’t apple, but it is the general problem.)

Let’s put this into perspective.  The Dell Precision 690 I have runs 2x Dual-Core 3.0Ghz Xeon CPU’s, 8G of ram, and it cost me less than $1,000 when bought seperately.  It’s a faster box, (Twice as many CPU cores, DDR2, PC5300 memory, etc)

Now I’m not the type to buy the latest and greatest.  I’ve never bought a “new” laptop in my life, (I prefer refurbs, especially since Dell sells them with the exact same warranty as new at half the price.) I drive a 6-year old Prius, my wife drive’s a 10-year old Chevy.  I have a modest house in the suburbs that’s slightly crooked but fits my needs, but isn’t flashy by any stretch.  And every piece of computer equipment I buy for the datacenter is second-hand.  (we just acquired a pair of Cisco 9140 Switches, how many generations back is that?)

To go out and buy a “NEW” Mac for those prices is completely INSANE.  Now I could probably buy one used on ebay.  (Apple people tend to upgrade often, so there are lots of them out there.)

So in my humble view – Macs are great personal computers, and wonderful graphics arts systems.  They *CAN* be used in business if you’re willing to make some sacrifices, but again, if you want stuff to just work, Windows is still the way to go for business.

I *MAY* consider a used MacBook Pro though.  I can see where the portable version would come in VERY handy, and you can get Intel-based MacBooks on Ebay (lease-returns) pretty cheap.  (I’m amazed Apple doesn’t have an outlet store like Dell does)

This concludes my latest experiment.

P.S. For Sale – Mac Pro G5 Tower.  Dual 2.5Ghz PPC, 8GB Ram, 2x 250G HardDisks, dual-port Video, Keyboard/Mouse (new).  MacOS 10.5.6 Leopard (Installed, no media)

Make me an offer.

Day-24 (Mac Experiment)

I told you I had no concept of days right?

Well I think it’s an “I can use this” thing.  The only downside I’ve found so far probably has more to do with my outdated hardware than anything else.

I’ve since upgraded the SIngle Processor 1.6Ghz G5 to a dual-processor 2.5Ghz G5.  The difference in performance is obviously grand, plus the dual 2.5 has 8 DIMM slots for memory instead of 4.

So now I’m up to 8G of Ram.

What I found most interesting is that to move from the old system to the new it was simply a matter of move the drives over.  I guess simplification and standardization of the hardware means that unlike windows/PC hosts, you never have to worry about whether or not the drivers are installed when you upgrade.

I also had a pretty good time with “TimeMachine”

It seems like it does a great “Grandfather-Father-Son” backup automatically, and without the user having to understand what a “GFS” backup is.  So you can restore to any hour in the last 24, day in the last month, or month in the last (however much disk-space you’ve got.)

What I liked is that nothing special was required to restore from disk.  Just the OSX boot CD.  Boot, select “restore from backup” and poof, or tah-dah, or whichever.  Windows7 has something fairly similar, but you have to build a recovery CD for it to work, probably because it has to store whatever raid-specific drivers you’re using.

All in all, a positive experience.  I may still go back to my Precision690 though…Dual Xeon 2.8Ghz processors and 8G of ram can run circles around the older G5 hardware.

I haven’t decided.

 

Day-5 (Mac Experiment)

Ok, I might be sold.

Though the outdated hardware has posed a few limitations, I’m not so worried about that.  I did just order a Dual 2.5Ghz G5 off Ebay for $300 because the one thing I *AM* driven nuts by is the pounding that this single 1.6Ghz PPC chip is completely incapable of taking.  I’m hoping that the new one has more than the 4 DIMM slots this one has…more memory couldn’t hurt. 🙂

I’ve obtained a copy of Mac:Office 2008, Photoshop, and a few other neat pieces to play with, but so far I’ve not dove completely into it.  (My laptop is still on my desk as well, just in case I should need it)  Entourage 2008 Web Service edition was added because of course, I’m running Exchange2010, and WebDAV has been removed after Exchange2k7.

The other thing I’ve noticed is the backup/restore process worked wonderfully.  I wiped the drives and built a new 1.8TB RaidSet on the new drives, which finally gave me a partition of appropriate size, and booted from the CD, Selecting “Restore from TimeMachine backup”

Impressively enough it took less than 2.5 hours to restore the OS and everything I had done on it to that point.  When it rebooted, there were no strange messages, though on opening the “Mail” app, I found that it had to go and re-import all of the mail that had already been downloaded.

Oh well, not THAT huge a deal.

I’ve done something similar with Windows7 recently, but it required a “RecoveryCD” be made before you could run the restore.

The best part of this new setup, by far.  Is access to BASH. I *HATE* that there doesn’t seem to be a decent CygWin shell anywhere on the market for windows.  I do a *LOT* of shell scripting both for work and because I find it fun, and this makes life very easy.

We interupt this experiment to bring you this special bulletin…

The government’s “Continuing Resolution” will be expiring a week from today.  As a government contractor, this directly affects me.

They have two choices.  They can pass ANOTHER C.R. or they can actually pass a budget.

I don’t post political statements here too often.  However I don’t know about you, but from where I stand this travesty that the House has floated is a disaster.  1.2 million jobs lost by the estimates I’m hearing, and to top it all of, it doesn’t do SQUAT to balance the budget because the places that need to be cut / reformed, IE Defense, etc. are off the table.  So this will be for nothing.

If the posturing peacocks on capitol hill don’t get their collective crap together and one side (or the other) forces the government to shut down. I may have some time on their hands.

Part of me is hoping that cooler heads prevail.

Part of me is looking forward to a little time off. 😉  I’m told it is actually a CRIMINAL offense for me to work if the government shuts down.

Bring it.