PDA

View Full Version : How to get fired within a short keystroke


JustRalph
04-30-2011, 02:01 AM
http://newenterprise.allthingsd.com/20110429/amazon-details-last-weeks-cloud-failure-and-apologizes/


Amazon Cloud Services failed miserably last week. This was just a few weeks into what has been a very nice little "cloud crazy" marketing campaign on many fronts and by many companies. It's a little hard to understand, if you aren't a tech type, but "The Cloud" lost a ton of ground last week. Lots of naysayers coming out of the woodwork now.

One step forward, ten back............

-HRrbLA7rss

Microsoft has been pushing "the cloud" for a while now....... Amazon hurt not only themselves, but Microsoft too...........

Stevecsd
05-02-2011, 11:27 AM
Whatever happened that first day, Amazon failed as a cloud provider, at least for those 2 days. From my quick review of news it looks like it happened in their back end relational data base servers and traffic management systems.

As a former Info Systems Manager, this should be a firing offense for whoever was in charge of that location. One of the main reasons to use a cloud is reliability and 99.5+% up time.

The company I work for now is using a cloud provider (not Amazon). I have visited their location. They have redundancy built in everywhere. They even have their own generators and water coolers for their whole site. We are in Nashville, and when they had the floods last May, the provider automatically re-routed all traffic to their Atlanta hub without a hiccup.

Here is a link to a somewhat technical article on what happened:

http://itmanagement.earthweb.com/netsys/article.php/3932316/Amazon-Describes-Cloud-Outage-Fix---Apologizes.htm
(http://itmanagement.earthweb.com/netsys/article.php/3932316/Amazon-Describes-Cloud-Outage-Fix---Apologizes.htm)

The lesson for anyone contemplating using cloud services for their business operations is you need to plan for this to occur at some point and figure out how you can build redundancy within your own system.

Stevecsd
05-02-2011, 12:05 PM
I just spoke with our VP of IT. What he thinks happened to those web sites that went off line was that they didn't have a fully formed disaster recovery plan. Meaning, they didn't implement the parts from Amazon that said if Alexandria goes off line for what ever reason, automatically reroute our traffic to another site, Chicago, New York, Atlanta, or whatever Amazon has available for their other sites. So when Alexandria went off-line, they went off-line too. Obviously you have to pay extra for that option.

The ones that already had that implemented in their disaster recovery plans were the ones that stayed on line and working.

We have 2 parts for that in our contigency plans. One is the part where the cloud vendor does a full copy of our data every X hours, (I forget if this is every 8 or every 24 hours). They then copy that data to their other operations site. The second part is the rerouting of traffic in case their Nashville center goes off-line. If it does go off-line, we are rerouted to their Atlanta location, using the backup data of X hours old.

In computers and software failures are not an if, but a when.

Dave Schwartz
05-02-2011, 01:29 PM
In my own little version of cyber-space, I have recently moved to an internal "cloud model" of sorts.

My work area actually consists of 3 computers:
http://www.horsestreet.com/BBSImages/Dave'sOffice/Office-200711-03.jpg

(This is not actually a current picture as the blank space has been replaced by a small TV for watching RTN.)

Those 3 monitors in front of me are actually shared by 2 computers. Thus, computer #1 uses the first two monitors and while computer #2 uses monitors 2/3. Computer #3 has its own 2 monitors.

Like most people I use one computer most of the time. When that computer has a failure, I am stuck until it gets fixed. Well, I got sick of that and committed to moving EVERYTHING to a server.

This turned out to be a huge process. It has been 5 months and I am still moving things. In the long run it makes me much more portable.

Like Amazon, I have another level of problem if the server fails. But I do have a contingency plan: Restore the backup to another computer, and remap to that computer.

If I work at it, I can even access much of the stuff remotely (if I choose to put it in a shared place).

Cloud concept=good :ThmbUp: