At the O’Reilly Velocity conference we attended last month, Hyperic was there to hype the launch of Cloudstatus, which aspires to become the monitoring tool for cloud providers similar to what Netcraft is for’classic’ hosting providers. Get the lowdown on Hyperic and CloudStatus in this video (two parts) from Jon Travis (Principal Engineer) and Xavier Soltero (co-founder and CEO).
See part 1 of the movie here on Vimeo, but scroll down for the best part!
As we are living in the ice age of cloud computing, glitches (like the recent outage of Amazon S3) are to be expected and it must be said that Amazon managed to fix its ecosystem relatively fast and openly reported on the underlying problem .
An interesting question is posted by Reuven Cohen on his blog about the use of federated network protocols within cloud services and the gossip protocol that caused Amazon’s WS downtime on June 24.
“…We have been big fans of use of XMPP for federated communications within our Enomalism cloud platform for multi cloud communications (Wide Area Cloud). XMPP is interesting because it natively solves a number of federation problems within a tried and tested framework. One of the biggest benefits to the use of a gossip protocol lies in the the robust spread of information and the exponential nature of it’s sharing of information within a large number of machines…”
At Virtualization.com, we intend to report on cloud initiatives too, since all these Platform-as-a-Service providers (Google App Engine being the exception to this rule) are enabled by virtualization technology. We expect to see several more competitive statistical analysis tools for various cloud service providers to emerge in the near future. With Amazon Web Services (AWS) blazing the cloud trail, Hyperic picked them to start reporting on via CloudStatus. But Google App Engine and (Sales)force.com seem target platforms to follow. So Amazon’s trouble with being first, is they are first in line to be publicely reported on too. This also means the PR and sales people at Cloudstatus have a busy time issuing press releases and contacting impacted prospects whenever Amazon experiences a glitch or failure.
Stacey Higginbotham at GigaOm ventilated the common fear that:
“… Amazon or another cloud provider could shut the service down, either by offering their own status service or by stopping the Hyperic agent. Given the rush to provide dashboards, application-testing products and other services on top of established computing services, I’m eager to see how startups keep their footing in the clouds.”
Being curious and knowing Amazon only speaks through CEO Jeff Bezos or CTO Werner Vogels, we walked up to the latter and were happy to learn Amazon actually loves CloudStatus. He took a step back right afterwards, but why not just watch the video to see his response to the CloudStatus launch?
On a sidenote: Hyperic’s newly launched CloudStatus detected the outage at 8:45am PDT, a full 20 minutes before Amazon posted that they were aware of the issue, at 9:05 PDT on http://status.aws.amazon.com/. CloudStatus saw several server errors coming from the majority of their S3 and SQS monitoring agents, in addition to other problems with EC2 (lots of EC2 zombies being created) that may have been related.
Like hurricane warning systems, while Hyperic CloudStatus could not have prevented the S3 outage, it was able to provide enough of a “storm” warning for users to take action. The company will be adding additional cloud services to CloudStatus in the coming months, next up is Google App Engine.