By Sramana Mitra and guest author Shaloo Shalini
SM: What are the vendors that are involved in this process at HMS? What roles and responsibilities are the vendors playing, and how do you evaluate them and bring them in? What are some of your criteria for vendor evaluation?
MA: The vendors that are involved are Platform Computing for one. They are the vendor for cloud and cluster management software. Most of the hardware on the computational side is IBM BladeCenter servers. We have Cisco as the network fabric provider, and we use storage from a company called Isilon high performance network attached storage (NAS). So that is pretty much the mix of hardware vendors that we have.
Our process for evaluating vendors is primarily about piloting. We take an agile development approach with out cloud implementation similar to the one people are taking in software development. We start with the design requirements; we meet with users and then look at what our software options are. Next, we try it out and see how it works, and we build on our successes. We do change course where things are not meeting our needs. There are specific things we knew needed to happen. We knew we need hardware vendor that had reliable hardware and would give us support services we needed. We needed cloud software that could support scale from hundreds to thousands of nodes and give us the flexibility to be able to provide these services where we guarantee certain parts of the cloud to certain researchers and that sort of thing. Overall it is an iterative process to find the requirements and pilot them and see how it helps the research work at HMS.
[Note to readers: You can read more about how HMS is using Platform Computing’s scheduling solution for helping research jobs here.]
SM: How do you view yourselves against the rest of the industry’s adoption of cloud computing? What are the configuration and the type of environment that you describe? Is that something that you see in other high-end research labs?
MA: Yes, a lot of folks are moving toward this sort of cloud computing–based approach. They are figuring out how to also leverage the public cloud. Maybe they will get to that when other questions about public cloud adoption are answered. One thing we at HMS are excited about is potentially being able to provide our researchers with the capability to purchase ad-hoc on demand capacity from, say, Amazon EC2 or one of the cloud vendors and have it still look and feel like a part of the cloud that they are used to working in at the med school.
[Note to readers: You can read more about Harvard Medical School’s use of Amazon EC2 in this case study here.]
SM: So, you aren’t yet integrating your private cloud with the public cloud or looking at supplementing the private cloud with public cloud infrastructure?
MA: Right. We are not yet doing that, but we are working on it. That is one of the things we will be going through and doing pilots to find a use cases where people really need it, and then we can get it working.
SM: What about public cloud-like applications running on your private cloud? For instance, there are a variety of software as a service (SaaS) applications all over the place. Are there any SaaS companies that address your types of applications and that are interesting to you to tap into from a public cloud point of view?
MA: On the high-end research side the answer is no. There isn’t really anything out there yet available for us to use in terms of SaaS. There are some challenges and obstacles to why [medical research] is a harder place for SaaS applications. SaaS is mostly related to volume data in terms of subscription. If you have an instrument in the lab and it just pulled a terabytes of data off, and now you want to run some analysis on it, if you are using a SaaS on the public cloud then you would need to have access to that terabytes of data. Unless you are on a gigabit network, even in that case it is going to take up to twenty hours to move that file on to the cloud before any cloud-based computation can happen on that large data set.
That being said, I think the other thing is, this is a very interesting time in biomedical high performance computing space because on the one hand this part of HPC world is growing faster than in any other sector. Bio med is moving from what used to be primarily a work lab occupation to one that is significantly digital and computationally intensive. On the other the hand, you have a set of people who are not really trained in computer science or in best practices of computation. So, there is an opportunity for entrepreneurs to come up with SaaS models for these research pipelines, whether they run in private cloud or public cloud. Right now, our cloud has to know how to get the researchers’ code running in our environment and they have to have some pretty reasonable technical skills.
SM: Yes, I second what you are saying here. How can someone be a very high-level researcher in biomedical sciences and also know enough computer science on the fly to design simulation systems and analytics systems? Frankly, that is too much to deal with. It is not clear to me that this is the best use of time for biomedical research scientist, right?
MA: No, it is not the best use of their time. It also leads to lot of reinventing of the wheel because folks who have postdoctoral training in life sciences end up writing computer code and even things like version control and other such activities, which are a standard practice in computer science domain. As such, there is a definite opportunity out there to take the hard technical side out of some of the computational stuff that biomedical researchers have to deal with today and implement some kind of SaaS, especially for the things which are a lot lower in the stack than applications.
[Note to readers: “We spend almost $5 for every $100 in national health expenditures on biomedical research.” said URMC neurologist Ray Dorsey, M.D. More on the topic and funding data is available here.]