My last blog, Business Analytics as a High-Value Career Opportunity, examined the growth and career opportunities inherent in using analytics to improve business functions and processes. However, analytics applications—and the increasingly powerful tools that enable them—are also creating incredible new opportunities for graduates in a broad range of technical disciplines.
Two leading technology vendors, IBM and Google, in cooperation with a government agency, have taken an important step in helping students prepare for careers in these promising new fields. Not coincidentally, this work will help these vendors enhance the IT architectures, create the application development skills and build a base of developers that will instrumental in creating new-generation computing infrastructures and applications on which these vendors hope to build their own futures.
Building the Foundation for Large-Scale Internet Computing
There is nothing new about large-scale technical computing. Scientists and engineers have long used the world’s most powerful supercomputers to perform complex calculations on huge data sets—the type of computations required to model and visualize complex interactions and simulate outcomes.
What is new is that a growing portion of this work is migrating from huge, expensive and traditionally proprietary supercomputers and software, to distributed, cloud-based architectures that consist of clusters of hundreds or thousands of standard PCs, connected through open standard interfaces, and applications developed with open source tools.
In October 2007, IBM and Google partnered to create the IBM/Google Cloud Computing University Initiative, which provided several universities with access to a large cluster running the Hadoop open source distributed computing platform. The companies provided the required hardware, software and services and recruited six leading computer science research organization (University of Washington, Carnegie-Mellon, MIT, Stanford, U of C Berkeley and University of Maryland) to participate in a pilot program. Then, in February 2008, IBM and Google partnered with the National Science Foundation (NSF) to provide grants to academic researchers to explore large-data architectural issues and create applications that could take advantage of this infrastructure.
Technical Analytics Enablement
As of October 2009, the NSF had awarded $5 million in grants to 14 universities for various research projects.. Most of these projects have a dual goal of:
- Improving computer science students’ knowledge of highly parallel computing practices; and of
- Spurring research into specific aspects of large-scale, data-intensive cloud cluster architectures and application development.
The first award, to the University of Washington, has the broadest, most foundational goal. It is intended to help jumpstart the widespread teaching of large-scale cluster computing to large numbers of computer science and software engineering teachers and students across multiple undergraduate universities. It is creating a 2.5 day workshop that provides course material and curricular support that professors at undergraduate universities around the world can use to develop their own courses.
Most awards, however, are intended to fund advanced research into specific particularly knotty problems that must be addressed for cloud to become a ubiquitous platform. A number of the initial grants focus on search—the primary horizontal application of cloud technology and the foundation of Google’s market position. For example, Carnegie-Mellon, University of California-Santa Barbara and University of Massachusetts-Amherst were each awarded NSF grants for developing more efficient methods of searching and managing queries across the Web. University of California-Irvine received one for research intended to improve the efficiency and accuracy of fuzzy search queries on large text repositories.
A number of awards were focused on issues that underlie a broad range of high-performance, technical computing problems. Examples include grants to:
- MIT, Yale and University of Wisconsin-Madison for studies of tradeoffs associated with using different approaches for analyzing and extracting information from very large collections of data across large-scale clusters of parallel computers; and
- University of California-San Diego for improving the performance of dynamic provisioning of data-intensive applications.
But while most grants focused on broad, infrastructure-related issues, a few delved directly into specific scientific analytic applications. For example:
- One of the University of Washington’s three projects focuses on astrophysics, particularly the analysis of astronomical images, space-time overlaps and the simulation of collisions of galaxies;
- University of Washington and University of Utah each won grants for projects that will allow ad hoc, longitudinal query and visualization of massive ocean simulation results at interactive speeds;
- University of Maryland-College Park is conducting a project to develop parallel algorithms for analyzing DNA sequencing; and
- University of California-San Diego’s aforementioned dynamic provisioning research will include a focus on protein matching in bioinformatics.
Helping Students—Helping Themselves
The IBM/Google Cloud Computing University Initiative, as mentioned, has an immediate objective of stimulating research into areas that will be instrumental in establishing cloud as a ubiquitous computing platform. A few are intended to promote research into specific technical disciplines—some of which may have direct commercial application, others not.
But regardless of the immediate commercial opportunities, many of these projects will serve as platforms for subsequent research by hundreds of other universities and corporate research labs. Research findings, for example, will be published in scientific journals, and be disseminated though conferences and by graduates who move to other universities and into the private sector. Some of the projects will result directly in usable products, such as code that will be available under open source licenses.
All of these projects, however, address another of the supporting vendors’ longer-term goals—to create a generation of students that understand the value and application, and will help drive the adoption of cloud-based computing.
Some of these students—particularly those in disciplines such as computer science and software engineering—are training to become the systems and application architects of tomorrow. IBM and Google will work with professors to identify the most promising of these students, offer them scholarships and internships, and attempt to recruit them into their organizations. (See my November 11 blog and report on IBM’s Academic Initiative (IBM’s Role in Creating Tomorrow’s Workforce) to understand how such efforts fit into that vendor’s broad employee development strategy.)
But only a small percentage of those students who benefit from the Cloud Computing University Initiative efforts will end up working for IBM or Google. Many are likely to end up in working for IT organizations or for other vendors—including competitors of IBM and Google. Meanwhile, students who learn to apply parallel computing tools to other disciplines, such as astrophysics, biochemistry or environmental studies, are likely to apply these techniques to their own private and public sector careers.
All of these graduates, however, can provide at least indirect benefits to the founding vendors. Those who work in customer IT departments will help drive demand for cloud-based solutions. Even those who join competitors have the potential of helping to expand the overall cloud market.
In the end, however, the founding vendors, and all private and public sector participants in all types of technical research, are likely to gain the greatest value from those students in non-IT-related technical disciplines—those that learn to apply high-performance, cloud-based computing clusters to drive innovation in their own fields. Their work, combined with the expansion of the IBM/Google Cloud Computing University Initiative into other academic departments—everything from finance and marketing, though metallurgy and nanotechnology, to architecture and urban planning–will spur new applications, and new innovation in all types of fields.
Although not all this work will directly benefit IBM or Google, it will certainly help to jumpstart the cloud computing market. This will not only provide indirect benefits to the two vendors, it will also create new career opportunities for thousands of IBM/Google Cloud Computing University Initiative graduates plus millions of others that end up in new jobs that will be created by these graduates’ innovations