Big Data Research: Will Industry Solve all the Problems?

In her keynote talk at the upcoming Very Large Data Bases Conference, Professor Magdalena Balazinska of the University of Washington poses this controversial question. Is there a place for academic research in Big Data management and analytics or will industry solve all the problems? Particularly given all the new systems and innovation coming out of industry — or from academic projects that quickly became big players in industry.


The ISTC for Big Data blog caught up with Professor Balazinska for a preview of her talk. Here are her answers to our questions.

Why ask this question now?

This year is the 41st annual Very Large Data Bases conference. For the past 40 years, the researchers who attend this conference have consistently presented some of the most leading-edge developments in managing big data — long before it was popularly known as big data.

In the last 10 years, things have really sped up. Big data has become a very big problem to industry. As a result, the landscape of data management systems has grown at a rapid pace — the Hadoop stack being a prime example. And there are other examples. So, it’s a natural question to ask and a natural time to ask it.

So, what’s the answer?

The short answer is “Yes,” there is definitely a place for academic research in big data management and analytics. It’s important to remember that many of the more transformative systems in use today in industry originated in academia. Spark and GraphLab are great examples.

The more accurate question to ask is how best we can contribute to the vibrant area of big data management innovation in the future.

In my keynote, I identify five unique ways that academic research can contribute to innovation in big data management systems and analytics. I illustrate by looking back at innovations coming out of a number of our projects in the database group at the University of Washington. And there are of course others coming out of MIT CSAIL, Brown University, Stanford, CMU and other academic research hot spots in data management.

Can you tell us a few of these ways that academic research can make a unique contribution?

Well, I will tell you one.

Science is becoming increasingly data-driven, and scientists — working in everything from small research labs to large scientific communities — have access to more data than ever before.  In our big data research, we’re tapping into the real data and workloads from our science collaborators on campus, in astronomy, oceanography and other disciplines. These applications help us understand some of the challenges related to big data management, across the spectrum: volume, variety, velocity and so on.

One approach to big data research is to apply existing tools to these problems and evaluate how well they work. More often than not, this exercise reveals limitations of existing tools and opens the door to innovation by modifying and extending these tools. Interestingly, we find that the problems we identify through these collaborations with science users are analogous to problems faced in industry and, as a result, our innovations become interesting beyond the academic campus.

One way to contribute to big data research in academia is therefore to work in situ with some of the most advanced, demanding users of complex data — scientists — and use the experience to push the boundaries of what’s possible in data management and analytics for science but also other users.

Learn more from Professor Balazinska’s VLDB 2015 keynote on Wednesday, September 2, at 8:30 AM or by downloading her paper, “Big Data Research: Will Industry Solve All the Problems?,” available online later this summer.


University of Washington Database Group

University of Washington eScience Institute

Research projects by Magda Balazinska

ISTC for Big Data blog posts by Magda Balazinska


This entry was posted in Big Data Applications, Big Data Architecture, Data Management, ISTC for Big Data Blog and tagged , , , . Bookmark the permalink.

Leave A Reply

Your email address will not be published. Required fields are marked *

− 4 = one