Extracting knowledge from a torrent of information
The amount of data has been exploding. Everything from health records, environmental monitoring, agriculture and online behavior with clicks, “likes,” tweets and purchases generate data every second. With this proliferation of data, the ability to analyze large data sets—big data—has become a platform of competition. It is a key driver of productivity, innovation and market demand. A high-level panel set up by the United Nations Secretary General recently reported that for too long global development efforts have been hampered by a lack of the most basic data about the social and economic circumstances in which people live. Traditional ways of analyzing and presenting data no longer meet the needs of society. Data science is a key area of growth and investment for the College of Science and for OSU because it is highly relevant, we have an obligation, we have key strengths and there is tremendous opportunity.
First, big data is highly relevant in a 21st century world. It satisfies a growing need to manage, analyze and interpret massive, complex data sets to solve problems and to better inform decision makers across disciplines, from policy and industry to education and agriculture. Because data analysis techniques are complex, the meaning can be misunderstood by those charged with prioritizing, designing, leading and implementing public policy.
Angus Deaton, the 2015 Nobel Laureate in Economics, spoke recently about the importance for better data that leads to better lives. Understanding patterns in large data sets is extremely important and has tremendous impacts on our world. As he said in an interview on PBS Newshour, “…most of the numbers we have are not ‘given.’ They’re produced by statistical offices, many of whom are under terrible budget pressure…..and if we don’t know what sort of progress we’re making and how we’re doing, we don’t really know where we are.”
Understanding patterns in large data sets is particularly important. The outcomes have tremendous impact on our world. In its 2015 report, The Internet of Things (or sensors and actuators connected by networks to computing systems), McKinsey & Company advised that “if policy makers and businesses get it right, linking the physical and digital worlds could generate up to $11.1 trillion a year in economic value by 2025.”
OSU’s Strategic Plan 3.0 outlines its commitment to leveraging technology as a strategic asset:
“Technology and information occupy a critical role in a 21st century university….. Greater accountability, enhanced expectations of a current generation and growth in the development, management and delivery of digital resources point to the expanding role that big data, analytics and information technologies provide as a strategic enabling asset.”
By aligning with national and global priorities for big data, the College of Science is able to lead big data analytics at OSU and beyond.
“Data science is the heartbeat of 21st century global economies, and innovations in sciences, engineering, business, and education are becoming increasingly computationally- and data-enabled,” explains Sastry G. Pantula, dean of the College of Science.
“Strategic investments in data analytics research and in training future data scientists will have long-term payoffs not only for our students, but also for industry and society.”
Secondly, we also have an obligation and a responsibility to educate the next generation of data scientists with computational-thinking and data analytics skills to solve our most pressing challenges as part of a 21st century workforce. Tomorrow’s leaders in science will need to manage large data sets by extracting useful, actionable information and by developing new statistical methods, mathematical models, visual analytics and computational algorithms.
Given the interdisciplinary, collaborative research that is a hallmark of the university, Oregon State and the College of Science are well positioned to lead a data science initiative. Data science can expand the university’s footprint and impact in one of the fast-growing fields and create an area of innovation and distinction in mathematical and statistical science.
In a landmark report on “Big Data: The next frontier for innovation,” McKinsey & Company forecasted a national shortfall of 150,000 master’s-level professionals trained in data analytics with the ability to manage big data. To address this shortage, the College is developing undergraduate courses and is creating an online master’s program in data analytics. The business school is developing a business analytics option for their MBA program.
Third, data science capitalizes on our strengths and hallmark collaboration while transcending disciplines, moving seamlessly between research and classrooms. Data science will expand the university’s footprint, positioning it as a leader in the statistical, mathematical and computational sciences. OSU’s premier Center for Genomics Research and Biocomputing comprising scientists from a wide-range of disciplines, including statisticians, are using large datasets to conduct research in bioinformatics, biological computing and genomic biosciences.
The College is developing a distinct research and education program in data sciences that integrates OSU strengths in computer science, genomics, statistics, mathematics, and applied sciences and policy.
With data sciences programs at Stanford University, University of California-Berkeley and University of Washington, a complimentary program at OSU would provide synergistic opportunities with these peers, and attract students and researchers from around the globe to Oregon.
Strategic investments in mathematics, statistics and life sciences faculty have extended the College’s impact of data science on transdisciplinary research. In a science-without-borders approach, the College is deepening engagement between data science and other sciences, engineering, education, arts and business. Cluster hiring in bioinformatics across disciplines has brought expertise in mathematical biology; ecological, evolutionary, and functional properties of the microbiome; and deep sequencing data.
And finally, data science offers abundant opportunity. By aligning our expertise with market and national needs, federal priorities and funding opportunities, the College and OSU will advance the White House’s Big Data Research and Development Initiative, which seeks to accelerate the pace of discovery in STEM and transform teaching and learning by improving our ability to extract knowledge and insights from large, complex collections of digital data.
Last year, the federal government allocated $200 million for R&D in big data. Our peers are following this growth opportunity. This past year, the University of California, Berkeley was awarded a $10 million grant for an “Expeditions in Computing” project to explore aspects of managing large data sets. Clearly, funding opportunities are available in this burgeoning area that would generate research grants for OSU.
Funding agencies are following suit. NSF has encouraged “research universities to develop interdisciplinary graduate programs to prepare the next generation of data scientists and engineers.” The National Institutes of Health created Big Data to Knowledge (BD2K), a trans-NIH initiative to enable biomedical research as a digital research enterprise, to facilitate discovery and support new knowledge. NIH contends that the ability to harvest the wealth of information contained in biomedical Big Data will advance our understanding of human health and disease.
Our peers are following this growth opportunity. This past year, the University of California, Berkeley was awarded a $10 million grant for an “Expeditions in Computing” project to explore aspects of managing large data sets. Clearly, funding opportunities are available in this burgeoning area that would generate research grants for OSU.
In a boon to OSU’s marine science and big data initiatives, NSF recently awarded OSU its NSF Research Traineeship award to build cohorts of leaders in marine science, data and policy. The five-year, $3 million award will prepare a new generation of natural resource scientists and managers who will combine mathematics, statistics, and computer science with environmental and social sciences to study, protect and manage ocean systems.
In other words…
“Marine, earth and atmospheric studies of tectonics, ocean acidification and clouds, through the use of massive data collection and search algorithms are helping us to understand the pace and consequences of climate change.”
“Today, predictive analytics applied to the big data regarding education pathways taken by thousands of students over a dozen years can help us diagnose the education choices made by individual students from diverse backgrounds to determine what they need to change to be successful.” —Ed Ray, President, Oregon State University
“It has been said that we can’t know everything, but we can know quite a lot. How that knowledge comes about is evolving rapidly. We now routinely gather massive amounts of data on our environment, our bodies, and our behaviors. Until the emergence of the field of informatics, much of that knowledge remained locked away and unavailable to scientific study.”—Cynthia Sagers, Vice President of Research, OSU
“If the liberal arts are charged with tackling society’s most challenging problems, big data represents a powerful new tool for manipulating, sorting and analyzing the nearly endless amount of information that humanists and social scientists must sift through and harness in their quest to find answers.”—Larry Rodgers, Dean, College of Liberal Arts
“Precision agriculture will help meet the food, fuel and fiber needs of a growing population. Underpinning precision agriculture is the use of Big Data, which includes many types of sensors collecting soil and field data on increasing smaller scales thereby creating more and more data. Data analytics use that information to make smart management decisions which lead to increased production efficiency and higher quality farm gate products.”—Dan Arp, Dean, College of Agricultural Sciences
“Studying learning has always been about the individual and about populations. Today available data has exploded in both contexts. We can study learning of whole communities as well as millions of social network ripples from a national event. Also we are studying student- and teacher-level data across an entire educational pathway.”—Larry Flick, Dean, College of Education
“I believe that big data will become increasing important to every aspect of engineering, from understand the Cascadia Subduction Zone to the design of our autonomous systems we use to collect that data.”—Scott Ashford, Dean, College of Engineering
“Big data is transforming higher education, both in the ability to truly and deeply understand what actions impact student success, and in the discovery and creation of new knowledge and insights through research. From the vast arrays of instrumentation in our research enterprise to the interactive systems and smart devices used by faculty and students, we are amassing data at a rate far beyond what we had even a few years ago.
“Our ability to realize the potential of big data in educational and research endeavors depends upon our ability to effectively collect, analyze and leverage this data.”—Lois Brooks, Vice Provost for Information Services
“With the advent of the Internet, it became clear that society would soon be swimming in a sea of data and the role of a university would be to help learners navigate to success through that sea. Now though, it is the university itself that is inundated with data—learner data, demographic data, demand data, market data. Our success will be completely dependent on harvesting decision data, in some cases in real-time, and steering the ship accordingly.”—Dave King, Associate Provost of Outreach and Engagement