Preparing a new generation for leadership in a big data world
Words of wisdom from a university leader
By Stephanie Robertson, SAS Insights Editor
Dr. Michael Rappa is the founding director of North Carolina State University's Institute for Advanced Analytics. As head of the Institute, he leads the nation’s first Master of Science in Analytics as its principal architect. Here, Dr. Rappa shares his ideas on the role of data scientists in organizations today, offers tips on hiring them and explains how the NCSU program helps prepare students for exciting opportunities in the world of big data.
How would you define a data scientist?
Rappa: Data scientist, as it’s used today, is still an evolving job category. Though its popularity has skyrocketed with employers the last few years, I don’t think there’s a single definition that applies – and perhaps we shouldn’t expect there to be at this early stage. I define a data scientist broadly as someone with the technical knowledge and tool skills for extracting useful insights from the variety of data generated in today’s digital economy. Perhaps the single most important characteristic of a data scientist is the deep passion for grappling with the complexity of data analysis.
And what do you think of the title?
The “data scientist” title is here to stay, and I can understand its popularity because there needed to be a new way to describe the kind of data-savvy talent employers are searching to hire. But I wish we could have avoided a label that conjures up the lofty feel of white coats and ivory towers. In reality, data scientists are in the trenches, working laboriously to draw insights from the inherently messy array of data that is now called “big data.” It’s arduous work.
We looked at data on your website and noticed that in 2011 and 2012 data scientist jobs weren’t listed. In 2013, 13% of your graduates were hired as data scientists. Last year (2014), that jumped to 33%. Does that surprise you? Do you think it’s a trend that will continue?
Indeed, it was a surprise to see that big jump. I knew a shift was under way, but the rapid rate of change had me do a double take at the numbers. Looking deeper at the data you could see that employers were not only looking to hire data scientists, and more of them, but they were also reclassifying existing positions into data scientists. I expect the trend to continue.
Are your students more interested in landing a data scientist job than other positions?
We coach our students not to get too hung-up on a title. Employers will differ in how comparable positions are labeled. Far more important is the exciting nature of the work itself, to work with smart people, and to have the opportunity to continue to learn on the job. The ability to be a contributor from day one and to know your work can have bottom line impact on the business is key. As time goes on, I anticipate we’ll encounter young students who set their sights on becoming a data scientist as freshman, or even earlier, and the position title will be very important to fulfilling their aspiration of landing that dream job.
How is a data scientist different from a statistician or data miner?
Data scientists must have a mastery of statistical concepts to be effective. I don’t think you need a PhD in statistics, but you need a level of understanding that goes beyond the undergraduate level. The data scientist also needs to have a strong versatility in computer programming for manipulating and melding data from various sources. This can go beyond what’s common for statisticians, in general. Having said that, some statisticians claim data scientist is just new label for what they’ve been doing all along, and I certainly understand their sentiment.
Data mining is one important methodology that data scientists use in their work. A data scientist will have several different methods in their toolbox, which can be applied appropriately to the problem at hand. The best data scientists never stop adding tools to their repertoire.
Last year, 22 of your graduates were hired as data scientists or senior associate data scientists (more than any other position). Do you know if some industries hire more data scientists than others?
I look at the diffusion of the data scientist phenomenon primary from a geographical perspective. Starting with the large metropolitan areas along the east and west coasts and heading inward. I haven’t gotten to the point of discerning stable variations across different industries. In the last few years the strongest sector for employment was financial services, and this is driven in part by new regulatory requirements.
If you were to hire a data scientist what key question(s) would you ask?
I would want to hear about their experience working on problems with real-world data. What challenges did they encounter and how were these challenges overcome. I would also want to understand their propensity to work in teams. I would listen carefully to how a candidate talks about technical matters to see if there’s a comfortable ability to convey the complexity of data analysis in a manner that non-technical decision makers will understand.
Have you made adjustments to your program since you started and if so, what are they?
We make adjustments every year – not just in the content, but also in the structure and delivery of the curriculum. It’s built into the DNA of the Institute. We’re always talking about what we’re doing now and how we can improve it next year. I sometimes call our MSA the first “artisanal” degree because the curriculum is meticulously recreated each year and customized to address the evolving needs of industry. We have a laser-like focus on our students and how their skills resonate with employers.
There are so many incremental changes we make each year, it’s hard to enumerate. When you look across the years, the degree today is substantially different from the early years. What’s surprising though, is how the core principals – a completely team-based and fully integrated curriculum, working with real-world data and using industry-leading tools – have remained consistent over time.
Advice for students and graduates
When students enter your program, are there some who set out with the goal of becoming a data scientist? Are there discernible differences in the skills possessed by those who want to become data scientists compared to say an analyst or consultant?
Yes, clearly there are students who have set a career goal on becoming a data scientist. Perhaps the most common characteristic in these students is a strong balance between their statistical competency and overall computer programming skills. They tend to excel on both dimensions equally well. That’s rare. We typically see students stronger in one or the other area. This has more to do with the current reality in undergraduate education, where the crossover between statistics and computer science in the curriculum is not as strong as it needs to be.
How would you advise graduates to plan for success in a career as a data scientist?
I would start by advising them to narrow their time horizon to the initial three years after graduation. The first job is a stepping stone and getting it right will give you the maximum lift from your education. If you can, don’t focus too much on geographic location or the highest salary. I encourage students to think about their first position like a medical residency. You hope to go to a leading hospital where you can benefit from being around experienced surgeons and physicians and learn the profession. After three solid years of experience, then you can then consider moving on, if it makes sense, to a geographic location or industry that better suits your needs.
Never stop learning. Especially today – there are so many great opportunities to continue to learn while on the job. Keep adding to the toolbox. Always be professional. Think about how your work is connected to and adds value to the business. Build strong relationships with colleagues – learn from them and share your knowledge. Put faith in the power of teams. Be the consummate team player that everyone wants on their team.
If you were to give one bit of advice to a candidate who was interested in pursuing a future as a data scientist – what would that be?
If you want a clear pathway into the profession, I would recommend university graduate programs like ours, which are specifically designed to help both new graduates and midcareer people become data scientists. Our approach is to provide an intense, immersive learning experience in the shortage time possible – 10 months – so candidates can get into (or back to) the workforce quickly. There are also part-time and online options. Since the time we launched the first Master of Science in Analytics degree in 2006, there are now more than 100 similar programs worldwide, so there are plenty to choose from. There are also a number of online options for self-learners, where one can brush-up on their knowledge of statistics and improve their programming skills.
I would caution a candidate not to get caught up in the hype. Data scientists have been given the tagline of the “sexy” new career, and truthfully it’s nothing of the sort. Valuable insights from data don’t come easily or quickly. It’s plain hard work, and it takes a serious person with intelligence, skill and determination to see a problem through to completion. If that person is you, it’s an amazing career opportunity and you’ll find yourself with no shortage of work after graduation.
Michael Rappa is the founding director of the Institute for Advanced Analytics and a member of the faculty in the Department of Computer Science at North Carolina State University. As head of the Institute, he leads the nation’s first Master of Science in Analytics as its principal architect. Before joining NC State, for nine years he was a professor at the MIT Sloan School of Management. Dr. Rappa has more than 25 years of experience working across academic disciplines. An accomplished researcher and instructor, his passion is to bring an entrepreneurial and forward-thinking mindset to innovation in higher learning.