Hiring? We have the data scientist interview questions you need
20 questions for recruiting well-rounded data scientists
By Alison Bolen, SAS Insights Editor
There’s no shortage of data scientist interview questions available online. You can find lists and lists of questions to ask data scientist recruits in an interview, but most of the questions focus on the technical and quantitative aspects of the job without considering the softer skills.
Our list takes a different approach. If you’re hiring a data scientist, we recommend focusing on more than technical skills, and your interview questions should reflect that. Data scientists have a unique blend of skills; not only do they solve problems by squeezing information out of data, they also communicate results and persuade others to apply that information to their decisions.
Remember that data scientists aren’t just ninja quantitative analysts, says Wayne Thompson, Chief Data Scientist at SAS. “You’re looking for more than technical skills. Logical reasoning agility, along with storyboarding skills are critical because these scientists must be able to solve problems iteratively in collaboration with business analysts and decision makers. Their domain experience and communications skills are just as important as their technical skills.”
Taking a hint from Thompson, we’ve broken our list into three categories, with an equal emphasis on each:
- Technical questions.
- Practical experience questions.
- Communication questions.
Consider our questions to be a starting point for your data scientist interviews. They are designed to evaluate the three sets of skills above, but you’ll want to add your own questions to make sure the candidate is a good fit with your organization’s culture and requirements.
Technical data scientist interview questions
Statistics and machine learning are important technical skills for data scientists. These questions help measure knowledge, plus the ability to explain complex topics. Some of the questions are also designed to bring out the art and science of data science.
- What is the curse of dimensionality and how should one deal with it when building machine-learning models?
- Why is a comma a bad record separator/delimiter?
- Explain the difference between a compiled computer language and an interpreted computer language.
- How do you determine “k” for k-means clustering? Or, how do you determine the number of clusters in a data set?
- What’s more important: predictive power or interpretability of a model?
- Explain finite precision. Why is finite precision a problem in machine learning?
- Explain the “bias-variance trade-off” and why it is fundamental to machine learning.
Practical experience data scientist interview questions
Technical skills are important, but they must be applied to solve problems. Your data science candidates should be able to describe projects they have worked on, and how they turned out. They also should be able to articulate what aspects of their technical training have been important in their day-to-day data scientist tasks, and how they can apply their skills to your business.
“Remember that the candidate’s practical experience does not need to be within your industry,” says Udo Sglavo, Senior Director of Advanced Analytics R&D at SAS. “I’ve heard from hiring managers who say their best hires have been people outside their industries who can look at problems from a fresh perspective.”
Evaluate practical skills by asking some of these questions:
- Describe a recent use of logistic regression.
- Describe an analysis you have recently completed, including strategies and findings. How were the findings used by the business? (This can be from a student research project or thesis if the candidate is a recent graduate.)
- Give examples of data cleaning techniques you have used in the past.
- What subjects would you include in a one-day data science crash course? And why?
- Describe a situation where you had to decide between two different types of analyses – and why you chose the one you did.
- Explain the benefits of test-driven software development; or explain the benefits of unit testing.
Communication-focused data scientist interview questions
Last but not least is communication. Even the smartest statistician in the room will fail if she cannot explain the relevance of her results. Data scientists need to understand their data and explain its significance to the problem at hand.
With these questions, you are seeking to evaluate the candidate’s ability to communicate clearly and persuasively.
- Explain to the leaders of this company what model lift is and why they should care.
- How do you identify and overcome obstacles (during projects, with customers, with decision makers, etc.)
- Tell me about a project you worked on that succeeded in part because of the way results were communicated. What were the factors that made it a success?
- Tell me a compelling story about data that you have analyzed.
- What is your favorite data visualization book or blog? And why?
- How would you design a chart or graph for a color-blind audience?
- Explain to a business analyst the trade-off between the predictive power and the interpretability of a model – and why this matters.
Ultimately, you are looking for someone who is tech savvy, quant savvy and business savvy. He should be persuasive and credible, but also creative and passionate.
Data scientists are in short supply, but hiring a good data scientist can help anticipate customer needs, optimize prices, prevent fraud – and more. We hope these data scientist interview questions can help you find someone with a range of technology skills and a knack for communicating complex subjects to a variety of audiences.
- What are data scientists? Learn more about who they are, what they do and why you want to be one.
- Download the e-book, Your Data Scientist Hiring Guide
- Looking for a data scientist training path? Check out the SAS Academy for Data Science
Get More Insights
Want more Insights from SAS? Subscribe to our Insights newsletter. Or check back often to get more insights on the topics you care about, including analytics, big data, data management, marketing, and risk & fraud.