Even though the demand for data scientists continues to outpace supply, companies are still picky when it comes to investing in new data scientist hires. If you’re applying for this role, you’ll need to ace questions on a wide variety of topics—and demonstrate your aptitude for excruciatingly detailed data-work.
But it’s not just about showing off your attention to detail; because data scientists must crunch data for insights that a company can use to advance its overall strategy, you must show that you grasp the “big picture.” For example, when describing your previous projects, take some time to walk the interviewer through the deployment process and the ultimate goals.
"It’s common to end up with a few models to choose from but deploying the simpler ones shows that you understand trade-offs,” noted Madeleine Shang, A.I. research engineer and recommenders team lead at OpenMined.
To help you prepare, we'll walk you through some common questions you can expect during a data scientist interview and tips for answering them.
Tell me how you went about selecting the appropriate algorithm (or algorithms) for a recent project that you found technically challenging.
The best way to grab the interviewer's interest is by describing a previous project that echoes the potential employer’s goals and workflow, down to what the deployment process looked like (or should have looked like). Once you’ve provided a high-level description of that project, data-science experts like Shang may dig deeper with related questions such as:
•Why did you select this particular algorithm?
•What are the underlying assumptions? Are these assumptions respected or violated in the data? How did you verify that?
•What parameters and hyperparameters did you select and optimize? What did each hyperparameter do for the model? Did you have a separate parameter tuning data set (that was not included in the training and testing set)?
•How does the algorithm scale with more data? With imbalanced data?
•What is the run-time complexity and memory complexity of this algorithm?
•Were you happy with the results? What would you do differently?
Tips: To prepare for questions about your process and results, list all the models that you tried and the associated analysis you did along the way.
“It’s best to start simple,” Shang noted. Describe how you began with very simple models (like GBDT models) and tried to overfit them on a small balanced sample. If you opt for a complex model, a hiring manager may view that as a tendency to overcomplicate things (which is a red flag).
Want to really impress a hiring manager? According to Briana Brownell, founder and CEO of Pure Strategy, dig into the impact a data science project had on the organization.
“A lot of stories don’t go anywhere,” Brownell pointed out. What separates a great data scientist from one who struggles to gain credibility is their ability to integrate project results into the company’s broader strategy.
What are your hobbies and interests outside of work and technology?
In addition to finding out more about you as a person, hiring managers want to see if you can apply job-related fundamentals, problem-solving and communication skills to outside situations. A discussion of hobbies and outside interests also hints at what you might be like to work with.
For example, if a candidate says that they are into archery, Shang will ask them to teach her the key things that would make her a good archer—what processes would they recommend? This is also a good opportunity for the candidate to show how they can explain even complex topics in a simple and straightforward way, which is essential when it comes to communicating the results of data analysis.
Tips: When Shang asks this question, she’s looking for empathy from the candidate. She’s also evaluating the candidate’s ability to break down a topic and communicate at the right level of complexity and detail.
Here is a piece of ML-related code. Please review it for potential bugs, inefficiencies and improvements.
Hiring managers are not only looking for a data scientist with knowledge of good code design and practices, as well as basic familiarity with syntax—they also want someone with soft skills, the ability to work with a team, and an inclination to ask questions and smartly prioritize.
Tips: Because code reviews can sometimes create tension in the team, a simulated review reveals your teamwork qualities.
Tell me about the ethical issues and challenges you faced in the design and implementation of a recent project. How did you resolve them and garner trust?
The way data scientists build models and harvest data can result in bias, discrimination, lack of transparency and fairness as well as privacy issues. However, a recent survey found that many data scientists don’t focus on the ethical conundrums they might encounter.
Tips: To set yourself apart during an interview, be ready to discuss the ethical dilemmas that data scientists might encounter, Brownell advised. Show that you know how to integrate ethics oversight and analysis within a data science project and mitigate potential problems with biased datasets, outliers, and so on.