Building Effective Data Science Teams

After sorting through the buzzwords and defining the technical skills that you actually want in a strong candidate, you extend some offers and have them accepted. Now you’ve got Real Life Data Scientists on your hands. What now? How do you make the most of them, and how do you make them stay? Especially when you’re dealing with snake people Millenials, getting them to hang around is half the battle.

Data Scientists (and You!) Need Structure

If you’re lucky, the hacker mindset is strong in your organization. Curiosity, resourcefulness, and determination come together to produce a sense that no problem is unsolvable, but you just haven’t found the right solution yet. However, while this requires a lot of individual drive and initiative, it’s a misconception that you have to go it alone. Effective teams of Data Science (DS) talent are a force multiplier because they let these highly resourceful and eager people come together to exchange ideas and skills.

Additionally, providing some kind of structure helps prevent your DS practice from becoming a monoculture – by providing a framework within which every member of the group can participate (rather than just the most eager member), you prevent the creation of dependencies on individuals that may come and go.

Let’s Talk Frameworks

As with so many things in the DS world, there are several frameworks you can use when building your data science teams, and of course, each of these has trade-offs to take into account when picking one for your organization. Some trade-offs may be more acceptable than others, and it really just depends on your organization.

Embedded Teams

Embedded DS teams establish data scientists within business units. They help the business unit identify, understand, and ultimately solve problems that come up. Data scientists for embedded teams are or have an interest in becoming experts in the domain of that business. They may meet with data scientists in other business units, but the primary reporting relationship is to the business unit.

To the degree that data science is a single thing that one does (which is not exactly the case), embedded data scientists do their thing alongside the operational staff of the business unit.

Pros:

  • Deep understanding of the problem is a crucial component of successful data projects; this model makes that easier
  • Your DS staff, who are probably approachable and likable people, are more easily accessible to business and build deeper relationships across the unit
  • In the sense that “Data Scientist” is a stage of analytical prowess, this model helps your analysts advance to that level

Cons:

  • Individual development for DS staff is slower, because they’re not always exposed to people who know things they don’t
  • Results in a greater headcount than other frameworks if your business needs a wide variety of skills
  • If your DS staff manage their own infrastructure, may lead to poor interoperability with other groups

Discrete/Specialized Teams

Discrete, or specialized, teams establish “Data Science as a Service” (using this term may lead to audible eyerolls – you’ve been warned) within the organization. They all work together, sit together, and report to the same manager. Business problems are investigated and solved as a unit.

Pros:

  • Your DS team grows as a group – they all share their skills and develop with each other
  • Technological sophistication tends to be higher, which helps contribute to the state of the art across the industry
  • Responsibility for documentation of complicated solutions can be shared, which leads to more reproducible research

Cons:

  • Trade-off between domain knowledge and technical solutions (this is a hotly debated topic)
  • If management positions DS expertise as a “Unicorn”, this model is difficult to integrate into the organization

Hybrid/Matrix Teams

Hybrid or Matrix-type teams seek to combine the best of both of these approaches by placing a team across multiple reporting lines. This may be the framework of choice if the organization itself already uses matrix management structures. Typically, these implementations place the DS team as a horizontal group supporting multiple verticals.

Pros:

  • Information may be less siloed – multiple lines of reporting allow information to flow in both directions
  • Domain knowledge accumulates across verticals, so staff may develop more quickly

Cons:

black-and-white-city-man-people

Metering the Trade-Offs

It’s definitely the case that there is no easy solution here. As with everything in life, instead there are a series of trade-offs. However, it might be said that the some organizations are more suited for certain models.

For example, organizations with many varied lines of business may opt for an embedded team model, because it is impractical to expect DS staff to learn the ins and outs of every business. On the other hand, organizations that focus on doing one or few thing(s) and doing it(them) well may choose to invest in a discrete or specialized team for data science because the domain is easy to learn, or they prefer more sophisticated solutions. Finally, organizations that already have a matrix organization, or a middling variety of business lines, may opt for the hybrid/matrix framework.

Said succinctly, the choice may be as such: Simplicity, Sophistication, or Expertise: Choose Two

If you haven’t already, read some of the other pieces that I have written on the topic of managing data scientists, or get in touch if you’re interested in exchanging some ideas.

DS Team Frameworks

Leave a Reply

Your email address will not be published.

This site uses Akismet to reduce spam. Learn how your comment data is processed.