So you want to start a data team.
Maybe you need reporting for sales and marketing. Perhaps you want to see if there are insights into the application data you’ve been collecting. Maybe you want to learn more about data science and if that will help keep investors happy.
Whatever your reasoning there is some planning and some tough questions to ask yourself before you engage a recruiter.
The first hire you should make is not a data scientist or machine learning engineer. It’s time to take a long hard look at your business. The purpose of this step is to understand where the gaps are, how decisions are made and what it is you really need from a data team.
While the idea of having a PhD level data scientist onboard has been romanticised, there must be a reason for them to start digging through data for those elusive insights. And when they have worked through a dataset, what happens next.
Appoint a project manager or team lead to sit down with the leaders of your sales, marketing, finance and oroduct functions to identify where the gaps in knowledge or reporting are, and how the business fits together.
The next step may be a hire, or in a pinch may be something that can be done with your current team. Collecting the data identified from the sources identified in step one. You need something for your data team to analyse and if there is nothing to look at, they will be stuck.
The data I’m referring to here is from systems which already exist. Log files, production databases if you are building software, CRM systems, billing platforms, financial systems, marketing automation tools. All of these systems tell a story about your organisation and customers.
“If you are not aware of where and how the data is stored, no analysis can take place.”
A data engineer, BI developer, or BI engineer can help get things moving. This role is focused on the ‘plumbing’ of the data world and sets up pipelines to collect data in a constant stream or in batches throughout the day. However, their first job will be to work with your project manager or team lead to determine what is available and how easy it is to get at.
Data teams can struggle to produce reports, insights and models because they simply don’t have the data they need. If you have the luxury of starting a team from scratch make this your priority.
Data may not be structured in a way that’s right for modelling or so granular it needs work to process first. At this stage, a decision needs to be made on where the data is going to go and how it will be processed.
“Just because you can get at the Data, doesn’t mean you can get to work on Analysis.”
Will the data flow into a data warehouse, data lake, or database for the next step? Or does it make more sense to leave it where it is and query it there using another tool? A data engineer will be able to understand the complexity around the landing and processing large amounts of data and the best way to get it where it needs to go. This repository becomes your ‘single source of truth’.
This isn’t the end of the story for the collection and cleansing of data. Changes at source and changes in what is required for Analysis means there is always a need for a data engineer.
Traditionally, data or business intelligence teams have reported into the IT function as they are heavy users of databases and have specialised infrastructure needs. These teams generally work closely with, but not as a part of product or sales. But have the advantage of being able to lean on their IT teammates for support.
“Data is becoming its own centralised team and is seen as a service function.”
If the team are reporting into a product or sales function this changes their focus. They have much more of an inside view of the team they are producing reports and analysis for. However, they may not be able to share knowledge as easily. This also brings up the topic of career progression. If you are the only analyst in the sales team there is no obvious place to move up.
The third possible structure is a hybrid. Teams share knowledge with their peers but remain reporting to their functional teams.
Each has a tradeoff so it’s important to decide which focus your team will take once the team is big enough to need to decide.
- Is the functional area specialised? If the analysis requires deep knowledge of the product or market it could be worth embedding analysts in teams.
- How complex is the data? If there are gotchas and a steep learning curve just to get to grips with the data, a centralised function is best for learning and onboarding.
- How distributed is the team? If you have regional offices it makes sense for the analyst to report locally.
Once a reliable flow of data is arriving on your chosen data platform it’s time to start analysing. The first place to start is with descriptive analytics. And with that your first data analyst.
Descriptive analytics is all about getting useful data in front of stakeholders and decision-makers. This answers questions like ‘Which customers have churned?’, ‘Do products sell faster in certain markets?’.
“The most important part is to be able to communicate their findings in a way that makes sense for the requester.”
Data analysts specialise in visualising and describing data using SQL, BI tools and spreadsheets. This role is responsible for taking deep dives into the data to answer business questions and create regular reporting when needed. In reality, data professionals skills are on a scale with analysts sometimes also being interested in data science, data engineering, visualisation or business decisions.
It’s an unfortunate fact that analysis is often ad-hoc and never used again. While it is tempting to move on to the next thing that code block might be useful down the track. While not strictly a part of creating a team, creating a culture of sharing and documenting findings is an important first step.
- Is data being captured? You cannot expect an analyst to create reports without access to Data.
- How will insights be used? There should be a relationship between the analyst and the requester and some kind of feedback loop when they produce work.
- Are you likely to need reports, graphs, or simply a number for a slide deck? Analysts are not data vending machines for Dashboards.
As the demand for analysis and reporting grows you may consider self-service Analytics. These are tools designed to take the complexity away from the data itself and build in business rules.
“Anyone with a bit of training and enthusiasm should be able to open the tool and get the answers they need.”
Using tools rather than sharing spreadsheets around provides a level of tracking to see what is actually being used. They also build in data governance and security. Something you can’t get from emailing a report around every Monday morning.
While this is great in theory it requires ‘buy in’ from the teams using the tool and preparation of the data so it’s ready to go. Consider bringing on ‘power users’ from functional areas to test the waters before making any investment in this area.
- Are your Execs/Board asking for it? Better get to it then.
- Are you getting the same kind of Ad hoc requests again and again? If there’s a way to pull together a ‘drag and drop’ dashboard to answer the most common questions it’s worth investigating.
- Is there a desire for visualisation/automation? If these things will help drive action and business decision making it’s worth looking into a tool to provide a solution.
Predictive analytics is the practice of building custom models that can help business decision-makers predict what is coming next. This answers questions like ‘Which customers are likely to churn this month?’, ‘What will sales be over the next year?’. This can also be seen in recommendation systems, suggestions in your email subject line and image recognition.
- Is there data to support their work? Data science often requires more granular, and potentially sensitive, data to get going.
- Is there a need for this kind of analysis? Data scientists who end up doing regular reporting will not be utilised or satisfied in their work.
- How will the models be put into production? Is there a mechanism for their hard work to go anywhere beyond a sandpit and dummy data?
Creating a data team isn’t easy, especially if you are starting from scratch. These guidelines will help you better understand the workflow and building blocks that need to be in place to start your data journey.
This is the first part of a series on Designing a Data Team. I hope you join me for the next post on Hiring a Data Analyst.
- The AI Hierarchy of Needs
- How Not to Hire Your First Data Scientist
- How should I structure my data team? A look inside HubSpot, Away, M.M. LaFleur