5.28.24

How to get more out of your startup’s data strategy

Fifteen tips and techniques to improve your data stack, KPIs, product roadmap, AI initiatives, and more from former Shopify VP Solmaz Shahalizadeh

The importance of data-driven decision-making at work feels like conventional wisdom. Yet, the rate of Fortune 1000 companies that have actually been able to build data-driven organizations is low and declining, despite the fact they increased investments in data during the same time period.

For startups, setting the foundation of your data strategy and analytics framework is still challenging, but far less so, for two reasons: first, because startups tend to have less data debt (the compounding issues that arise from shortcuts and compromises made on data infrastructure and processes), and second, because startups have fewer and less established processes, which means that new processes, standards, and tools are easier to introduce and more likely to be adopted.

To help you get the most out of an early investment in data, we’re bringing you guidance from Operating Advisor Solmaz Shahalizadeh, the data leader who grew Shopify’s data organization from a few to 500, and led the team who built the company’s suite of machine learning products.

In part one of our data series, Solmaz gave startup leaders a 101 course on how to build a high-performing data team. In part two, she shares fifteen practical tips and techniques to help startups get more out of their data, covering topics ranging from how to create trust in data to how to use it to drive product innovation.

Chock-full of practical advice for founders, functional leaders, and data teams, Solmaz’s guide can benefit anyone working to create a company where data is accurate, accessible, understood, most importantly, used to make decisions and influence the product and business.

How to set up your initial data infrastructure

Data infrastructure is the foundation of any good data strategy. Advancements in the cloud data stack have abstracted away complexities and reduced technical barriers to using data, meaning most startups can get started setting up data infrastructure even before hiring a data team. “There’s usually no need to build your data infrastructure from scratch, no matter how tempting that may sound. Great tools already exist for every layer of the stack,” says Solmaz.

There's no need to build your stack from scratch. 

The right amount of time, engineering resources, and budget to invest in your data infrastructure up front is going to vary from team to team, but Solmaz recommends following three principles regardless of how you approach your initial set-up.

Tip #1: Collect the right data from the beginning

“Think hard about what data you want to collect from your business processes and product and why. Make sure to consider as many current and future use cases as possible because you won’t be able to go back and capture data later,” says Solmaz.

Deciding what data to collect is an ongoing, team effort. “Ideally, data tracking is treated as part of product development, and the engineering and data teams have a single process for managing it. Otherwise, you may end up shipping a feature and not know its impact because you didn’t implement tracking for the right events.”

When collecting data, it’s important to also consider and capture its context. “Make sure to capture not only the fact that an event happened but also relevant information about the event — the product version, system, browser, referrer, etc. That information may prove useful later, and it’s not something you can go back and get after the fact.”

Tip #2: Unify data from multiple sources in one place

Once you’ve decided what data you want to collect, it’s time to bring that data together into a warehouse or lake. Consolidating data from all your major business functions will help you prevent data silos, perform more complex (and insightful) analyses, and get a 360 view of performance across the company.

Which tool you decide to use will depend on the type of data you're storing, your budget, and the resources available to maintain it. Regardless, Solmaz suggests choosing something out-of-the-box (either from a vendor or open-source), at least initially. “Choose a tool that meets standard data privacy and security requirements and then implement it with the default settings to start.”

Tip #3: Enable teams to answer their own questions

While many startups opt to build a custom analytics or business intelligence solution in-house, Solmaz finds that these often don’t scale with the company and its data needs. For the sake of simplicity, speed, and sustainability, Solmaz recommends purchasing a tool (open-source tools are another option, but deploying and maintaining them requires resources that are typically scarce in the early days of a startup). Using a vendor that specializes in data will help you keep focus on the goals of your product and business and avoid spending precious resources trying to do what analytics and data infrastructure companies can do for you.

Providing a platform that allows non-technical users to answer simple questions on top of well defined company-wide metrics also prevents your data team from turning into a service organization that has to answer the same questions over and over.

With the recent evolution of AI and LLMs, we’re witnessing more and more tools that allow non-technical users to interact with data in natural language, including for deriving insights from defined KPIs and metrics, querying data and creating simple visualizations and dashboards. Whether you choose to use one of these platforms or not, it’s important to provide people across organizations with the basic knowledge and tools they need to interact with data independently. This can help the entire company move faster, and allow data scientists and analytics teams to focus on more complex projects and analyses.

How to decide which metrics to track

All companies — but especially early ones — should be selective about which metrics to track. Defining, maintaining, and deeply understanding a metric and its drivers takes time and resources and if it isn’t going to influence decisions and lead to action, it’s usually not worth the effort. Reporting too many metrics can also reduce focus and create confusion about what good performance really looks like and what the team should do in response to a given change in a metric.

“Don’t waste energy on metrics that aren’t actionable."

“Don’t waste energy on metrics that aren’t actionable,” says Solmaz. “At Shopify, we curated a set of company-wide, well understood, and actionable metrics that we looked at every day. Tracking these metrics and their drivers, and then forecasting their expected trends, allowed us to detect unexpected changes in the metrics quickly. Periodically, we would refine our metric sets to ensure that everything we were tracking reflected the current state of the product or business and would allow everyone to use data to make informed decisions.”

The metrics that are worth tracking will depend on your business model and vertical and will change as the business grows and evolves. You can use Solmaz’s ‘three C’ criteria as a rubric when assessing potential options.

Tip #4: Check for the three Cs of an actionable metric

    1. Computable: The metric is calculated using one specific formula and method, and it’s easy to reproduce results.
    2. Clear: The metric definition is clear and simple enough that people outside the data team can understand and remember what it captures.
    3. Cross-functional: The metric is relevant and valuable to more than just one business unit.

Metric matrix for SaaS companies

For each stage of your product and business model, identify the metric that will be the most informative and actionable for your team. Make sure you’re collecting the data you need to measure it and then track and report on it consistently.

Exercise: Use this simple grid to help you select and organize the metrics that matter to your company.

Template

Stage

Product KPI 

Business KPI 

Data to track 

Insights / Outcomes 

Attraction         
Acquisition        
Engagement         
Conversion         
Retention         
Expansion         
Evangelism         

Example: SaaS Design Tool 

Stage

Product KPI

Business KPI 

Attraction  N/A

# of press hits 

Social following 

Web traffic 

Acquisition

# of signs (free / paid) 

 # of users 

Website conversion rate (%) 

Engagement 

Value moments (usage)

Create first design 

Share design with team 

Export first design 

Daily Active Users (DAUs) 

Monthly Active Users (MAUs) 

Time in app 

Cohort retention 

# of exports 

# of designs / users 

# of shared designs 

Conversion 

Upgrade to paid account 

Upgrade to paid account 

Retention 

Continues to use / pay for the product 

Net dollar retention (NDR) 

Expansion 

# of new seats added 

# of new seats added 

Evangelism 

Social evangelism 

Customer testimonials 

NPS score 

Social evangelism 

Customer testimonials 

NPS score 

How data teams can create trust in data

Increasing data usage is one of the most important roles of the data team, and you can’t do that without creating trust in data and putting it into its proper context. “You can have high quality data and derive really interesting insights from it, but if other teams don’t quite see how it could help their team or company improve, it won’t be used and, in the end, won’t have an impact on the product and business,” says Solmaz.

To some extent, how data gets used — or not — is up to leadership and sometimes, your company’s organizational structure. But data teams certainly play an important part in influencing the role of data in decision-making, strategy, and execution.

Tip #5: When sharing insights from data, always answer, “So what?”

In grad school, Solmaz worked as a bioinformatician with molecular oncologists. Early on, she found that they rarely studied the reports she produced. “I eventually realized something that now seems obvious: they were interested in their research — not in pure data. When I started to present the insights within the context of biology behind the data, the dynamic changed.”

Speak the language of your collaborators. 

Solmaz calls this “speaking the language of your collaborators” and considers it an essential skill for all data scientists and leaders. “As a data team, you have to tailor your work to the strategic priorities of the company and be able to present data in the context of and using the language of those strategic priorities,” says Solmaz. “If your CEO is thinking heavily about user growth, you want to situate your analysis within the context of that objective, not give them a soup of numbers and force them to fish out whatever insights and ideas they can.”

At Shopify, Solmaz would coach data scientists before they met with members of the leadership team. When they practiced presenting an analysis, Solmaz would follow up with questions: “So what? What does this analysis tell us about the business, our progress toward goals, and what to do next?” As Solmaz puts it, “If everything is going as expected, you can just send a report. I wanted to prompt my team to go a level deeper and really put the findings within the context of the goals and challenges of the specific leader to make the data actionable for them.”

Tip #6: Be transparent about data quality

Nothing creates trust quite like honesty, so Solmaz encourages data teams to proactively inform others of the quality of any metric or analysis. “There’s always going to be trade-offs between speed and accuracy. Some numbers (like growth rate) don’t need to be perfectly accurate and reproducible to the minute whereas others (like revenue) do. But it’s up to the data team to clearly explain this calculus and share their confidence level every time they present a finding.”

Data teams should always assess how accurate a number needs to be for a given use case or audience. A marketing team may need a metric to decide whether to invest more in a given channel during a promotional campaign. In this case, a metric that’s 90% of the actual figure is likely sufficient to give direction, especially if a decision needs to be made quickly. In other cases — say, the CFO reporting revenue to the board — the data team needs to put in the time to get a number that is exact, and if there are concerns, be the first to raise them.

Tip #7: Catch and raise issues fast

Another tried and true way to instill trust is to admit when you’re wrong. “It’s much better for the data team to inform someone that there’s a mistake or issue with data rather than for that person to discover it for themselves,” says Solmaz.

“Teams may worry that proactively sharing issues damages trust, but really the opposite is true. Proactively and systematically sharing issues with data quality and changes to metric calculations is essential for building trust. Data teams should anticipate and be able to answer follow-up questions too. For instance: What does this change mean and how does it impact historical analysis and past decisions? Why is this change or issue happening? Who needs to know about it? What is the plan going forward to prevent issues or implement changes in the future?”

It’s really easy for teams to lose trust in data, but it’s very hard for that trust to be regained — especially within a company where data informs important decisions that can have a big impact on the business.

How to use data to drive growth and product innovation

Data is only as valuable as the action it drives and opportunities it creates. “Data teams should not only inform but also influence and be used to execute strategy,” says Solmaz. This work can take many forms — identifying product changes that can increase engagement, proposing adjustments to the pricing model to enhance product-led growth initiatives, or assessing the risks and rewards of going after a new market, to name a few.

In our conversations, Solmaz discussed four common business initiatives that benefit when data and the data team play a key role.

Tip #8: Use data to refine your ideal customer profile

Using a well-defined ideal customer profile (ICP) to refine your go-to-market strategy and product roadmap can help you more effectively attract and retain the customers who will benefit most from your product, and are likely to become its strongest advocates. 

Typically, startups take one of two approaches to get to an ICP. In one scenario, one team (whether that’s product or GTM) takes a first pass at defining the ICP based on user research and the team’s past experience and intuition. Once they have a first draft, they loop in the data team to compare the draft against behavioral data of actual users on the platform. 

“The data team will come in and identify the users who have high product engagement, retention, and reach, and in many cases, the attributes of this cohort of users will differ, at least somewhat, from an original conception of the ideal customer. As the product grows, it also becomes harder to solely rely on findings from user research. The data team’s findings end up being a reality check, and the team can then refine the ICP based on those findings,” explains Solmaz. 

Looping in the data team from the beginning can eliminate this kind of back and forth, and help you get to an accurate ICP faster.

“In most cases, your ICP is really just the best customers you already have. Good collaboration between your user research, product, and data functions can help clearly define an ICP that maps on to a real population who are using your product in the most ideal way and will enable you to attract more users like them.” This process works especially well for consumer, SMB, or PLG-focused businesses, but larger organizations might want to rely on a structured validation process and qualitative data before investing too heavily in the product.

The data team can also help with the ongoing work of identifying new classes of customers you should try to attract to the platform. “The ICP is not set in stone and changes with product growth and expansion. If you want to go up market, for example, the data team can answer important questions like: Is our current marketing attracting any enterprise leads? How are they converting (or not)? How does their behavior differ from our existing SMB customers? And then other teams can use those findings to inform the strategies that are used to go after that new market.”

Tip #9: Design better experiments

A/B testing and other forms of experimentation can help teams determine which version of an experience is more effective at driving a particular behavior, such as starting a free trial or making a purchase. Solmaz has seen a rise in early startups running A/B tests, even before they’ve built out product analytics. She sees this as a good thing, with one caveat: “The results will only ever be as good as the test’s design.”

That’s why it’s really important to bring in a data scientist to provide input early on. “For example, a data scientist can tell you whether you have enough users for a given experiment to be able to actually get statistically significant results. Or, the product team might want to change a button in the sign-up workflow and then measure whether it increased sales a month later. But someone with data expertise is going to be able to tell you that you can’t tie a small change in the sign-up workflow to that level of impact later on. Bringing the data team early in the process ensures you have sound experimental design and metrics that will allow you to make product changes based on the experiment’s result.”

Tip #10: Leverage data to size the TAM of a new feature

Proactive, strategic data teams offer their company’s a competitive advantage that reactive, service-oriented data teams can’t. Assessing product opportunities by helping estimate the total addressable market (TAM) of a proposed feature or service is a great way for data teams to help inform the company’s roadmap. 

“At Shopify, we were considering offering shipping on the platform. Since we’d never offered shipping before, we wanted to get a sense of the TAM before we made a big investment,” recalls Solmaz. “We started by looking at the shipping addresses of the users we already had on the platform to assess how many of them could have benefitted from a shipping feature, assuming we started with a proposed set of initial geographies. We also looked at the size of the orders they made, how their usage had grown (or not), and other types of behavior. All this data factored into our final recommendation on whether this was a good bet.”

To influence high-stakes decisions like these, the data team has to be brought in during the exploratory phase, and that requires having the trust and ear of leadership. “Notice, the team didn't say: we’re going to do shipping, and now we need to bring in the data team to provide numbers that back up this decision. Data was a deciding factor.”

Tip #11: Measure a new feature’s performance

Once your data team has grown beyond the initial hire and you have enough users on the platform, you can start investing in product analytics to get a better understanding of behavior segmentation, cohort analyses, and other methods to understand feature and product usage. This is particularly effective for identifying issues within the current customer experience, assessing whether new features are working as expected, and finding opportunities for improvement. 
 
“A data team can help make sure a new feature is achieving its purpose. They can answer questions like: ‘Are the users solving the intended problem with this feature? Are these users the ones that we thought should be using this feature? Where are users dropping off? Where are users spending time? Is the feature attracting any new users to the platform? Are the new users (for whom the new feature was part of their original understanding of the product) adopting the feature faster than older ones?’”
 
Building features that no one ends up using drains scarce developer resources and contributes to product complexity, so it’s important to invest in analytics resources to determine how a new feature is performing and what to do if performance is low. 
 

Lay the foundation to build with ML/AI 

At Shopify, Solmaz led the team that helped build a number of machine learning and AI-powered products and features for the e-commerce platform, including Shopify Capital, which uses machine learning to offer cash advances to merchants in real-time and has advanced more than $5.1B since its inception in 2016. 

The recent rise of advanced, open source large language models (LLMs) has significantly lowered the bar to experiment and build products with machine learning and AI. These widely available models have also inspired others to start thinking about employing their own data to create unique solutions to their users' problems. 

“Teams that have the baseline capabilities to use data to inform their product and business can start using that data to build their own products — creating with data instead of just using it to describe what already exists,” says Solmaz. 

Tip #12: Choose a problem that’s suitable for tackling with AI

Building with AI will introduce its own challenges and uncertainties, so you want to experiment with a use case that is worthwhile, feasible, and low risk. “Start with a real user problem that your company is uniquely positioned to solve,” says Solmaz. “Make sure that you already have the data you need to create something differentiated, and that you have proper feedback loops in place inside your product to learn and improve your models and determine whether your solution is having the desired impact.”

Solmaz is speaking from experience. At Shopify, she and her team built machine learning models that powered multiple data products. “One of the early use cases we tackled was order fraud detection. We noticed that merchants didn’t have the time and expertise to assess the fraud risk on every single order in a timely manner. Our order fraud detection product would recommend that the merchant fulfill, cancel, or investigate further. Because we had a decade worth of order and payment data, and because we knew reducing fraud was a worthwhile goal for merchants and the company, this proved to be a good initial data product. And having measurable success with that initial product opened the door for more investments in AI products, including Shopify Capital.”

Tip #13: Identify the right product metrics to track

“Remember: your goal is not to build an AI product for its own sake. Your goal is to provide value to users and give them a delightful experience. With that in mind, make sure that the metrics you track will give you a good sense of how things are working (or not working) within the product,” says Solmaz.

Your goal is to provide value to users and give them a delightful experience.

A common mistake when building products with AI is to purely focus on AI-related metrics such as model accuracy, and not measure whether the product enabled your users to achieve their goals. It’s very important to develop AI products with clearly defined product and user metrics that are independent of model and algorithms used. As an example, for a feature that makes product recommendations, be sure to track how many users accept the recommendation that’s made, click on it, add it to cart, and convert.

Tip #14: Go with the simplest model to start

“Some may disagree, but I always recommend starting with the simplest AI models that can solve your problems — at least until the new feature or product has product-market fit and you can always opt for more complex models after you have product-market fit,” says Solmaz.

Otherwise, you may be unnecessarily sinking resources into a product that doesn’t actually solve the user need you thought it would. “Many models are available off the shelf now with great APIs, and you can start with something approachable and then adapt it once you have a sense of what the experience is like for users in production.”

Tip #15: Make the user experience of AI products failure-proof

Don’t just think about the ideal customer experience — the case where you have the required data and the model returns results with high confidence.Also think through what will happen if the experience goes sideways.

“Because the models are probabilistic, there may be many scenarios when they aren't able to provide a reliable response. Make sure you map out what those scenarios are, how you’ll know if it's happening, and how you’ll manage that in the product experience and as a team.”

Keep in mind: even if you do all the necessary prep work in advance, the first use case you test for AI may not end up being the right one. If that happens, the data teams should be the first one to sound the alarm, and either advocate changes to the product or even propose deprecating it if the data shows the feature has degraded the user experience.

At Bessemer, many of the SaaS businesses in our portfolio are starting or are already integrating AI into their products. In our conversations with these teams, and in our writing on recent advancements in AI, we emphasize the importance of making sure the use case for an LLM is really germane to the product and not “bolted on.”

We also encourage product leaders to think about how to give more access to the product, how to make the product address more use cases, and even generate a tangible work asset, with the power of embedded-AI or LLMs. If the answer to these questions naturally lead to LLMs, then design with those specifics in mind.

Building with machine learning and AI may feel like an imperative in the current moment, but customers don't really care about LLMs, per se, just what business utility is solved by them. So much like data insights, the technology is only as useful as what you do with it — not an end unto itself.

Further reading

For more practical advice on building a high-impact data function, check out part one of this series where Solmaz covers:

  • The key objectives of the data team
  • When to make your first data hire
  • What to look for in your first data IC
  • What to look for in your first data leader
  • How to structure your data organization
  • Ways to “measure the measurers”