Introduction
Snowflake is a company that I have “special insight” into because I have used the product for a data migration project at a large company. My goal of this article is to articulate why Snowflake is such an incredible company presented in laymen’s terms that can easily be understood by someone with a non-technical background. I hope after reading this article, readers will at a minimum understand why a rational investor would invest in Snowflake at ~25X trailing sales.
Basic Business Background
Snowflake is a multi-cloud data platform provider. A multi-cloud data platform means Snowflake is able to connect data from multiple different data silos (storage locations) including on-premise and multiple different clouds. Essentially, Snowflake can be the central point to connect data from an Oracle on-premise server, an AWS Cloud, and an Azure Cloud. This is extremely important because 93% of organizations using the cloud in 2020 used multiple cloud providers. Having all the enterprise’s data in the same location allows a user to make insights using all of the organization’s data.
Different Types of Clouds
Source: Xorlogics
Data Warehouse
While Snowflake, also has data lake offerings (pre-processing), they’re known for their data warehouse and this writeup will focus on their data warehouse offering. A data warehouse is a central location where organizations are able to run queries, machine learning algorithms, and gain other insights into the data stored. Other major data warehouse includes AWS Redshift and Azure Synapse. These offerings are all Hybrid-Cloud offerings, which means they only support on-premise data sources and their respected parent company’s cloud. Google’s Big Query has recently become a multi-cloud offering. DataBricks considers itself a data lakehouse (in between a data lake and data warehouse).
More on Snowflake’s specific use cases, competitive advantage, and detail about DataBricks will be discussed later.
Industry Trends
Snowflake is benefitting from tailwinds in the cloud computing industry. Cloud computing is a safer, cheaper, and easier way to store data than legacy on-premise systems and organizations are investing in this superior technology. Gartner expects the total cloud spend to increase by almost 50% from 2021 to 2023.
Gartner Cloud Spend
Source: Gartner
This trend still has an extremely long runway. Only 60% of all data is currently stored in the cloud and the total amount of data in the world is growing at a rapid pace. Snowflake will benefit as more data is moved into the cloud and generated in the cloud. Snowflake stated that most of its customers still rely upon on-premise servers.
The reality is, is we have signed hundreds and hundreds of customers to migrate from on-prem to the cloud. The reality is, only about 50 of our customers have fully shut down their legacy system. - Koyfin
Percent of Data Stored in the Cloud
Source: Statista
During the Morgan Stanley technology conference, Snowflake’s management team boasted that they had 184 customers that paid over one million dollars, however, not a single one of them was saturated yet. Management said, “So we really don't know what a mature, fully saturated customer looks like because we haven't gotten there.”
Competitive Advantage
Many companies start using Snowflake as a data migration tool and then discover all the other use cases with Snowflake. Management constantly referenced the increased adoption of many of their sticky products during their earnings reports. Their largest competitive advantages include economies of scale, network effects, and switching costs.
Architecture
Snowflake’s product separates computing resources from storage, which is different from software like Hadoop which ties computing resources and storage together. The advantage of having separate computing resources is you can automatically scale up (and down) resources to complete your workload so you are not wasting resources. This saves money and increases performance. Other companies have started copying Snowflake’s architecture; however, it is extremely expensive to rebuild a product, and Snowflake has scale advantages over new entrants.
As Snowflake generates more revenue, they’re able to pass on more savings to customers as they’re a fixed-cost heavy business. Newer companies that do not have positive or breakeven FCF are not able to do this. Snowflake is able to charge less than new entrants and offer a fuller suite of product offerings making the barriers for new entrants high.
Also, 70% of their customers’ workloads are driven by machines, not people. This adds to Snowflake’s stickiness because companies would have to stop an automatic process and move it to another platform. Significant inertia exists in these processes.
Snowflake also constantly makes performance improvements to their product and gives the improvements back to the customer. Snowflake expects to save customers $162 million dollars next year by giving this performance optimization back to the customer. They projected this would create a negative $97 million revenue headwind that will be partially offset by increased usage of their platform due to the performance improvement. Many analysts were spooked by this process improvement, but Snowflake has been warning analysts about this. Process improvements and passing on savings to customers are industry standards.
Data Sharing
Snowflake has introduced data sharing to its customers which allows Snowflake customers to share data with other Snowflake customers. Snowflake allows data sharing in two different forms: Data Marketplace and cross-organization data sharing.
Snowflake’s Data Marketplace allows organizations to purchase data directly from vendors to be shared in their cloud. For example, an organization could buy S&P Capital IQ data sets from S&P Global and have that data in their data warehouse. This allows the organization to directly integrate these datasets with their own data making them easier to use and giving organizations the ability to find insights. Snowflake’s data marketplace is rapidly growing with Snowflake’s listing growing “195% this year, now with more than 1,100 data listings from over 230 providers.”
Snowflake Datasets
Source: Snowflake
Snowflake also allows organizations to share data with outside organizations. This could be extremely beneficial for issues such as supply chain optimization. Once a supply chain becomes reliant on this data-sharing feature, it would be nearly impossible to rip out. It is well-known how hard it is to rip out a single enterprise software system, ripping out multiple enterprise software systems must be nearly impossible.
The use of Snowflakes data sharing feature has been rapidly adopted. Snowflake had almost a 100% increase in customers that used the data sharing feature. The stickiness of Snowflake’s platform is increasing as Snowflake scales.
Snowflake Data Sharing
Source: Author Calculations, Data from Snowflake Earnings Call
Snowpark
Snowflake has also recently unveiled SnowPark and it currently supports Java and Python (in private preview). Snowpark will allow companies to build complex machine learning applications directly in Snowflake. When adopted, there will be high switching costs in moving applications. The switching costs will increase with the complexity and number of applications built. The costs include application downtime, risk of application failure, time, and money. As Snowflake’s customers build out their applications, it becomes much harder to leave for another platform.
Product Advantage
While it is possible to imitate another product, I would be remised not to mention how Snowflake’s product is better than the competition. They had 100% of customers say they would recommend Snowflake in the Dresner Advisory Survey, and from my own experience, it was really intuitive to learn. Snowflake’s language is called Snowflake SQL which is based heavily on SQL. Almost every developer knows SQL which makes Snowflake SQL very easy to learn.
Snowflake should continue to execute partially because of its consumption-based model. The consumption-based model better aligns their incentives with customers which encourages them to continue innovating.
Management/Capital Allocation
Frank Slootman is the CEO of Snowflake and he made his name in his previous successes at Data Domain and ServiceNow. Frank Slootman is one of the most well-respected software CEOs and is a great leader for Snowflake. Snowflake is not reliant on acquisitions and most of their acquisitions are to hire good engineering teams
Yes, we will continue to acquire most of our M&A to date other than Streamlit have really been around acquihire. There's been maybe a little bit of technology, but it's really more about finding people with great domain expertise, good engineering teams. - Koyfin
Snowflake is just starting to break even. One of management’s stated goals is to slow dilution. They have guided to less than 1% dilution which means they have started to get share-based compensation under control.
Risks
Obviously, Snowflake is in a very attractive industry and competition is always a threat in attractive industries. Snowflake’s three biggest threats are from large public clouds data warehousing systems, BigQuery, and DataBricks.
Public Cloud Data Warehouses
Amazon and Microsoft have hybrid cloud data warehousing features that support on-premise and their own cloud. If they were to switch to a multi-cloud offering similar to Snowflake and invested heavily in their product, they could be a formidable threat to Snowflake. I find this transition extremely unlikely because supporting a multi-cloud offering would benefit Amazon and Microsoft’s main competitors, each other. For example, if AWS were to create a multi-cloud offering, it would be compatible with Microsoft and Google giving them business and cannibalizing their core business AWS. It does not make strategic sense for Amazon and Microsoft to do this, however, it is still a risk. Amazon and Microsoft have transitioned from being direct competitors of Snowflake to co-selling Snowflake. Amazon has been known to even throw money at deals to help Snowflake win new business. Also, Amazon has recently allowed its e-commerce data to become available on Snowflake. Google’s failed attempt at making BigQuery a multi-cloud platform also serves as a warning to both Amazon and Microsoft. A significant shift in Amazon and Microsoft’s strategy would need to occur for them to become direct competitors again.
BigQuery
The three cloud providers have had different strategies with Snowflake. Google has taken the route of trying to fully integrate its cloud offering and adopting a multi-cloud strategy. In terms of product, Snowflake is well ahead of BigQuery and Snowflake executives stated that they win “almost every deal” when head-to-head with BigQuery.
Snowflake execs seem perplexed that Google doesn’t want to partner with them.
Partnering with them and showing them that having Snowflake in AWS will lead to you selling more of your software around us, such as SageMaker. It's funny, I had a call 3 weeks ago with the people at Google, and I pointed out to them there were 300 instances where you guys compete it to the very end with BigQuery and we won. And all of those customers ended up in AWS or Azure when they all could have been in Google if you would had just partnered with us and you would have been able to sell some of your AI or ML technologies around it. - Koyfin
While Snowflake currently has a large lead on BigQuery, Google has deep pockets and could continue to innovate its product. I believe Snowflake’s competitive advantage is too strong for BigQuery to catch up.
DataBricks
DataBricks is another multi-cloud offering and ironically one of Snowflake’s biggest competitors and compliments. DataBricks is a Data Lakehouse which means they focus on the preprocessing and analysis stage. Snowflake’s executives say they coexist in about 80% of accounts and compete in the other 20%.
And then within our existing customers, we do see some of these emerging, call them, like Databricks. They coexist in many of our accounts. I would say 80% we don't overlap. Like for instance, Morgan Stanley, you use both Snowflake and Databricks. You talk to the guy at AT&T who's very, you talk to customers and they don't see the competitive nature between the 2. There's about 20% where we kind of bump heads a little bit, but the vast majority of customers we coexist, and they see a very defined use case where they use Snowflake and a very defined use case where they use Databricks. - Koyfin
Historically, Databricks and Snowflake were more compliments. Snowflake’s recent release of Snowpark competes in Databrick’s data science area and Databricks is supporting more data warehousing use cases. While DataBricks and Snowflake are not very competitive currently, competition could increase, especially as the industry matures. I don’t think competition will increase for a considerable time and I believe both companies have strong moats and will each be able to carve out their niches with high pricing power.
Valuation
Snowflake’s Product Revenue accounts for the vast majority of its total revenue and is the key to the company. Management guided that their product revenue will have a Net Retention Rate (NRR) between 150%-170% in their Q4 earnings call. NRR is how much more existing customers will pay each year and the revenue has very little incremental costs. NRR is the key driver of Snowflake’s stock price. From thereafter, I expect Snowflake’s NRR to decelerate as it becomes harder to grow off of large numbers and their customers eventually reach saturation.
Product Revenue
Source: Author, Historical Data from SEC Filings and Koyfin
I expect Snowflake’s total company margins to expand as product revenue becomes a larger percentage of the product mix. I expect a modest expansion in product gross profit margins this year due to performance improvements, but it will accelerate in 2024 as similar improvements are not made.
Total Revenue
Source: Author, Historical Data from SEC Filings and Koyfin
Snowflake’s revenue per employee has 4X since 2019 showing impressive leverage in the company. Snowflake has significant economies of scale and should be able to continually materially widen operating margins. Their operating margins have improved 126% from negative 185% in 2019 to negative 59% last year. While they are still very negative, they have improved significantly and should continue to improve.
Source: Author, Historical Data from SEC Filings
Stock-based compensation is adjusted out of the operating income. While it is a real expense, it is already factored into the diluted shares outstanding projections in the final table.
Margins
Source: Author, Historical Data from SEC Filings and Koyfin
Dilution should slow considerably for Snowflake. Dilution “has already been running below 1% year-on-year on a fully diluted basis”. Snowflake management has guided to 360m shares outstanding in 2023 and there is no reason to doubt Frank Slootman’s word given his track record. I expect a modest 1% dilution until 2025, and for Snowflake to eventually buy back a small percentage of shares outstanding given their capital-light model.
Given the below estimates and an exit multiple of 25 EV/EBIT, I expect Snowflake to have an expected IRR of 19%. While this may seem like a lot, given the risky nature of the business, the expected IRR should be significantly higher than a normal company. I do believe these estimates are conservative and Snowflake has enough margin for error where it is a good investment at this price level.
Projected IRR
Source: Author, Historical Data from SEC Filings and Koyfin
Disclosure: I/we have a beneficial long position in the shares of SNOW either through stock ownership, options, or other derivatives. I wrote this article myself, and it expresses my own opinions. I am not receiving compensation for it. I have no business relationship with any company whose stock is mentioned in this article.
More of a big picture question, but is cloud storage predicted to just endlessly grow? In other words, will there ever be an alternative that is as cheap and effective as cloud storage?
Hey, liked the article but I have some thoughts:
What do you mean when you say Snowflake can connect to Oracle or GCP/AWS/Azure sources? As far as I know, there's no real external/federated query capabilities on Snowflake. If you were trying to read directly out of an Oracle DB on prem / a CloudSQL GCP MySQL instance / an RDS AWS Postgres instance, you'd have to replicate the data into Snowflake using some process (script, Snowpipe, 3rd party ETL tool, etc.)
You mentioned in the article that the three majors only can read on-prem data + whatever's in their cloud - I don't think that's really true. BigQuery Omni can read across all three major cloud object storage platforms (https://cloud.google.com/blog/products/data-analytics/analyze-data-across-clouds-with-bigquery-omni), and while I don't have experience with it, it looks like Athena has multi-cloud capabilities on the AWS side, too.
Finally, I'd just mention that as of right now (6/8/22), Python Snowpark is in private preview, AKA not generally available. Hopefully that changes soon, but as of right now I think most customers either don't have access, or would be skeptical to build production workflows on a product which is essentially in beta.