Scaling Smarter: The Growth of Serverless Compute
The growth of usage based pricing and gen AI are bringing serverless capabilities to the front and center of purchase decisions and product roadmaps.
As companies go through cloud migration journeys, a decision they have to make is how much compute capacity to reserve with their CSPs. For businesses with stable use cases and scale, this may be relatively straight forward. But for growing companies or companies with “spikey” usage, this can get quite complicated.
Let’s say you are a bank and every week, you have to update your risk prediction models with the latest data. You only run this workload a few times a month, but today, you have to reserve the full capacity of dedicated servers for this work from AWS. You don’t want to keep these servers always turned because you’d incur higher electricity bills. But each time you do want to run something, you have to turn on the servers, wait for them to warm up, run your work, then turn them off. One time, you forgot to spin down your servers and got a larger than expected AWS bill (your manager was not too happy about that).
In another scenario, let’s say you are an enterprise company building an internal chat bot. You need a large amount of compute capacity to fine tune an LLM, but everyone else had the same idea so your CSP only has capacity available in 1 month’s time. So your project is now delayed (and once again your boss is not too happy).
Serverless solutions resolve these issues by reducing complexity, providing flexibility, and reducing costs. Instead of managing a separate contract and relationship with a 3P compute provider as well as an application provider, you will only need one contract with the application provider because they will handle provisioning you compute on the backend.
Server capacity is always available, so no need to wait for server warm up time. Additionally, the application will automatically close down your compute instance when you are done with your work. TLDR, serverless options typically lead to faster compute times and lower costs. Our channel checks suggest that on a like-for-like basis, serverless options typically save users 15-40% on all-in costs for various workloads.
Now, serverless is not always the right answer. You may be better off with sourcing your own compute in certain scenarios. For example:
For some companies that have stringent regulatory requirements, such as healthcare, they may need to run workloads on owned or directly managed hardware.
For some companies that have very consistent, predictable needs, they may be able to optimize their server spend independently.
As companies continue to optimize their cloud spend and gen AI applications grow, I expect more tools and applications to provide serverless offerings, driven by customer demand. For example, Databricks announced general availability of their serverless option about 6 months ago. Elastic recently announced the general availability of its Elastic Cloud Serverless platform. Industry feedback on these capabilities have been positive and suggests growing attach rate in their customer base. In the long term, I expect customers to have more choice and flexibility in terms of how they manage their compute spend.
Sources: expert calls, company websites, company public filings.
Disclaimers: The information presented in this newsletter is the opinion of the author and does not necessarily reflect the view of any other person or entity, including Altimeter Capital Management, LP ("Altimeter"). The information provided is believed to be from reliable sources but no liability is accepted for any inaccuracies. This is for information purposes and should not be construed as an investment recommendation. Past performance is no guarantee of future performance. Altimeter is an investment adviser registered with the U.S. Securities and Exchange Commission. Registration does not imply a certain level of skill or training.
This post and the information presented are intended for informational purposes only. The views expressed herein are the author’s alone and do not constitute an offer to sell, or a recommendation to purchase, or a solicitation of an offer to buy, any security, nor a recommendation for any investment product or service. While certain information contained herein has been obtained from sources believed to be reliable, neither the author nor any of his employers or their affiliates have independently verified this information, and its accuracy and completeness cannot be guaranteed. Accordingly, no representation or warranty, express or implied, is made as to, and no reliance should be placed on, the fairness, accuracy, timeliness or completeness of this information. The author and all employers and their affiliated persons assume no liability for this information and no obligation to update the information or analysis contained herein in the future.