Machine Learning

Assessing the Financial Sustainability of AI

In my April column, I talked about the true cost of AI—a mistake that could be detrimental to the technology's long-term profitable trade. Interestingly, in the two months since then, we've seen some amazing headlines from the tech industry that may confirm my argument to a catastrophic degree.

It feels like the winds in the AI ​​industry are changing direction so quickly that it's hard to keep track. For the past few months, technology companies and other businesses have been cracking the whip for employees to use AI more, demanding that teams integrate it into their workflows, regardless of whether they have a clear need or a specific desire for the software.

Hindsight is 20-20

As anyone thinking about it might have predicted, when you tie people's lives to using something more, a larger proportion of people will, in fact, use that thing more. This has led to “tokenmaxxing”, token usage leaderboards within companies like Amazon, and shocking quarterly AI token cost figures from places like Uber (and other unnamed companies). It is not clear to me why these companies are surprised by these results, but nevertheless, this has led to a pivot in the instructions to employees both because these costs cannot be sustained for any length of time, but also because the use of AI has not produced surprising business results.

It's possible that senior leadership believed that some miraculous explosion was about to come from the use of AI, but if so, they hadn't done their homework. Many of us in the field and people in the media covering the industry have voiced warnings about AI being a tool, which can be used effectively or ineffectively, and expecting miracles will always be disappointing.

I've used this kind of metaphor before, but imagine if these companies were still being built, and electrical machinery had just been invented, making incredible productivity improvements in construction possible. The correct response would not be to buy as many drills as possible, to the point of making drill parts scarce and increasing their price, and instructing workers to use the drill for every job, producing boards showing who was using the drills most minutes of the day. You'd have buildings with swiss cheese patterns of holes in them, you'd spend a lot of money on drills and electricity to power them, and you'd have as much to show for it as tech companies are doing in AI now.

Money is Unlimited

In any case, reality has begun to crumble, and at least it was a quick return to earth. Some businesses are still buying drills, but the big players have realized that the ratio of costs and benefits here is absurd, and we are correcting it. However, as I explained in April, this will not be as easy as they think. Some companies are starting to tell their teams that the use of AI needs to be for productive purposes, not just tokenmaxxing, to try to reduce costs while still reaping the benefits of technology where it can generate value.

What they don't understand is that token budgeting and clearly defining when AI will help with a problem is a much more quantifiable task than using other types of technology. Let's go back to my April article and recall the experience of using AI for the individual.

“[Y]you can clearly control how many tokens you transfer, and thus control your costs, but that control is limited. You can make your information shorter, limit extraneous instructions, and lower your installation costs as a result. However, when agent tools are involved, and the LLM creates commands to pass to other LLMs, you no longer manage the length of the commands. More importantly, you have very little control over the number of tokens that any model responds with (such as asking it to be “brief”). For the most part, the number of output tokens is part of the unknown unknown I described earlier. And, you will notice, the withdraw token is worth 5 times the value of the input token.”

Extending this further, any time you use AI, it has a chance of failing to successfully answer your question. So the slot machine part adds to the problem. The technician does not know A. how many tokens or any information will return or B. how many times the information will need to be fed (possibly with editing) to get a successful response to the query. To calculate the cost, we need to sum all the input information token counts, and all the output information token counts (A, unknown) by the length of the number of attempts required (B, also unknown). UA and B vary infinitely based on the design of the model, the problem at hand, the randomness of the model, and other factors that we may not even be aware of behind the scenes. Then, we multiply by the value per token of whatever model or models are being used, which, as I explained in April, also vary.

So, if you're in the finance department of a tech company, and you need to budget for next year's AI tokens, good luck. Even with estimates based on past usage, or with very good information about the company's production goals, your chances of budgeting for the right amount seem pretty slim to me. However, you have to use some kind of limitation, this cannot be an empty test case, so you will have to cut people off at some point.

Practical Implications

How will this actually work? Is it “manual coding” in the second half of the year, after spending the first half using AI intensively? Are all our emails and marketing documents handwritten in Q3 and Q4? Do we shut down our AI transcription tools and speech-to-text software after the limit is hit? This is an interesting question to me, because I've seen firsthand how different the experience of coding with AI is from doing it externally, and going back and forth between the two processes can be incredibly confusing.

This also raises the question of how cost reductions in AI will affect companies that provide AI-based solutions. Last October I discussed how hyperscalers (Anthropic, OpenAI, Google, etc.) are pushing to start using AI-based features in their products, as an attempt to get profits to return to the investors who have sunk billions of dollars into the industry. As the cost of providing AI features rises, and companies move towards a pay-per-use model, this flywheel will begin to fall off. If companies start using AI-based tools less because their budgets can't support the ongoing costs, the revenue pipeline back to hyperscalers will dry up. Anthropic and OpenAI are planning IPOs this year, both with highly uncertain revenue streams and hundreds of billions of dollars owed to investors, so a slowdown in AI use is the last thing they need.

It's also worth mentioning that Apple announced that their product will go AI last week at WWDC, and critics have been very positive so far. The new Siri using Google Gemini technology will have greater privacy protection (on the device and private cloud computing and less data storage) and will not cost users more. With this discovery, and if the quality lives up to expectations, the general consumer use of ChatGPT and Claude may be in jeopardy.

The conclusion

Watch this space, because while the stories of “corporations shocked by AI debt” and “OpenAI and Anthropic shooting the biggest IPOs in history” are often reported separately, they are really the same narrative from different sides. Even though tech companies feel that AI offers benefits and productivity benefits, they don't have unlimited budgets to apply it to. If they don't have unlimited budgets (and consumers certainly don't, with CPG pricing squeezing budgets and economic sense too low to exist in nearly a century of tracking), we have to go back and ask where the millions and billions OpenAI, Anthropic, and others expect to generate in revenue will come from. Combine this with public pushback against data centers and negative sentiment about AI in general, and hyperscalers have a real problem on their hands.


Read more about my work at www.stephaniekirmer.com


Continuous Learning

Source link

Related Articles

Leave a Reply

Your email address will not be published. Required fields are marked *

Back to top button