Making (Software) Hay While The (AI) Sun Shines
anelson June 28, 2026 #genaiAs Iโve been working with SOTA models from Anthropic and OpenAI practically all day every day for the last ~2.5 years now, the idiom โto make hay while the sun shinesโ has repeatedly come to my mind, seemingly of its own accord. In the midwestern American countryside milieu in which I was forged, this was a common expression, meaning to take advantage of favorable but fleeting conditions that allow one to accomplish some task. I canโt shake the feeling that the current state of inference subscriptions from Anthropic and OpenAI is exactly the โsunโ that we should take advantage of while we have it.
Today, I (or rather, my company) pay $200/mo for a Claude Max subscription from Anthropic, and another $200/mo for OpenAIโs ChatGPT Pro. These subscriptions allow me to use the Claude Code and Codex coding agents and also the corresponding desktop and mobile apps, powered by the latest SOTA models, with very generous usage limits that for me so far have been indistinguishable from unlimited. Using these tools, Iโm noticeably more productive (at least I feel more productive, but I lack an objective productivity metric with which to substantiate my feeling), along more than one axis.
I donโt want this post to devolve into the specific value I get out of agentic coding tools; thatโs a topic for another day. But briefly, my experience matches that of other competent engineers whose work I admire, to wit:
-
On a given task, implementing a new feature or fixing a bug, chatting with the LLM helps me work through the contours of the problem and identify solutions, ultimately producing a plan for the agent to execute. In this way, a good agent makes me maybe 2x more productive on the day-to-day software engineering tasks I was already able to perform on my own.
-
LLMs are great research assistants, which let me explore domains I donโt know well and learn more about the world (obviously they hallucinate so this is a particularly lossy form of learning). This lets me indulge my wild ideas for self-edification more or less on a whim, without spending a lot of my own time sifting through search results and skimming papers. I canโt say that this is a 2x or 10x or 100x multiplier, because itโs a net-new capability that I didnโt have before and which I greatly enjoy. There are also professional rewards here, as my self-edification journeys often have unexpected professional applications.
-
Agents are great at overcoming my own inertia, especially on side projects. When I donโt have a professional obligation to do something, the only motivation is my own innate desire to do it. Thatโs a very uneven phenomenon, as anyone with dozens or hundreds of discarded side projects can attest. But with agentic tools, the activation energy to take the next step in a project is so much lower! I forgot what I was working on? Ask the agent to look at recent activity and tell me. Iโm not sure how to fix an issue or even what to fix? Tell the agent to do it, it will do a shitty job that I will hate, this will get me engaged in doing it the right way, and off we go! Not sure which way is best? Spawn multiple agents to go implement all of the ways and pick the best parts from each.
That also is simply not a capability that I had before. I would be sleeping in on a Saturday, or nursing a hangover on a long weekend, feeling guilty for not working on that project that I was so passionate about just a short while ago. Coding agents donโt do anything to treat the underlying psychological problem (yet! ๐คฆโโ๏ธ) but now all I have to be able to muster is a lazy inquiry to an eager and always-on clanker sidekick to move a project forward, albeit ever so slightly.
-
Agents amplify my existing Dunning-Kruger to weapons-grade purity. Thatโs very much a double-edged sword at the societal level, but for me personally I feel (again with those unfalsifiable feels!) that I am able to be creative in many more domains with the assistance of a SOTA LLM that has (lossily) compressed all of human knowledge. Not only because they unlock knowledge in an unfamiliar domain, but also because, given the right tooling, the cost in terms of my time to try out an idea is so much lower that I can just indulge my curiosity without having to convince myself that thereโs a monetization strategy to justify the cost.
Taking the above as a given, that is the software hay that Iโm making. So whatโs the sun?
The current market price for SOTA model inference tokens is whatever OpenAI and Anthropic charge on their usage-based API plans. When looking at my token usage (which, again, is unmetered on the subscription plans that I use), itโs easy to calculate how much that usage would have cost, had I paid per token for it. I donโt monitor this that closely, but I pulled the data for this article. Hereโs my usage for June 2026, with the caveats that the month isnโt over yet, and I only ran ccusage on my main Hetzner dev server, so this doesnโt reflect token usage in the desktop and mobile apps or the smaller amount of agentic coding I do on my local Mac:
$ npx ccusage@latest monthly
โญโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฎ
โ โ
โ Coding (Agent) CLI Usage Report - Monthly โ
โ Detected: Claude, Codex, Gemini CLI โ
โ โ
โฐโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฏ
โโโโโโโโโโโโฌโโโโโโโโโโโโโโโโฌโโโโโโโโโโโโโโโโโโโโโโโโโโโฌโโโโโโโโโโโโโโโฌโโโโโโโโโโโโโโฌโโโโโโโโโโโโโโโโฌโโโโโโโโโโโโโโโโโฌโโโโโโโโโโโโโโโโโฌโโโโโโโโโโโโโโ
โ Month โ Agent โ Models โ Input โ Output โ Cache Create โ Cache Read โ Total Tokens โ Cost (USD) โ
โโโโโโโโโโโโผโโโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโผโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโโโผโโโโโโโโโโโโโโค
[snipped]
โโโโโโโโโโโโผโโโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโผโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโโโผโโโโโโโโโโโโโโค
โ 2026-06 โ All โ โ 70,033,557 โ 30,047,962 โ 115,063,898 โ 4,753,772,196 โ 4,968,917,613 โ $4523.53 โ
โโโโโโโโโโโโผโโโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโผโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโโโผโโโโโโโโโโโโโโค
โ โ - Claude โ - fable-5 โ 4,824,550 โ 25,271,829 โ 115,063,898 โ 3,781,169,956 โ 3,926,330,233 โ $3568.09 โ
โ โ โ - haiku-4-5 โ โ โ โ โ โ โ
โ โ โ - opus-4-7 โ โ โ โ โ โ โ
โ โ โ - opus-4-8 โ โ โ โ โ โ โ
โโโโโโโโโโโโผโโโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโผโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโโโผโโโโโโโโโโโโโโค
โ โ - Codex โ - gpt-5.4-mini โ 65,209,007 โ 4,776,133 โ 0 โ 972,602,240 โ 1,042,587,380 โ $955.44 โ
โ โ โ - gpt-5.5 โ โ โ โ โ โ โ
โโโโโโโโโโโโผโโโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโผโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโโโผโโโโโโโโโโโโโโค
โ Total โ โ โ 238,950,689 โ 47,564,864 โ 133,778,816 โ 8,359,726,986 โ 8,780,021,355 โ $6908.88 โ
โโโโโโโโโโโโดโโโโโโโโโโโโโโโโดโโโโโโโโโโโโโโโโโโโโโโโโโโโดโโโโโโโโโโโโโโโดโโโโโโโโโโโโโโดโโโโโโโโโโโโโโโโดโโโโโโโโโโโโโโโโโดโโโโโโโโโโโโโโโโโดโโโโโโโโโโโโโโ
So thatโs over $4.5K worth of tokens, for $400 in total actual dollars spent. OMG the labs subsidized over $4K of use in just one month for just one subscriber, this is unsustainable! Or is it?
Over in the comments threads on Hacker News, thereโs constant bickering over whether or not per-token API pricing is profitable right now. Many commenters claim that, even at the current per-token pricing, VC money is subsidizing inference in an Uber-like play for market share at any cost, and that it must necessarily increase once investors remember that realized gains are a thing that interests them. Others claim that actually, comparing model performance and token price over time, inference keeps getting cheaper and will continue to do so as models get even more capable. Iโve seen plausible-sounding claims that inference is actually profitable at current pricing, and is used to pay for the (possibly not yet profitable) training of new models. Then there are always a few guys who run some tiny Qwen model locally on their Macs and claim thatโs all they need and thus frontier labs are cooked, apparently willfully ignorant of what actual SOTA models can do.
Suffice it to say, I have no idea of the economics that OpenAI and Anthropic are working with, nor do I have any idea how those economics will evolve over time. I also donโt think it matters either way, because whether or not the labs will need to jack up pricing to reach break-even, I donโt see any long-term incentive for them to keep subsidizing tokens as generously as they do now.
One doesnโt need any fancy mathematical analysis to see that the adoption of these tools is so rapid, enthusiastic, and (in many cases) mindless, that the vast majority of corporate activity and especially software engineering activity is already becoming utterly dependent upon GenAI tools, with no sign of that adoption slowing down. Already, itโs not at all unusual for an engineering team to have no idea what is in their codebase, having vibe-coded themselves into such a hazy tenuous grasp of their own product that one could be forgiven for wondering what intoxicant they are all smoking. Many teams literally cannot do their jobs without access to coding agents and the SOTA models that power them, with token budgets well into the billions. And thatโs just engineering. If the proliferation of AI slop befouling my inbox (and, sadly, the web as a whole) is anything to go by, I may be among the last humans alive producing text with my own thinkmeat, fleshsticks, and ocular juice bags.
You can agree or disagree with my mostly-negative framing of the situation, but I donโt think a reasonable person can refute the mere fact that GenAI tooling is being adopted much faster than any technology humanity has yet conceived, to the point that I think many orgs would be unable to function without their clankers. What do you suppose are the odds that Anthropic and OpenAI will continue to leave money on the table once their tentacles are wrapped around all of an organizationโs essential functions?
You might well counter that market forces will prevent that from happening, since if one of the labs jacks up prices then the others will just take more of the market. After all, the models are not that different in capability. Sometimes Anthropicโs is best, other times OpenAIโs is a bit better, but thereโs not anything you can do with one model and not the other. And thatโs not even taking into account the open-weight models, especially the Chinese labs with their definitely-not-distilled-from-US-models offerings. You might then sit back righteously and bask in the smug glow of your own brilliance, like the insufferable little strawman that you are.
Far be it from me to let a strawman argument go by unremarked. You see, thatโs just not how enterprises buy technology. If you were around during the transition from on-prem virtualization to cloud workloads, you probably know what Iโm talking about. Is it pants-on-head retarded to move your on-prem VMs one-for-one into EC2 instances so you can pay 10x the cost? Yes, if you look at it the way you look at your own personal spending. But almost everyone went all-in on cloud, and continues to do so, some more mindlessly and profligately than others. Why is that? Azure didnโt suddenly offer VM compute for 10% of what Amazon charged, thereby killing the AWS business. Why is that? The answer, then as now, is much the same.
The reasons are complex, but my take boils down to the fact that decisions are usually made on vibes, the principal-agent problem is very much a thing, and keeping the systems that you utterly depend on vendor-neutral is a frustratingly hard problem that almost no one has the desire, discipline, or budget to solve. So once an org is utterly dependent upon, say, Anthropicโs AI tooling (which will be way more than just inference API endpoints; they are smart enough to make sure of that!), it will be very hard for them to switch to, say, OpenAI (and anyway they could only contemplate it at the end of the enterprise license term, and that should be multiple years if Anthropicโs sales team have half a brain).
All of which brings me back to the premise that started this post. Iโm trying to take full advantage of the cheap LLM subscriptions while I still can, because I do not expect this to last. The labs will enshittify, rents will be sought, and intelligence will absolutely be metered.
Iโm not saying that you or I will lose access to SOTA models. Continuing the cloud analogy, thanks to usage-based pricing models I can rent an hour on a server that would cost ~$10K to buy, for the price of a fancy coffee. If I only need it for an hour, this is a huge win compared to a capex-based hosting model in which I have to buy the whole server. But if I want that ~$10K server for a whole month, itโll cost me ~$5K/mo in AWS. Likewise, I can get a million Opus 4.8 output tokens for just $25. But look at my usage above: I donโt need 1M tokens, I need 70M input and 30M output tokens in a month! And no doubt usage will go up in July!
Right now, I donโt care what tokens cost. I donโt wonder if the thing Iโm going to have the clanker do is worth the cost. I donโt wonder which reasoning level is right, or whether the task I have in mind is something Haiku can handle or if it merits Opus. I donโt have to constantly prune my AGENTS.md looking for any way I can spend fewer tokens and keep the same performance. I donโt have to argue online about which approaches to tooling and which skills are worth their cost, or whether the new version of my preferred coding agent harness sacrifices effectiveness as part of its prompt optimizations.
My thesis is that, soon enough, Iโm going to have to deal with those things, and that will mark the end of this glorious moment in which my capabilities are vastly expanded for minimal cost. I would love to be proven wrong on this, but Iโm not counting on it. Thatโs why Iโm making software hay while the AI sun shines.