Making (Software) Hay While The (AI) Sun Shines

anelson June 28, 2026 #genai

As I’ve been working with SOTA models from Anthropic and OpenAI practically all day every day for the last ~2.5 years now, the idiom “to make hay while the sun shines” has repeatedly come to my mind, seemingly of its own accord. In the midwestern American countryside milieu in which I was forged, this was a common expression, meaning to take advantage of favorable but fleeting conditions that allow one to accomplish some task. I can’t shake the feeling that the current state of inference subscriptions from Anthropic and OpenAI is exactly the “sun” that we should take advantage of while we have it.

Today, I (or rather, my company) pay $200/mo for a Claude Max subscription from Anthropic, and another $200/mo for OpenAI’s ChatGPT Pro. These subscriptions allow me to use the Claude Code and Codex coding agents and also the corresponding desktop and mobile apps, powered by the latest SOTA models, with very generous usage limits that for me so far have been indistinguishable from unlimited. Using these tools, I’m noticeably more productive (at least I feel more productive, but I lack an objective productivity metric with which to substantiate my feeling), along more than one axis.

I don’t want this post to devolve into the specific value I get out of agentic coding tools; that’s a topic for another day. But briefly, my experience matches that of other competent engineers whose work I admire, to wit:

On a given task, implementing a new feature or fixing a bug, chatting with the LLM helps me work through the contours of the problem and identify solutions, ultimately producing a plan for the agent to execute. In this way, a good agent makes me maybe 2x more productive on the day-to-day software engineering tasks I was already able to perform on my own.
LLMs are great research assistants, which let me explore domains I don’t know well and learn more about the world (obviously they hallucinate so this is a particularly lossy form of learning). This lets me indulge my wild ideas for self-edification more or less on a whim, without spending a lot of my own time sifting through search results and skimming papers. I can’t say that this is a 2x or 10x or 100x multiplier, because it’s a net-new capability that I didn’t have before and which I greatly enjoy. There are also professional rewards here, as my self-edification journeys often have unexpected professional applications.
Agents are great at overcoming my own inertia, especially on side projects. When I don’t have a professional obligation to do something, the only motivation is my own innate desire to do it. That’s a very uneven phenomenon, as anyone with dozens or hundreds of discarded side projects can attest. But with agentic tools, the activation energy to take the next step in a project is so much lower! I forgot what I was working on? Ask the agent to look at recent activity and tell me. I’m not sure how to fix an issue or even what to fix? Tell the agent to do it, it will do a shitty job that I will hate, this will get me engaged in doing it the right way, and off we go! Not sure which way is best? Spawn multiple agents to go implement all of the ways and pick the best parts from each.

That also is simply not a capability that I had before. I would be sleeping in on a Saturday, or nursing a hangover on a long weekend, feeling guilty for not working on that project that I was so passionate about just a short while ago. Coding agents don’t do anything to treat the underlying psychological problem (yet! 🤦‍♂️) but now all I have to be able to muster is a lazy inquiry to an eager and always-on clanker sidekick to move a project forward, albeit ever so slightly.
Agents amplify my existing Dunning-Kruger to weapons-grade purity. That’s very much a double-edged sword at the societal level, but for me personally I feel (again with those unfalsifiable feels!) that I am able to be creative in many more domains with the assistance of a SOTA LLM that has (lossily) compressed all of human knowledge. Not only because they unlock knowledge in an unfamiliar domain, but also because, given the right tooling, the cost in terms of my time to try out an idea is so much lower that I can just indulge my curiosity without having to convince myself that there’s a monetization strategy to justify the cost.

Taking the above as a given, that is the software hay that I’m making. So what’s the sun?

The current market price for SOTA model inference tokens is whatever OpenAI and Anthropic charge on their usage-based API plans. When looking at my token usage (which, again, is unmetered on the subscription plans that I use), it’s easy to calculate how much that usage would have cost, had I paid per token for it. I don’t monitor this that closely, but I pulled the data for this article. Here’s my usage for June 2026, with the caveats that the month isn’t over yet, and I only ran ccusage on my main Hetzner dev server, so this doesn’t reflect token usage in the desktop and mobile apps or the smaller amount of agentic coding I do on my local Mac:

$ npx ccusage@latest monthly

╭─────────────────────────────────────────────╮
│                                             │
│  Coding (Agent) CLI Usage Report - Monthly  │
│     Detected: Claude, Codex, Gemini CLI     │
│                                             │
╰─────────────────────────────────────────────╯

┌──────────┬───────────────┬──────────────────────────┬──────────────┬─────────────┬───────────────┬────────────────┬────────────────┬─────────────┐
│ Month    │ Agent         │ Models                   │        Input │      Output │  Cache Create │     Cache Read │   Total Tokens │  Cost (USD) │
├──────────┼───────────────┼──────────────────────────┼──────────────┼─────────────┼───────────────┼────────────────┼────────────────┼─────────────┤
[snipped]
├──────────┼───────────────┼──────────────────────────┼──────────────┼─────────────┼───────────────┼────────────────┼────────────────┼─────────────┤
│ 2026-06  │ All           │                          │   70,033,557 │  30,047,962 │   115,063,898 │  4,753,772,196 │  4,968,917,613 │    $4523.53 │
├──────────┼───────────────┼──────────────────────────┼──────────────┼─────────────┼───────────────┼────────────────┼────────────────┼─────────────┤
│          │ - Claude      │ - fable-5                │    4,824,550 │  25,271,829 │   115,063,898 │  3,781,169,956 │  3,926,330,233 │    $3568.09 │
│          │               │ - haiku-4-5              │              │             │               │                │                │             │
│          │               │ - opus-4-7               │              │             │               │                │                │             │
│          │               │ - opus-4-8               │              │             │               │                │                │             │
├──────────┼───────────────┼──────────────────────────┼──────────────┼─────────────┼───────────────┼────────────────┼────────────────┼─────────────┤
│          │ - Codex       │ - gpt-5.4-mini           │   65,209,007 │   4,776,133 │             0 │    972,602,240 │  1,042,587,380 │     $955.44 │
│          │               │ - gpt-5.5                │              │             │               │                │                │             │
├──────────┼───────────────┼──────────────────────────┼──────────────┼─────────────┼───────────────┼────────────────┼────────────────┼─────────────┤
│ Total    │               │                          │  238,950,689 │  47,564,864 │   133,778,816 │  8,359,726,986 │  8,780,021,355 │    $6908.88 │
└──────────┴───────────────┴──────────────────────────┴──────────────┴─────────────┴───────────────┴────────────────┴────────────────┴─────────────┘

So that’s over $4.5K worth of tokens, for $400 in total actual dollars spent. OMG the labs subsidized over $4K of use in just one month for just one subscriber, this is unsustainable! Or is it?

Over in the comments threads on Hacker News, there’s constant bickering over whether or not per-token API pricing is profitable right now. Many commenters claim that, even at the current per-token pricing, VC money is subsidizing inference in an Uber-like play for market share at any cost, and that it must necessarily increase once investors remember that realized gains are a thing that interests them. Others claim that actually, comparing model performance and token price over time, inference keeps getting cheaper and will continue to do so as models get even more capable. I’ve seen plausible-sounding claims that inference is actually profitable at current pricing, and is used to pay for the (possibly not yet profitable) training of new models. Then there are always a few guys who run some tiny Qwen model locally on their Macs and claim that’s all they need and thus frontier labs are cooked, apparently willfully ignorant of what actual SOTA models can do.

Suffice it to say, I have no idea of the economics that OpenAI and Anthropic are working with, nor do I have any idea how those economics will evolve over time. I also don’t think it matters either way, because whether or not the labs will need to jack up pricing to reach break-even, I don’t see any long-term incentive for them to keep subsidizing tokens as generously as they do now.

One doesn’t need any fancy mathematical analysis to see that the adoption of these tools is so rapid, enthusiastic, and (in many cases) mindless, that the vast majority of corporate activity and especially software engineering activity is already becoming utterly dependent upon GenAI tools, with no sign of that adoption slowing down. Already, it’s not at all unusual for an engineering team to have no idea what is in their codebase, having vibe-coded themselves into such a hazy tenuous grasp of their own product that one could be forgiven for wondering what intoxicant they are all smoking. Many teams literally cannot do their jobs without access to coding agents and the SOTA models that power them, with token budgets well into the billions. And that’s just engineering. If the proliferation of AI slop befouling my inbox (and, sadly, the web as a whole) is anything to go by, I may be among the last humans alive producing text with my own thinkmeat, fleshsticks, and ocular juice bags.

You can agree or disagree with my mostly-negative framing of the situation, but I don’t think a reasonable person can refute the mere fact that GenAI tooling is being adopted much faster than any technology humanity has yet conceived, to the point that I think many orgs would be unable to function without their clankers. What do you suppose are the odds that Anthropic and OpenAI will continue to leave money on the table once their tentacles are wrapped around all of an organization’s essential functions?

You might well counter that market forces will prevent that from happening, since if one of the labs jacks up prices then the others will just take more of the market. After all, the models are not that different in capability. Sometimes Anthropic’s is best, other times OpenAI’s is a bit better, but there’s not anything you can do with one model and not the other. And that’s not even taking into account the open-weight models, especially the Chinese labs with their definitely-not-distilled-from-US-models offerings. You might then sit back righteously and bask in the smug glow of your own brilliance, like the insufferable little strawman that you are.

Far be it from me to let a strawman argument go by unremarked. You see, that’s just not how enterprises buy technology. If you were around during the transition from on-prem virtualization to cloud workloads, you probably know what I’m talking about. Is it pants-on-head retarded to move your on-prem VMs one-for-one into EC2 instances so you can pay 10x the cost? Yes, if you look at it the way you look at your own personal spending. But almost everyone went all-in on cloud, and continues to do so, some more mindlessly and profligately than others. Why is that? Azure didn’t suddenly offer VM compute for 10% of what Amazon charged, thereby killing the AWS business. Why is that? The answer, then as now, is much the same.

The reasons are complex, but my take boils down to the fact that decisions are usually made on vibes, the principal-agent problem is very much a thing, and keeping the systems that you utterly depend on vendor-neutral is a frustratingly hard problem that almost no one has the desire, discipline, or budget to solve. So once an org is utterly dependent upon, say, Anthropic’s AI tooling (which will be way more than just inference API endpoints; they are smart enough to make sure of that!), it will be very hard for them to switch to, say, OpenAI (and anyway they could only contemplate it at the end of the enterprise license term, and that should be multiple years if Anthropic’s sales team have half a brain).

All of which brings me back to the premise that started this post. I’m trying to take full advantage of the cheap LLM subscriptions while I still can, because I do not expect this to last. The labs will enshittify, rents will be sought, and intelligence will absolutely be metered.

I’m not saying that you or I will lose access to SOTA models. Continuing the cloud analogy, thanks to usage-based pricing models I can rent an hour on a server that would cost ~$10K to buy, for the price of a fancy coffee. If I only need it for an hour, this is a huge win compared to a capex-based hosting model in which I have to buy the whole server. But if I want that ~$10K server for a whole month, it’ll cost me ~$5K/mo in AWS. Likewise, I can get a million Opus 4.8 output tokens for just $25. But look at my usage above: I don’t need 1M tokens, I need 70M input and 30M output tokens in a month! And no doubt usage will go up in July!

Right now, I don’t care what tokens cost. I don’t wonder if the thing I’m going to have the clanker do is worth the cost. I don’t wonder which reasoning level is right, or whether the task I have in mind is something Haiku can handle or if it merits Opus. I don’t have to constantly prune my AGENTS.md looking for any way I can spend fewer tokens and keep the same performance. I don’t have to argue online about which approaches to tooling and which skills are worth their cost, or whether the new version of my preferred coding agent harness sacrifices effectiveness as part of its prompt optimizations.

My thesis is that, soon enough, I’m going to have to deal with those things, and that will mark the end of this glorious moment in which my capabilities are vastly expanded for minimal cost. I would love to be proven wrong on this, but I’m not counting on it. That’s why I’m making software hay while the AI sun shines.