The Unit of Work Is the Agent-Hour

OpenAI published usage data from inside its own walls. Its 99th-percentile employees now run more than 60 hours of agent work every single day. Sixty hours inside a 24-hour day is not overtime. It is a different unit of work.

The number breaks your intuition on purpose. You cannot fit 60 hours of labor into a day if labor is something a human performs sequentially with two hands and one attention span. You can fit 60 agent-hours into a day trivially, because agent-hours run in parallel and the human is no longer the one doing them. That single figure marks the boundary between the old model of work and the one replacing it.

The rest of OpenAI's report fills in the shape. The average employee now produces 85% of their output through Codex, not typed, delegated. Across the company, agents already account for 99.8% of weekly output tokens. The humans are still deciding what gets built and whether it is right. They have almost entirely stopped being the ones who produce it.

The work did not get faster. It went parallel.

This is the distinction people miss, and it changes every downstream conclusion. "Faster" is a story about the same sequential process compressed in time: the same person doing the same task in less time. That is a linear improvement, and linear improvements have ceilings set by the human at the center.

Parallel is a different regime entirely. The human stops executing tasks one after another and starts dispatching many at once, each running independently while attention moves elsewhere. The constraint is no longer how fast you work. It is how many streams of work you can start, supervise, and accept. The growth-team data makes the pattern concrete: research teams show 56 times more agent use than seven months ago, customer support 32 times, engineering 27 times, even legal 13 times. Those are not efficiency gains. Those are step changes in how many things happen at once.

The human stops doing the work and starts approving it. That is not a productivity upgrade. It is a change in what a person is for.

Old company, new company

The old company had a simple production function: headcount times hours. Output was labor, and labor was people multiplied by the time each one worked. It scaled linearly and it scaled by hiring. If you wanted more output, you added heads, onboarded them, managed them, and absorbed the coordination cost of every new person. The ceiling was real, and everyone knew where it was.

The new company has a different production function: agents times parallelism. Output is a function of how many agents you can run and how many you can run at once, and there is no ceiling you can staff your way to. This is not a rhetorical flourish. It is a structural claim about where the limit sits. In the old company, the binding constraint was people. In the new one, the binding constraint is your ability to specify work clearly and verify it correctly at volume. Those are different muscles, and most organizations have only trained the first one.

We have seen this abstraction before

The move is not unprecedented; it is the same move computing has made twice already. Compilers did it to assembly. Programmers stopped hand-writing the instructions the machine executes and started writing intent, letting the compiler generate the instructions. The programmer's job moved up a level, from producing machine code to specifying behavior and checking the result. Nobody mourns hand-written assembly.

The cloud did it to servers. Operations teams stopped racking physical machines and started declaring the infrastructure they wanted, letting the provider produce it. The unit of work stopped being the server you touched and became the capacity you specified. In both cases the human did not become less important. The human moved to a higher level of abstraction and became responsible for more, because each unit of their attention now commanded far more underlying work. Agents are the third instance of the same pattern, applied to knowledge work itself.

What the agent-hour measures, and what breaks

When the unit of work is the agent-hour, the metrics that ran the old company stop describing the new one. Headcount measured the old company because headcount was the input that produced output. Throughput measures this one, because output is now decoupled from the number of people. A ten-person team running thousands of agent-hours a day is not a ten-person team in any meaningful sense. It is a throughput engine with ten people steering it. Counting the people tells you almost nothing about what it produces.

Two things break as this lands, and both are worth naming before they surprise you:

Verification becomes the bottleneck. When agents produce 99.8% of output, the scarce human resource is the judgment that accepts or rejects it. I have argued this at length in verification cost is the new bottleneck: the constraint moves from producing work to confirming it is correct, and that cost does not fall as fast as generation cost.
Org charts stop mapping to output. If throughput is agents times parallelism, then seniority, span of control, and headcount budgets are measuring the wrong thing. The high-leverage person is the one who specifies and verifies the most agent-hours, not the one who manages the most people.

The delegation itself compounds. Every agent-hour is inference, and inference at this volume is a cost curve, not a fixed line. That is why the biggest operators are moving to own the silicon underneath it, a shift I traced in the biggest customer becomes the competitor. The agent-hour is both the new unit of work and the new unit of spend, and the two are the same number read from opposite ends.

How to operate when the unit changes

If the unit of work is the agent-hour, the skills that matter shift accordingly. The premium moves to specification, decomposition, and verification, the three things a human still does that an agent cannot yet do for itself. Writing a clear enough instruction that an agent produces the right thing is a skill. Breaking a large goal into parallelizable pieces is a skill. Judging correctness at the rate agents generate output is the scarcest skill of all, and the one most organizations have not started training.

The forward-looking version is uncomfortable and worth sitting with. If output is agents times parallelism, then the competitive gap between two companies is no longer a hiring gap. It is a gap in how well each one specifies and verifies work at scale. That gap is invisible on an org chart and enormous in throughput. The company that learns to run agent-hours well will out-produce the company that keeps counting heads, and the head-counting company will not understand why until it is far behind.

Key takeaways

OpenAI's top employees run 60+ agent-hours per day, and agents produce 99.8% of the company's weekly output tokens. Work went parallel, not just faster.
The old production function was headcount times hours, which scales linearly by hiring. The new one is agents times parallelism, with no ceiling you can staff your way to.
Compilers did this to assembly and the cloud did it to servers: the human moves up a level of abstraction and becomes responsible for more.
The human stops producing work and starts approving it, which makes verification the scarce resource.
Headcount measured the old company; throughput measures this one. Org charts stop mapping to output.
The competitive gap is now a specification-and-verification gap, invisible on an org chart and enormous in throughput.

The industrial era measured work in hours because a person was the engine. That era is closing. The unit of work is no longer the hour; it is the agent-hour, and the companies that learn to count it will look nothing like the ones that keep counting people. For the wider argument about where capability, cost, and control are heading, start with the manifest and the Joule Wars thesis. Measure throughput, not headcount, and you will see the new company before it is obvious.