Login
Sign Up
A comprehensive analysis of approximately 400,000 interactive programming sessions conducted between October 2025 and April 2026 reveals a fundamental shift in the human-AI collaboration model. The data indicates that in agentic programming, the division of labor is distinct: humans predominantly determine 'what to do' through planning decisions, while the AI agent assumes responsibility for 'how to do it' via execution. This dynamic suggests that AI is increasingly absorbing implementation tasks such as writing code, modifying files, running commands, and debugging, whereas goal setting and outcome judgment remain firmly in human hands. Data compiled by Woofun AI shows that the effectiveness of these tools does not strictly correlate with a user's background as a professional programmer. Instead, success rates in code generation tasks for non-technical professionals in fields like law, finance, management, and research are nearly on par with those of software engineers. The critical differentiator is the user's understanding of the specific problem they aim to solve, implying that AI reduces the implementation threshold without lowering the judgment threshold.
The study categorizes interactions into nine work modes, with approximately 56% of sessions involving direct coding activities, including building new features, fixing broken code, testing, and orchestrating automation pipelines. Another 17% involve operating software through deployment and configuration, while 14% focus on planning and exploration. Over the seven-month observation period, the nature of these tasks evolved significantly; the proportion of sessions dedicated to debugging decreased by almost half, shifting toward end-to-end agentic usage such as deploying code, analyzing data, and generating non-code documentation.
Concurrently, the estimated economic value of typical tasks increased by an average of 27%, with building, operating, and fixing tasks seeing value increases of 43%, 34%, and 32% respectively. This trend indicates that AI assistants are being integrated into more complex and valuable workflows beyond simple script generation.
To quantify the autonomy and decision-making dynamics, researchers developed a privacy-preserving classifier to attribute decisions within sessions. On average, humans make about 70% of planning decisions but only 20% of execution decisions. In a typical session cycle, which averages four rounds of interaction, a single user prompt triggers the AI to perform approximately 10 actions, sometimes exceeding 100 actions, and generate around 2,400 words of output. The volume of work executed by the AI between user checks is heavily dependent on the level of control retained by the user. When users retain control over execution, the AI performs fewer actions per round, whereas when the AI holds planning control, it undertakes significantly more actions. Woofun AI notes that the strength of a user's domain-specific knowledge directly correlates with the workload triggered for the AI, as experts provide more precise instructions that allow the agent to execute longer, more complex action chains.
The relationship between user expertise and session outcomes was quantified using a five-level scale ranging from novice to expert, based on instruction precision, validation requests, and correction patterns. In novice conversations, each prompt word triggers about 5 actions and 600 words of output, whereas expert conversations trigger about 12 actions and 3,200 words of output. This gap persists across all task types and value ranges. Success metrics, defined by 'Verified Success' which requires both a positive judgment and hard evidence like git commits or passing tests, show a clear gradient. Novice sessions achieved a validated success rate of 15%, while intermediate and expert sessions reached rates between 28% and 33%.
Notably, the gap between intermediate and expert users is not significant, suggesting that sufficient domain proficiency is enough to leverage these tools effectively, even without deep technical mastery.
Occupational analysis inferred from session contexts revealed that while 'Computer and Mathematical Occupations' remain the largest user group, the fastest-growing non-software groups include management, sales, and legal professionals. In code generation sessions, software-related roles had a validated success rate of about 30%, compared to 26% for other occupations.
However, under a more lenient definition of success, the gap narrows to just one percentage point, with 89% of software users and 88% of non-software users achieving at least partial success. Management occupations actually recorded the highest validated success rate, slightly surpassing software engineering, potentially reflecting transferable skills in directing intelligent agents. Woofun AI analysis suggests that the ability to steer an AI toward success stems more from mastering a specific domain than from the ability to write code, allowing individuals with business context to accomplish technical work previously out of reach.
The implications for the labor market are profound, as the data indicates that AI programming tools are absorbing task-oriented work while rewarding those who comprehend the underlying problems. Novice users facing issues are significantly more likely to abandon sessions, with an abandonment rate of 19% compared to 5% to 7% for other groups. This highlights that the value of expertise lies in the ability to guide the agent back on track when errors occur. As models and user behaviors evolve, the landscape will continue to shift. If the proportion of non-software professionals successfully completing coding sessions continues to rise, it may signal that software production is becoming a ubiquitous part of everyday work across various fields, fundamentally altering the most valued skills in the future economy.