TODAY IN 30 SECONDS

Welcome back. Today's insights focus on how automation is reshaping operations across various sectors.

  • Automation Tools: Companies are increasingly adopting automation tools to enhance efficiency and reduce operational costs.

  • AI-Driven Insights: Businesses leveraging AI for data analysis report improved decision-making capabilities.

  • Integration Challenges: Organizations are facing integration challenges as they implement new AI systems alongside legacy platforms.

  • Workforce Adaptation: Teams are adapting to new automated workflows, emphasizing the need for training in AI literacy.

  • Regulatory Considerations: As AI use expands, companies are navigating evolving regulatory frameworks to ensure compliance.

LEAD SIGNAL

NVIDIA's Vera CPU Is Built for the Agent Era, Not the Cloud Era

NVIDIA has published initial benchmark results for its Vera CPU, a processor built specifically around what the company calls "agentic AI" workloads. The chip packs 88 custom Olympus cores, 1.2TB/s of memory bandwidth, and a high-speed on-chip fabric into a 450-watt thermal envelope. Per the Phoronix review cited in NVIDIA's announcement, Vera delivered strong performance across code compilation, file compression, video transcoding, Python, Java, and database management. Those aren't random benchmarks: they're the exact tasks AI agents run constantly when orchestrating software stacks, executing code in sandboxes, and querying data stores.

The signal here is architectural, not just competitive. For years, AI infrastructure conversation has centered almost entirely on GPU performance for model training and inference. Vera is NVIDIA making a public argument that the CPU tier matters now too. According to the newsletter, agentic AI (meaning AI systems that take multi-step actions autonomously, rather than just answering a question) puts a very different load on infrastructure than a simple query-response model does. Branch-heavy logic, parallel orchestration, sustained all-core activity: these are CPU problems, not GPU problems. The industry is starting to catch up to what practitioners running real agent pipelines already know: The newsletter suggests that the bottleneck moves around.

For operators running AI-assisted workflows at the 10-200 person scale, none of this requires a hardware purchase decision today. What it does require is updated mental models about where your cost and latency problems actually live. If your team is running agents through cloud APIs, your infrastructure choices are abstracted away. But as agentic workloads mature and get pushed toward on-premise or private cloud deployments for cost or compliance reasons, Analysis indicates that the CPU tier becomes a real variable in the context of agentic AI workloads. The companies that understand this architecture now will make smarter procurement and vendor decisions later. So what: The newsletter states that the GPU obsession in AI infrastructure is real but incomplete., and According to the newsletter, Vera is the clearest signal yet that the full stack is getting a rethink.

WHAT HAPPENED

NVIDIA released initial benchmark data for its Vera CPU, built around 88 custom Olympus cores and 1.2TB/s memory bandwidth, targeting the orchestration and code-execution demands of agentic AI workloads.

WHY IT MATTERS

It confirms that Analysis suggests that agentic AI creates a distinct infrastructure tier separate from GPU-heavy inference.: one where CPU throughput, memory bandwidth, and sustained all-core performance are the binding constraints in agentic AI workloads.

THE BREAKDOWN

Analysis suggests that operators running agent pipelines today are largely insulated by cloud abstraction., but understanding this infrastructure shift matters when evaluating vendor lock-in, pricing models, and future deployment options.

Bottom line: You don't need to buy hardware, but you do need to understand that Analysis indicates that agentic AI has a full-stack cost structure, and the CPU tier is no longer an afterthought..

LATEST DEVELOPMENTS

Development

AI Engineering Is Growing. Software Engineering Hiring Hasn't Recovered.

The Pragmatic Engineer's 2026 job market analysis, built on exclusive data from hiring platforms tracking open roles across Big Tech and top startups, paints a picture that defies the usual boom-or-bust framing. AI engineering roles are up. Traditional software engineering hiring remains soft. The paradox from 2025 hasn't resolved: job seekers still report poor response rates, while hiring managers still say finding good candidates is harder than it should be. Both things are true simultaneously. What's shifting is the composition of demand, not its overall volume. For operators running teams or making hiring decisions, that distinction matters: the market isn't recovering uniformly, it's sorting. Roles tied to AI tooling and infrastructure are attracting attention; general engineering headcount is not bouncing back at the same pace.

So what: Watch whether the AI engineering surge is expanding total engineering budgets at your vendors and tool providers, or simply cannibalizing headcount from adjacent roles, the answer will tell you a lot about where product investment is actually flowing.

Development

The Agentic Gap: 85% Want It, 76% Can't Support It

According to a Celonis report cited in MIT Technology Review, 85% of organizations aim to be fully agentic within three years. Yet, 76% admit their current operations and infrastructure can't support this ambition. The gap isn't about technology. It's about design. Companies are cramming AI agents into workflows meant for humans. PwC UK's global CTO for workforce consulting compares it to "adding sticky tape" to something already breaking. The agents aren't failing; the operating model is. Early deployments in customer service, HR, and sales show AI agents could speed up business processes by 30–50% and cut low-value work time by 25–40%. But only if the organization is rebuilt around the agents, not vice versa.

So what: Before adding another agent to your stack, map whether the workflow was designed for manual human tasks. If it was, the agent will hit the same ceiling.

Infrastructure

NVIDIA's Vera CPU Is Built for the Workload Running Your Agents

Most CPU conversations focus on raw compute. Vera reframes the question around what agentic AI actually does all day: branch-heavy runtimes, sandboxed code execution, database queries, and orchestration across large software stacks. NVIDIA's new Vera CPU packs 88 custom Olympus cores, 1.2TB/s of memory bandwidth, and a high-speed on-chip fabric into a single-socket design rated at 450 watts. Initial benchmarks published by Phoronix cover workloads that map closely to real agent infrastructure: code compilation, Python and Java runtimes, file compression, video transcoding, and database management. Performance held up across all 88 cores simultaneously, which is the failure mode that bites most CPU designs under sustained agentic load. Memory power came in under 30 watts within that envelope.

So what: If you're sizing or advising on AI infrastructure, watch whether sustained all-core performance under agentic load becomes the benchmark criterion that displaces peak single-core scores as the standard procurement question.

THE LENS

NVIDIA's Vera CPU Is Built for the Agent Era

Source: NVIDIA Blog · Vera CPU Benchmark Coverage · May 2026

NVIDIA's first public Vera CPU benchmarks (per Phoronix testing) show a chip purpose-built for the workloads that AI agents actually run: code compilation, database queries, runtime orchestration, and data compression. These aren't GPU tasks. They're the CPU-heavy glue work that every agentic pipeline depends on.

What nobody's telling you: The claim that the bottleneck in most AI deployments isn't the model inference chip but the CPU is attributed to the source material.

AI finds the signal. Human judgment sharpens it. Same workflow we'd build for your team.

LAUNCH PAD

⚙️

Claude Code

AI Tool · Now available

Claude Code changes how you interact with LLMs. Programmable agents with memory and custom commands boost coding efficiency. That's it.

🎟️

TechCrunch Conference 2026

Event · Early Bird Tickets

Grab your spot in San Francisco. Save up to $410 on tickets before May 29. Don't wait.

🌍

Google's Genie World Model

Simulation Tool · Now available

With Street View integration, create immersive simulations. Perfect for robotics and gaming. Dive in.

TOOL WE USE

🔮

Gemini API

AI Inference / API

Google's Gemini API offers two inference tiers: Flex and Priority. Flex trades latency for cost. Priority speeds up responses when throughput is key. For operators running LLM (large language model) calls at volume, this cost-speed dial is what's been missing from most APIs. Full stop.

Most teams stick to one configuration. They overpay on batch jobs or underperform on real-time tasks. Tiered inference makes that a choice, not a compromise.

REPORTS & RECIPES

Route Agent Workloads to the Right Infrastructure Before You Scale

Most ops teams running LLM-based agents hit a wall. Not because the model's weak, but because the infrastructure can't handle the orchestration layer. Agent pipelines are CPU-heavy: they compile code, query databases, compress outputs, and coordinate runtimes in rapid sequence. If your host environment can't sustain that load across all cores at once, your agents queue up and slow down.

  1. Audit your current agent tasks: List every step your automation stack runs between LLM calls. Flag anything CPU-bound: database queries, file compression, script execution, runtime coordination. These are your bottleneck candidates.

  2. Separate model inference from orchestration: Run your LLM calls on GPU-focused infrastructure. Route the surrounding orchestration logic (scheduling, data processing, tool calls) to CPU setups with high memory bandwidth. Don't mix them on the same node.

  3. Test sustained load, not peak load: Benchmark your orchestration layer with all cores active simultaneously. Single-core burst numbers are misleading for agentic workloads. What matters is consistent throughput under concurrent agent sessions.

  4. Right-size before you scale replicas: Adding more agent instances on under-specced hardware multiplies the bottleneck. Fix the per-node architecture first, then scale horizontally.

Result: Agent pipelines that stop stalling mid-task, with orchestration infrastructure matched to the actual workload profile rather than inherited from a general-purpose server setup.

Signals

  • Another customer of Delve has experienced a significant security incident, raising further concerns about the company's compliance standards.· TechCrunch

  • Meta plans to lay off 10 percent of its workforce, cutting around 8,000 jobs as it shifts focus following significant AI investments.· The Verge

  • Despite fears of AI-induced job losses, current data shows no significant impact on employment in white-collar roles.· MIT Technology Review

How was today's issue?

AI finds the signal. Human judgment sharpens it. Same workflow we'd build for your team.

Reply

Avatar

or to participate

Keep Reading