ShipTalk - SRE, DevOps, Platform Engineering, Software Delivery
ShipTalk is the podcast series on the ins, outs, ups, and downs of software delivery. This series dives into the vast ocean Software Delivery, bringing aboard industry tech leaders, seasoned engineers, and insightful customers to navigate through the currents of the ever-evolving software landscape. Each session explores the real-world challenges and victories encountered by today’s tech innovators.
Whether you’re an Engineering Manager, Software Engineer, or an enthusiast in Software delivery is your interest, you’ll gain invaluable insights, and equip yourself with the knowledge to sail through the complex waters of software delivery.
Our seasoned guests are here to share their stories, shining a light on the do's, don’ts, and the “I wish I knew” of the tech world. If you would like to be a guest on ShipTalk, send an e-mail to podcast@shiptalk.io. Be sure to check out our sponsor's website - Harness.io
ShipTalk - SRE, DevOps, Platform Engineering, Software Delivery
Special ShipTalk Episode from DND NYC 2026
Use Left/Right to seek, Home/End to jump to start or end. Hold shift to jump forward or backward.
This is a special episode where we sat down with the speakers at DevOpsNotDead NYC 2026 to hear their perspectives on how AI is transforming software delivery.
Connect with our guests:
https://www.linkedin.com/in/diamondbishop
https://www.linkedin.com/in/sadiojonas/
https://www.linkedin.com/in/joshuamlee/
https://www.linkedin.com/in/akash-thakur-00367a155/
https://www.linkedin.com/in/mahender-mangalasri/
https://www.linkedin.com/in/jamesbrookbank/
0:00:00 — Dewan Ahmed:
Good morning. Happy Friday. My name is Dewan Ahmed, your host for ShipTalk podcast. And we're bringing you a special episode from New York. Now, we've talked about agents fixing code. Sometimes agents are also deleting all the emails in our inbox.
0:00:17 — Dewan Ahmed:
But who's watching the agents? Our next guest has been building AI products for 15 years and is currently Director of Engineering and AI at Datadog. Welcome, Diamond Bishop.
0:00:26 — Diamond Bishop:
Hey, happy to be here. Yeah, I work on the AI Skunkworks team or labs team, kind of a product labs group at Datadog. But I've also worked across the board on our AI agents products, Bits AI, which is a variety of agents—and happy to chat about it.
0:00:39 — Dewan Ahmed:
Can’t wait to talk. Yeah, no, for sure. I think this is something everyone is wondering—what's happening next. So you call 2026 the year of enterprise agent. What's the biggest difference between the hype of agents from last year and the ones that are actually running in production today?
0:01:00 — Diamond Bishop:
Yeah, I think we've all kind of had teams that built something for a quick demo last year. We showed a lot of really cool ideas that can work. But one of the big things is that it takes a lot longer to go from demo to production than to go from zero to demo.
And what we're seeing this year—and one of the reasons I think of this year as really the enterprise push for agents—is that we're doing a better job with securely deploying agents into the ecosystem. So actually using them for tasks that you do every day at work, not just things that you can share on Twitter.
0:01:42 — Diamond Bishop:
And a big part of that is really understanding how to evaluate, observe, deal with monitoring, and actually improve your agents over time, because the first version is not going to work that well.
0:02:23 — Dewan Ahmed:
And I want to dive deeper on the process itself, because as software engineers, like DevOps engineers, we spend decades perfecting the flow—build, test, deploy, monitor. Now that you have these agents in the flow, how hard was it to adapt?
0:02:36 — Diamond Bishop:
Yeah, this is definitely complicated. AI applications are more stochastic—meaning they can behave differently each time.
So we had to rethink observability. Instead of looking at single failures, we now track patterns across many runs and contexts.
0:03:24 — Diamond Bishop:
We spent a lot of time building offline evaluation systems—how do you prove something works before production?
And then we built online evaluation systems that monitor behavior at scale.
0:03:51 — Dewan Ahmed:
So are humans still interpreting agent decisions, or are agents evaluating other agents?
0:04:00 — Diamond Bishop:
It’s a mix. Humans still do spot-checking, but that doesn’t scale.
So we use LLMs as judges—models that evaluate outputs using a rubric. Combine that with human alignment and you get scalable evaluation.
0:05:45 — Dewan Ahmed:
What about security—like prompt injection or jailbreaks?
0:05:52 — Diamond Bishop:
We rely heavily on sandboxing. Agents only have access to specific tools and environments.
We also use:
- AI gateways for observability
- MCP gateways for controlled integrations
This ensures agents can operate safely without unrestricted access.
0:08:36 — Dewan Ahmed:
How do you convince leadership and customers to adopt this?
0:08:43 — Diamond Bishop:
It used to be very hard. Now it's more of a “yes, but safely” conversation.
Customers want to try AI—but within strict boundaries.
0:11:09 — Dewan Ahmed:
Build vs buy—what’s your advice?
0:11:15 — Diamond Bishop:
Start with first-party agents where your data already lives.
Then:
- Learn the boundaries
- Identify gaps
- Build custom logic only where needed
A small team shouldn’t try to build everything from scratch.
0:13:15 — Dewan Ahmed:
Where can people find you?
0:13:18 — Diamond Bishop:
You can find me on X at @DiamondBishop, and check out our YouTube series “Unhobbling AI.”
0:13:41 — Dewan Ahmed:
Welcome back. I'm Dewan Ahmed, your host for ShipTalk Podcast.
Our next guest is a 20-year tech veteran who says the linear DevOps pipeline is collapsing. Please welcome Sadio Jonas.
0:14:01 — Sadio Jonas:
Nice to be here. Thank you for having me.
0:14:08 — Dewan Ahmed:
You argue AI is breaking the traditional plan-build-deploy cycle. What does that mean for SREs?
0:14:16 — Sadio Jonas:
It actually makes things better.
The original DevOps problem was miscommunication—requirements were unclear, development was misaligned, and operations had to deal with the fallout.
0:15:32 — Sadio Jonas:
With AI, all stakeholders can collaborate earlier. You can prototype in real time and validate ideas immediately.
0:16:02 — Dewan Ahmed:
Release cycles have gone from quarterly to near continuous. Are we moving too fast?
0:16:10 — Sadio Jonas:
Yes—and that’s dangerous.
We need intentional pacing. Just because we can deploy continuously doesn’t mean we should.
0:17:32 — Dewan Ahmed:
Why do AI transformations fail?
0:17:40 — Sadio Jonas:
Leadership gaps.
Leaders focus too much on tools instead of strategy. They need to:
- Define organizational goals
- Build an AI strategy
- Then choose tools
0:19:54 — Dewan Ahmed:
Is AI creating more tech debt?
0:20:00 — Sadio Jonas:
It can.
Best approach:
- Use AI for MVP and clarity
- Then switch to structured engineering practices
0:21:32 — Dewan Ahmed:
Is there still a human in the loop?
0:21:38 — Sadio Jonas:
Absolutely.
AI amplifies humans—it doesn’t replace them. Human judgment, creativity, and nuance are still essential.
0:23:14 — Dewan Ahmed:
Where can people learn more about you?
0:23:18 — Sadio Jonas:
Visit aivantageconsulting.com. We help organizations with end-to-end AI adoption.