KubeCon + CloudNativeCon Europe 2025 recap
General
All in all it was another great experience. Big shout out to all the folks at CNCF who make that happen!
Coffee was still awful though. I wonder will they ever realize filtered coffee is just undrinkable and self service espresso machines are the way to go? 😜☕
More seriously were the problems with acoustics in Halls S10 and N10. Each hall was divided into multiple rooms with curtains which provided nearly no insulation. That made talks in these rooms hard to follow. Interestingly, at one time there was some kind of fan humming in the background. That made the problem almost disappear! So maybe noise cancellation would provide a solution here?
I also wish people would manage to stay put during the Closing Remarks of the Keynotes. The folks on stage are part of the organization that makes all the event you’re here for happen. Not being willing to donate 2 minutes of your time listening to them telling you to have a great time is incredibly rude!
But these minor complaints aside, as I said it was a great experience and I’d encourage everyone interested in Kubernetes, whether expert or beginner, to go. There’s simply no better way to stay informed on what’s happening in the industry.
The following sums up my takeaways from some of the talks I’ve listened to.
Tuesday
That’s Just My Cup of Tea: Configuring Cilium for Performance and Scale - Liz Rice, Isovalent at Cisco & Neha Aggarwal, Microsoft
Aside from hearing that there seem to be more “right” ways of enjoying tea than there are Brits in the world, we learned about various options that need to be turned on to utilize the full performance potential of Cilium. It’s good to keep in mind that Cilium by default is configured for compatibility not performance!
Main takeaway: take more care in tuning my cilium installations, compatibility is not as important for my use cases.
⚡ Lightning Talk: Locals Only: Patterns and Anti-Patterns in OpenTofu Local Variables - Robbie Glenn, Glennium
Very interesting talk that echoed much I’ve encountered myself with locals. The only drawback was that Robbie went way too fast, switching slides (which had lots of example code) before it was possible to grasp them.
Main takeaway: trust my own feelings I have about locals; think about refactoring tf modules to have all locals in a separate locals.tf
file.
Don’t Stop at the Cloud - Connecting all the Dots with Infrastructure-as-Code - Anuraag Agrawal, CurioSwitch
Kind of a déjà vue from the previous talk: again a very interesting topic, echoing my own experience, but going way too fast to fully appreciate the content.
Main takeaway: yes, terraform all the things!
Wednesday
Keynote: AI Enabled Observability Explainers - We Actually Did Something With AI! - Vijay Samuel, Principal MTS, Architect, eBay
Interesting insights into how eBay managed to utilize LLM to speed up finding the root cause for incidents. I especially liked how he didn’t present “AI” as a magic cure-all, but instead emphasized how you must not overwhelm the model which too much data and preferably feed it pre-analyzed data to keep it from hallucinating.
Main takeaway: when using LLMs, consider how to prepare your input so that you will be able to truth the output.
Taking Care of Your Control Plane With API Priority and Fairness and Resource Quotas - Matteo Ruina & Ayaz Badouraly, Datadog
Very interesting talk about a topic I didn’t now anything about, yet. I’d only seen that there’s such a thing as “priority and fairness” from kubectl messages.
Mateo and Ayaz showed how that control mechanism actually works and how they used it to fight various classes of incidents they met at Datadog. They also pointed out the pitfalls hidden behind the mechanism of “borrowing” capacity from other queues that prevented their tarpit policy from working as expected.
Main takeaway: you can priority and fairness to protect your cluster from escalating deployment mistakes.
Was Leslie Lamport Right? - Sarah Christoff, Edera & Nic Jackson, Hashicorp
An enthusiastic introduction into the problem of the Byzantine Generals and the solutions Leslie Lamport proposed. All systems today using Raft/Paxos are built on this work!
Main takeaway: I honestly didn’t know that the historic situation Lamport’s story is based on, is the very same portrait in Netflix’ Vikings Valhalla! 😳
Don’t Write Controllers Like Charlie Don’t Does: Avoiding Common Kubernetes Controller Mistakes - Nick Young, Isovalent at Cisco
A must-hear as I’m just making my first attempts at writing a controller. For me the good news was that by choosing controller-runtime as a framework, I’m already avoiding many of Charlie’s mistakes.
Main takeaway: watch out that your controller doesn’t overwhelm the API server!
gRPC: 5 Years Later, Is It Still Worth It? - Konstantin Ostrovsky, Torq.io
I definitely fall into the line of “protocol-curious” people that Konstantin mentioned as possible target audience for his talk, as I’ve never written anything encompassing grpc, but the talk still provided many interesting insights.
Main takeaway: given the right project it makes sense to look into grpc for internal communications, but stick with REST for customer facing APIs.
Flux Ecosystem Evolution - Stefan Prodan, ControlPlane & Sanskar Jaiswal, Kong
Sanskar was showing latest improvements in Flagger (which I currently don’t have any use case for) and Stephan was presenting his new baby, Flux Operator.
Main takeaway: Flux is mature (and backwards compatible) enough for automated upgrades, so use Flux Operator to always run the latest version!
Thursday
Identity-based Trust - Till Death Do We Part? - John Kjell, ControlPlane & Kairo De Araujo, Independent
This talk was a little bit over my head at parts. John and Kairo talked about how Fulcio and Rekord attestation works and also showed how they used a tool called witness to point out that a dev had “illegally” set an ignore flag on the container linter, thus smuggling a root container through the build process.
Main takeaway: I should look more into attestation and signing and witness looks really interesting. But somebody will still had to write all these policies…
TUF-en up Your Software Supply Chain - Marina Moore, Edera & Kairo De Araujo, Independent
Inspired by my difficulty to follow parts of the previous talk I went and listened to Marina and Kairo on the maintainer track. It helped, but I still feel I’m missing part of the picture. Interestingly enough the presenters had no real answer to the audience question on how to best utilise TUF in the context of Kubernetes.
Main takeaway: hmm 🤔
Beyond Security: Leveraging OPA for FinOps in Kubernetes - Sathish Kumar Venkatesan, DevOpsCloudJunction Foundation Inc.
Another talk that I think will help me clarify thoughts I already had around a subject. In this case how OPA is a general framework that admits or denies workloads in a cluster based on rules. So these rules should be able to specify anything, not just security related stuff. In Sathish’ case it’s FinOps. He was showing various ways to calculate and limit the cost a pod might accrue, based on resource requests in the pod spec or rstrict the use of expensive resources like high performance storage to certain namespaces.
Main takeaway: yes, it does make sense to use OPA for any kind of policy enforcement. Not just security.
Friday
Container Runtimes… on Lockdown: The Hidden Costs of Multi-tenant Workloads - Lewis Denham-Parry, Edera & Caleb Woodbine, ii.nz
Lewis and Caleb gave an overview over the history and different possibilities for workload isolation: from separate machines to VMs to containers and the various pros and cons.
Main takeaway: have a look at Edera’s project, I didn’t quite catch how it’s supposed to be the best of both worlds between VMs and containers.
Beyond Classical Cryptography: Building Quantum-Resistant Cloud Native Infrastructure With SPIFFE - Andrés Vega, M42 & Hugo Landau, Messier42
One more talk that went way too fast for me. Might be the advanced topic, but also it wasn’t possible to read the slides before switching.
Main takeaway: SPIFFE/SPIRE is supposed to be the way to implement quantum secure encryption.
Taming the Beast: Advanced Resource Management With Kubernetes - Lucy Sweet, Uber & Dawn Chen, Google
Interesting insights by Lucy and Dawn into new K8S resource management features like pod resources, dynamic pod resizing and node swap handling.
Main takeaway: pod resources will finally be a thing! 🥳
Wait! Can Your Pod Survive a Restart? - Aya Ozawa, CloudNatix Inc.
A much needed refresh of what happens when a pod shuts down as well as a recap on PDBs. Most surprising fact mentioned: nginx expects SIGQUIT for shutdown! 😯
Main takeaways: look into nginx signal settings and use lease release in controllers!
Stateful Connections in Kubernetes: The Scaling Secrets Nobody Talks About - André Mocke & Rodrigo Fior Kuntzer, Miro
This talk tied nicely into the previous one as it too was about how to deal with pods shutting down. In this case it was about the perils of getting websocket connections to fail over nicely.
Main takeaway: hard to say as I don’t currently use websockets 🤔 but still a mighty interesting talk. 👍