Deployment per component
At the beginning of the usage of Kubernetes, the choice was made to use HELM as the YAML packaging and deployment tool in La Redoute.
This technology allows us to create a “Chart” that represents all the objects of your application that will be installed in the Kubernetes cluster. And to follow that, every application and component has its own chart in its repository. It is guided by a template, but you can modify it for your personal needs.
The problem here? When it is working, no one touch the chart. Usually, because most of the engineering people don’t have expertise in Kubernetes specifications, it quickly became an “Ops box” that the developers don’t know how and why they must manage it. So, we created the platform team. This team is managing the helm template and all updates to do on the application repository. The problem is that the chart became wider and more complicated due to adding new tools or practices. And we always had new applications and microservices deployed in the cluster.
In the end, we were starting to have multiple templates (1 per application), but similar. All the default configurations were in all projects, and nearly all projects used the same. Regarding that, when an operation was necessary, the platform team was forced to update all application repositories to apply the needed update in the deployment. And then, engineering teams never touch the chart again, losing track of the meaning of those helm charts and values (vicious circle). The platform team couldn’t continue to manage everything because the platform grew so much that there are too many services to update.
First steps in GitOps
At that time, the deployment state was done by jobs in Gitlab and only gave us feedback when the manifest YAML was applied correctly and if the linked pods were ready. The lack of real deployment visibility, the complexity of several values to manage deployments, and the heaviness, when global evolutions and changes were necessary, started to be a big pain point for all teams (Engineering & DevOps).
To resolve that, the Kubernetes integration team came up with what they think is a “magic” solution: ArgoCD. A tool that manages k8s deployment, including HELM, giving a GUI on the states of the Kube objects and where rollback management is possible (without launching all jobs in the pipeline). The perfect answer for all the pain points. However, this situation required changing the deployment pattern to a GitOps model, which meant a total split of CI/CD and a total change in the current workflow of application deployment (testing jobs in the pipeline, the usage of a button inside Gitlab to do everything). It has been a great war, but a too big of a step to change in the blink of an eye.
Because the developers were not ready to change their workflow completely and the change was not enough thought and prepared, the integration team imagined that it was not so blocked-in engineering side. So, we temporarily abandoned this target.
One Helm to rule them all
The goal for this “try again” was to find an easy way and early win to respond to our pain points.
First, we externalized the HELM Chart of the application repository. With that, the Integration team could propagate new features or tools to everyone instantly and in a versioned manner. After simplifying the template by keeping only the mandatory values, we clarify the usage, so the developers better understand the values they are using and go straight forward with what they want.
The concrete results came early because we had upcoming projects like Jaeger Tracing & Istio to implement on our applications. And by just activating some flags, in a blink of an eye you could implement those features!
But the problem of the “ops box” was still present. So, following that, the K8S integration team was kind of “integrated” into the engineering teams by assigning a key point person to each team and by organising bi-weekly meetings. Those meetings allowed teams to easily share their roadmaps, their needs, and their pain points directly with one another. Another good point of this new organization was that the Kube guy knows better the application environment that he is working on. And we were able to merge the platform team into the Kubernetes Integration team. So, it was the start of the two teams working together.
The Kubernetes integration team created some workshops to show how to work with the values of their applications and the meaning of all those values, as well as their impact on the cluster. We decided to create KPIs to follow the improvements (for example, the good usage of replicas or liveness probe), but it’s always something complex to do. At this time, these KPIs still do not exist.
It’s been now several months since we started working this way. And we know that we can still improve a lot.
First, by including the Kube Team in the design of the application because, for now, we are following 1 application template and 1 guideline. We need to enrich our possibilities and give proper guidelines when new types of applications come.
Another big project that is coming is the GitOps move. And to achieve that goal, we need first to align on some principles like a unified way for tag usage and the split of the CI and the CD.
When those subjects are finished, we will have a concrete view of which team is managing each code. And have full control of our applications’ code and deployment steps.