The last few years

Jan 10, 2022 · 14 min read

#work

It’s nice to reflect on what you’ve achieved over a certain period of time. I switched roles within the same business in 2018. I went from managing and coding in a software engineering team I built from scratch, to an Individual Contributor role in a few other engineering teams. This post aims to show at a high level what I’ve been working on during this time. It’s just about in chronological order (from memory!). I’ve written this as a positive way to reflect on different achievements that would otherwise be forgotten. This isn’t everything I did, but certainly some highlights. Some of these items have been very tough to work on and see through to completion.

But where there is pain, there is growth.

A UI Kit

Two of us had the same idea, but came at it in different ways. My mate wanted to produce some style guides on the Intranet for engineers to use in their web applications. As an engineer with limited design skills in his team, I saw value in this. We got chatting and realised that we both wanted to achieve a set of UI Components that any engineering team could use. I wanted to move away from Bootstrap and utilise the expertise of others within the business.

We ideally wanted the UI components to be JavaScript framework agnostic. I was a firm believer of letting teams use the frameworks they were comfortable in, be it React, or Angular (two of the frameworks being used at that time).

We agreed the scope of what we wanted to achieve, and then presented to the Engineering and UX directors. We were lucky enough to have convinced them. We both went about forming a team around the idea. We picked the right people for the job.

I was there at the beginning and helped form the team, the engineering processes, and was lucky enough to be an engineer coding on CSS/HTML/React for a few months whilst the team got off the ground. I also helped with the infrastructure in AWS, and spun up a simple Jenkins instance so we had a decent CI tool.

It was such a buzz working in an incredibly creative team. We recorded podcast like talks to showcase sprint demos etc. Such fun.

We used BEM, Jenkins, AWS technologies, Storybook, React, Hugo and a raft of other tech.

CLI tools

I’m a massive advocate of the CLI. I love writing CLI tools, and I love using them. Prior to 2018, the team I lead was the only team in GitHub within the business. Around the 2018 mark suddenly GitHub was home to around 100 repos, and increasing daily. We were onboarding hundreds of developers who hadn’t used GitHub before.

As you can imagine, scaling hurt. Based on Pepper from Jess Frazelle, I started to write a Go tool that helped audit and maintain some kind of structure in GitHub.

I was lucky enough to find a bunch of engineers who liked the concept and they mucked in adding more and more rules, or self healing fixes to the repos. It ended up being a GitHub application installed into our organisation, ran in an AWS Lambda, and reported back as a Pull Request Check that failed if you were not meeting the agreed standards.

My aim was to influence key people within the business to get this tool open sourced. We had written it in a config based/company agnostic way, and would be beneficial for other organisations. Sadly this never came to fruition.

This was using Go, and Cobra which is a nice combo of Tech. The infrastructure for running as a hosted application was all writing in AWS CDK (TypeScript).

Tech Radar

I’ve written about this before. This fell out of favour with the business for various reasons that had nothing to do with the idea.

I still think a Tech Radar is a great tool for many reasons. So, as of January 2022, I managed to convince one or two other engineers to help me convert the site from Hugo, over to Svelte. This was nothing against Hugo, but was more about challenging ourselves to do something new, which was semi relevant. Having a break from React, and writing in Svelte was cathartic.

I’ve now streamlined our Tech Radar and reined it in, where previously it went too far. It’s now more manageable. I’ve started to introduce this back into the teams within my sphere as a single decent place to document stuff that will help other teams.

Decision making for Tech Leads

Around the time my role changed, the business had done a restructure. My role of Tech Lead had gone, and you either were an “Expert” or a “Pod Leader”. To everyone else this was a “Tech Lead” and “Engineering Manager”. The “Experts” did not line manage, so we were left to be influencers. You will see that I use “Technical Lead” throughout the rest of this document, rather than Expert. It’s hopefully obvious as to why.

Anyway, we digress.

The “business” wanted the “Experts” to lead on technical decisions as a group. At the time there was 13 or 14 of us, from differing backgrounds.

In order to help us document our thinking and provide a mechanism for us to come to agreements, I setup an RFC process in GitHub. This allowed us to publicly (within the business) discuss issues/ideas, document them, and then vote via a Pull Request process.

It sounds more simple than it really was. Buy me a beer if you want to know more.

Design Authority

I’ve also written about this before. This was a peer reviewed, community based initiative to help teams bake quality into their software. I was one of the Technical Leads who contributed content, and helped drive the process to get engagement.

An auditing solution

This was a standard engineering project. I was the lead in the team, but didn’t line manage anyone. We were tasked with building a highly available, auditing and logging solution. Other teams would send us all their audit and log data, and we would store it, and provide an API for retrieval. When I joined the team, the solution was using a plethora of languages, which I was able to streamline to Go and TypeScript. We were responsible for building the data stores (Elasticsearch and AWS S3), ingestion of data (AWS Kinesis), APIs (Go and AWS API Gateway), SDKs for consuming teams, and the front end application to view the data.

The back end was written in Go, utilising AWS Lambdas, and API Gateways. The front end application was all written in TypeScript and React. This was a fun project and team to work in.

Solution engineering

From the auditing solution I moved over to a lead role in a Solution Engineering team (a System Team if you are familiar with SAFe). This was a wide ranging role within the department. I was able to shape the projects we worked on, and the backlogs we worked through. We were lucky enough to be able to collaborate with many teams, and add value to their products along the way. My solution team focussed on the platform/backend teams, including infrastructure projects.

We quickly defined and published our mission statement, so we could hold ourselves to account.

Our teams’ mission is to support, guide, and provide engineering solutions to the domain teams, enabling efficient and high-quality software to be developed - resulting in a consistent, maintainable, high quality, and well-engineered solution.

This team utilised many technologies in order to add value, such as AWS Cloud Formation, CDK, Terraform, Ansible, Go, TypeScript, Bash, GitHub Actions, Jenkins, etc.

Some of the main projects we ran with are documented below.

Defining a release process

The business was moving from a well defined Windows Desktop based software deployment model to a web based model. This meant that the release processes needed to reflect new technologies. The solution engineering team were tasked with leading on the technical aspects of deploying cloud based web products.

This factored in quality gates, canary deployments, auditing of deployments, release notes, company wide notifications, etc.

Building an enterprise Jenkins deployment

For an understanding as to why we did this, see the comparison we did. My personal experience of deploying Jenkins was always directly onto VMs via Ansible. Over the years the ansible scripts became more polished. I had previously deployed “Hudson” (as it was called then) at Plusnet, using capistrano. These deployments were used by tens of engineers.

This time the business needed a cloud based solution that could scale to hundreds of developers. Luckily, the team managed to rope in a very knowledgable AWS engineer who came in and lead the design for us. It was fantastic working and learning at such a fast pace. A highlight.

The solution ended up being an Elastic Container Service service, backed with EC2 nodes. It was all built with Terraform and partitioned into different clusters (Jenkins instances) for different teams, so IAM permissions could be restricted to assume different roles to deploy into different accounts.

To manage this infrastructure in a production environment, and react to situations that cropped up, we built a little Go CLI tool called jenk. This was a neat little wrapper around the Go AWS SDK to pull out key pieces of information we needed. This could provide us information on:

What versions of Jenkins are being used by different clusters.
What build environments were supported on the Jenkins instances. e.g. Some teams wrote in C#, whilst others wrote in Go. We deployed agents with differing versions for different teams.
What docker images were stored in the Elastic Container Registries for specific teams.
Data around the Elastic File System mount points.
Information about the ECS Tasks and services running.

The reason as to why we built the CLI was the fact we wanted to streamline the number of clicks needed in the AWS console. We wanted to provide wholistic information on the CLI without all the drama of using the UI. Of course, this was all backed with dashboards in Grafana.

Supporting a monitoring platform

The monitoring solution had started out as a one man project outside of my team. It ultimately needed a home, so it came to my team. Two of the engineers on my team lead on uplifting this project to a more productionised quality. I provided minimal support and coding on this, so they could lead on their own thing.

This is currently the defacto stack in the business. It’s an ECS solution akin to the Jenkins solution above, running Grafana. We had hopes to take this solution further, but currently we don’t have capacity.

Peer reviews

As mentioned above, we had a “Design Authority” which is essentially a guide to help engineers provide consistent quality and solutions across the business. My proposal was for my team to conduct weekly reviews and write up reports to help teams achieve the standards required. We notified teams we were doing this, but it was passive. We didn’t get in their way, and we provided constructive suggestions and feedback where it was appropriate.

When you’re in an overarching role, you really can provide more context to teams. You have more suggestions based on what other teams are doing. Since we were also coding on the solutions day to day too, we were still very close to the detail. It’s a privileged role to have.

acme.json

This is a JSON Schema specification for all engineering teams to expose their internals to an automated tooling set. Each GitHub repo had a specific [company].json file in the root.

This project had the following aims:

This should be seen as a public contract between a team and the rest of the engineering department. The schema is there to expose internals (the way your team wants to work) to other teams, and tooling, so they can consistently understand information they need, without dictating an implementation to you.

This project provided:

A full detailed list of AWS accounts owned by teams. This could have gone further to store in a queryable back end.
A schema that could show flow in the system. Using the auditing example I talk about above, you could determine who had integrated with auditing, and which AWS accounts talked to which AWS accounts. Again, we could have gone further to show automated architectural diagrams based on real life Config as as Code that teams were shipping.
It provided a centralised touch point that allowed teams to structure their repos internally how they wanted, but still expose key details, such as their Open API Specifications for the Developer Portal. Without this project, in order for the Developer Portal to work, it would have had to have been mandated teams have a certain folder structure in their repo.
A support file for the CLI tools we were building up.

Technical Roadmap

I’ve written about this in more detail here. The basic concept is making sure engineering teams know what to work on, why they are working on it, and where they are going. It’s slightly more nuanced, so check out the post.

Weekly workshops

I was lucky enough to have a mate who was willing to run weekly workshops over Slack/Teams with me. We decided on a topic the week before, and let the rest of the department know. Then we would spend time building an agenda, discussion points, provide demos and coding examples. On the day, one of us would be the compère for the session, and the other would drive the screen. It was really really enjoyable, and produced a load of valuable content for us, and for our colleagues.

Sadly, my mate has left the business now, so I’m hoping for a YouTube channel or Twitch stream one day with him.

Centralised Dockerfiles

About 3 and a bit years a go, a few of us had a similar idea to centralise and standardise the Dockerfiles we were using. The idea was a complete copy of this from Jess Frazelle again. It was a pet project that had minimal momentum.

However, as we started to containerise more and more of our applications, this project became a business need. It was no longer a pet project.

I was tasked with leading on it, so quickly formed a team from the original group of engineers who were interested. Luckily I could convince them all to help me out on this. We wrote a build process in JavaScript using Commander.js since JavaScript was the most common language in the team. This work built on top of the work my team had done in the Jenkins solution for different build agents. This meant we provided Production, Testing, and Development (dev containers) Dockerfiles for the main languages being used in the business. We pushed the images to Azure Container Registry, Elastic Container Registry, and GitHub Package Registry.

We used Trivy to scan all the images we were building. It provided a centralised touch point for any other specific business requirements. It was deemed easier to do this at a central touch point.

JAMStack hosting

After the solution engineering team was disbanded (buy me a Jack Daniels and we can discuss this), I found myself in a front end team writing React applications again. I joined late to the party, so the team was already formed.

So, as you can see, I’m almost full circle in which to conclude this post.

Having to write Infrastructure as Code (IaC) in the Jenkins project was fun and exhilarating. Writing it to host a JAMStack site, is not. I set out to demo that as a team we could move at a faster pace if we used more modern hosting providers. I used Netlify (other providers exist in the same space, I just happened to advocate for Netlify). I provided a demo back to team, who loved it. I provided demos to other stakeholders. In the end, the team agreed to write a white paper on why Netlify was a key technology to embrace.

The team have been able to remove 28% of the code they write and manage by using Netlify. We are able to spend that saving on adding more value to our customers.

Conclusion

You forget, during the day to day grind, what you’ve achieved. As you can imagine, I cannot go into all the details. The projects above represent a volume of work that I’m really proud of. I’m proud of the teams I worked with to deliver all of this. Building software is a team sport, and that should never be forgotten.

Go fast, go alone; Go far, go together

Thanks