The panic and adventure starts on a day like any other, I go to work, we have our daily standup, I write some code, and even answer a few emails. However the next part really surprised me. I get a heads up from a friend in one of our overseas offices alerting me that something really big just happened and to start poking around. As suggested I check around with some of my colleagues in other offices, and surely enough we shortly find out.. our CTO just sent in his resignation. I’ve had several calls with him in the past and some in-person meetings, and we worked together on a few projects, but I can’t say I really knew him well personally, however he didn’t seem like the type of person to get up and go without some really good reasons. Surely enough I created a few Google Alerts to keep an eye out for any new developments, and a few days later we find a copy of his resignation, which sheds light on the many reasons behind his departure. I’m not sure if it was the act of leaving itself or the letter, but a couple weeks after that we get word that our CIO resigned as well, and then I hear that the head of development of our accounting software team just left. It felt really puzzling since I knew these people and it felt like everyone around me was part of some strange game that I didn’t understand. Little did I imagine at that time that this would become a story about engineers from all parts of the world coming together and becoming innovative heroes that revolutionized one giant company.
The team I’m part of was called Special Projects at the time, and that’s because we were had a different job that most other teams in the company. The company was started 25 years ago by 2 people, and had continuous growth year after year until that day when it consisted of around 4,000 people in 90+ offices around the world. It grew organically as demand for its services grew, and it branched out into many other business areas based on what our clients needed. We had very specialized teams, which were very good at what they did. The culture itself was quite interesting as well, every team had its own energy and way of doing things, and were quite optimized for what they needed. However, there each team was optimized for something else, and while we had really important skills all throughout the company, each team could use something that another team was really good at, but they just didn’t know about each other. In the bigger picture as we started to reach a critical size, we started to see new requirements from clients that needed multiple teams working together to be able to achieve. This was a challenge, because our teams and tools didn’t really talk to each other, because that’s not what their requirements were until then, so we had some growing pains as a company. That’s where my team comes in. We were created to solve this very problem, about 8.5 year ago. My boss took an engineer from the company, and then contracted me / the company I owned at the time to start building different tools to solve specific problems. About a year after this, I come on board as a full-time employee, and the 3 of us start our journey. We get our accounting software team to build some workflows that communicate cross-systems, we get our production software to open APIs to us, and we build what later became the company’s flagship customer portal, which integrated all these tools together.
Over the next few years, we grow our tools to the point they touch almost every employees work in some way, from document conversion tools, interactions with clients, to optimizations in processes, and better UX, and a consistent brand across our software and everyone benefited. We built the company’s distributed document analysis and conversion system, making all the company’s specialized tools available to every team that needed them. We built the company’s corporate SSO server, federating authentication to clients that supported it, and onboarding almost every internal and external application our employees use or we offer to our clients onto this one platform. By respecting standards, training and publishing documentation, applications were developing the capabilities to directly integrate with each other without even realizing it.
With all this though, and our team growing from 3 people to 30+, we were starting to experience growing pains of ourselves. We weren’t really created to be a software development team, but we had built software that was now in use by tens of thousands of people and really became critical to the business. And we were still developing new solutions.
This I think was the reason we were shielded from what was going on between our CEOs, and I think they made the right call on that and grateful to them for it. Our CEOs had been in litigation against each other for a couple years by that day, with hundreds of millions of dollars spent on that lawsuit, and constant stress, I can only imagine what my colleagues were going through. After our CTO’s and CIO’s departure, other people in upper management started leaving. Over the next few months we lost most of our VPs, directors, and managers, 3 levels of management just got up and left, and the exodus concluded with my boss and other software manager on my team leaving.
I found myself with all engineering on our team reporting to me, supporting all the tools we’ve ever built, and because our CEOs were deadlocked, I basically found myself .. in charge. I didn’t have anyone to report to. We basically could do whatever we wanted technically and no one would really have time to care. It was an interesting position, because on one hand we had that freedom, but on the other hand we had all the company’s customers to worry about .. and if we needed any major purchase, there was no one to approve it, so quite interesting.
Before this role, I used to run a software company. I had recently studied Systems Engineering in a 6-month course at MIT, which basically teaches architectural frameworks you can use to efficiently run teams of thousands of people to build a massively complex project like an airplane; I always loved security, and I was training all our engineers on OWASP, which originally started as just showing junior engineers why certain practices in development are bad; and I was a certified scrum master. And well, a workaholic. I don’t think anyone in the company until then even had a count of how many software engineers worked in the company. But all I saw was opportunity to change every little thing I didn’t like in the company, and no one to get in my way. This wasn’t something I could achieve by coding; I loved coding everything myself; but for this I needed an army, and that meant raise every single person I could find to know what I knew. I started with Scrum training, I invited my team, our accounting software team, and opened it up to any engineer or tech manager in the company. Then we started showing teams the different automation we had, and opened our project management system up for anyone who wanted it. At the time most teams were using a really old and slow version of Jira, and we had the latest and greatest Team Foundation Server, build pipelines, quality control, and all sorts of customization and automation in that which I had set up. I also started training anyone who was curious on how TFS worked, and how they could use it on their own team. I was having the time of my life.
Next, one thing I really wanted, was a way to get code quality up throughout the company. We had been using SonarQube for a few months, and we love it. There had been a big fear in the company about imposing any type of rules about coding, because no one wants to make engineers upset, and telling engineers how to write code is at the top of the “don’t do list.” But to me this was different; imposing spaces over tabs or C# vs Java can literally cause wars to start; but within each domain, there are still best practices and quality can be defined for each, and I think SonarQube nails it. So I email our corporate QA team with a plan to try this. We email every team we knew of, and ask them to send 1-2 senior engineers from their team to participate in a 2-hour call to see if we can all agree and form a code quality standard for the company, and to invite any other teams they might know. This I think is still to this day, the best call I ever had. We had code veterans and new teams, C#, Java, JavaScript, PHP teams and more, all just simply agreeing with each other, from teams just starting out with proof-of-concept tools to software that’s been developed over the previous 10 years, coming up with problems they foresee and everyone brainstorming on solutions to those problems. By the time this call was over, we had done it. We had a Code Quality SOP verbally agreed to by every team, and shortly after that we had a written SonarQube SOP which was now just a standard part of the process for every team. I talked more about this in a separate post here: https://scatteredcode.net/code-quality-using-sonarqube/ and how we actually set it up: https://scatteredcode.net/installing-and-configuring-sonarqube-with-azure-devops-tfs/
Then I opened our Owasp resources to everyone, and decided to then run it on webex. I had trained around 60 engineers probably by then on security threats and best practices, but I thought I could reach more. I got in touch with our corporate QA team again, which also were the go-to for audits, so I knew they understood the benefit, and they started putting me in touch with other managers that had interest. And suddenly I see the invite list at over 450 people. I started to worry if our conferencing software could even handle that, I now had to figure that out, but luckily enough I found out that our HR team had some the research, and we could accommodate 800+. About 150 engineers and managers showed up to that, and it was great. We had people asking questions and being engaged, and I received a lot of followups with questions and new scenarios teams had come up with, which meant they got something out of it, so I was thrilled. I then realized we actually have one of the best tech cultures in any company I knew of.
In different parts of the company, similar sized initiatives were taking place, and we were invited to those as well. We had teams demoing new tools they had, building new systems and advancing technology they had because now they had the contacts and motivation to make it happen. On the IT side, we were getting new hardware, and new security processes were being put in place. We were now looking back on previous audits, not to make a good impression or pass them, but to take everything we heard our customers asking about, and making those known to teams, which in turn shared their own experiences and solutions. Our new group even came up with a new corporate SDLC, Secure Development SDLC and Code Quality SOP I mentioned before. We even passed new security and access control validation procedures, and came up with really creative ways to make the process air tight and painless for teams to do, to the point where we were talking even talking about encryption key storage in FIPS 140-2 Hardware Security Modules and key rotation as routine practice. It was a new era.
We had launched our in-house developed Single SignOn server around that time as well, and the demand for it was really something. At that years company New Year’s party, I probably knew everyone in the room, and the room was bursting with new ideas, and technical discussions.
Our team even changed the way it approaches software. We were a taskforce before, we got a problem, and ran with it until we found a solution. Now instead, we started to come together more, and introduced our concept of engineering meetings. These were meetings on any engineering topic, could be called for at any time, and the whole team was welcome. One problem we solved was a performance problem around how we sent email notifications, that basically took around 15 minutes to come up with a list of people who should get the email, because our rules needed to be dynamic, and existing solutions we ran across were just falling short of our use-case. I recall going into this and no one even understood the whole business logic of how users needed to be selected. We spent the first part of the way on discovery, and interviewed our QA and support teams about what they knew about it, and worked backward from the code to build out the business logic. When discovery was done, we came up with our business rules, put them on the board, and scratched our head about if an optimization was even possible.
I remember staring at an almost blank whiteboard, when our support lead comes and stands next to me, and all I say as I stare at the board is “I think it involves math”. He sits next to me for another couple minutes and then leaves. And for the next day, we just look at the board, have half a thought and then say “no, hmm,” and back to staring. Then it hits us. We finally have an idea, and it’s oh so simple. I’ll have another post into the details since this is getting a bit long. Our solution reduced the time and space complexity to basically nothing. We went from 15 minutes to 3 milliseconds, and it took 2 hours to write the code, maybe around 100 lines, which replaced the previous 10,000 lines of business logic, and we suddenly had the biggest optimization we had ever done. This was such a big impact that this is now the default way we approach solving complex problems. We have no fear of just putting an impossible problem on the board, and just staring at it, hoping that a solution will come to us, and so far it’s worked great.
We were now mature enough that we realized we lacked proper observability. I went into research mode, and found that the best tech for the job was something called the ELK stack from Elastic. With quite a steep learning curve at the time, and a lot of long nights reading docs, we now had a SIEM. We developed and deployed a central log aggregation and security information event management system with machine learning trained unattended anomaly detection, slack and email integration, in a matter of weeks. We of course were very excited to grow this and open it up to the rest of the company. From useful dashboard, to alerts, everyone from developers, IT to support had tailored access to the information they needed. In some cases we’ve been able to bring support questions down to about 1 minute to reply with an answer detailed enough about something that happened, to be completely satisfactory.
The list of projects goes on, we started using new technologies like neo4j, we created our own discovery / gossip protocol for distributed system communication, we came up with multi-datacenter replication and hot-hot strategies, and we even got into fancy technologies like Vectors of Trust and Software Defined Perimeter.
We probably did more in the 8 months since our management left, than we could have ever imagined possible in that time-frame. I just talked about things I was in directly involved in which came to mind, but so many other teams did the same, and I made a lot of friends.
Finally, we get word that the lawsuit between our CEOs had finally come to a conclusion and a winner was decided. Our guy had won, and he was a champion to the company. He personally then went out and got almost everyone who left back over the next few months. We were really excited that our colleagues were coming back, but at the same time we were worried that we may have overstepped out authority, by basically changing everything we thought needed changing. Those worries were foundation-less, and they all came back with new energy, rewarded us for what we did, and made a lot more contributions themselves. However, I don’t think the company would be the market leader it is today without things having played out in this exact way.
I’ve been pondering why this happened this way. These are all very smart people in management, and it’s not like we came up with ideas that were rejected, we actually all worked toward the same goal, and ideas do get to flow freely here. Now that I’ve thought about it for a couple years, I think people hesitate to make decisions, especially when they respect someone else’s authority, just to not accidentally make a decision that would upset the other person or ruin some other initiative their boss has. I’m sure this isn’t universally true, but I felt like I needed to share this experience, since it was so impactful to the way I see the world today, and maybe this story will help someone going through something similar. I am confident that when I leave, every member of my team will rise as I did.
Quick Links
Legal Stuff