At Austin DevOps the other night, Charity Majors, founder of Honeycomb stated several times, “Your Engineering time is the most scarce and the most precious resource of your company.” She further stated that solving the same problem more than once is wasteful.
These statements seem obvious on their face, but how often do decision makers consider the value of an engineer’s time? How often do we consider the opportunity cost in using engineers’ time on problems for which solutions already exist? Do we look for opportunities to remove this waste from the system?
Words matter because time is money. I have kept time in meetings just to see how long we spend just defining terms. Here is an example:
Engineer A: “We will need a bigger server for that.” Engineer B: “But we are not CPU bound, we don’t need a bigger server.” Engineer A: “Not the web server, the database server.” Engineer B: “Ah, that makes sense. Do you think its the physical server or the virtual private server?” Engineer A: “Physical, we will have to rack and cable a newer, bigger one.”
Granted the above is a contrived exchange, but it shows the word server in four different contexts and two distinct meanings. In that meeting you paid for the definition. Worse, your engineers are doing this all the time!
I have kept time in meetings on these “definition activities.” In general we lose anywhere from 10% to 25% of our time to defining terms and retracing our steps due to this confusion. It provides no value. Worse, it will happen in the next meeting around many of the same terms!
Let’s take a look at this problem in the real world.
The Language of Logging
The languague of logging, like most technology verticals, is rife with overloaded terms. During Charity’s thought provoking presentation however, I was constantly parsing the various meanings of terms. This is known as cognitive interference. I would much rather have been focused on her main points.
Here are a couple of examples.
Event vs. Entry
Charity tossed out the question to the crowd, “What do I mean by an event?” It was clear from the question and context she had a specific meaning in mind.
From the crowd: “An event is something like an outage that would cause me to go look at the logs.” In the context of logging, this is a reasonable meaning for the term, “event.”
Charity, an expert in logging, stopped and considered the answer. It was clear the answer what she was looking for. Another person chimed in, “It’s more general than that. It could be anything really. Like a file upload or a write to the database.” An excellent generalization, but it was clear this was not what Charity was looking for.
To Charity, an event is a single entry in a log file. For most of our careers this was likely a single line in a text file. Some vendors in the space such as Splunk and Sumologic use this terminology.
However, I would like to offer the following context. When William Shattner said, “Captain’s log, start date ….” he was not recording an event. He was making an entry. That log entry was comprised of a series of events recorded for a given time period.
Honeycomb looks for key-value input and shreds it to a proprietary store so searches on the keys are lightning fast. It is each of these json documents Honeycomb defines as an event. This is a new way to think about the term as well.
Of the 75 minute presentation, this exchange took about 5 minutes. That is 5 minutes of time to define and agree on the meaning of the word event. If we made that investment only once that would be fine. However, I knew walking out of the venue that night that the next time logging came up at Austin DevOps, we would be spending another 5 minutes agreeing on the same terminology.
In a supposed meeting room, given six people – all fully loaded at $100 per hour – your company just spent $50 to define a single term. If this happens 4 times per day you are paying $200 per day just to define terms knowing you will have to spend another $200 on those same definitions in the future. Worse, the definition might even change the next time.
Structured vs. Unstructured
Charity explained that Honeycomb uses unstructured data. She suggested that applications should emit json in key-value pairs rather than raw text lines. She suggested that software engineers use linting tools to ensure it’s good json.
This seems oxymoronic. The very fact that Honeycomb uses key-value pairs implies a structure. The fact that the log entry (event?) can be linted means there is predictable regularity (structure).
When asked what a linter might do, Charity suggested it could ensure the presense of keys and even light type checking. This sounded very similar to XML’s DTD schema definition. In other words, it was structured.
At an after-presentation social, I asked people if they were confused by this. Many were and some realize that Charity’s usage of the word “structure” was in a different context. She had explained pointing out that Splunk limits the number of usable keys, and Honeycomb does not. To her, the term structure was the artificial limit placed on the important metadata. A valid meaning for the term.
I am not saying either definition of structure is correct or cannonical. Charity is and expert and I am not. I am pointing out that this very confusion made it difficult for Charity to make her point: Honeycomb gives you an immense power not found in other log aggregation systems.
I estimate this topic was covered three different times for about 10 minutes total. How much would it cost in your organization to define the word structure?
Give Your Engineer’s Time Back
We cannot control people outside our organization. But, we can limit the problem locally. As a leader in your company pay attention to the act of clarifying terms during meetings.
- Keep a small notebook with the terms, definitions and amount of time spent on coming to agreement.
- Watch for confusion and miscommunication caused by these overloaded terms.
- After a month, ask yourself, “Is it worth solving this problem?”
It is our experience at Victory CTO that the answer is, “Yes.”
What can you do to recover your engineers’ time? Ultimately, the solution is to standardize on a meaning for a give term. Your architecture team is the source of many standards. Standard usage of terms should be added to their duties. Typically they will:
- Document the agreed upon meaning of a term
- Broadcast that meaning in documentation and meetings
- Enforce the meaning of the term
- Recruit engineers to reinforce the usage of a term
At Victory CTO we recognize that leadership isn’t only about vision. Leadership is also about identifying and removing obstacles for your teams. Taking the time to identify, measure and remove the right obstacles allows your teams to spend your most scare and precious resource on the most important problems.