New version of Software Operability book – chapters on core practices, logging & metrics


Matthew Skelton 2016
Published by
Matthew Skelton (@matthewpskelton)
PUBLISHED: May 29, 2018 7:53 pm

We have published a new version of the book Team Guide to Software Operability by Matthew Skelton and Rob Thatcher with two new chapters: Core Operability Practices and Logging & Metrics. There is also a new Appendix with technical details on logging for different technologies (cloud, Serverless, IoT, etc.). We think that logging and metrics are absolutely fundamental to good software operability, so we are pleased to publish this material early.

Go to to buy the book or download a sample chapter. All purchases are covered by the LeanPub money-back guarantee.

Software Operability - Leanpub thumbnail

For more information on the book, head to

Existing subscribers will receive the new version automatically via LeanPub. We use LeanPub for publishing so we can deliver value incrementally as the chapters are written. It allows us to incorporate feedback from readers and change the material before we go to print.

New chapters – May 2018

Core practices for good software operability

2.1 Logging and metrics are the first features to implement
2.2 Use a well-defined, meaningful event identifiers
2.3 Include operational hooks as first-class features
2.4 ‘DONE’ means working correctly in Production
2.5 Treat Operations as a high-skill activity
2.6 The software development team writes a draft Run Book
2.7 Avoid a separate ‘Production-ization’ or ‘Hardening’ phase
2.8 Avoid Production-specific tools
2.9 Talk about ‘operational features’, not ‘non-functional requirements’
2.10 Developers and Product Owners should be on-call
2.11 Make operational problems visible
2.12 Test for operability in a deployment pipeline

Use modern log aggregation and metrics for deep operational insights

4.1 Use logging to help design and understand distributed systems
4.2 Collect and aggregate logs and metrics centrally using standard tools & software
4.3 Focus on collaboration, design decisions, and team experience
4.4 Identify 2 or 3 key application metrics and test these early on
4.5 Run log aggregation and metrics locally on development workstations
4.6 Hide sensitive information at the point of logging
4.7 Use Structured Logging for greater meaning
4.8 Use Event IDs for visibility of application behaviour
4.9 Collaborate on Event IDs to enhance operability
4.10 Test your logging and metrics
4.11 Trace operations across system boundaries with correlation IDs
4.12 Adapt your logging and metrics techniques to the technology characteristics

Buy the book or download a sample

Go to to buy the book or download a sample chapter. All purchases are covered by the LeanPub money-back guarantee.