What tools have you used in your development?
- standard unittest library coming with python
- pycallgraph and then some custom bits of code to analyze the nesting
- coverage
- nose as a test runner
- mox for mocking
- tox (which uses virtualenv) to test and collect metrics from all possible combinations of libraries and python versions
- pep8 and pylint for style and code correctness checking
- lettuce for behavioural driven development
- distribute and/or pip to build and install code (check this)
- Jenkins for Continuos Integration (used to use buildbot, they are both fine tools)
How did you store the metrics? What options are out there?
How can you handle a lot of metrics?
- chained rrdcached with shards , quite new and still experimental but working and certainly pretty interesting.
- a key value store, probably redis. This is a somewhat arbitrary recommendation based on my experience with it, most likely mongodb or cassandra would also scale well, I believe twitter has actually released something based on it to store their metrics.
- mysql, most likely with shards, probably on hostname, but I’m not a fan.
- openTSDB by stumbleupon is an interesting timeseries database built on top of hbase
Developers don’t want to be ops.
This was more of a statement than a question from someone in the audience. He commented that while appreciating the goal he didn’t see it feasible as it required one group to acquire competencies it doesn’t have an interest into, ie developers don’t want to know about operation details.
But there is no such requirement. A lot of value can be gained just by listening and contributing from one own’s realm. Work that devs and ops do should be considered not just in its own terms, but also in terms of value generated for the other group. To follow from the talk, developers should not be required to understand how monitoring systems work or how to collect metrics from their test runs, they should get that as a service provided by operations. That would become a point of contact from which discussions can be had to improve the application and further collaborate on that project and others. In turn developers can better understand the needs of a production application and enhance the code to expose selftests and metrics via some kind of API that operations can tie to a monitoring system.
Testing becomes a point of contact where the two groups meet and have a conversation in a common language.
How do you prove the value of metrics like these to management?
This is an evergreen, and for good reasons. Without management buy-in getting the time to implement proper testing environments is difficult, albeit not impossible. I must say that of all the topics you can bring to a manager for approval, metrics is probably one of the easiest to communicate value about. That said implementing something like what has been discussed is not cheap and not an easy task to get approved.
Retro-analysis can be very powerful in this instance. Companies that care tend to have some kind of record of outages and at least a rough estimate of the impact. If code was tracked in a version control system then you can run a set of tests (think integration tests on virtual systems simulating production) on whatever commit it was that was released at the time of an incident. It is possible that by graphing and analysing various metrics for a set of commits up to the day of and outage you could have predicted it. This becomes a strong leverage to use to gain buy-in for the project.
Thanks everybody, it was a blast, I loved FOSDEM!
4 Responses to Slides and notes from my FOSDEM talk – I’m going M.A.D. – monitoring aided development
Leave a Reply Cancel reply
About Me
Hi, my name is Spike Morelli and this is my thinking lab. Over the past 13 years of career in the tech industry I've been a developer, a system engineer, a devops person, a manager and a startup owner. I've taken the best from each experience and brought it into the next, innovating and focusing on delivering value. I have a passion for sociology and communication, but above all I care about making people happy, it's incredibly rewarding and happy folks do the best work.
Most of us wouldn't have done what we have done if we didn't have people around us to learn from, their experiences is what helped us grow, their passion our fuel. If that's also your experience let's make that circle bigger, reach out to me at fsm@spikelab.org or on twitter





[...] This post was mentioned on Twitter by jtimberman and patrickdebois, Spike Morelli. Spike Morelli said: http://bit.ly/gjKofs slides¬es from my #fosdem talk on monitoring aided dev, trust your code and your ppl #devops [...]
Just watched your talk at http://video.fosdem.org/2011/ -precisely http://video.fosdem.org/2011/maintracks/mad.xvid.avi
It was inspiring, thank you.
Hi Feth,
thank you very much for your comment. Feel free to reach out if you’d like to chat about it, I’d love to hear from people applying metrics to their workflow.
Excellent Spike
Love the anecdotes and i especially feel it for TDD.