LISA’11: Realworld Configuration Management

Print Friendly, PDF & Email

Back to LISA conference again! Whoop! It’s Sunday and my favourite session on Config Management (of course).

This session is in workshop format which is basically a load of people (around 40 this time, smaller than last year) sitting around a table with a few loose topics to start us off. The workshop is lead by Kent Skaar (or Skaar Skaar Skaar) from VMware, with Cory Lueninghoener from Los Alamos National Lab and Naryan Desai from Argonne Nat Lab.

We were asked to give a little intro, stating our interests. Mine consisted of  collaboration work (disparate groups using a core/base repos), tools to manage the tools (automating writing spec), resource discovery, name-space management and integration with tools. I described how our environment is a mix of tools built for Windows, HPC, wireless/residential and a mixture out in the Zones.

Points raised by others in the room around the table as areas of interest:

  • recognising non-existence of crystalline perfection.
  • rpath for package management at AMD (Mike seemed to have some similar problems and solutions to us.)
  • tool migration: many are looking for new tools (move from cf2->3 or puppet) (move from cf2->bcfg2 whoop!)
  • collaboration seems to be common problem (differing systems, techniques, requirements).

Components recognised that can be considered as separate or independent configuration management systems:

  1. Package management (at OS level)
  2. HPC or configuration file management
  3. Application layer deployment

Mike (from AMD) described how they are considering a move to cf3. With cf2 they had taken an image. Added and distilled it to what they wanted. This didn’t work so well for them, so allowing component 1 (above) to be managed by rPath, 2 is managed by cf3 and release manager manages 3. They use a similar environment as us, but have a more solid process for rolling out software patches via dev->qa->prod on a weekly rotation.

He has a general desire to make things modular. They want to allow for users to cherry pick configuration classes and software groups without needing to contact the admin. Which is along the same lines as what i was thinking this morning. You have a catalogue which states what the user can select (groups/packages) whenever they want them. In Bcfg2 this could be as simple as a file on the host, listing the classes that you want installed, that is slurped in by a probe. It would need to be part of the build process (on the release management side) to replace this on rebuild and not be part of config management (for us at least).

A common solution to collaborative work is to use a code review tool, e.g. git and gerrit. This was mentioned last year too I think.

Orchestration is still common problem, the ordering of changes and release. Mike said they pull stages of a release into the live repo rather than the whole code revision repo in one go. Cisco Orchestrator was mentioned as a possible solution for this in the future. There was a discussion about having the orchestrator as a separate entity, having a single management viewpoint and was described as “encoding business logic” in a single unique stream. Too many cooks spoil the broth. Configuration management, build and deployment tools can be many and varied. Bcfg2 had a pilot run to create orchestration tool in it’s early stages. It needed to “map the state machine”, then build it, have feedback on atomic operations and reactive to them. It’s really costly to do this and often cheaper to manually orchestrate changes in the real world. The solution it seems is adding layers of abstraction to the model, one to probe state of a machine and another to overlay a view of the infrastructure and again with the business model. This is similar to something Nikki Rogers (out Enterprise Architect) and I have discussed in the past. The further up the layers of abstraction you go the closer you get to the business logic. Something that the tools  are lacking at the moment is ability to perform atomic operations (if something fails stop and roleback) OR use feedback and recording state, utilising those layers of abstraction.

After break…

Tip: If a tool doesn’t do the job, document what it should do. If your asset db isn’t good, make the documentation generate the spec (or visa versa).

Tool Poll

Number Tool Comments
4 home grown
0 Chef has DSL but allows free form ruby, more explicit in ordering.
6 Cf2 most are intending to move away (those that are not are cowards!)
1 Cf3 Only used by the Cfe guys! (side discussion of convergence model*) KM (knowledge map) and Spec are explicitly related
9 Puppet Most popular one right now, most were happy with using it but there were some grumbles when it was mentioned.
4 Bcfg2 Some organisation political problems motivated some of the design of bcfg2.

* the choice of models are:

  • golden (i.e. reinstall every time)
  • incremental (change aspects when needed)
  • convergence (steps of fixed point promises)

Cory described that in their environment installation is based on a golden image. It has a fast boot, you can make changes at run time (but they won’t stick), you apply changes to the master and reboot to make use of them.

After lunch…

Skaar and Mike picked up on the point of “Tools that manage tools” for discussion. A lot of the time it’s necessary to check the scope of change, what will it affect? Often this is done outside of the main CM tool. There are many tools for testing deployment, some simple guidelines are KISS, be prescriptive and don’t allow for deviation from the norms. Recognise that you can’t engineer something that can only be confirmed via human interaction. If you’re going to write a change without having written a validation check then you take the flack if it breaks. It was mentioned that rspec can be useful for validation (and that Cucumber is not). It’s indicated that you should supplement a monitor for your monitoring system or populate a separate source of metadata for monitoring. Accidental removal of metadata at the source can be missed when the monitoring is also removed.

A big topic for contention, which Skaar was reluctant to go into, is the concept of “editfiles” (a sin?). A configuration file API and syntax parser, Augeos is an attempt to make this a more stable process. However, this practise is identified as risky, but depending on that risk assessment, it could be useful. Though you can and probably will shoot yourself in the foot.

Chris St. Pierre expressed that most (if not all) OpenSource (at least) cfgmgt tools do not do full asset/CMDB management, they know about host info but they don’t act as a main source for it. Why! After last break I mentioned using rdf again as a tool to represent the relationships between configuration items. This might need more investigation again on my part but it was believed that this sort of representation can often become very messy and sounds like the nightmare of HP SIM .

Starting with a clean base and introducing tweaks from the norm causes expense in maintenance over time. Try to keep things simple and stick to the designed norm. Having a complete rewrite or migration to a new tool might be helpful. There’s a lot of focus on cloud and virtual environments. It seems that the notion that a developer can build a new environment freely on a cloud solution is a business risk. It’s almost on a parallel with developing on a desktop. Once the service needs to go into production the service will be difficult to support. Alva Couch (is here yay!) says that he usually says he asks for applications to be reimplemented by an engineered before it can go production.

Topic moved onto what we see the future will be.

  • generating documentation from the spec (or the spec is the documentation).
  • orchestration is the next grail. cisco orchestrator – each team can “code” plugins which provides a function. e.g. vm build, apply config.
  • representing /etc as a database – some scoffed at this idea. suggests this is like plan9.
  • more layers of abstraction, more dynamic “on the fly” configuration.

My final comments take back to work.

  • Happy that I’m one of 100% happy customers of bcfg2! Loads of people want to look at it again.
  • Interested in living knowledge in configuration specification driving documentation.
  • Glad to hear the recognition of dynamic and stateful machines. Will work more on this and avoid static configuration where possible.
  • The workshop had a good balance of theory and practical coverage.
  • It’s good to hear that people feel that they’re more in control previous years seemed to be loosing the battle.
  • There’s a general desire for simplicity but the future will essentially still be complex.
  • Mentioned windows Systems Centre (as someone asked about what was going on).