Files
mgmt/docs/puppet-guide.md
James Shubin 96dccca475 lang: Add module imports and more
This enables imports in mcl code, and is one of last remaining blockers
to using mgmt. Now we can start writing standalone modules, and adding
standard library functions as needed. There's still lots to do, but this
was a big missing piece. It was much harder to get right than I had
expected, but I think it's solid!

This unfortunately large commit is the result of some wild hacking I've
been doing for the past little while. It's the result of a rebase that
broke many "wip" commits that tracked my private progress, into
something that's not gratuitously messy for our git logs. Since this was
a learning and discovery process for me, I've "erased" the confusing git
history that wouldn't have helped. I'm happy to discuss the dead-ends,
and a small portion of that code was even left in for possible future
use.

This patch includes:

* A change to the cli interface:
You now specify the front-end explicitly, instead of leaving it up to
the front-end to decide when to "activate". For example, instead of:

mgmt run --lang code.mcl

we now do:

mgmt run lang --lang code.mcl

We might rename the --lang flag in the future to avoid the awkward word
repetition. Suggestions welcome, but I'm considering "input". One
side-effect of this change, is that flags which are "engine" specific
now must be specified with "run" before the front-end name. Eg:

mgmt run --tmp-prefix lang --lang code.mcl

instead of putting --tmp-prefix at the end. We also changed the GAPI
slightly, but I've patched all code that used it. This also makes things
consistent with the "deploy" command.

* The deploys are more robust and let you deploy after a run
This has been vastly improved and let's mgmt really run as a smart
engine that can handle different workloads. If you don't want to deploy
when you've started with `run` or if one comes in, you can use the
--no-watch-deploy option to block new deploys.

* The import statement exists and works!
We now have a working `import` statement. Read the docs, and try it out.
I think it's quite elegant how it fits in with `SetScope`. Have a look.
As a result, we now have some built-in functions available in modules.
This also adds the metadata.yaml entry-point for all modules. Have a
look at the examples or the tests. The bulk of the patch is to support
this.

* Improved lang input parsing code:
I re-wrote the parsing that determined what ran when we passed different
things to --lang. Deciding between running an mcl file or raw code is
now handled in a more intelligent, and re-usable way. See the inputs.go
file if you want to have a look. One casualty is that you can't stream
code from stdin *directly* to the front-end, it's encapsulated into a
deploy first. You can still use stdin though! I doubt anyone will notice
this change.

* The scope was extended to include functions and classes:
Go forth and import lovely code. All these exist in scopes now, and can
be re-used!

* Function calls actually use the scope now. Glad I got this sorted out.

* There is import cycle detection for modules!
Yes, this is another dag. I think that's #4. I guess they're useful.

* A ton of tests and new test infra was added!
This should make it much easier to add new tests that run mcl code. Have
a look at TestAstFunc1 to see how to add more of these.

As usual, I'll try to keep these commits smaller in the future!
2018-12-21 06:22:12 -05:00

10 KiB

Puppet guide

mgmt can use Puppet as its source for the configuration graph. This document goes into detail on how this works, and lists some pitfalls and limitations.

For basic instructions on how to use the Puppet support, see the main documentation.

Prerequisites

You need Puppet installed in your system. It is not important how you get it. On the most common Linux distributions, you can use packages from the OS maintainer, or upstream Puppet repositories. An alternative that will also work on OSX is the puppet Ruby gem. It also has the advantage that you can install any desired version in your home directory or any other location.

Any release of Puppet's 3.x and 4.x series should be suitable for use with mgmt. Most importantly, make sure to install the ffrank-mgmtgraph Puppet module (referred to below as "the translator module").

puppet module install ffrank-mgmtgraph

Please note that the module is not required on your Puppet master (if you use a master/agent setup). It's needed on the machine that runs mgmt. You can install the module on the master anyway, so that it gets distributed to your agents through Puppet's pluginsync mechanism.

Testing the Puppet side

The following command should run successfully and print a YAML hash on your terminal:

puppet mgmtgraph print --code 'file { "/tmp/mgmt-test": ensure => present }'

You can use this CLI to test any manifests before handing them straight to mgmt.

Writing a suitable manifest

Unsupported attributes

mgmt inherited its resource module from Puppet, so by and large, it's quite possible to express mgmt graphs in terms of Puppet manifests. However, there isn't (and likely never will be) full feature parity between the respective resource types. In consequence, a manifest can have semantics that cannot be transferred to mgmt.

For example, at the time of writing this, the file type in mgmt had no notion of permissions (the file mode) yet. This lead to the following warning (among others that will be discussed below):

$ puppet mgmtgraph print --code 'file { "/tmp/foo": mode => "0600" }'
Warning: cannot translate: File[/tmp/foo] { mode => "600" } (attribute is ignored)

This is a heads-up for the user, because the resulting mgmt graph will in fact not pass this information to the /tmp/foo file resource, and mgmt will ignore this file's permissions. Including such attributes in manifests that are written expressly for mgmt is not sensible and should be avoided.

Unsupported resources

Puppet has a fairly large number of built-in types, and countless more are available through modules. It's unlikely that all of them will eventually receive native counterparts in mgmt.

When encountering an unknown resource, the translator module will replace it with an exec resource in its output. This resource will run the equivalent of a puppet resource command to make Puppet apply the original resource itself. This has quite abysmal performance, because processing such a resource requires the forking of at least one Puppet process (two if it is found to be out of sync). This comes with considerable overhead. On most systems, starting up any Puppet command takes several seconds. Compared to the split second that the actual work usually takes, this overhead can amount to several orders of magnitude.

Avoid Puppet types that mgmt does not implement (yet).

Avoiding common warnings

Many resource parameters in Puppet take default values. For the most part, the translator module just ignores them. However, there are cases in which Puppet will default to convenient behavior that mgmt cannot quite replicate. For example, translating a plain file resource will lead to a warning message:

$ puppet mgmtgraph print --code 'file { "/tmp/mgmt-test": }'
Warning: File[/tmp/mgmt-test] uses the 'puppet' file bucket, which mgmt cannot do. There will be no backup copies!

The reason is that per default, Puppet assumes the following parameter value (among others)

file { "/tmp/mgmt-test":
	backup => 'puppet',
}

To avoid this, specify the parameter explicitly:

puppet mgmtgraph print --code 'file { "/tmp/mgmt-test": backup => false }'

This is tedious in a more complex manifest. A good simplification is the following resource default anywhere on the top scope of your manifest:

File { backup => false }

If you encounter similar warnings from other types and/or parameters, use the same approach to silence them if possible.

Configuring Puppet

Since mgmt uses an actual Puppet CLI behind the scenes, you might need to tweak some of Puppet's runtime options in order to make it do what you want. Reasons for this could be among the following:

  • You use the --puppet agent variant and need to configure servername, certname and other master/agent-related options.
  • You don't want runtime information to end up in the vardir that is used by your regular puppet agent.
  • You install specific Puppet modules for mgmt in a non-standard location.

mgmt exposes only one Puppet option in order to allow you to control all of them, through its --puppet-conf option. It allows you to specify which puppet.conf file should be used during translation.

mgmt run puppet --puppet /opt/my-manifest.pp --puppet-conf /etc/mgmt/puppet.conf

Within this file, you can just specify any needed options in the [main] section:

[main]
server=mgmt-master.example.net
vardir=/var/lib/mgmt/puppet

Caveats

Please see the README of the translator module for the current state of supported and unsupported language features.

You should probably make sure to always use the latest release of both ffrank-mgmtgraph and ffrank-yamlresource (the latter is getting pulled in as a dependency of the former).

Using Puppet in conjunction with the mcl lang

The graph that Puppet generates for mgmt can be united with a graph that is created from native mgmt code in its mcl language. This is useful when you are in the process of replacing Puppet with mgmt. You can translate your custom modules into mgmt's language one by one, and let mgmt run the current mix.

Instead of the usual --puppet, --puppet-conf, and --lang for mcl, you need to use alternative flags to make this work:

  • --lp-lang to specify the mcl input
  • --lp-puppet to specify the puppet input
  • --lp-puppet-conf to point to the optional puppet.conf file

mgmt will derive a graph that contains all edges and vertices from both inputs. You essentially get two unrelated subgraphs that run in parallel. To form edges between these subgraphs, you have to define special vertices that will be merged. This works through a hard-coded naming scheme.

Mixed graph example 1 - No merges

# lang
file "/tmp/mgmt_dir/" { state => "present" }
file "/tmp/mgmt_dir/a" { state => "present" }
# puppet
file { "/tmp/puppet_dir": ensure => "directory" }
file { "/tmp/puppet_dir/a": ensure => "file" }

These very simple inputs (including implicit edges from directory to respective file) result in two subgraphs that do not relate.

File[/tmp/mgmt_dir/] -> File[/tmp/mgmt_dir/a]

File[/tmp/puppet_dir] -> File[/tmp/puppet_dir/a]

Mixed graph example 2 - Merged vertex

In order to have merged vertices in the resulting graph, you will need to include special resources and classes in the respective input code.

  • On the lang side, add noop resources with names starting in puppet_.
  • On the Puppet side, add empty classes with names starting in mgmt_.
# lang
noop "puppet_handover_to_mgmt" {}
file "/tmp/mgmt_dir/" { state => "present" }
file "/tmp/mgmt_dir/a" { state => "present" }

Noop["puppet_handover_to_mgmt"] -> File["/tmp/mgmt_dir/"]
# puppet
class mgmt_handover_to_mgmt {}
include mgmt_handover_to_mgmt

file { "/tmp/puppet_dir": ensure => "directory" }
file { "/tmp/puppet_dir/a": ensure => "file" }

File["/tmp/puppet_dir/a"] -> Class["mgmt_handover_to_mgmt"]

The new noop resource is merged with the new class, resulting in the following graph:

File[/tmp/puppet_dir] -> File[/tmp/puppet_dir/a]
				|
				V
		Noop[handover_to_mgmt]
			|
			V
	File[/tmp/mgmt_dir/] -> File[/tmp/mgmt_dir/a]

You put all your ducks in a row, and the resources from the Puppet input run before those from the mcl input.

Note: The names of the noop and the class must be identical after the respective prefix. The common part (here, handover_to_mgmt) becomes the name of the merged resource.

Mixed graph example 3 - Multiple merges

In most scenarios, it will not be possible to define a single handover point like in the previous example. For example, if some Puppet resources need to run in between two stages of native resources, you need at least two merged vertices:

# lang
noop "puppet_handover" {}
noop "puppet_handback" {}
file "/tmp/mgmt_dir/" { state => "present" }
file "/tmp/mgmt_dir/a" { state => "present" }
file "/tmp/mgmt_dir/puppet_subtree/state-file" { state => "present" }

File["/tmp/mgmt_dir/"] -> Noop["puppet_handover"]
Noop["puppet_handback"] -> File["/tmp/mgmt_dir/puppet_subtree/state-file"]
# puppet
class mgmt_handover {}
class mgmt_handback {}

include mgmt_handover, mgmt_handback

class important_stuff {
	file { "/tmp/mgmt_dir/puppet_subtree":
		ensure => "directory"
	}
	# ...
}

Class["mgmt_handover"] -> Class["important_stuff"] -> Class["mgmt_handback"]

The resulting graph looks roughly like this:

File[/tmp/mgmt_dir/] -> File[/tmp/mgmt_dir/a]
	|
	V
Noop[handover] -> ( class important_stuff resources )
			|
			V
		Noop[handback]
			|
			V
File[/tmp/mgmt_dir/puppet_subtree/state-file]

You can add arbitrary numbers of merge pairs to your code bases, with relationships as needed. From our limited experience, code readability suffers quite a lot from these, however. We advise to keep these structures simple.