The new version of the urfave/cli library is moving to generics, and
it's completely unclear to me why this is an improvement. Their new API
is very complicated to understand, which for me, defeats the purpose of
golang.
In parallel, I needed to do some upcoming cli API refactoring, so this
was a good time to look into new libraries. After a review of the
landscape, I found the alexflint/go-arg library which has a delightfully
elegant API. It does have a few rough edges, but it's otherwise very
usable, and I think it would be straightforward to add features and fix
issues.
Thanks Alex!
Standalone etcd is useful for when we don't want to use the embedded
version to make it easier to deploy somewhere or for testing.
This pulls in about the same amount of code since we already embedded
etcd previously. Since the embedded etcd feature of mgmt is not very
stable, we'll add this for now.
The old system with vendor/ and git submodules worked great,
unfortunately FUD around git submodules seemed to scare people away and
golang moved to a go.mod system that adds a new lock file format instead
of using the built-in git version. It's now almost impossible to use
modern golang without this, so we've switched.
So much for the golang compatibility promise-- turns out it doesn't
apply to the useful parts that I actually care about like this.
Thanks to frebib for his incredibly valuable contributions to this
patch. This snide commit message is mine alone.
This patch also mixes in some changes due to legacy golang as we've also
bumped the minimum version to 1.16 in the docs and tests.
Lastly, we had to disable some tests and fix up a few other misc things
to get this passing. We've definitely hot bugs in the go.mod system, and
our Makefile tries to workaround those.
This is a giant cleanup of the etcd code. The earlier version was
written when I was less experienced with golang.
This is still not perfect, and does contain some races, but at least
it's a decent base to start from. The automatic elastic clustering
should be considered an experimental feature. If you need a more
battle-tested cluster, then you should manage etcd manually and point
mgmt at your existing cluster.
This giant patch makes some much needed improvements to the code base.
* The engine has been rewritten and lives within engine/graph/
* All of the common interfaces and code now live in engine/
* All of the resources are in one package called engine/resources/
* The Res API can use different "traits" from engine/traits/
* The Res API has been simplified to hide many of the old internals
* The Watch & Process loops were previously inverted, but is now fixed
* The likelihood of package cycles has been reduced drastically
* And much, much more...
Unfortunately, some code had to be temporarily removed. The remote code
had to be taken out, as did the prometheus code. We hope to have these
back in new forms as soon as possible.
This splits the recursive watching bit of the file file resource into
it's own package. This also de-duplicates the configwatch code and puts
it into the same package. With these bits refactored, it was also easier
to clean up the error code in main.
This makes this logically more separate! :) As an aside...
I really hate the way golang does dependencies and packages. Yes, some
people insist on nesting their code deep into a $GOPATH, which is fine
if you're a google dev and are forced to work this way, but annoying for
the rest of the world. Your code shouldn't need a git commit to switch
to a a different vcs host! Gah I hate this so much.
All resources can now set a retry limit (-1 for infinite) and a delay
between retries. This applies to both the CheckApply methods, and the
Watch methods as well. They each have their own separate counts, but use
the same input meta param, since I decided it wouldn't be useful to have
a separate watchRetry and watchDelay set of meta parameters.
In the process, we got rid of about 15 error cases which would normally
panic.
This patch required a slight overhaul of the Event system.
The previous commit is an earlier version of this patch which I decided
to leave in to "show my work" as I used to have to do in math class.
It's slightly more correct with the current event system, and this
version is less correct and has a few bugs, but that is because the
event system needs a massive overhaul, and once that's done this should
all work properly for the corner cases.
This patch extends the --converged-timeout argument so that when used
with --remote it waits for the entire set of remote mgmt agents to
converge simultaneously before exiting.
purpleidea says: This particular part of the patch probably took as much
work as all of the work required for the initial remote patches alone!
This adds a new method of marking whether a particular UUID has
converged or not. You can now Start, Stop, or Reset a convergence timer
on the individual UUID's. This wraps the existing SetConverged calls
with a hidden go routine. It is not recommended to use the SetConverged
calls and the Timer calls on the same UUID.
This is a new mode to be used for bootstrapping mgmt clusters or in
situations with tight operational restrictions.
This includes the basics, additional functionality will follow!
This monster patch embeds the etcd server. It took a good deal of
iterative work to tweak small details, and survived a rewrite from the
initial etcd v2 API implementation to the beta version of v3.
It has a notable race, and is missing some features, but it is ready for
git master and external developer consumption.
Puppet can be used on the basis of the ffrank-mgmtgraph module.
There are three modes available:
* fetching catalogs from the master (--puppet agent)
* compiling a manifest from a local file (--puppet /path/to/file.pp)
* compiling a manifest from the cli (--puppet "<manifest>")
Catalogs from the master are currently never refreshed. We should
add some more code to re-run the parsing function at an interval
equal to Puppet's local 'runinterval' setting.
There is also still a distinct lack of tests.
Still, this fixes#8
This also adds a cleaner exit for the inner main loop. I'm not sure if
it's absolutely needed, but this will give me more confidence that we
won't end in the middle of some action.
Use `git show -w` to inspect this commit diff since
it changes whitespace due to `gofmt`.
This commit allows to use environment variables in place of
CLI parameters to change program behavior. This supports the
notion of [12-factor apps](http://12factor.net/)
and makes it easier to dockerize the app as well as create
a systemd unit file.
It establishes a pattern of `MGMT_*` naming convention
for environment variables.
The old converged detection was hacked in code, instead of something
with a nice interface. This cleans it up, splits it into a separate
file, and removes a race condition that happened with the old code.
We also take the time to get rid of the ugly Set* methods and replace
them all with a single AssociateData method. This might be unnecessary
if we can pass in the Converger method at Resource construction.
Lastly, and most interesting, we suspend the individual timeout callers
when they've already converged, thus reducing unnecessary traffic, and
avoiding fast (eg: < 5 second) timers triggering more than once if they
stay converged!
A quick note on theory for any future readers... What happens if we have
--converged-timeout=0 ? Well, for this and any other positive value,
it's important to realize that deciding if something is converged is
actually a race between if the converged timer will fire and if some
random new event will get triggered. This is because there is nothing
that can actually predict if or when a new event will happen (eg the
user modifying a file). As a result, a race is always inherent, and
actually not a negative or "incorrect" algorithm.
A future improvement could be to add a global lock to each resource, and
to lock all resources when computing if we are converged or not. In
practice, this hasn't been necessary. The worst case scenario would be
(in theory, because this hasn't been tested) if an event happens
*during* the converged calculation, and starts running, the exit command
then runs, and the event finishes, but it doesn't get a chance to notify
some service to restart. A lock could probably fix this theoretical
case.
This might not be fully correct, but it seems to be accurate so far. Of
particular note, the vertex order needs to be deterministic for this
algorithm, which isn't provided by a map, since golang intentionally
randomizes it. As a result, this also adds a sorted version of
GetVertices called GetVerticesSorted.
Sorry for the size of this patch, I was busy hacking and plumbing away
and it got out of hand! I'm allowing this because there doesn't seem to
be anyone hacking away on parts of the code that this would break, since
the resource code is fairly stable in this change. In particular, it
revisits and refreshes some areas of the code that didn't see anything
new or innovative since the project first started. I've gotten rid of a
lot of cruft, and in particular cleaned up some things that I didn't
know how to do better before! Here's hoping I'll continue to learn and
have more to improve upon in the future! (Well let's not hope _too_ hard
though!)
The logical goal of this patch was to make logical grouping of resources
possible. For example, it might be more efficient to group three package
installations into a single transaction, instead of having to run three
separate transactions. This is because a package installation typically
has an initial one-time per run cost which shouldn't need to be
repeated.
Another future goal would be to group file resources sharing a common
base path under a common recursive fanotify watcher. Since this depends
on fanotify capabilities first, this hasn't been implemented yet, but
could be a useful method of reducing the number of separate watches
needed, since there is a finite limit.
It's worth mentioning that grouping resources typically _reduces_ the
parallel execution capability of a particular graph, but depending on
the cost/benefit tradeoff, this might be preferential. I'd submit it's
almost universally beneficial for pkg resources.
This monster patch includes:
* the autogroup feature
* the grouping interface
* a placeholder algorithm
* an extensive test case infrastructure to test grouping algorithms
* a move of some base resource methods into pgraph refactoring
* some config/compile clean ups to remove code duplication
* b64 encoding/decoding improvements
* a rename of the yaml "res" entries to "kind" (more logical)
* some docs
* small fixes
* and more!
Naming the resources "type" was a stupid mistake, and is a huge source
of confusion when also talking about real types. Fix this before it gets
out of hand.