Commit Graph

108 Commits

Author SHA1 Message Date
Guillaume Herail
3575d597f7 resources: Add User/Group to ExecRes 2017-11-24 10:38:16 -05:00
James Shubin
81daf10157 test: Fix linter issues
These are some linter issues that were found in a new version of the
linter. Let's fix them now before that linter hits our test suite.
2017-09-26 19:38:53 -04:00
James Shubin
0df4824a56 test: Increase timeouts for slow travis
Should prevent more intermittent failures.
2017-09-09 15:31:06 -04:00
James Shubin
653c76709a test: Fix another intermittent failure
Some of the tests had very precise timeouts, which weren't very
important. Here's another one that timed out early.
2017-09-04 16:39:01 -04:00
James Shubin
6c8588c019 test: Increase timeouts because travis is slow
Should hopefully prevent some intermittent failures.
2017-09-04 13:02:05 -04:00
James Shubin
0edba74091 etcd: Bump to version 3.2.6 and update all the grpc deps
Note: When go-grpc-prometheus was in the main $gopath (even at this
version) and everyone else was where they always were in vendor/ this
didn't build! It gave errors like:

	have SendHeader("github.com/purpleidea/mgmt/vendor/google.golang.org/grpc/metadata".MD) error
	want SendHeader("google.golang.org/grpc/metadata".MD) error

and I got frustrated. Putting it "next" to the other vendored deps seems
to have fixed this. Where are the golang docs that explain this
phenomenon?

This also requires golang 1.8+ as that is a requirement for etcd. It's
probably a reasonable thing for us too.

Note the older versions of etcd had some bugs with the concurrency
package and other things, so this is a necessary bump.
2017-08-30 14:16:02 -04:00
James Shubin
0dadf3d78a resources: Add NewNamedResource helper
This makes the common pattern of NewResource, SetName, easier. It also
makes it less likely for you to forget to use SetName.
2017-06-17 18:09:49 -04:00
James Shubin
9f5057eac7 resources: Do not panic on autogrouped graph switches
Graph changes from autogrouped -> not autogrouped or vice versa cause a
panic (or I assume a leak) because we compared the auto grouped graph to
the ungrouped one, which would cause an Exit on an unstarted Vertex.
This includes a test that seems to reliably reproduces the issue.
2017-06-08 01:05:58 -04:00
James Shubin
b8ff6938df resources: Unify resource creation and kind setting
This removes the duplication of the kind string and cleans up things for
resource creation.
2017-06-07 03:07:02 -04:00
James Shubin
9cbaa892d3 gapi: Allow the GAPI implementer to specify fast and exit
This allows the implementer of the GAPI to specify three parameters for
every Next message sent on the channel. The Fast parameter tells the
agent if it should do the pause quickly or if it should finish the
sequence. A quick pause means that it will cause a pause immediately
after the currently running resources finish, where as a slow (default)
pause will allow the wave of execution to finish. This is usually
preferred in scenarios where complex graphs are used where we want each
step to complete. The Exit parameter tells the engine to exit, and the
Err parameter tells the engine that an error occurred.
2017-06-02 04:03:10 -04:00
James Shubin
0545c4167b pgraph: Remove NewVertex and NewEdge methods and fix examples
Since the pgraph graph can store arbitrary pointers, we don't need a
special method to create the vertices or edges as long as they implement
the String() string method. This cleans up the library and some of the
examples which I let rot previously.
2017-05-31 18:04:58 -04:00
James Shubin
d11854f4e8 pgraph: Clean up pgraph module to get ready for clean lib status
The graph of dependencies in golang is a DAG, and as such doesn't allow
cycles. Clean up this lib so that it eventually doesn't import our
resources module or anything else which might want to import it.

This patch makes adjacency private, and adds a generalized key store to
the graph struct.
2017-05-13 13:28:41 -04:00
James Shubin
a4858be967 lib, gapi: Next method of GAPI should generate first event
This puts the generation of the initial event into the Next method of
the GAPI. If it does not happen, then we will never get a graph. This is
important because this notifies the GAPI when we're actually ready to
try and generate a graph, rather than blocking on the Graph method if we
have a long compile for example.

This is also required for the etcd watch cleanup.
2017-04-10 03:20:58 -04:00
James Shubin
66d9c7091c lib: examples: Update to most recent API
At some point in the past the API changed. Fixed now.
2017-04-10 03:20:58 -04:00
Julien Pivotto
33d20ac6d8 prometheus: Add detailed metrics
Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
2017-03-16 14:18:46 +01:00
James Shubin
cd5e2e1148 pgraph: Add fast pausing and exiting of graphs
This causes a graph to actually stop processing part way through, even
if there are poke's that want to continue on. This is so that the user
experience of pressing ^C actually causes a shutdown without finishing
the graph execution. It might be preferred to have this be a user
defined setting at some point in the future, such as if the user presses
^C twice. As well, we might want to implement an interrupt API so that
individual resource execution can be asked to bail out early if
requested. This could happen on a third ^C press.
2017-03-13 07:54:03 -04:00
James Shubin
91af528ff8 pgraph: Move the quiesce done indicator to avoid deadlock
This avoids a deadlock on resource failure when retry==0. Without this
we would never exit. This adds a test in too!
2017-03-12 13:52:35 -04:00
James Shubin
8ff048d055 test: Disable prometheus-3.sh test temporarily
It seems to be failing, and I'm not sure where the regression is, or if
there is a race. Sorry roidelapluie.
2017-03-09 11:46:21 -05:00
James Shubin
22b48e296a resources, yamlgraph: Drop the kind capitalization
This stopped making sense now that we have a resource with two primary
capitals. It was just a silly formatting hack anyways. Welcome kv!
2017-03-09 02:50:55 -05:00
James Shubin
d8e19cd79a semaphore: Create a semaphore metaparam
This adds a P/V style semaphore mechanism to the resource graph. This
enables the user to specify a number of "id:count" tags associated with
each resource which will reduce the parallelism of the CheckApply
operation to that maximum count.

This is particularly interesting because (assuming I'm not mistaken) the
implementation is dead-lock free assuming that no individual resource
permanently ever blocks during execution! I don't have a formal proof of
this, but I was able to convince myself on paper that it was the case.

An actual proof that N P/V counting semaphores in a DAG won't ever
dead-lock would be particularly welcome! Hint: the trick is to acquire
them in alphabetical order while respecting the DAG flow. Disclaimer,
this assumes that the lock count is always > 0 of course.
2017-02-27 02:57:06 -05:00
James Shubin
757cb0cf23 test: Small fixups to t4 and a rename 2017-02-26 20:54:22 -05:00
Julien Pivotto
7d92ab335a prometheus: Add mgmt_pgraph_start_time_seconds metric
Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
2017-02-26 15:28:43 +01:00
James Shubin
8be09eadd4 test: Fix up probable timeout failures due to slow ci
We had *occasional* failures most likely due to slow ci and compounded
by low entropy. We also weren't pointing at the right test!
2017-02-22 23:08:09 -05:00
James Shubin
98bc96c911 golint: Fixup issues found in the report
This also increases the max allowed to 5% -- I'm happy to make this
lower if someone asks.
2017-02-22 22:18:55 -05:00
James Shubin
3bd37a7906 test: Don't block on graph transitions
Improvements in the engine have uncovered some annoying race conditions
which would cause the engine to block between transitions. This is a
test which catches the most obvious file based ones.

This requires inotify to work in the test environment.
2017-02-22 17:45:16 -05:00
James Shubin
18ea05c837 pgraph, resources: Add proper start/stop signals
We need to perform some operations in lock step between graph
transitions. This should help with that!
2017-02-21 18:48:27 -05:00
James Shubin
02dddfc227 test: Fix yamlfmt test
Last chance before we kill this entirely.
2017-02-21 16:16:41 -05:00
Mildred Ki'Lya
d0d62892c8 resources: file: Allow creation of empty directories 2017-02-16 20:41:59 -05:00
Julien Pivotto
be5040e7a8 prometheus: Remove mgmt_process_start_time_seconds metric
That metric is useless as by default the prometheus golang client
provides the `process_start_time_seconds` metric.

This reverts commit 25e2af7c89.
2017-02-14 22:56:12 +01:00
Julien Pivotto
25e2af7c89 prometheus: Add mgmt_process_start_time_seconds metric 2017-02-14 22:14:59 +01:00
James Shubin
e9adbf18d3 test: prometheus: Fix up test 2017-02-14 12:10:54 -05:00
Sean Jones
610202097a test: Enable macOS shell testing
* Check and install libvirt with Homebrew

  macOS does not have apt, dnf or yum. Add checking for homebrew for
  installing libvirt.

* Use platform timeout for tests
    * Add timeout detection to test/util.sh
    * Use $timeout for shell test requiring timeout
2017-02-14 11:59:44 -05:00
Mildred Ki'Lya
8c2c552164 resources: file: Implement file attributes
Add owner which must be username or uid of the file owner, group which is
the group name or gid of the file, and mode which is the octal unix file
permissions.

Add separate implementation for Go 1.6 and lower.
2017-02-14 11:55:00 -05:00
Julien Pivotto
e8855f7621 prometheus: Implement mgmt_checkapply_total metric
Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
2017-02-12 23:45:47 +01:00
Julien Pivotto
bdb8368e89 resources: augeas: New resource
Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
2017-02-12 23:02:12 +01:00
Julien Pivotto
de9a32a273 recwatch: Remove watcher on file move
Fix #120

Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
2017-02-11 13:45:32 +01:00
Julien Pivotto
72873abe05 test: file: test the behaviour of inotify on parent dir moves
This is a test for #124. It is disabled until #124 is fixed, so it can
already me merged.

Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
2017-02-09 16:01:09 +01:00
Julien Pivotto
7b7c765d78 prometheus: Add a new test, with --prometheus-listen
Also: rename t9 to prometheus-1

Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
2017-02-09 00:05:19 +01:00
Julien Pivotto
1af67e72d4 prometheus: Implement basic Prometheus support
Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
2017-02-08 12:13:33 +01:00
James Shubin
11b40bf32f resources: Update state checks
The mgmt graph depends on state tracking to eliminate redundant pokes.
With the Watch loop now able to produce events quickly, it should no
longer play a part in determining the vertex state. This simplifies the
resource API as well!
2017-01-25 09:13:59 -05:00
James Shubin
1370f2a76b gapi: Split out graph generation into a proper graph API
This is a monster patch that splits out the yaml and puppet based graph
generation and pushes them behind a common API. In addition alternate
pluggable GAPI's can be easily added! The important side benefit is that
you can now write a custom GAPI for embedding mgmt!

This also includes some slight clean ups that I didn't find it worth
splitting into separate patches.
2016-11-03 03:56:16 -04:00
James Shubin
66fbbb940a test: temporarily disable test 2016-09-18 03:46:42 -04:00
James Shubin
1cf88d9540 test: Increase timeouts of t8
Increase the timeouts in the rare chance that this is slow performing
travis, and not just an etcd regression.
2016-09-02 02:27:39 -04:00
James Shubin
9260066fa3 tests: Workaround regression in two host etcd clusters
If you don't give your two host cluster enough time to "feel healthy",
it will generate an error if you do operations within five seconds. This
is a regression and the five seconds is also quite arbitrary. This is
detailed at: https://github.com/coreos/etcd/issues/6305

This seems to be a bit of a race condition, even with a 10s timer, so
this also disables the StrictReconfigCheck. Re-enable this as soon as
possible.
2016-08-31 21:55:19 -04:00
James Shubin
db4de12767 Add more flexibility around the prefixes available
This allows you to specify a custom prefix, or a tmp prefix which is
chosen automatically.
2016-08-31 21:55:19 -04:00
James Shubin
276219a691 Run tests with a tmp prefix
This will avoid failures if /var/lib/mgmt/ isn't writable.
2016-08-31 21:55:19 -04:00
James Shubin
fa4f5abc78 Fix up tests, and one small bug 2016-08-30 17:47:25 -04:00
James Shubin
0c7b05b233 Check for exit status of tests 2016-08-30 17:47:25 -04:00
James Shubin
24ba6abc6b Don't block on member exits
The MemberList() function which apparently looks at all of the endpoints
was blocking right after a member exited, because we still had a stale
list of endpoints, and while a member that exited safely would update
the endpoints lists, the other member would (unavoidably) race to get
that message, and so it might call the MemberList() with the now stale
endpoint list. This way we invalidate an endpoint we know to be gone
immediately.

This also adds a simple test case to catch this scenario.
2016-07-26 04:23:46 -04:00
James Shubin
c1605a4f22 Add test case for urfave regression
Credit to jerith for helping me hack this together :)
2016-07-25 20:25:08 -04:00