Commit Graph

43 Commits

Author SHA1 Message Date
James Shubin
cd5e2e1148 pgraph: Add fast pausing and exiting of graphs
This causes a graph to actually stop processing part way through, even
if there are poke's that want to continue on. This is so that the user
experience of pressing ^C actually causes a shutdown without finishing
the graph execution. It might be preferred to have this be a user
defined setting at some point in the future, such as if the user presses
^C twice. As well, we might want to implement an interrupt API so that
individual resource execution can be asked to bail out early if
requested. This could happen on a third ^C press.
2017-03-13 07:54:03 -04:00
James Shubin
91af528ff8 pgraph: Move the quiesce done indicator to avoid deadlock
This avoids a deadlock on resource failure when retry==0. Without this
we would never exit. This adds a test in too!
2017-03-12 13:52:35 -04:00
James Shubin
8ff048d055 test: Disable prometheus-3.sh test temporarily
It seems to be failing, and I'm not sure where the regression is, or if
there is a race. Sorry roidelapluie.
2017-03-09 11:46:21 -05:00
James Shubin
22b48e296a resources, yamlgraph: Drop the kind capitalization
This stopped making sense now that we have a resource with two primary
capitals. It was just a silly formatting hack anyways. Welcome kv!
2017-03-09 02:50:55 -05:00
James Shubin
d8e19cd79a semaphore: Create a semaphore metaparam
This adds a P/V style semaphore mechanism to the resource graph. This
enables the user to specify a number of "id:count" tags associated with
each resource which will reduce the parallelism of the CheckApply
operation to that maximum count.

This is particularly interesting because (assuming I'm not mistaken) the
implementation is dead-lock free assuming that no individual resource
permanently ever blocks during execution! I don't have a formal proof of
this, but I was able to convince myself on paper that it was the case.

An actual proof that N P/V counting semaphores in a DAG won't ever
dead-lock would be particularly welcome! Hint: the trick is to acquire
them in alphabetical order while respecting the DAG flow. Disclaimer,
this assumes that the lock count is always > 0 of course.
2017-02-27 02:57:06 -05:00
James Shubin
757cb0cf23 test: Small fixups to t4 and a rename 2017-02-26 20:54:22 -05:00
Julien Pivotto
7d92ab335a prometheus: Add mgmt_pgraph_start_time_seconds metric
Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
2017-02-26 15:28:43 +01:00
James Shubin
8be09eadd4 test: Fix up probable timeout failures due to slow ci
We had *occasional* failures most likely due to slow ci and compounded
by low entropy. We also weren't pointing at the right test!
2017-02-22 23:08:09 -05:00
James Shubin
98bc96c911 golint: Fixup issues found in the report
This also increases the max allowed to 5% -- I'm happy to make this
lower if someone asks.
2017-02-22 22:18:55 -05:00
James Shubin
3bd37a7906 test: Don't block on graph transitions
Improvements in the engine have uncovered some annoying race conditions
which would cause the engine to block between transitions. This is a
test which catches the most obvious file based ones.

This requires inotify to work in the test environment.
2017-02-22 17:45:16 -05:00
James Shubin
18ea05c837 pgraph, resources: Add proper start/stop signals
We need to perform some operations in lock step between graph
transitions. This should help with that!
2017-02-21 18:48:27 -05:00
James Shubin
02dddfc227 test: Fix yamlfmt test
Last chance before we kill this entirely.
2017-02-21 16:16:41 -05:00
Mildred Ki'Lya
d0d62892c8 resources: file: Allow creation of empty directories 2017-02-16 20:41:59 -05:00
Julien Pivotto
be5040e7a8 prometheus: Remove mgmt_process_start_time_seconds metric
That metric is useless as by default the prometheus golang client
provides the `process_start_time_seconds` metric.

This reverts commit 25e2af7c89.
2017-02-14 22:56:12 +01:00
Julien Pivotto
25e2af7c89 prometheus: Add mgmt_process_start_time_seconds metric 2017-02-14 22:14:59 +01:00
James Shubin
e9adbf18d3 test: prometheus: Fix up test 2017-02-14 12:10:54 -05:00
Sean Jones
610202097a test: Enable macOS shell testing
* Check and install libvirt with Homebrew

  macOS does not have apt, dnf or yum. Add checking for homebrew for
  installing libvirt.

* Use platform timeout for tests
    * Add timeout detection to test/util.sh
    * Use $timeout for shell test requiring timeout
2017-02-14 11:59:44 -05:00
Mildred Ki'Lya
8c2c552164 resources: file: Implement file attributes
Add owner which must be username or uid of the file owner, group which is
the group name or gid of the file, and mode which is the octal unix file
permissions.

Add separate implementation for Go 1.6 and lower.
2017-02-14 11:55:00 -05:00
Julien Pivotto
e8855f7621 prometheus: Implement mgmt_checkapply_total metric
Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
2017-02-12 23:45:47 +01:00
Julien Pivotto
bdb8368e89 resources: augeas: New resource
Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
2017-02-12 23:02:12 +01:00
Julien Pivotto
de9a32a273 recwatch: Remove watcher on file move
Fix #120

Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
2017-02-11 13:45:32 +01:00
Julien Pivotto
72873abe05 test: file: test the behaviour of inotify on parent dir moves
This is a test for #124. It is disabled until #124 is fixed, so it can
already me merged.

Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
2017-02-09 16:01:09 +01:00
Julien Pivotto
7b7c765d78 prometheus: Add a new test, with --prometheus-listen
Also: rename t9 to prometheus-1

Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
2017-02-09 00:05:19 +01:00
Julien Pivotto
1af67e72d4 prometheus: Implement basic Prometheus support
Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
2017-02-08 12:13:33 +01:00
James Shubin
11b40bf32f resources: Update state checks
The mgmt graph depends on state tracking to eliminate redundant pokes.
With the Watch loop now able to produce events quickly, it should no
longer play a part in determining the vertex state. This simplifies the
resource API as well!
2017-01-25 09:13:59 -05:00
James Shubin
1370f2a76b gapi: Split out graph generation into a proper graph API
This is a monster patch that splits out the yaml and puppet based graph
generation and pushes them behind a common API. In addition alternate
pluggable GAPI's can be easily added! The important side benefit is that
you can now write a custom GAPI for embedding mgmt!

This also includes some slight clean ups that I didn't find it worth
splitting into separate patches.
2016-11-03 03:56:16 -04:00
James Shubin
66fbbb940a test: temporarily disable test 2016-09-18 03:46:42 -04:00
James Shubin
1cf88d9540 test: Increase timeouts of t8
Increase the timeouts in the rare chance that this is slow performing
travis, and not just an etcd regression.
2016-09-02 02:27:39 -04:00
James Shubin
9260066fa3 tests: Workaround regression in two host etcd clusters
If you don't give your two host cluster enough time to "feel healthy",
it will generate an error if you do operations within five seconds. This
is a regression and the five seconds is also quite arbitrary. This is
detailed at: https://github.com/coreos/etcd/issues/6305

This seems to be a bit of a race condition, even with a 10s timer, so
this also disables the StrictReconfigCheck. Re-enable this as soon as
possible.
2016-08-31 21:55:19 -04:00
James Shubin
db4de12767 Add more flexibility around the prefixes available
This allows you to specify a custom prefix, or a tmp prefix which is
chosen automatically.
2016-08-31 21:55:19 -04:00
James Shubin
276219a691 Run tests with a tmp prefix
This will avoid failures if /var/lib/mgmt/ isn't writable.
2016-08-31 21:55:19 -04:00
James Shubin
fa4f5abc78 Fix up tests, and one small bug 2016-08-30 17:47:25 -04:00
James Shubin
0c7b05b233 Check for exit status of tests 2016-08-30 17:47:25 -04:00
James Shubin
24ba6abc6b Don't block on member exits
The MemberList() function which apparently looks at all of the endpoints
was blocking right after a member exited, because we still had a stale
list of endpoints, and while a member that exited safely would update
the endpoints lists, the other member would (unavoidably) race to get
that message, and so it might call the MemberList() with the now stale
endpoint list. This way we invalidate an endpoint we know to be gone
immediately.

This also adds a simple test case to catch this scenario.
2016-07-26 04:23:46 -04:00
James Shubin
c1605a4f22 Add test case for urfave regression
Credit to jerith for helping me hack this together :)
2016-07-25 20:25:08 -04:00
James Shubin
94b447a9c5 Remove manual etcd usage. No longer needed. 2016-07-05 06:58:04 -04:00
James Shubin
9aea95ce85 Bump go versions in travis
Failures might be caused by inotify not working correctly in travis.
This is probably because travis has things mounted over NFS.
https://github.com/travis-ci/travis-ci/issues/2342
2016-02-28 18:44:19 -05:00
James Shubin
1a164cee3e Fix file resource regression
I added a regression to the file resource. This was caused by two
different bugs that I added when I switched the API to use checkapply. I
would have caught these issues, except my test cases *also* had a bug! I
think I've fixed all three issues now.

Lastly, when running on travis, the tests behave very differently! Some
of the tests actually fail, and it's not clear why. As a result, we had
to disable them! I guess you get what you pay for.
2016-02-28 02:18:46 -05:00
James Shubin
3a85384377 Rename type to resource (res) and service to svc
Naming the resources "type" was a stupid mistake, and is a huge source
of confusion when also talking about real types. Fix this before it gets
out of hand.
2016-02-21 15:51:52 -05:00
James Shubin
655d527d5f Add a fan in, fan out example and test 2016-02-02 08:52:32 -05:00
James Shubin
ff838700d0 Add a fan in example and test 2016-02-02 04:36:12 -05:00
James Shubin
358604def2 Enable shell tests
We need to use sudo: required, and dist: trusty to avoid old versions of
bash in travis which don't support the -n argument to the `wait` shell
built-in.

We had to disable the -e checks in etcd.sh since the killall || killall
parts were causing those to trigger in travis.
2016-01-28 09:37:43 -05:00
James Shubin
d5367b7a1c Add shell based test harness
This allows you to simulate one or more simultaneously running mgmt
processes. It should be easy to use by following the test cases provided.
2016-01-21 00:23:25 -05:00