Commit Graph

64 Commits

Author SHA1 Message Date
James Shubin
d4e815a4cb resources: Clean up converger and make it easier for tests
This cleans up the resource converger code slightly and makes it easier
to write resource specific test cases.
2017-06-02 01:15:25 -04:00
James Shubin
a87288d519 pgraph, resources: Major refactoring continued
There was simply some technical debt I needed to kill off. Sorry for not
splitting this up into more patches.
2017-05-31 15:36:14 -04:00
James Shubin
3cf9639e99 pgraph, resources: Major refactor to remove pgraph to resource dep
This is the mechanical port of the remaining bits. Next to clean it up a
bit.
2017-05-29 15:43:50 -04:00
James Shubin
fbcb562781 pgraph: Move the timestamp storage into the resource 2017-05-29 15:43:50 -04:00
James Shubin
11c3a26c23 pgraph: Move the AutoEdges mechanism into the resource package
Remove the pgraph->resource dependency.
2017-05-29 15:43:50 -04:00
James Shubin
0af9af44e5 etcd, resources, world: Add World API for shared keys
It's up to the end user to decide who is writing and/or overwriting
them.

It could also be useful to reimplement (refactor) some of the existing
World API's to be implemented in terms of these primitives.
2017-04-17 07:03:29 -04:00
James Shubin
9b9ff2622d resources: Make resource kind and baseuid fields public
This is required if we're going to have out of package resources. In
particular for third party packages, and also for if we decide to split
out each resource into a separate sub package.
2017-04-11 01:52:21 -04:00
James Shubin
6fd5623b1f gapi: Move separate etcd Watch method into GAPI
This cleans up the API to not have a special case for etcd anymore. In
particular, this also adds the requirement that the GAPI must generate
an event on startup as soon as it is ready to generate a graph.
2017-04-10 03:20:58 -04:00
Mildred Ki'Lya
525a1e8140 yamlgraph: Refactor parsing for dynamic resource registration
Avoid use of the reflect package, and use an extensible list of registred
resource kinds. This also has the benefit of removing the empty VirtRes and
AugeasRes struct types when compiling without libvirt and libaugeas.
2017-03-24 22:38:06 +01:00
James Shubin
028ef14cc0 misc: Replace sloppy use of %v with %s 2017-03-16 13:18:36 -04:00
Julien Pivotto
33d20ac6d8 prometheus: Add detailed metrics
Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
2017-03-16 14:18:46 +01:00
James Shubin
074da4da19 pgraph, resources: Run the resource Setup in parallel
This is a reasonable thing to do at this time.
2017-03-13 07:54:03 -04:00
James Shubin
95a1c6e7fb pgraph, resources: Discard BackPokes during pause and resume
This prevents some nasty races where a BackPoke could arrive on a paused
vertex either during a resume or pause operation. Previously we might
also have poked an excessive number of resources on resume.

The solution was to discard BackPokes during pause or resume. On pause,
they can be discarded because we've asked the graph to quiesce, and any
further work can be done on resume, and on resume we ignore them because
this should only happen during the unrolling (reverse topological resume
of the graph) and at the end of this the indegree == 0 vertices will
initiate a series of pokes which should deal with any BackPoke that was
possibly discarded.

One other aspect of this which is important: if an indegree == 0 vertex
is poked (Process runs) but it's already in the correct state, it should
still transmit the Poke through itself so that subsequent vertices know
to run. Currently this is done correctly in Process().

I'm a bit ashamed that this wasn't done properly in the engine earlier,
but I suppose that's what comes out of running fancier graphs and really
thinking in detail about what's truly correct. Hopefully I got it right
this time!
2017-03-09 06:35:15 -05:00
James Shubin
0b1a4a0f30 pgraph, resources: Quiesce when pausing or exiting the resource
This prevents a nasty race that can happen in a graph with more than one
resource. If a resource has someone that it can BackPoke, and then
suppose an event comes in. It runs the obj.Event() method (from inside
its Watch loop) and then *before* the resulting Process method can run
it receives a pause event and pauses. Then the parent resource pauses as
well. Finally (it's a race) the Process gets around to running, and
decides it needs to BackPoke. At this point since the parent resource is
paused, it receives the BackPoke at a time when it can't handle
receiving one, and it panics!

As a result, we now track the number of running Process possibilities
via a WaitGroup which gets incremented from the obj.Event() and we don't
finish our pause or exit operations until it has quiesced and our
WaitGroup lets us know via Wait(). Lastly in order to prevent repeated
replays, we detect when we're quiescing and suspend replaying until post
pause. We don't need to save the replay (playback variable) explicitly
because its state remains during pause, and on exit it would get
re-checked anyways.
2017-03-09 02:50:55 -05:00
James Shubin
e97ac5033f resources: Split util functions into separate file
This also adds errwrap to their implementation.
2017-03-08 19:23:33 -05:00
James Shubin
44771a0049 gapi: Move the World interface into resources
This was necessary to fix some "import cycle" errors I was having when
adding the World api to the resource Data struct.

I think this is a good hint that I need to start splitting up existing
packages into sub files, and cleaning up and inter-package problems too.
2017-03-08 19:23:33 -05:00
James Shubin
32aae8f57a lib, pgraph, resources: Refactor data association API
This should make things cleaner and help avoid as much churn every time
we change a property.
2017-03-07 22:51:11 -05:00
James Shubin
018399cb1f semaphore, pgraph: Add semaphore grouping and tests
If two resources are grouped, then the result should contain the
semaphores of both resources. This is because the user is expecting
(independently) resource A and resource B to have a limiting choke
point. If when combined those choke points aren't preserved, then we
have broken an important promise to the user.
2017-02-28 16:40:53 -05:00
James Shubin
d8e19cd79a semaphore: Create a semaphore metaparam
This adds a P/V style semaphore mechanism to the resource graph. This
enables the user to specify a number of "id:count" tags associated with
each resource which will reduce the parallelism of the CheckApply
operation to that maximum count.

This is particularly interesting because (assuming I'm not mistaken) the
implementation is dead-lock free assuming that no individual resource
permanently ever blocks during execution! I don't have a formal proof of
this, but I was able to convince myself on paper that it was the case.

An actual proof that N P/V counting semaphores in a DAG won't ever
dead-lock would be particularly welcome! Hint: the trick is to acquire
them in alphabetical order while respecting the DAG flow. Disclaimer,
this assumes that the lock count is always > 0 of course.
2017-02-27 02:57:06 -05:00
James Shubin
2462ea0892 pgraph, resources: Wait for innerWorker to exit cleanly
Don't run the Close() method until the innerWorker has exited cleanly.
This is a guarantee which we make to the resources.
2017-02-25 21:00:38 -05:00
James Shubin
98bc96c911 golint: Fixup issues found in the report
This also increases the max allowed to 5% -- I'm happy to make this
lower if someone asks.
2017-02-22 22:18:55 -05:00
James Shubin
49594b8435 pgraph, resources: Clean up the event system around the resources
This cleans up some of the resource events and also reorganizes the
struct for simplicity. This should hopefully kill off at least one race
which would cause unnecessary blocking!

Yes this patch is a bit yucky, but so was the bug I was fighting with!
2017-02-22 17:45:16 -05:00
James Shubin
18ea05c837 pgraph, resources: Add proper start/stop signals
We need to perform some operations in lock step between graph
transitions. This should help with that!
2017-02-21 18:48:27 -05:00
James Shubin
fccf508dde resources, pgraph: Refactor Worker and simplify API
I'm still working on reducing the size of the monster patches that I
land, but I'm exercising the priviledge as the initial author. In any
case, this refactors worker into two, and cleans up the passing around
of the processChan. This puts common code into Init and Close.
2017-02-21 18:42:07 -05:00
James Shubin
2da21f90f4 pgraph, resources: Improve Init/Close and Worker status
This should do some rough cleanups around the Init/Close of resources,
and tracking of Worker function status.
2017-02-21 18:42:07 -05:00
James Shubin
b7948c7f40 resources: Specify defaults for the MetaParams
When creating new resources, we didn't specify the defaults, which for
the limit metaparam caused invalid resources by default. It would be
nice to change the limit param to have the 1/X (reciprocal) as the
default, although the problem with that is that (1) it is illogical, and
(2) it's not clear if the precision for the common cases is enough.

If someone wants to investigate this further, please do! Zero value
structs are definitely more useful! In any case, we can now specify the
default. It's not entirely obvious to me if this is the best way to do
it, or if there is a superior method.
2017-02-16 21:08:46 -05:00
James Shubin
a981cfa053 legal: Oh yeah, it is 2017 2017-02-16 01:34:32 -05:00
Julien Pivotto
e8855f7621 prometheus: Implement mgmt_checkapply_total metric
Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
2017-02-12 23:45:47 +01:00
James Shubin
68a8649292 resources: Parse YAML infinity specifications correctly
This makes it easier to specify an infinite rate.
2017-02-05 21:01:52 -05:00
James Shubin
c247cd8fea resources: Don't double close on Running restart
If we use the "retry" metaparam on a Watch, we want to avoid a double
close due to the second Running() signal. This avoids this with a simple
flag.
2017-02-05 18:47:40 -05:00
James Shubin
9421f2cddd resources: Rename GetUIDs to UIDs
This is more in line with the style guide for golang.
2017-01-25 14:51:23 -05:00
James Shubin
54296da647 converger: Remove converger boilerplate from the resources
This simplifies the resource code by now removing all the converger
related material. Happy resource writing!
2017-01-25 11:30:47 -05:00
James Shubin
11b40bf32f resources: Update state checks
The mgmt graph depends on state tracking to eliminate redundant pokes.
With the Watch loop now able to produce events quickly, it should no
longer play a part in determining the vertex state. This simplifies the
resource API as well!
2017-01-25 09:13:59 -05:00
James Shubin
9ecc49e592 resources: Force a sane default for zero value limiting
The default UnmarshalYAML on *BaseRes doesn't work properly at the
moment, so hack in a default so that we don't need to specify one if the
MetaParams struct isn't specified. The problem is that if there isn't a
meta value added, its UnmarshalYAML doesn't get a chance to run.
2017-01-25 09:13:59 -05:00
James Shubin
4f34f7083b resources: rate limiting: Implement resource rate limiting
This adds rate limiting with the limit and burst meta parameters. The
limits apply to how often the Process check is called. As a result, it
might get called more often than there are Watch events due to possible
Poke/BackPoke events.

This system might need to get rethought in the future depending on its
usefulness.
2017-01-25 09:13:59 -05:00
James Shubin
2a6df875ec resources: Improve composition of Validate API in resources
This now appropriately calls the Base method.
2017-01-22 06:00:37 -05:00
James Shubin
51c83116a2 resources: Overhaul legacy code around the resource API
This patch makes a number of changes in the engine surrounding the
resource API. In particular:

* Cleanup of send/read event.
* Cleanup of DoSend (now Event) in the Watch method.
* Events are now more consistently pointers.
* Exiting within Watch is now done in a single place.
* Multiple incoming events will be combined into a single action.
* Events in flight during an action are played back after CheckApply.
* Addition of Close method to API

This gets things ready for rate limiting and semaphore metaparams!
2017-01-22 05:59:15 -05:00
James Shubin
56efef69ba resources: Officially add Validate method
This officially adds the Validate method to the resource API, and also
cleans up the ordering in existing resources.
2017-01-09 05:10:26 -05:00
James Shubin
60912bd01c resources: Add a Default method to the resource API
This provides sensible defaults for when they're not the zero value.
2017-01-09 04:35:48 -05:00
James Shubin
b921aabbed resources: Add poll metaparameter
This allows a resource to use polling instead of the event based
mechanism. This isn't recommended, but it could be useful, and it was
certainly fun to code!
2016-12-24 00:51:39 -05:00
James Shubin
45820b4ce3 resources: Rename Converger to ConvergerUID
This is more correct since we need the Converger method to return the
actual converger!
2016-12-23 23:32:05 -05:00
James Shubin
0009d9b20e pgraph, resources: Integrate properly with the startup logic
This signals which resources have to run their initial pokes, and
removes the racy retry timer. We actually get a proper signal when
things are running too!
2016-12-20 05:49:17 -05:00
James Shubin
067932aebf resources: Remove SetWatching/IsWatching code from Watch
This removes some boilerplate from the Watch methods which can be baked
into the engine instead.

This code should be checked for races and locks to make sure we only
start resources when it makes sense to.
2016-12-20 05:47:40 -05:00
James Shubin
36b916f27f resources: Simplify resource Converger and Startup code
This takes the Converged initialization and Startup patterns that are
common in all resources, and bakes it into the core engine. This way
resource writing is much more concise and there is less boilerplate!
2016-12-20 05:47:40 -05:00
James Shubin
4803be1987 misc: Rename mgmtmain to lib and remove global package
This refactor should make it cleaner to use mgmt.
2016-12-08 23:31:45 -05:00
James Shubin
597ed6eaa0 resources: Polish the password PoC and build out send/recv
This polishes the password resource so that it can actually avoid
writing the password to disk, and so that the work actually happens in
CheckApply where it can properly interact with the graph. This resource
now re-generates the password when it receives a notification.

The send/recv plumbing has been extended so that receivers can detect
when they're receiving new values. This is particularly important if
they might otherwise not expect those values to change and cache them
for efficiency purposes.
2016-12-06 02:29:47 -05:00
James Shubin
2e718c0e9d resources: Improve notification system and notify refreshes
Resources can send "refresh" notifications along edges. These messages
are sent whenever the upstream (initiating vertex) changes state. When
the changed state propagates downstream, it will be paired with a
refresh flag which can be queried in the CheckApply method of that
resource.

Future work will include a stateful refresh tracking mechanism so that
if a refresh event is generated and not consumed, it will be saved
across an interrupt (shutdown) or a crash so that it can be re-applied
on the subsequent run. This is important because the unapplied refresh
is a form of hysteresis which needs to be tracked and remembered or we
won't be able to determine that the state is wrong!

Still to do:
* Update the autogrouping code to handle the edge notify properties!
* Actually finish the stateful bool code
2016-12-03 01:35:31 -05:00
James Shubin
b0a8fc165c resources: Improve the state/cache system
Refactor the state cache into the engine. This makes resource writing
less error prone, and paves the way for better notifications.
2016-12-03 00:07:29 -05:00
James Shubin
ba6044e9e8 resources, pgraph: split logical chunks into separate files 2016-12-03 00:07:29 -05:00
James Shubin
7f1c13a576 resources: Implement Send -> Recv
This is a new design idea which I had. Whether it stays around or not is
up for debate. For now it's a rough POC.

The idea is that any resource can _produce_ data, and any resource can
_consume_ data. This is what we call send and recv. By linking the two
together, data can be passed directly between resources, which will
maximize code re-use, and allow for some interesting logical graphs.

For example, you might have an HTTP resource which puts its output in a
particular file. This avoids having to overload the HTTP resource with
all of the special behaviours of the File resource.

For our POC, I implemented a `password` resource which generates a
random string which can then be passed to a receiver such as a file. At
this point the password resource isn't recommended for sensitive
applications because it caches the password as plain text.

Still to do:
* Statically check all of the type matching before we run the graph
* Verify that our autogrouping works correctly around this feature
* Verify that appropriate edges exist between send->recv pairs
* Label the password as generated instead of storing the plain text
* Consider moving password logic from Init() to CheckApply()
* Consider combining multiple send values (list?) into a single receiver
* Consider intermediary transformation nodes for value combining
2016-12-03 00:07:29 -05:00