lourenco/mgmt - mgmt - código do assilvestrar

Author	SHA1	Message	Date
James Shubin	9b9ff2622d	resources: Make resource kind and baseuid fields public This is required if we're going to have out of package resources. In particular for third party packages, and also for if we decide to split out each resource into a separate sub package.	2017-04-11 01:52:21 -04:00
James Shubin	028ef14cc0	misc: Replace sloppy use of %v with %s	2017-03-16 13:18:36 -04:00
Julien Pivotto	33d20ac6d8	prometheus: Add detailed metrics Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>	2017-03-16 14:18:46 +01:00
James Shubin	cd5e2e1148	pgraph: Add fast pausing and exiting of graphs This causes a graph to actually stop processing part way through, even if there are poke's that want to continue on. This is so that the user experience of pressing ^C actually causes a shutdown without finishing the graph execution. It might be preferred to have this be a user defined setting at some point in the future, such as if the user presses ^C twice. As well, we might want to implement an interrupt API so that individual resource execution can be asked to bail out early if requested. This could happen on a third ^C press.	2017-03-13 07:54:03 -04:00
James Shubin	074da4da19	pgraph, resources: Run the resource Setup in parallel This is a reasonable thing to do at this time.	2017-03-13 07:54:03 -04:00
James Shubin	e5dbb214a2	pgraph: Move the BackPoke to before the semaphores I can't think of a reason we should grab a semaphore before backpoking. The semaphore is intended to block around the actual work in CheckApply, not the dependency resolution of the correct vertex.	2017-03-13 07:49:29 -04:00
James Shubin	91af528ff8	pgraph: Move the quiesce done indicator to avoid deadlock This avoids a deadlock on resource failure when retry==0. Without this we would never exit. This adds a test in too!	2017-03-12 13:52:35 -04:00
James Shubin	95a1c6e7fb	pgraph, resources: Discard BackPokes during pause and resume This prevents some nasty races where a BackPoke could arrive on a paused vertex either during a resume or pause operation. Previously we might also have poked an excessive number of resources on resume. The solution was to discard BackPokes during pause or resume. On pause, they can be discarded because we've asked the graph to quiesce, and any further work can be done on resume, and on resume we ignore them because this should only happen during the unrolling (reverse topological resume of the graph) and at the end of this the indegree == 0 vertices will initiate a series of pokes which should deal with any BackPoke that was possibly discarded. One other aspect of this which is important: if an indegree == 0 vertex is poked (Process runs) but it's already in the correct state, it should still transmit the Poke through itself so that subsequent vertices know to run. Currently this is done correctly in Process(). I'm a bit ashamed that this wasn't done properly in the engine earlier, but I suppose that's what comes out of running fancier graphs and really thinking in detail about what's truly correct. Hopefully I got it right this time!	2017-03-09 06:35:15 -05:00
James Shubin	0b1a4a0f30	pgraph, resources: Quiesce when pausing or exiting the resource This prevents a nasty race that can happen in a graph with more than one resource. If a resource has someone that it can BackPoke, and then suppose an event comes in. It runs the obj.Event() method (from inside its Watch loop) and then before the resulting Process method can run it receives a pause event and pauses. Then the parent resource pauses as well. Finally (it's a race) the Process gets around to running, and decides it needs to BackPoke. At this point since the parent resource is paused, it receives the BackPoke at a time when it can't handle receiving one, and it panics! As a result, we now track the number of running Process possibilities via a WaitGroup which gets incremented from the obj.Event() and we don't finish our pause or exit operations until it has quiesced and our WaitGroup lets us know via Wait(). Lastly in order to prevent repeated replays, we detect when we're quiescing and suspend replaying until post pause. We don't need to save the replay (playback variable) explicitly because its state remains during pause, and on exit it would get re-checked anyways.	2017-03-09 02:50:55 -05:00
James Shubin	d8e19cd79a	semaphore: Create a semaphore metaparam This adds a P/V style semaphore mechanism to the resource graph. This enables the user to specify a number of "id:count" tags associated with each resource which will reduce the parallelism of the CheckApply operation to that maximum count. This is particularly interesting because (assuming I'm not mistaken) the implementation is dead-lock free assuming that no individual resource permanently ever blocks during execution! I don't have a formal proof of this, but I was able to convince myself on paper that it was the case. An actual proof that N P/V counting semaphores in a DAG won't ever dead-lock would be particularly welcome! Hint: the trick is to acquire them in alphabetical order while respecting the DAG flow. Disclaimer, this assumes that the lock count is always > 0 of course.	2017-02-27 02:57:06 -05:00
Julien Pivotto	46260749c1	prometheus: Move the prometheus nil check inside the prometheus function That pattern will be reused in future metrics. Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>	2017-02-26 09:33:34 +01:00
James Shubin	12160ab539	pgraph: Wait for Process routine to exit Wait for the innerWorker's Process routine to exit, before we exit too.	2017-02-25 21:01:02 -05:00
James Shubin	2462ea0892	pgraph, resources: Wait for innerWorker to exit cleanly Don't run the Close() method until the innerWorker has exited cleanly. This is a guarantee which we make to the resources.	2017-02-25 21:00:38 -05:00
James Shubin	49594b8435	pgraph, resources: Clean up the event system around the resources This cleans up some of the resource events and also reorganizes the struct for simplicity. This should hopefully kill off at least one race which would cause unnecessary blocking! Yes this patch is a bit yucky, but so was the bug I was fighting with!	2017-02-22 17:45:16 -05:00
James Shubin	18ea05c837	pgraph, resources: Add proper start/stop signals We need to perform some operations in lock step between graph transitions. This should help with that!	2017-02-21 18:48:27 -05:00
James Shubin	fccf508dde	resources, pgraph: Refactor Worker and simplify API I'm still working on reducing the size of the monster patches that I land, but I'm exercising the priviledge as the initial author. In any case, this refactors worker into two, and cleans up the passing around of the processChan. This puts common code into Init and Close.	2017-02-21 18:42:07 -05:00
James Shubin	2da21f90f4	pgraph, resources: Improve Init/Close and Worker status This should do some rough cleanups around the Init/Close of resources, and tracking of Worker function status.	2017-02-21 18:42:07 -05:00
James Shubin	a981cfa053	legal: Oh yeah, it is 2017	2017-02-16 01:34:32 -05:00
Julien Pivotto	e8855f7621	prometheus: Implement mgmt_checkapply_total metric Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>	2017-02-12 23:45:47 +01:00
James Shubin	dd8454161f	pgraph: Set the false starter value too We might have left re-used nodes as true, even if they no longer were anymore due to graph changes, which would have caused additional pokes.	2017-01-27 17:43:24 -05:00
James Shubin	357102fdb5	converger: Block converging in the engine This more appropriately blocks converging in the engine, since we are now 1-1 decoupled from the Watch resource. This simplifies resource writing, and should be more accurate around small converged timeouts. We don't block in the Worker routine when we are polling, because we expect to get constant poll events, and we can instead be more careful about these by looking at CheckApply results. If we can do this for all resources in the future, it would be excellent!	2017-01-25 11:06:02 -05:00
James Shubin	7e15a9e181	pgraph: Add debug messages These two messages in particular make graph analysis easier.	2017-01-25 09:52:34 -05:00
James Shubin	12e0b2d6f7	pgraph: Parallelize the BackPoke facility I can't guarantee this has a significant effect, but it's likely to add some efficiency when sending multiple BackPoke's at the same time, so that erroneous ones can be cancelled out easier.	2017-01-25 09:13:59 -05:00
James Shubin	11b40bf32f	resources: Update state checks The mgmt graph depends on state tracking to eliminate redundant pokes. With the Watch loop now able to produce events quickly, it should no longer play a part in determining the vertex state. This simplifies the resource API as well!	2017-01-25 09:13:59 -05:00
James Shubin	8d2b53373f	pgraph: Remove unnecessary indentation in Process code path We can indent the simple BackPoke code path instead!	2017-01-25 09:13:59 -05:00
James Shubin	4f34f7083b	resources: rate limiting: Implement resource rate limiting This adds rate limiting with the limit and burst meta parameters. The limits apply to how often the Process check is called. As a result, it might get called more often than there are Watch events due to possible Poke/BackPoke events. This system might need to get rethought in the future depending on its usefulness.	2017-01-25 09:13:59 -05:00
James Shubin	51c83116a2	resources: Overhaul legacy code around the resource API This patch makes a number of changes in the engine surrounding the resource API. In particular: * Cleanup of send/read event. * Cleanup of DoSend (now Event) in the Watch method. * Events are now more consistently pointers. * Exiting within Watch is now done in a single place. * Multiple incoming events will be combined into a single action. * Events in flight during an action are played back after CheckApply. * Addition of Close method to API This gets things ready for rate limiting and semaphore metaparams!	2017-01-22 05:59:15 -05:00
James Shubin	b921aabbed	resources: Add poll metaparameter This allows a resource to use polling instead of the event based mechanism. This isn't recommended, but it could be useful, and it was certainly fun to code!	2016-12-24 00:51:39 -05:00
James Shubin	5b3425a689	pgraph: Remember to unpause the vertices! Forgot this part earlier, sorry! Should work correctly now :)	2016-12-21 02:39:54 -05:00
James Shubin	2c8c9264a4	pgraph: Simplify graph exit waiting I think the vertex resource exiting can be done in a single stage instead of the previous two stage exit.	2016-12-20 05:49:17 -05:00
James Shubin	0009d9b20e	pgraph, resources: Integrate properly with the startup logic This signals which resources have to run their initial pokes, and removes the racy retry timer. We actually get a proper signal when things are running too!	2016-12-20 05:49:17 -05:00
James Shubin	dd8d17232f	pgraph: Build the sync group into the graph structure This hides the sync/wait logic inside the graph itself.	2016-12-20 05:49:17 -05:00
James Shubin	067932aebf	resources: Remove SetWatching/IsWatching code from Watch This removes some boilerplate from the Watch methods which can be baked into the engine instead. This code should be checked for races and locks to make sure we only start resources when it makes sense to.	2016-12-20 05:47:40 -05:00
James Shubin	36b916f27f	resources: Simplify resource Converger and Startup code This takes the Converged initialization and Startup patterns that are common in all resources, and bakes it into the core engine. This way resource writing is much more concise and there is less boilerplate!	2016-12-20 05:47:40 -05:00
James Shubin	4803be1987	misc: Rename mgmtmain to lib and remove global package This refactor should make it cleaner to use mgmt.	2016-12-08 23:31:45 -05:00
James Shubin	6edb5c30d5	resources: Actually verify which send/recv elements changed When updating the code, I forgot to actually verify if there were changes or not. This caused erroneous changed messages when none were actually sent.	2016-12-06 14:22:34 -05:00
James Shubin	597ed6eaa0	resources: Polish the password PoC and build out send/recv This polishes the password resource so that it can actually avoid writing the password to disk, and so that the work actually happens in CheckApply where it can properly interact with the graph. This resource now re-generates the password when it receives a notification. The send/recv plumbing has been extended so that receivers can detect when they're receiving new values. This is particularly important if they might otherwise not expect those values to change and cache them for efficiency purposes.	2016-12-06 02:29:47 -05:00
James Shubin	07fd2e88a2	resources: Fix poke/refresh race Clearly the use of errgroup is flawed. 1) You can't pass in variables, so this is likely to race. 2) You can't get a set of errors, so this is a bad API. For the second problem, it would be much more sane to return a multierr or a list of errors. If there's no fix for the first, I think it should be removed from the lib.	2016-12-04 21:06:08 -05:00
James Shubin	2e718c0e9d	resources: Improve notification system and notify refreshes Resources can send "refresh" notifications along edges. These messages are sent whenever the upstream (initiating vertex) changes state. When the changed state propagates downstream, it will be paired with a refresh flag which can be queried in the CheckApply method of that resource. Future work will include a stateful refresh tracking mechanism so that if a refresh event is generated and not consumed, it will be saved across an interrupt (shutdown) or a crash so that it can be re-applied on the subsequent run. This is important because the unapplied refresh is a form of hysteresis which needs to be tracked and remembered or we won't be able to determine that the state is wrong! Still to do: * Update the autogrouping code to handle the edge notify properties! * Actually finish the stateful bool code	2016-12-03 01:35:31 -05:00
James Shubin	b0a8fc165c	resources: Improve the state/cache system Refactor the state cache into the engine. This makes resource writing less error prone, and paves the way for better notifications.	2016-12-03 00:07:29 -05:00
James Shubin	ba6044e9e8	resources, pgraph: split logical chunks into separate files	2016-12-03 00:07:29 -05:00

41 Commits