Commit Graph

5 Commits

Author SHA1 Message Date
James Shubin
6794aff77c miscellaneous cleanups and fixes 2016-08-31 21:55:19 -04:00
James Shubin
eee652cefe converger: Add new timer system for determining convergence
This adds a new method of marking whether a particular UUID has
converged or not. You can now Start, Stop, or Reset a convergence timer
on the individual UUID's. This wraps the existing SetConverged calls
with a hidden go routine. It is not recommended to use the SetConverged
calls and the Timer calls on the same UUID.
2016-08-31 21:55:19 -04:00
James Shubin
6d45cd45d1 converger: Add new methods to the API
This adds new helper methods to the API such as the ability to query the
set timeout, and to set the state change function after initialization.
The first makes it easier to pass in the timeout value to functions and
structs, because you don't need to pass it separately from the converger
object. The second makes it easy to specify the state change functon
when you don't know what it is at creation time. In addition, it is more
powerful now, and tells you when we converge or un-converge in case we
want to take different actions on each.
2016-08-31 21:55:19 -04:00
James Shubin
f5fb135793 converger: Update the API for errors and naming
We updated the API and behaviour to make two changes:

1) We remove the log.* stuff, and replace the Fatal calls with straight
calls to panic(). These are meant to be like an assert, and shouldn't
happen unless there is a user error or a bug.

2) We made the !uuid.IsValid() checks return an error instead of causing
a panic. These returns errors instead, and makes the process safer for
users who are okay calling a function that won't have an effect.
2016-08-31 21:55:19 -04:00
James Shubin
6f3ac4bf2a Rework the converged detection and provide a clean interface
The old converged detection was hacked in code, instead of something
with a nice interface. This cleans it up, splits it into a separate
file, and removes a race condition that happened with the old code.

We also take the time to get rid of the ugly Set* methods and replace
them all with a single AssociateData method. This might be unnecessary
if we can pass in the Converger method at Resource construction.

Lastly, and most interesting, we suspend the individual timeout callers
when they've already converged, thus reducing unnecessary traffic, and
avoiding fast (eg: < 5 second) timers triggering more than once if they
stay converged!

A quick note on theory for any future readers... What happens if we have
--converged-timeout=0 ? Well, for this and any other positive value,
it's important to realize that deciding if something is converged is
actually a race between if the converged timer will fire and if some
random new event will get triggered. This is because there is nothing
that can actually predict if or when a new event will happen (eg the
user modifying a file). As a result, a race is always inherent, and
actually not a negative or "incorrect" algorithm.

A future improvement could be to add a global lock to each resource, and
to lock all resources when computing if we are converged or not. In
practice, this hasn't been necessary. The worst case scenario would be
(in theory, because this hasn't been tested) if an event happens
*during* the converged calculation, and starts running, the exit command
then runs, and the event finishes, but it doesn't get a chance to notify
some service to restart. A lock could probably fix this theoretical
case.
2016-03-29 20:27:38 -04:00