Commit Graph

39 Commits

Author SHA1 Message Date
Julien Pivotto
fdce9d6a6a prometheus: Initialize mgmt_checkapply_total metrics
It is recommended by Prometheus to initialize metrics:

https://prometheus.io/docs/practices/instrumentation/#avoid-missing-metrics

This commits initialize the mgmt_checkapply_total metric
for each registered resource.

Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
2017-11-23 15:23:41 +01:00
Jonathan Gold
0f70c31a30 etcd: Add advertise urls to cli
This patch adds the option to specify URLs to advertise for clients and peers.
This will facilitate etcd communication through nat, where we want to listen
on a local IP, but expose a public IP to clients/peers.
2017-10-28 22:42:27 -04:00
James Shubin
46be83f8f7 legal: Re-license to GPLv3 2017-09-11 18:07:47 -04:00
James Shubin
6b489f71a1 remote: Add a Ready method to know when startup is finished
Previously, there was an extremely rare race where we would startup,
kick off the Run method in a goroutine, and then run Exit before Run got
very far in its execution. If Run ran some early sections of its code
_after_ we had Exited, we would trigger a panic due to the converger UID
being unregistered.

This patch blocks Exit from progressing until Run has started and
finished running. It also adds a Ready method so that you can monitor
this signal yourself if you'd like to add the necessary wait to your
code.
2017-06-08 03:55:03 -04:00
James Shubin
9f5057eac7 resources: Do not panic on autogrouped graph switches
Graph changes from autogrouped -> not autogrouped or vice versa cause a
panic (or I assume a leak) because we compared the auto grouped graph to
the ungrouped one, which would cause an Exit on an unstarted Vertex.
This includes a test that seems to reliably reproduces the issue.
2017-06-08 01:05:58 -04:00
James Shubin
4f420dde05 etcd: Wait for server to start before continuing
I think there was a rare race where we would make use of the etcd server
before it had fully started up. I only ever saw this occur on travis,
and with this fix hopefully we'll never see it again.

It is worth mentioning that much of my etcd code and the lib Run()
function could use a solid cleaning.
2017-06-03 01:00:35 -04:00
James Shubin
4d9d0d4548 resources: Improve AutoEdge API and pkg breakage
I previously broke the pkg auto edges because the package list wasn't
available by the time it was called. This fixes the pkg resource so that
it gets the necessary list of packages when needed. Since this means
that a possible failure could happen, we also update the AutoEdges API
to support errors. Errors can only be generated at AutoEdge struct
creation, once the struct has been returned (right before modification
of the graph structure) there is no possibility to return any errors.

It's important to remember that the AutoEdges stuff gets called because
the Init of each resource, so make sure it doesn't depend on anything
that happens there or that gets cached as a result of Init.

This is all much nicer now and has a test too :)
2017-06-02 22:15:28 -04:00
James Shubin
9cbaa892d3 gapi: Allow the GAPI implementer to specify fast and exit
This allows the implementer of the GAPI to specify three parameters for
every Next message sent on the channel. The Fast parameter tells the
agent if it should do the pause quickly or if it should finish the
sequence. A quick pause means that it will cause a pause immediately
after the currently running resources finish, where as a slow (default)
pause will allow the wave of execution to finish. This is usually
preferred in scenarios where complex graphs are used where we want each
step to complete. The Exit parameter tells the engine to exit, and the
Err parameter tells the engine that an error occurred.
2017-06-02 04:03:10 -04:00
James Shubin
c35916fad1 resources: Rename the Data struct to ResData to avoid ambiguity
There's a similarly named gapi.Data struct which we could also rename.
2017-06-02 02:53:53 -04:00
James Shubin
14c2fd1edd resources: Add proper edge compare method
Might as well do this cleanly in one place.
2017-05-31 17:27:34 -04:00
James Shubin
4150ae7307 pgraph: Replace edge struct with interface
This further cleans up the pgraph lib to be more generic.
2017-05-31 15:36:15 -04:00
James Shubin
a87288d519 pgraph, resources: Major refactoring continued
There was simply some technical debt I needed to kill off. Sorry for not
splitting this up into more patches.
2017-05-31 15:36:14 -04:00
James Shubin
3cf9639e99 pgraph, resources: Major refactor to remove pgraph to resource dep
This is the mechanical port of the remaining bits. Next to clean it up a
bit.
2017-05-29 15:43:50 -04:00
James Shubin
11c3a26c23 pgraph: Move the AutoEdges mechanism into the resource package
Remove the pgraph->resource dependency.
2017-05-29 15:43:50 -04:00
James Shubin
1c59712cbf pgraph: Move AssociateData function out of the package
This removes another dependency on the resource package.
2017-05-15 10:19:46 -04:00
James Shubin
c2cb1c9168 pgraph: Move GraphMetas function out of package
This removes a dependency on the resources package which wasn't
necessary.
2017-05-15 10:06:31 -04:00
James Shubin
70e7ee2d46 pgraph: Remove use of Flags struct in favour of Value API
One small step to completely cleaning up the pgraph package so that we
can eventually fix the code that would otherwise create a cycle!
2017-05-13 13:28:41 -04:00
James Shubin
a4858be967 lib, gapi: Next method of GAPI should generate first event
This puts the generation of the initial event into the Next method of
the GAPI. If it does not happen, then we will never get a graph. This is
important because this notifies the GAPI when we're actually ready to
try and generate a graph, rather than blocking on the Graph method if we
have a long compile for example.

This is also required for the etcd watch cleanup.
2017-04-10 03:20:58 -04:00
James Shubin
6fd5623b1f gapi: Move separate etcd Watch method into GAPI
This cleans up the API to not have a special case for etcd anymore. In
particular, this also adds the requirement that the GAPI must generate
an event on startup as soon as it is ready to generate a graph.
2017-04-10 03:20:58 -04:00
James Shubin
3e001f9a1c main: Update log messages for consistency 2017-03-16 13:14:50 -04:00
James Shubin
cd5e2e1148 pgraph: Add fast pausing and exiting of graphs
This causes a graph to actually stop processing part way through, even
if there are poke's that want to continue on. This is so that the user
experience of pressing ^C actually causes a shutdown without finishing
the graph execution. It might be preferred to have this be a user
defined setting at some point in the future, such as if the user presses
^C twice. As well, we might want to implement an interrupt API so that
individual resource execution can be asked to bail out early if
requested. This could happen on a third ^C press.
2017-03-13 07:54:03 -04:00
James Shubin
a0686b7d2b pgraph: graphviz: Update Graphviz lib to quote names properly
This also moves the library to after the graph starts so that the kind
fields will be visible.
2017-03-08 19:23:33 -05:00
James Shubin
44771a0049 gapi: Move the World interface into resources
This was necessary to fix some "import cycle" errors I was having when
adding the World api to the resource Data struct.

I think this is a good hint that I need to start splitting up existing
packages into sub files, and cleaning up and inter-package problems too.
2017-03-08 19:23:33 -05:00
James Shubin
32aae8f57a lib, pgraph, resources: Refactor data association API
This should make things cleaner and help avoid as much churn every time
we change a property.
2017-03-07 22:51:11 -05:00
James Shubin
8207e23cd9 lib: Refactor instantiation of world API 2017-03-07 22:51:11 -05:00
James Shubin
d8e19cd79a semaphore: Create a semaphore metaparam
This adds a P/V style semaphore mechanism to the resource graph. This
enables the user to specify a number of "id:count" tags associated with
each resource which will reduce the parallelism of the CheckApply
operation to that maximum count.

This is particularly interesting because (assuming I'm not mistaken) the
implementation is dead-lock free assuming that no individual resource
permanently ever blocks during execution! I don't have a formal proof of
this, but I was able to convince myself on paper that it was the case.

An actual proof that N P/V counting semaphores in a DAG won't ever
dead-lock would be particularly welcome! Hint: the trick is to acquire
them in alphabetical order while respecting the DAG flow. Disclaimer,
this assumes that the lock count is always > 0 of course.
2017-02-27 02:57:06 -05:00
Julien Pivotto
7d92ab335a prometheus: Add mgmt_pgraph_start_time_seconds metric
Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
2017-02-26 15:28:43 +01:00
James Shubin
98bc96c911 golint: Fixup issues found in the report
This also increases the max allowed to 5% -- I'm happy to make this
lower if someone asks.
2017-02-22 22:18:55 -05:00
James Shubin
e070a85ae0 lib: Misc cleanups and new log message 2017-02-22 17:45:16 -05:00
James Shubin
2da21f90f4 pgraph, resources: Improve Init/Close and Worker status
This should do some rough cleanups around the Init/Close of resources,
and tracking of Worker function status.
2017-02-21 18:42:07 -05:00
James Shubin
a981cfa053 legal: Oh yeah, it is 2017 2017-02-16 01:34:32 -05:00
James Shubin
35d3328e3e etcd: Remove stuttering in package
This is a good first step to cleaning up the package.
2017-02-12 22:51:46 -05:00
Julien Pivotto
e8855f7621 prometheus: Implement mgmt_checkapply_total metric
Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
2017-02-12 23:45:47 +01:00
Julien Pivotto
1af67e72d4 prometheus: Implement basic Prometheus support
Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
2017-02-08 12:13:33 +01:00
James Shubin
62c3add888 lib: Make initial one-shot start pattern more elegant
This should do the same thing, in a more elegant and efficient way.
2017-02-05 19:00:14 -05:00
James Shubin
2c8c9264a4 pgraph: Simplify graph exit waiting
I think the vertex resource exiting can be done in a single stage
instead of the previous two stage exit.
2016-12-20 05:49:17 -05:00
James Shubin
dd8d17232f pgraph: Build the sync group into the graph structure
This hides the sync/wait logic inside the graph itself.
2016-12-20 05:49:17 -05:00
James Shubin
6312b9225f gapi: Rename SwitchStream to Next
This is more concise and I think more logical. Complains welcome!
2016-12-20 05:49:17 -05:00
James Shubin
4803be1987 misc: Rename mgmtmain to lib and remove global package
This refactor should make it cleaner to use mgmt.
2016-12-08 23:31:45 -05:00