Commit Graph

523 Commits

Author SHA1 Message Date
James Shubin
c696ebf53c resources: svc: Add failed state
Services can be in a failed state too.
2017-03-08 19:23:33 -05:00
James Shubin
a0686b7d2b pgraph: graphviz: Update Graphviz lib to quote names properly
This also moves the library to after the graph starts so that the kind
fields will be visible.
2017-03-08 19:23:33 -05:00
James Shubin
8d94be8924 resources: kv: Add new KV resource which sets key value pairs
This is a new resource for setting key value pairs in our global world
database. Currently only etcd is supported. Some of the implications and
possibilities of this resource will become more obvious with future
commits!

You can bother/test this resource with these commands:

ETCDCTL_API=3 etcdctl get "/_mgmt/strings/" --prefix=true
ETCDCTL_API=3 etcdctl put "/_mgmt/strings/KEY/HOSTNAME" 42

Replace the KEY and HOSTNAME variables with the actual values you'd like
to use. The 42 is the value that is set.
2017-03-08 19:23:33 -05:00
James Shubin
e97ac5033f resources: Split util functions into separate file
This also adds errwrap to their implementation.
2017-03-08 19:23:33 -05:00
James Shubin
44771a0049 gapi: Move the World interface into resources
This was necessary to fix some "import cycle" errors I was having when
adding the World api to the resource Data struct.

I think this is a good hint that I need to start splitting up existing
packages into sub files, and cleaning up and inter-package problems too.
2017-03-08 19:23:33 -05:00
James Shubin
32aae8f57a lib, pgraph, resources: Refactor data association API
This should make things cleaner and help avoid as much churn every time
we change a property.
2017-03-07 22:51:11 -05:00
James Shubin
8207e23cd9 lib: Refactor instantiation of world API 2017-03-07 22:51:11 -05:00
James Shubin
a469029698 etcd: Switch to buffered channel to remove duplicates
Since we don't return the actual values and instead only tell about
events (which leaves the `Get` of the value as a second operation) then
we don't have to use a channel with backpressure since all the events
are identical.
2017-03-07 22:51:11 -05:00
James Shubin
203d866643 gapi, etcd: Define and implement a string sharing API for the World
This adds a new set of methods to the World API (for sharing data
throughout the cluster) and adds an etcd backed implementation.
2017-03-07 22:51:11 -05:00
Michael Borden
1488e5ec4d travis: Add go_import_path
This should fix an issue with turning on travis for any mgmt fork.
2017-03-07 14:17:53 -08:00
Michael Borden
af66138a17 misc: Add pacman support 2017-03-03 21:04:29 -08:00
James Shubin
5f060d60a7 test: Avoid matching three X's
This helps my "WIP" detector script avoid false positives. It is a
simple script which helps me find release critical problems.
2017-03-01 22:37:08 -05:00
James Shubin
73ccbb69ea travis: Ensure recent golang versions in test
This ensures travis picks the latest versions. Apparently writing 1.x is
also a valid choice.
2017-03-01 21:45:02 -05:00
James Shubin
be60440b20 readme: Add new blog post about metaparameters
This also documents the recent semaphore work.
2017-03-01 16:18:14 -05:00
James Shubin
837efb78e6 spelling: Fix typos as found by goreportcard 2017-02-28 23:48:34 -05:00
James Shubin
4a62a290d8 pgraph: Clean up tests
This splits the tests into multiple files.
2017-02-28 16:47:16 -05:00
James Shubin
018399cb1f semaphore, pgraph: Add semaphore grouping and tests
If two resources are grouped, then the result should contain the
semaphores of both resources. This is because the user is expecting
(independently) resource A and resource B to have a limiting choke
point. If when combined those choke points aren't preserved, then we
have broken an important promise to the user.
2017-02-28 16:40:53 -05:00
James Shubin
646a576358 remote: Replace builtin semaphore type with common util lib
Refactor code to use the new fancy lib!
2017-02-27 02:57:06 -05:00
James Shubin
d8e19cd79a semaphore: Create a semaphore metaparam
This adds a P/V style semaphore mechanism to the resource graph. This
enables the user to specify a number of "id:count" tags associated with
each resource which will reduce the parallelism of the CheckApply
operation to that maximum count.

This is particularly interesting because (assuming I'm not mistaken) the
implementation is dead-lock free assuming that no individual resource
permanently ever blocks during execution! I don't have a formal proof of
this, but I was able to convince myself on paper that it was the case.

An actual proof that N P/V counting semaphores in a DAG won't ever
dead-lock would be particularly welcome! Hint: the trick is to acquire
them in alphabetical order while respecting the DAG flow. Disclaimer,
this assumes that the lock count is always > 0 of course.
2017-02-27 02:57:06 -05:00
James Shubin
757cb0cf23 test: Small fixups to t4 and a rename 2017-02-26 20:54:22 -05:00
Julien Pivotto
7d92ab335a prometheus: Add mgmt_pgraph_start_time_seconds metric
Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
2017-02-26 15:28:43 +01:00
James Shubin
46c6d6f656 compiling: Add -i flag to go build to speed up builds
So builds take about 30s on my shitty machine which is pretty long.
Turns out golang caches things in $GOPATH/pkg/ but is not clever enough
to erase things from there that are out of date. As a result, it was
rebuilding everything (including the unchanged dependencies) every
build!

By completely wiping out $GOPATH/pkg/ and then running `go build -i`,
this now takes builds down to about 8 seconds. (After one full build is
finished.)

This is basically the same as running `go install`, but without copying
junk to $GOPATH/bin.

Hopefully the tooling will be smart enough to know when to throw out
stuff in $GOPATH/pkg automatically and avoid this problem entirely.

Is it wrong to send Google a bill for all the extra cpu cycles I've
used? ;)
2017-02-26 04:06:36 -05:00
Julien Pivotto
46260749c1 prometheus: Move the prometheus nil check inside the prometheus function
That pattern will be reused in future metrics.

Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
2017-02-26 09:33:34 +01:00
Julien Pivotto
50664fe115 compilation: Make build the default target
It is really strange that whe I run make, it does not build mgmt. This
commit makes build the default target, without moving the target,
therefore we keep as much as we can the order of the file.

This also removes the confusion for designers that would run "make"
instead of "make art", whose work would be disrupted when we add a --
let's say -- make alpharelese command.

Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
2017-02-26 09:04:20 +01:00
James Shubin
c480bd94db resources: virt: Remove unnecessary early exit from CheckApply
I don't think this early exit is necessary any more, since the main
CheckApply function really just spawns out to the different sub workers
which all individually check the apply variable.

If I'm wrong, we can revert this. It was @roidelapluie that noticed the
check here to begin with.
0.0.9
2017-02-25 21:29:08 -05:00
James Shubin
79923a939b resources: virt: Catch bad calls to CheckApply
If the engine cheats, we'll know!
2017-02-25 21:28:35 -05:00
James Shubin
327b22113a resources: virt: Don't block exit in callbacks
This prevents us blocking an exit if we close when a callback was about
to run. This is because the callbacks are called from the
EventRunDefaultImpl method, which waits for their return to exit and
release the WaitGroup.

I think we should probably get rid of the obj.wg since the engine is
supposed to guarantee that Close doesn't happen before Watch finishes.
2017-02-25 21:04:36 -05:00
James Shubin
12160ab539 pgraph: Wait for Process routine to exit
Wait for the innerWorker's Process routine to exit, before we exit too.
2017-02-25 21:01:02 -05:00
James Shubin
2462ea0892 pgraph, resources: Wait for innerWorker to exit cleanly
Don't run the Close() method until the innerWorker has exited cleanly.
This is a guarantee which we make to the resources.
2017-02-25 21:00:38 -05:00
James Shubin
8be09eadd4 test: Fix up probable timeout failures due to slow ci
We had *occasional* failures most likely due to slow ci and compounded
by low entropy. We also weren't pointing at the right test!
2017-02-22 23:08:09 -05:00
James Shubin
98bc96c911 golint: Fixup issues found in the report
This also increases the max allowed to 5% -- I'm happy to make this
lower if someone asks.
2017-02-22 22:18:55 -05:00
James Shubin
b0fce6a80d travis: Start building golang 1.8 as well
Require success from 1.7 since we'll probably move to it as the default
within six months.
2017-02-22 22:18:55 -05:00
James Shubin
53b8a21d1e resources: virt: Cleanup cleanly on Close
Don't block accidentally on error!
2017-02-22 22:18:55 -05:00
James Shubin
1346492d72 yamlgraph: Close the recwatcher properly after use
Don't leave it running unnecessarily! This might have contributed to a
block, but it was hard to isolate if this was the cause or if this was
one of many causes.
2017-02-22 18:37:43 -05:00
James Shubin
e5bb8d7992 recwatch: Close the ConfigWatch properly
This improves the shutdown process so that there is no change of
blocking if the sender runs close without having emptied the channel.
2017-02-22 17:45:16 -05:00
James Shubin
49594b8435 pgraph, resources: Clean up the event system around the resources
This cleans up some of the resource events and also reorganizes the
struct for simplicity. This should hopefully kill off at least one race
which would cause unnecessary blocking!

Yes this patch is a bit yucky, but so was the bug I was fighting with!
2017-02-22 17:45:16 -05:00
James Shubin
3bd37a7906 test: Don't block on graph transitions
Improvements in the engine have uncovered some annoying race conditions
which would cause the engine to block between transitions. This is a
test which catches the most obvious file based ones.

This requires inotify to work in the test environment.
2017-02-22 17:45:16 -05:00
James Shubin
e070a85ae0 lib: Misc cleanups and new log message 2017-02-22 17:45:16 -05:00
James Shubin
c189278e24 recwatch: Unblock from sending on exit
If we receive an exit signal while we are waiting to send a message, we
should still be able to shut down.
2017-02-22 16:41:47 -05:00
James Shubin
2a8606bd98 recwatch: Close cleanly after Watcher finishes
This cleans up the recwatch code to avoid some possible races. While you
were previously able to call Close more than once, this was really
something that would mask bugs, rather than a useful feature.
2017-02-22 16:41:26 -05:00
James Shubin
18ea05c837 pgraph, resources: Add proper start/stop signals
We need to perform some operations in lock step between graph
transitions. This should help with that!
2017-02-21 18:48:27 -05:00
James Shubin
86c3072515 resources: The user should not call Init
Even in tests...
2017-02-21 18:42:07 -05:00
James Shubin
fccf508dde resources, pgraph: Refactor Worker and simplify API
I'm still working on reducing the size of the monster patches that I
land, but I'm exercising the priviledge as the initial author. In any
case, this refactors worker into two, and cleans up the passing around
of the processChan. This puts common code into Init and Close.
2017-02-21 18:42:07 -05:00
James Shubin
2da21f90f4 pgraph, resources: Improve Init/Close and Worker status
This should do some rough cleanups around the Init/Close of resources,
and tracking of Worker function status.
2017-02-21 18:42:07 -05:00
James Shubin
bec7f1726f resources: virt: Allow hotplugging
This allows hot (un)plugging of CPU's! It also includes some general
cleanups which were necessary to support this as well as some other
features to the virt resource. Hotunplug requires Fedora 25.

It also comes with a mini shell script to help demo this capability.

Many thanks to pkrempa for his help with the libvirt API!
2017-02-21 18:42:07 -05:00
James Shubin
74dfb9d88d test: Make test status more clear 2017-02-21 18:40:31 -05:00
James Shubin
02dddfc227 test: Fix yamlfmt test
Last chance before we kill this entirely.
2017-02-21 16:16:41 -05:00
Julien Pivotto
545016b38f project: Add $me to AUTHORS
Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
2017-02-21 07:07:41 +01:00
Felix Frank
0ccceaf226 project: Reluctantly add myself as an author
For the record, I currently don't feel that my level of contributions
warrant a spot here, but I'm entering myself upon express invitation
by James =) ...also in anticipation of much more substantial work.
2017-02-20 22:53:30 -05:00
James Shubin
a601115650 test: Fix false negative on go vet
This was my fault, now it is fixed :) It passed locally due to a bug.
2017-02-20 18:12:30 -05:00