Commit Graph

58 Commits

Author SHA1 Message Date
James Shubin
089267837d engine: graph: Add a lock around metas access
We forgot about the concurrent writes. This should fix that.
2023-09-19 13:46:20 -04:00
James Shubin
0d381e4c91 engine: graph: Handle the back poke differently
A back poke is the deferral or delay of a Process/CheckApply. This is
because we notice that we're not truly ready to CheckApply due to some
timestamp issue. When Process errors, we should accept that, but not
treat it as a success.
2023-09-04 15:22:14 -04:00
James Shubin
7ccda7e99b engine: Add a ctx to the CheckApply API
This is just a rough port, there are lots of optimizations to be done
and lots of timeout values that should be replaced by a new timeout meta
param!
2023-09-02 01:34:42 -04:00
James Shubin
cf49a9a6e7 engine: graph: Give retry channel its own signal
This just makes the copy+pasting less confusing.
2023-09-02 00:59:48 -04:00
James Shubin
1d10f85c28 engine: graph: Move misplaced comment 2023-09-01 22:13:34 -04:00
James Shubin
d3d84524f5 engine: graph: Improve limit output
Just show the milliseconds, and round it slightly.
2023-09-01 21:57:05 -04:00
James Shubin
cc04221516 engine: graph: Allow pause/resume while in retry or limit
The retry and limit "satellite" event loops didn't allow pausing or
resuming, and instead you needed to wait until either was done before
you could pause.

The downside of this patch is that for very fast graph transitions, we
wouldn't be really obeying the limits anymore, however now that we have
per resource kind+name uid, we can persist the limits across graph swaps
if we want to.

Most importantly, this allows us to exit entirely when we're stuck in
one of these satellite loops.
2023-09-01 21:57:05 -04:00
James Shubin
f9bc50e262 engine: Retry should be stateful and add RetryReset
Make the retry meta param a bit more sane now that we can persist it
between graph switches. This also unblocks us from pausing during retry
loops.
2023-09-01 21:57:05 -04:00
James Shubin
9545e409d4 engine: Create a resource kind+name specific stateful store
This adds a meta state store that is preserved between graph switches if
the kind and name match. This is useful so that rapid graph changes
don't necessarily reset their retry count if they've only changed one
resource field.
2023-09-01 21:57:05 -04:00
James Shubin
8299c04fc6 engine: graph, util: Clean up error printing
We should improve on this more, but at least as a quick fix, stop
splitting the error across two lines. This makes the logs really ugly.
2023-09-01 16:53:40 -04:00
James Shubin
0b1b0a3f80 engine: graph: Don't deadlock on error
This simplifies the pause mechanism and also avoids a deadlock on error.
If the Worker shuts down completely, but before we've been removed from
the graph, then an attempted pause would deadlock if we didn't have an
escape hatch here.

This removes the unnecessary ack mechanism now that we have a
synchronous channel send to represent the pausing, rather than an
asynchronous channel closing.
2023-09-01 16:53:40 -04:00
James Shubin
2773a621a2 engine: graph: Cleanup pause/resume code
There's always the fear that there is either a panic or a deadlock in
the highly concurrent engine resource code. I have not seen one recently
and I've been running some pretty concurrent tests. In the meantime, and
with my hopefully improved knowledge of concurrency, I decided to
rewrite some of the "uglier" parts of the engine. I think it is a lot
clearer now, and much less likely that there is a concurrency issue.

This has been tested by running the examples/lang/fastcount.mcl example.
2023-08-30 21:58:38 -04:00
James Shubin
c06cf44fd7 lib, engine: graph: Rename the Close method 2023-08-30 21:38:01 -04:00
James Shubin
7288e5d4a4 engine: graph: Improve documentation on concurrent use
Certain other methods should not be called concurrently, but this only
documents the most important cases.
2023-08-30 21:38:01 -04:00
James Shubin
b62f501745 engine: graph: Fix typo
We actually now cleanup instead of closing. It's semantically slightly
different, so be consistent with the error message.
2023-08-30 21:38:01 -04:00
James Shubin
514927c0b3 engine: resources, graph: Change the Close method to Cleanup
This also changes a few similarly named methods. Clearer what it's doing
in terms of cleanup vs. causing some action.
2023-08-08 01:11:29 -04:00
James Shubin
963393e3d9 engine: graph, resources: Change Watch to use ctx
This is a general port. There are many optimizations and cleanups we can
do now that we have a proper context passed in. That's for a future
patch.
2023-08-08 01:11:29 -04:00
James Shubin
53a878bf61 engine: resources, graph: Change the done channel into a ctx
This is part one of porting Watch to context.
2023-08-08 01:11:29 -04:00
James Shubin
c5efe7a17b lang, engine: Remove unneeded error wrapping
These situations basically never fail, and if they do, we certainly
don't need more context. This simplifies things a bit.
2023-04-20 18:02:40 -04:00
James Shubin
c598e4d289 engine, etcd: Update code for latest gofmt fixes
Latest version of golang broken gofmt again...
2023-03-14 16:43:08 -04:00
James Shubin
a7624a2bf9 legal: Happy 2023 everyone...
Done with:

ack '2022+' -l | xargs sed -i -e 's/2022+/2023+/g'

Checked manually with:

git add -p

Hello to future James from 2024, and Happy Hacking!
2023-03-05 18:31:52 -05:00
James Shubin
3cea422365 legal: Happy 2022 everyone...
Done with:

ack '2021+' -l | xargs sed -i -e 's/2021+/2022+/g'

Checked manually with:

git add -p

Hello to future James from 2023, and Happy Hacking!
2022-08-05 23:06:27 -04:00
James Shubin
784d15b012 all: Misc housekeeping for new golang versions 2022-08-04 14:16:33 -04:00
James Shubin
0ab2406db9 engine: Pass through the program version
We forgot this in a few places.
2022-02-17 13:34:26 -05:00
James Shubin
5927a54208 docs: Improve autogenerate godoc
There were a bunch of packages that weren't well documented. With the
recent split up of the lang package, I figured it would be more helpful
for new contributors who want to learn the structure of the project.
2021-10-26 00:12:18 -04:00
James Shubin
4d187419ac engine: Small typo/cleanups in autogrouping code 2021-05-04 05:30:27 -04:00
James Shubin
58998f9cab engine: Transform the send/recv init functions into helpers
Since we'll want to use them elsewhere, we should make these helper
functions. It also makes the code look a lot neater. Unfortunately, it
adds a bit more indirection, but this isn't a critical flaw here.
2021-05-04 05:30:27 -04:00
James Shubin
336a38081a legal: Happy 2021 everyone...
Done with:

ack '2020+' -l | xargs sed -i -e 's/2020+/2021+/g'

Checked manually with:

git add -p

Hello to future James from 2022, and Happy Hacking!
2021-01-31 16:52:46 -05:00
James Shubin
f67ad9c061 test: Add a check for too long or badly reflowed docstrings
This ensures that docstring comments are wrapped to 80 chars. ffrank
seemed to be making this mistake far too often, and it's a silly thing
to look for manually. As it turns out, I've made it too, as have many
others. Now we have a test that checks for most cases. There are still a
few stray cases that aren't checked automatically, but this can be
improved upon if someone is motivated to do so.

Before anyone complains about the 80 character limit: this only checks
docstring comments, not source code length or inline source code
comments. There's no excuse for having docstrings that are badly
reflowed or over 80 chars, particularly if you have an automated test.
2020-01-25 04:43:33 -05:00
James Shubin
579975f08d engine: graph: Don't error when state file is missing
For some reason we get errors when we try to remove a non-existent state
file. There's a slight possibility that it could be a bug we're working
around, but it's not clear that this is the case, and I think it's
possible that a state file could have gotten nuked by the user somehow,
although this was occurring "naturally" when running reverse1.mcl so
let's keep that working for now.
2020-01-12 16:41:09 -05:00
James Shubin
3707b39fef engine: graph: Improve comments
Clarify that we're referring to cycles in the graph, since it needs to
be a DAG.
2020-01-12 16:39:32 -05:00
James Shubin
2648fb1bb1 legal: Happy 2020 everyone...
Done with:

ack '2019+' -l | xargs sed -i -e 's/2019+/2020+/g'

Checked manually with:

git add -p

Hello to future James from 2021, and Happy Hacking!
2020-01-03 20:08:37 -05:00
James Shubin
5526bbba64 engine: resources: Add a tftp server and tftp file resource
This adds a tftp server and tftp file resource to help you run a small
pure golang tftp server embedded inside the mgmt resource model.
2019-12-17 03:41:45 -05:00
James Shubin
eaab1aae28 engine: graph, resources: Add filtered graph function
This lets a resource query the resource graph in a controlled way.
2019-10-29 07:15:43 -04:00
James Shubin
325ca03a13 engine: graph: Pass through the graph struct
We want to use it in the resources.
2019-10-29 07:15:43 -04:00
James Shubin
5c27a249b7 engine: resources: Add reversible API and file resource
This adds the first reversible resource (file) and the necessary engine
API hooks to make it all work. This allows a special "reversed" resource
to be added to the subsequent graph in the stream when an earlier
version "disappears". This disappearance can happen if it was previously
in an if statement that then becomes false.

It might be wise to combine the use of this meta parameter with the use
of the `realize` meta parameter to ensure that your reversed resource
actually runs at least once, if there's a chance that it might be gone
for a while.

This patch also adds a new test harness for testing resources. It
doesn't test the "live" aspect of resources, as it doesn't run Watch,
but it was designed to ensure CheckApply works as intended, and it runs
very quickly with a simplified timeline of happenings.
2019-09-11 03:40:22 -04:00
James Shubin
34d572c523 engine: Improve the way we make a unique res path token
This is needed in the state directory.
2019-09-11 03:16:57 -04:00
James Shubin
12b906eac6 engine: Refactor state dir into a separate function
This lets us re-use it, and know the path is fixed.
2019-09-11 03:16:57 -04:00
James Shubin
7a756cacb9 engine: graph: Add a mutex around waits map access
If you ran some extremely absurd code, it turns out you can cause a
race. This was found by roiedelapluie experimenting! In this case, it
would panic with: fatal error: concurrent map read and map write. This
patch adds the mutex to avoid this rare race.
2019-05-14 10:53:36 -04:00
James Shubin
a5842a41b2 etcd: Rewrite embed etcd implementation
This is a giant cleanup of the etcd code. The earlier version was
written when I was less experienced with golang.

This is still not perfect, and does contain some races, but at least
it's a decent base to start from. The automatic elastic clustering
should be considered an experimental feature. If you need a more
battle-tested cluster, then you should manage etcd manually and point
mgmt at your existing cluster.
2019-04-11 21:43:48 -04:00
James Shubin
07f542b4d7 legal: Happy 2019 everyone...
Done with:

ack '2018+' -l | xargs sed -i -e 's/2018+/2019+/g'

Checked manually with:

git add -p

Hello to future James from 2020, and Happy Hacking!
2019-03-24 15:08:50 -04:00
James Shubin
753d1104ef util: Port all multierr code to new errwrap package
This cleans things up and simplifies a lot of the code. Also it's easier
to just import one error package when needed.
2019-03-12 16:51:37 -04:00
James Shubin
880652f5d4 util: Port all code to new errwrap package
This should keep things more uniform.
2019-03-12 16:49:01 -04:00
James Shubin
de43569fa2 engine, lang: Improve send/recv significantly
Part of this was rotten, and not fully functional. This fixes the rot,
adds some tests, and improves the type checking that occurs when sending
and receiving values. In addition, a significant portion of this happens
at compile time.

There is still more work to be done here, but this should get us a good
chunk of the way for now.
2019-03-09 17:37:58 -05:00
James Shubin
aa165b5e17 engine: Add the retry loop around Process
This adds back the retry loop around Process. This is done as a
separate commit so you can more easily see the logic of the retry magic
This commit is similar but different to the earlier commit adding retry
around Watch.
2019-02-24 12:28:59 -05:00
James Shubin
f06e87377c engine: Add limit delay before Process can run
This adds back the limit delay around Process.
2019-02-24 12:28:59 -05:00
James Shubin
4c3bf9fc7a engine: Add the retry loop around Watch
This adds back the retry loop around Watch. This is done as a separate
commit so you can more easily see the logic of the retry magic.
2019-02-24 12:28:59 -05:00
James Shubin
253ed78cc6 engine: Rewrite the core algorithm
The engine core had some unfortunate bugs that were the result of some
early design errors when I wasn't as familiar with channels. I've
finally rewritten most of the bad parts, and I think it's much more
logical and stable now.

This also simplifies the resource API, since more of the work is done
completely in the engine, and hidden from view.

Lastly, this adds a few new metaparameters and associated code.

There are still some open problems left to solve, but hopefully this
brings us one step closer.
2019-02-24 12:28:59 -05:00
James Shubin
4860d833c7 converger: Rewrite the converger module
I found a deadlock in the converger code, and I realized the code was
sufficiently bad that it needed a good clean up.
2019-02-24 12:28:59 -05:00
James Shubin
bf63d2e844 engine: graph: Avoid a possible panic sending on a closed channel
It's plausible that we send on a closed channel if we're running a back
poke and it tries to send a poke on something that has already closed.
If it detects this condition, it will exit.

Unfortunately, it's not clear if the wait group will protect this case,
but hopefully this will hold us until we can re-write the needed parts
of the engine.
2019-01-17 20:05:49 -05:00