This mega patch primarily introduces a new function engine. The main
reasons for this new engine are:
1) Massively improved performance with lock-contended graphs.
Certain large function graphs could have very high lock-contention which
turned out to be much slower than I would have liked. This new algorithm
happens to be basically lock-free, so that's another helpful
improvement.
2) Glitch-free function graphs.
The function graphs could "glitch" (an FRP term) which could be
undesirable in theory. In practice this was never really an issue, and
I've not explicitly guaranteed that the new graphs are provably
glitch-free, but in practice things are a lot more consistent.
3) Simpler graph shape.
The new graphs don't require the private channels. This makes
understanding the graphs a lot easier.
4) Branched graphs only run half.
Previously we would run two pure side of an if statement, and while this
was mostly meant as an early experiment, it stayed in for far too long
and now was the right time to remove this. This also means our graphs
are much smaller and more efficient too.
Note that this changed the function API slightly. Everything has been
ported. It's possible that we introduce a new API in the future, but it
is unexpected to cause removal of the two current APIs.
In addition, we finally split out the "schedule" aspect from
world.schedule(). The "pick me" aspects now happen in a separate
resource, rather than as a yucky side-effect in the function. This also
lets us more precisely choose when we're scheduled, and we can observe
without being chosen too.
As usual many thanks to Sam for helping through some of the algorithmic
graph shape issues!
This removes the secret channel from the call function. Having it made
it more complicated to write new function engines, and it's not clear
why it was even needed in the first place. It seems that even the
current generation of function engines work just fine without it.
Co-authored-by: Samuel Gélineau <gelisam@gmail.com>
Most of the time, we don't need to have a dynamic call sub graph, since
the actual function call could be represented statically as it
originally was before lambda functions were implemented. Simplifying the
graph shape has important performance benefits in terms of both keep the
graph smaller (memory, etc) and in avoiding the need to run transactions
at runtime (speed) to reshape the graph.
Co-authored-by: Samuel Gélineau <gelisam@gmail.com>
I've been waiting to write this patch for a long time. I firmly believe
that the idea of "exported resources" was truly a brilliant one, but
which was never even properly understood by its original inventors! This
patch set aims to show how it should have been done.
The main differences are:
* Real-time modelling, since "once per run" makes no sense.
* Filter with code/functions not language syntax.
* Directed exporting to limit the intended recipients.
The next step is to add more "World" reading and filtering functions to
make it easy and expressive to make your selection of resources to
collect!
This adds an initial implementation of printing line numbers on type
unification errors. It also attempts to print a visual position
indicator for most scenarios.
This patch was started by Felix Frank and finished by James Shubin.
Co-authored-by: Felix Frank <Felix.Frank.de@gmail.com>
Even if they have no side effect, they aren't legal, since it would be
surprising if suddenly something new got added in one which then broke
the imports.
This adds a modern type unification algorithm, which drastically
improves performance, particularly for bigger programs.
This required a change to the AST to add TypeCheck methods (for Stmt)
and Infer/Check methods (for Expr). This also changed how the functions
express their invariants, and as a result this was changed as well.
This greatly improves the way we express these invariants, and as a
result it makes adding new polymorphic functions significantly easier.
This also makes error output for the user a lot better in pretty much
all scenarios.
The one downside of this patch is that a good chunk of it is merged in
this giant single commit since it was hard to do it step-wise. That's
not the end of the world.
This couldn't be done without the guidance of Sam who helped me in
explaining, debugging, and writing all the sneaky algorithmic parts and
much more. Thanks again Sam!
Co-authored-by: Samuel Gélineau <gelisam@gmail.com>
This removes the exclusive from the res names and edge names. We now
require that the names should be lists of strings, however they can
still be single strings if that can be determined statically.
Programmers should explicitly wrap their variables in a string by
interpolation to force this, or in square brackets to force a list. The
former is generally preferable because it generates a small function
graph since it doesn't need to build a list.
This adds support for `include as <identifier>` type statements which in
addition to pulling in any defined resources, it also makes the contents
of the scope of the class available to the scope of the include
statement, but prefixed by the identifier specified.
This makes passing data between scopes much more powerful, and it also
allows classes to return useful classes for subsequent use.
This also improves the SetScope procedure and adds to the Ordering
stage. It's unclear if the current Ordering stage can handle all code,
or if there exist corner-cases which are valid code, but which would
produce a wrong or imprecise topological sort.
Some extraneous scoping bugs still exist, which expose certain variables
that we should not depend on in future code.
Co-authored-by: Samuel Gélineau <gelisam@gmail.com>
This adds ExprTopLevel and ExprSingleton and ensures that ExprBind is
now monomorphic.
This corrects a previous design bug where it was not monomorphic and
would thus cause spawning of many more copies than necessary. In most
cases this was only harmful to memory and performance, and not
behaviour, since these functions were pure, and we didn't have a test
for this.
This also adds a bunch more tests. Most notably, the graph shape tests
generally produce smaller graphs now.
Lastly, a lambda cannot have two different types when used at two
different call sites. It is rare that this would be used, and when it
would make sense, there are easy workarounds to accomplish equivalent
goals.
This was mostly authored by Sam, James helped with some cleanup and
debugging.
Co-authored-by: James Shubin <james@shubin.ca>
These test both graph shape consistency and single value outputs.
Eventually we want to make the graph shape tests more precise, and also
verify specific outputs how it used to be. For now, this is okay.
Co-authored-by: Samuel Gélineau <gelisam@gmail.com>
This adds a meta state store that is preserved between graph switches if
the kind and name match. This is useful so that rapid graph changes
don't necessarily reset their retry count if they've only changed one
resource field.
This makes the tests easier to read and modify without having out of
order numbers. When writing the tests, you'll remember more easily which
section you're erroring in too!
This teaches the compiler to catch entries with duplicate fields, and
duplicate meta entries, because it could be ambiguous to determine which
should take precedence. For example, if you specified `content` to a
file resource twice, this should error. This is known statically, so we
can catch it. If you specified two `Meta:noop` entries, this can also be
caught.
The interesting part happens when you specify one `Meta:noop` entry, and
one `Meta` entry which happens to contain a noop field in the struct.
For this, we actually have to wait until type unification is finished,
and catch the error there. This is because after type unification we
will know the precise type of the struct being passed to `Meta`, and so
we can look at its field names, even if their values aren't yet known
because the graph hasn't run yet.
This adds a giant missing piece of the language: proper function values!
It is lovely to now understand why early programming language designers
didn't implement these, but a joy to now reap the benefits of them. In
adding these, many other changes had to be made to get them to "fit"
correctly. This improved the code and fixed a number of bugs.
Unfortunately this touched many areas of the code, and since I was
learning how to do all of this for the first time, I've squashed most of
my work into a single commit. Some more information:
* This adds over 70 new tests to verify the new functionality.
* Functions, global variables, and classes can all be implemented
natively in mcl and built into core packages.
* A new compiler step called "Ordering" was added. It is called by the
SetScope step, and determines statement ordering and shadowing
precedence formally. It helped remove at least one bug and provided the
additional analysis required to properly capture variables when
implementing function generators and closures.
* The type unification code was improved to handle the new cases.
* Light copying of Node's allowed our function graphs to be more optimal
and share common vertices and edges. For example, if two different
closures capture a variable $x, they'll both use the same copy when
running the function, since the compiler can prove if they're identical.
* Some areas still need improvements, but this is ready for mainstream
testing and use!
I wanted to make sure that the type unification algorithm restricts the
implementation of the class when included, when one of the polymorphic
types is specified with a fixed type. It seems this works! I had the
idea for this test while walking around aimlessly.
The simple type unification algorithm suffered from some serious
performance and memory problems when used with certain code bases. This
adds some crucial optimizations that improve performance drastically.
This quotes printed strings that contain special characters such as
newline. This changes the output of some tests, but makes future tests
that include a raw \n more appropriate.
When include-ing a class, we propagated the scope of the include into
the class instead of using the correct scope that existed when the class
was defined and instead propagating only the include arguments in.
This patch fixes the issue and adds a ton of tests as well. It also
propagates the scope into the include args, in case that is needed, and
adds a test for that as well.
Thanks to Nicolas Charles for the initial bug report.
If you run an import, you only include everything that's part of a
scope. This includes, variables, classes, and functions. Anything else
should cause a compile error. This cleans up the error by adding a
String() method to each Stmt in our AST.
The engine core had some unfortunate bugs that were the result of some
early design errors when I wasn't as familiar with channels. I've
finally rewritten most of the bad parts, and I think it's much more
logical and stable now.
This also simplifies the resource API, since more of the work is done
completely in the engine, and hidden from view.
Lastly, this adds a few new metaparameters and associated code.
There are still some open problems left to solve, but hopefully this
brings us one step closer.
I found a case where we had two missing unification rules. Now fixed in
the previous commits, and including this test to show I'm responsible.
I've added the same test in two locations for redundancy and as an
example.
These weren't yet exposed in mcl. They're now available under the same
Meta namespace as the normal meta param structs. Even though they live
as a separate trait, they should be exposed together for a consistent
interface in mcl. If autoedge or autogroup ever grow additional params,
we can always add: `Meta:autoedge:something` to break it down further.
This adds a core looping construct by allowing a list of names to build
a resource. They'll all have the same parameters, but they'll
intelligently add the correct list of edges that they'd individually
create.
Constructs like these are one reason we do NOT have actual looping
functionality in the language, and it should stay that way.
These two cases should be allowed in our language. This is something
that puppet got wrong, and hopefully this makes writing modules more
sane in mcl, since two modules both depending on a "cowsay" package
won't cause compile errors.
This only checks the language. The de-duplication is done there. We
don't currently have a check for this in the engine. (We should!)
This enables imports in mcl code, and is one of last remaining blockers
to using mgmt. Now we can start writing standalone modules, and adding
standard library functions as needed. There's still lots to do, but this
was a big missing piece. It was much harder to get right than I had
expected, but I think it's solid!
This unfortunately large commit is the result of some wild hacking I've
been doing for the past little while. It's the result of a rebase that
broke many "wip" commits that tracked my private progress, into
something that's not gratuitously messy for our git logs. Since this was
a learning and discovery process for me, I've "erased" the confusing git
history that wouldn't have helped. I'm happy to discuss the dead-ends,
and a small portion of that code was even left in for possible future
use.
This patch includes:
* A change to the cli interface:
You now specify the front-end explicitly, instead of leaving it up to
the front-end to decide when to "activate". For example, instead of:
mgmt run --lang code.mcl
we now do:
mgmt run lang --lang code.mcl
We might rename the --lang flag in the future to avoid the awkward word
repetition. Suggestions welcome, but I'm considering "input". One
side-effect of this change, is that flags which are "engine" specific
now must be specified with "run" before the front-end name. Eg:
mgmt run --tmp-prefix lang --lang code.mcl
instead of putting --tmp-prefix at the end. We also changed the GAPI
slightly, but I've patched all code that used it. This also makes things
consistent with the "deploy" command.
* The deploys are more robust and let you deploy after a run
This has been vastly improved and let's mgmt really run as a smart
engine that can handle different workloads. If you don't want to deploy
when you've started with `run` or if one comes in, you can use the
--no-watch-deploy option to block new deploys.
* The import statement exists and works!
We now have a working `import` statement. Read the docs, and try it out.
I think it's quite elegant how it fits in with `SetScope`. Have a look.
As a result, we now have some built-in functions available in modules.
This also adds the metadata.yaml entry-point for all modules. Have a
look at the examples or the tests. The bulk of the patch is to support
this.
* Improved lang input parsing code:
I re-wrote the parsing that determined what ran when we passed different
things to --lang. Deciding between running an mcl file or raw code is
now handled in a more intelligent, and re-usable way. See the inputs.go
file if you want to have a look. One casualty is that you can't stream
code from stdin *directly* to the front-end, it's encapsulated into a
deploy first. You can still use stdin though! I doubt anyone will notice
this change.
* The scope was extended to include functions and classes:
Go forth and import lovely code. All these exist in scopes now, and can
be re-used!
* Function calls actually use the scope now. Glad I got this sorted out.
* There is import cycle detection for modules!
Yes, this is another dag. I think that's #4. I guess they're useful.
* A ton of tests and new test infra was added!
This should make it much easier to add new tests that run mcl code. Have
a look at TestAstFunc1 to see how to add more of these.
As usual, I'll try to keep these commits smaller in the future!