This adds a modern type unification algorithm, which drastically
improves performance, particularly for bigger programs.
This required a change to the AST to add TypeCheck methods (for Stmt)
and Infer/Check methods (for Expr). This also changed how the functions
express their invariants, and as a result this was changed as well.
This greatly improves the way we express these invariants, and as a
result it makes adding new polymorphic functions significantly easier.
This also makes error output for the user a lot better in pretty much
all scenarios.
The one downside of this patch is that a good chunk of it is merged in
this giant single commit since it was hard to do it step-wise. That's
not the end of the world.
This couldn't be done without the guidance of Sam who helped me in
explaining, debugging, and writing all the sneaky algorithmic parts and
much more. Thanks again Sam!
Co-authored-by: Samuel Gélineau <gelisam@gmail.com>
Originally, I considered having more than one way to express the meta
param. After thinking about it for longer, it probably makes sense to
have a second meta param if necessary, and to avoid the exclusive.
This removes the exclusive from the res names and edge names. We now
require that the names should be lists of strings, however they can
still be single strings if that can be determined statically.
Programmers should explicitly wrap their variables in a string by
interpolation to force this, or in square brackets to force a list. The
former is generally preferable because it generates a small function
graph since it doesn't need to build a list.
If we had a single list wrapped in an interpolated string, it could
sneak through type unification, which is not correct. Wrapping a
variable by interpolation in a string, must force it to be a string.
This optimizes the list of type unification invariants that we generate
for the common case where a resource or edge name is known statically.
For one code base this halved the type unification time in half. More
work can be done though!
If we're importing an embedded module, we need to also include any
possible system scope imports (like functions) that might be using the
same namespace. These might be used by a custom CLI frontend to extend
the code with values from the CLI parsing, for example.
With the recent merging of embedded package imports and the entry CLI
package, it is now possible for users to build in mcl code into a single
binary. This additional permission makes it explicitly clear that this
is permitted to make it easier for those users. The condition is phrased
so that the terms can be "patched" by the original author if it's
necessary for the project. For example, if the name of the language
(mcl) changes, has a differently named new version, someone finds a
phrasing improvement or a legal loophole, or for some other
reasonable circumstance. Now go write some beautiful embedded tools!
This simplifies the API by passing through the filesystem so that
function signatures don't need to be as complicated, and furthermore use
that consistently throughout.
From the Go specification [1]:
"1. ... For a nil slice, the number of iterations is 0."
Therefore, an additional nil check for around the loop is unnecessary.
[1]: https://go.dev/ref/spec#For_range
Signed-off-by: Eng Zer Jun <engzerjun@gmail.com>
The Ordering and DAG detection code is challenging because we need
Ordering to do SetScope, but Ordering itself needs to know about scopes.
This improved variant should hopefully catch all the scenarios of
identically named variables causing invalid loops.
Co-authored-by: Samuel Gélineau <gelisam@gmail.com>
This implements a new type of syntactic sugar for the common pattern of
a base class which returns a child class, and so on. Instead of needing
to repeatedly indent the child classes, we can instead prefix them at
the definition site (where created with the class keyword) with the name
of the parent class, followed by a colon, to get the desired embedded
sugar.
For example, instead of writing:
class base() {
class inner() {
class deepest() {
}
}
}
You can instead write:
class base() {
}
class base:inner() {
}
class base:inner:deepest() {
}
Of course, you can only access any of the inner classes by first
including (with the include keyword) a parent class, and then
subsequently including the inner one.
This adds support for `include as <identifier>` type statements which in
addition to pulling in any defined resources, it also makes the contents
of the scope of the class available to the scope of the include
statement, but prefixed by the identifier specified.
This makes passing data between scopes much more powerful, and it also
allows classes to return useful classes for subsequent use.
This also improves the SetScope procedure and adds to the Ordering
stage. It's unclear if the current Ordering stage can handle all code,
or if there exist corner-cases which are valid code, but which would
produce a wrong or imprecise topological sort.
Some extraneous scoping bugs still exist, which expose certain variables
that we should not depend on in future code.
Co-authored-by: Samuel Gélineau <gelisam@gmail.com>
This adds ExprTopLevel and ExprSingleton and ensures that ExprBind is
now monomorphic.
This corrects a previous design bug where it was not monomorphic and
would thus cause spawning of many more copies than necessary. In most
cases this was only harmful to memory and performance, and not
behaviour, since these functions were pure, and we didn't have a test
for this.
This also adds a bunch more tests. Most notably, the graph shape tests
generally produce smaller graphs now.
Lastly, a lambda cannot have two different types when used at two
different call sites. It is rare that this would be used, and when it
would make sense, there are easy workarounds to accomplish equivalent
goals.
This was mostly authored by Sam, James helped with some cleanup and
debugging.
Co-authored-by: James Shubin <james@shubin.ca>
This is a newer implementation of the panic magic. I kept the old commit
in for posterity and to show the difference. The two versions are
identical to the end-user with one exception: the newer version doesn't
include a useless panic resource in the graph when there is no panic. In
this version, the panic function returns false and the if statement it's
the condition of, doesn't produce the resource within. On error, we
still consume the function in the if expression, and doing so causes
everything to shutdown.
The other benefit is that the implementation is much cleaner and doesn't
need the interpolate hack.
It's valuable to check your runtime values and to shut down the entire
engine in case something doesn't match. This patch adds some magic
plumbing to support a "panic" mechanism.
A new "panic" statement gets transparently converted into a panic
function and panic resource. The former errors if the input is not
empty. The latter must be present to consume the value, but doesn't
actually do anything.
This lets us add a resource that has an implementation with a field
whose type is determined at compile time. This let's us write more
flexible resources.
What's missing is additional type checking so that we guarantee that a
specific resource doesn't change types during run-time.
This is a big giant patch that implements the AST part of lambdas!
I don't know how Sam is able to understand the AST so well, but he does,
and we're all grateful for it. Most of this code was written by him.
Co-authored-by: Samuel Gélineau <gelisam@gmail.com>
This returns the type with the arg names we'll actually use. This is
helpful so we can pass values to the right places. We have named edges
so you can actually see what's going on.
Co-authored-by: Samuel Gélineau <gelisam@gmail.com>
This plumbs through the new Output method signature that accepts a table
of function pointers to values and relies on the previous storing of the
function pointers to be used for the lookup right now. This has the
elegant side-effect that Output generation could run in parallel with
the graph engine, as the engine only needs to pause to take a snapshot
of the current values tables.
Co-authored-by: Samuel Gélineau <gelisam@gmail.com>
The Graph signature changes are needed for future function work, and it
also fits in nicely with the need for storing the value pointer for each
function node. These are used to later extract values during the Output
stage.
Sam deserves all of the credit for realizing both of these points and
convincing me to make the change! It worked out great, cheers!
Co-authored-by: Samuel Gélineau <gelisam@gmail.com>
This adds a meta state store that is preserved between graph switches if
the kind and name match. This is useful so that rapid graph changes
don't necessarily reset their retry count if they've only changed one
resource field.
There were some bugs about setting resource fields that were structs
with various fields. This makes things more strict and correct. Now we
check for duplicate field names earlier (duplicates due to identical
aliases) and we also don't try and set private fields, or incorrectly
set partial structs.
Most interestingly, this also cleans up all of the resources and ensures
that each one has nicer docs and a clear struct tag for fields that we
want to use in mcl. These are mandatory now, and if you're missing the
tag, then we will ignore the field.