This isn't perfect yet, but we're trying to do this incrementally, and
merge whatever we can as early as possible.
During this work, I realized that the Simplify method of the exclusive
could probably be improved, and possibly receive a better signature.
This work will have to happen later.
This is meant as an incremental step into the new unification. Hopefully
it doesn't break anything and that we can rip out the old polymorphisms
work soon.
This teaches the compiler to catch entries with duplicate fields, and
duplicate meta entries, because it could be ambiguous to determine which
should take precedence. For example, if you specified `content` to a
file resource twice, this should error. This is known statically, so we
can catch it. If you specified two `Meta:noop` entries, this can also be
caught.
The interesting part happens when you specify one `Meta:noop` entry, and
one `Meta` entry which happens to contain a noop field in the struct.
For this, we actually have to wait until type unification is finished,
and catch the error there. This is because after type unification we
will know the precise type of the struct being passed to `Meta`, and so
we can look at its field names, even if their values aren't yet known
because the graph hasn't run yet.
We should probably move these into the central interfaces package so
that these can be used from multiple places. They don't have any
dependencies, and it doesn't make sense to have the solver code mixed in
to the same package. Overall the interface being implemented here could
probably be improved, but that's a project for another day.
The original string interpolation was based on hil which didn't allow
proper escaping, since they used a different escape pattern. Secondly,
the golang Unquote function didn't deal with the variable substitution,
which meant it had to be performed in a second step.
Most importantly, because we did this partial job in Unquote (the fact
that is strips the leading and trailing quotes tricked me into thinking
I was done with interpolation!) it was impossible to remedy the
remaining parts in a second pass with hil. Both operations needs to be
done in a single step. This is logical when you aren't tunnel visioned.
This patch replaces both of these so that string interpolation works
properly. This removes the ability to allow inline function calls in a
string, however this was an incidental feature, and it's not clear that
having it is a good idea. It also requires you wrap the var name with
curly braces. (They are not optional.)
This comes with a load of tests, but I think I got some of it wrong,
since I'm quite new at ragel. If you find something, please say so =D In
any case, this is much better than the original hil implementation, and
easy for a new contributor to patch to make the necessary fixes.
Replace existing field-mapping code with calls to types.Into() to
reflect the mcl data into the Go resource struct with finer granularity
and accuracy, and less reliance on the magic reflect.Set() function.
One major advantage over reflect.Value.Set() is Into() allows tailoring
the data that is set, specifically when coercing mcl struct values into
Golang struct values, the fields can be appropriately mapped based on
the lang tag on the struct field. With reflect.Value.Set() this was not
at all possible as there was a contradiction of logic given the
following rules:
- mcl struct fields must have lowercase names
- Golang struct fields with lowercase names are unexported
- Golang reflection does not allow modifying unexported fields
Prior to this change, it was impossible to map an mcl inline struct in a
resource to the matched Golang counterpart, even if the lang tag was
present on the struct field. This can be demonstrated with the following
trivial example:
test "name" {
key => struct{
name => "hello",
},
}
and the accompanying Golang resource definition:
type TestRes struct {
traits.Base
traits.Edgeable
Key struct {
Name string `lang:"name"`
} `lang:"key"`
}
Due to the mismatch in field names in the embedded struct, the type
unifier failed and refused to match mcl 'name' to Go 'Name' due to the
missing mapping logic.
Signed-off-by: Joe Groocock <me@frebib.net>
Struct types with duplicate fields are invalid types and weren't caught
by the parser. This fixes the issue and adds some associated tests. It
also checks and tests for duplicate struct value field names.
As a technical side-note, this doesn't change the lang/types/ functions
to remove panics-- the signatures are simplified to make their use
simple, and we intentionally panic if they're used incorrectly. In this
case, one was being used without having previously validated the input.
Thanks to Patrick Meyer for finding this issue via fuzzing!
Since we focus on safety, it would be nice to reduce the chance of any
runtime errors if we made a typo for a resource parameter. With this
patch, each resource can export constants into the global namespace so
that typos would cause a compile error.
Of course in the future if we had a more advanced type system, then we
could support precise types for each individual resource param, but in
an attempt to keep things simple, we'll leave that for another day. It
would add complexity too if we ever wanted to store a parameter
externally.
Lastly, we might consider adding "special case" parsing so that directly
specified fields would parse intelligently. For example, we could allow:
file "/tmp/hello" {
state => exists, # magic sugar!
}
This isn't supported for now, but if it works after all the other parser
changes have been made, it might be something to consider.
This ensures that docstring comments are wrapped to 80 chars. ffrank
seemed to be making this mistake far too often, and it's a silly thing
to look for manually. As it turns out, I've made it too, as have many
others. Now we have a test that checks for most cases. There are still a
few stray cases that aren't checked automatically, but this can be
improved upon if someone is motivated to do so.
Before anyone complains about the 80 character limit: this only checks
docstring comments, not source code length or inline source code
comments. There's no excuse for having docstrings that are badly
reflowed or over 80 chars, particularly if you have an automated test.
Since this was an early form of the modern data struct, remove those and
pass in the correct data. This is also important in case we have
something more complex inside our string interpolation!
This should give us options as to how a function should interact with an
FS. I feel like it's cleaner to go through the World API, and passing in
the FsURI lets us do that, but I passed in the Fs at the same time in
case it's useful for some reason. I think using it is a boundary
violation, but it's just a hunch. Does anything break when we move from
one deploy to the next?
We weren't calling Init on some functions which should have had this
done. I'm not sure whether this is the right place, or if it should be
elsewhere as part of the scope building process. Good enough for now.
Sometimes certain internal functions might want to get some data from
the AST or from something relating to the state of the language. This
adds a method to pass in that data. For now it's a very simple method,
but we could generalize it in the future if it becomes more useful.
This adds a giant missing piece of the language: proper function values!
It is lovely to now understand why early programming language designers
didn't implement these, but a joy to now reap the benefits of them. In
adding these, many other changes had to be made to get them to "fit"
correctly. This improved the code and fixed a number of bugs.
Unfortunately this touched many areas of the code, and since I was
learning how to do all of this for the first time, I've squashed most of
my work into a single commit. Some more information:
* This adds over 70 new tests to verify the new functionality.
* Functions, global variables, and classes can all be implemented
natively in mcl and built into core packages.
* A new compiler step called "Ordering" was added. It is called by the
SetScope step, and determines statement ordering and shadowing
precedence formally. It helped remove at least one bug and provided the
additional analysis required to properly capture variables when
implementing function generators and closures.
* The type unification code was improved to handle the new cases.
* Light copying of Node's allowed our function graphs to be more optimal
and share common vertices and edges. For example, if two different
closures capture a variable $x, they'll both use the same copy when
running the function, since the compiler can prove if they're identical.
* Some areas still need improvements, but this is ready for mainstream
testing and use!
If a test failed in stage 2 (fail2) instead of an expected fail in stage
3 (fail3) then it would continue running, which was an undefined
behaviour in our API. IOW we should not run Unify if SetScope failed.
This patch adds these additional checks to ensure our tests are more
robust.
The simple type unification algorithm suffered from some serious
performance and memory problems when used with certain code bases. This
adds some crucial optimizations that improve performance drastically.
This quotes printed strings that contain special characters such as
newline. This changes the output of some tests, but makes future tests
that include a raw \n more appropriate.
When include-ing a class, we propagated the scope of the include into
the class instead of using the correct scope that existed when the class
was defined and instead propagating only the include arguments in.
This patch fixes the issue and adds a ton of tests as well. It also
propagates the scope into the include args, in case that is needed, and
adds a test for that as well.
Thanks to Nicolas Charles for the initial bug report.
Part of this was rotten, and not fully functional. This fixes the rot,
adds some tests, and improves the type checking that occurs when sending
and receiving values. In addition, a significant portion of this happens
at compile time.
There is still more work to be done here, but this should get us a good
chunk of the way for now.
If you run an import, you only include everything that's part of a
scope. This includes, variables, classes, and functions. Anything else
should cause a compile error. This cleans up the error by adding a
String() method to each Stmt in our AST.
The engine core had some unfortunate bugs that were the result of some
early design errors when I wasn't as familiar with channels. I've
finally rewritten most of the bad parts, and I think it's much more
logical and stable now.
This also simplifies the resource API, since more of the work is done
completely in the engine, and hidden from view.
Lastly, this adds a few new metaparameters and associated code.
There are still some open problems left to solve, but hopefully this
brings us one step closer.
This continues the earlier patch that allowed resource names to be lists
of strings so that edges can now allow the same. This also includes a
new fancy test!
I forgot to ensure that the type of the final expression matched the
type of each of the branches. It's rare, but possible for this to occur.
Luckily, this never would have caused a panic, because the func engine
would have caught the issue anyways, but it's still better we catch it
here first!
I forgot to include these two invariants which are occasionally
necessary, although in most cases they're necessary to prevent incorrect
code from getting past unification. In any case, they would have been
caught by the engine.
These weren't yet exposed in mcl. They're now available under the same
Meta namespace as the normal meta param structs. Even though they live
as a separate trait, they should be exposed together for a consistent
interface in mcl. If autoedge or autogroup ever grow additional params,
we can always add: `Meta:autoedge:something` to break it down further.
This adds a core looping construct by allowing a list of names to build
a resource. They'll all have the same parameters, but they'll
intelligently add the correct list of edges that they'd individually
create.
Constructs like these are one reason we do NOT have actual looping
functionality in the language, and it should stay that way.
Instead of adding complexity to the unification engine, we can add a
fake placeholder expression that is unreachable by the AST, but used for
unification so that we can ensure a "wrap" invariant has some contents.
Ideally we'd improve the unification engine, but we'll leave that for
the future, and it's easy to revert this one commit in the future.
This enables imports in mcl code, and is one of last remaining blockers
to using mgmt. Now we can start writing standalone modules, and adding
standard library functions as needed. There's still lots to do, but this
was a big missing piece. It was much harder to get right than I had
expected, but I think it's solid!
This unfortunately large commit is the result of some wild hacking I've
been doing for the past little while. It's the result of a rebase that
broke many "wip" commits that tracked my private progress, into
something that's not gratuitously messy for our git logs. Since this was
a learning and discovery process for me, I've "erased" the confusing git
history that wouldn't have helped. I'm happy to discuss the dead-ends,
and a small portion of that code was even left in for possible future
use.
This patch includes:
* A change to the cli interface:
You now specify the front-end explicitly, instead of leaving it up to
the front-end to decide when to "activate". For example, instead of:
mgmt run --lang code.mcl
we now do:
mgmt run lang --lang code.mcl
We might rename the --lang flag in the future to avoid the awkward word
repetition. Suggestions welcome, but I'm considering "input". One
side-effect of this change, is that flags which are "engine" specific
now must be specified with "run" before the front-end name. Eg:
mgmt run --tmp-prefix lang --lang code.mcl
instead of putting --tmp-prefix at the end. We also changed the GAPI
slightly, but I've patched all code that used it. This also makes things
consistent with the "deploy" command.
* The deploys are more robust and let you deploy after a run
This has been vastly improved and let's mgmt really run as a smart
engine that can handle different workloads. If you don't want to deploy
when you've started with `run` or if one comes in, you can use the
--no-watch-deploy option to block new deploys.
* The import statement exists and works!
We now have a working `import` statement. Read the docs, and try it out.
I think it's quite elegant how it fits in with `SetScope`. Have a look.
As a result, we now have some built-in functions available in modules.
This also adds the metadata.yaml entry-point for all modules. Have a
look at the examples or the tests. The bulk of the patch is to support
this.
* Improved lang input parsing code:
I re-wrote the parsing that determined what ran when we passed different
things to --lang. Deciding between running an mcl file or raw code is
now handled in a more intelligent, and re-usable way. See the inputs.go
file if you want to have a look. One casualty is that you can't stream
code from stdin *directly* to the front-end, it's encapsulated into a
deploy first. You can still use stdin though! I doubt anyone will notice
this change.
* The scope was extended to include functions and classes:
Go forth and import lovely code. All these exist in scopes now, and can
be re-used!
* Function calls actually use the scope now. Glad I got this sorted out.
* There is import cycle detection for modules!
Yes, this is another dag. I think that's #4. I guess they're useful.
* A ton of tests and new test infra was added!
This should make it much easier to add new tests that run mcl code. Have
a look at TestAstFunc1 to see how to add more of these.
As usual, I'll try to keep these commits smaller in the future!
This adds a new method to the *StmtProg that lets us determine if the
prog contains only what is necessary for a scope and nothing more. This
is useful because that is exactly what is produced when doing an import.
With this detection method, we can know if a module contains dead code
that might mislead the user into thinking it will get run when it won't.