lang: Initial implementation of the mgmt language
This is an initial implementation of the mgmt language. It is a declarative (immutable) functional, reactive, domain specific programming language. It is intended to be a language that is: * safe * powerful * easy to reason about With these properties, we hope this language, and the mgmt engine will allow you to model the real-time systems that you'd like to automate. This also includes a number of other associated changes. Sorry for the large size of this patch.
This commit is contained in:
432
docs/language-guide.md
Normal file
432
docs/language-guide.md
Normal file
@@ -0,0 +1,432 @@
|
||||
# Language guide
|
||||
|
||||
## Overview
|
||||
The `mgmt` tool has various frontends, each of which may produce a stream of
|
||||
between zero or more graphs that are passed to the engine for desired state
|
||||
application. In almost all scenarios, you're going to want to use the language
|
||||
frontend. This guide describes some of the internals of the language.
|
||||
|
||||
## Theory
|
||||
The mgmt language is a declarative (immutable) functional, reactive programming
|
||||
language. It is implemented in `golang`. A longer introduction to the language
|
||||
is coming soon!
|
||||
|
||||
### Types
|
||||
All expressions must have a type. A composite type such as a list of strings
|
||||
(`[]str`) is different from a list of integers (`[]int`).
|
||||
|
||||
There _is_ a _variant_ type in the language's type system, but it is only used
|
||||
internally and only appears briefly when needed for type unification hints
|
||||
during static polymorphic function generation. This is an advanced topic which
|
||||
is not required for normal usage of the software.
|
||||
|
||||
The implementation of the internal types can be found in
|
||||
[lang/types/](https://github.com/purpleidea/mgmt/tree/master/lang/types/).
|
||||
|
||||
#### bool
|
||||
A `true` or `false` value.
|
||||
|
||||
#### str
|
||||
Any `"string!"` enclosed in quotes.
|
||||
|
||||
#### int
|
||||
A number like `42` or `-13`. Integers are represented internally as golang's
|
||||
`int64`.
|
||||
|
||||
#### float
|
||||
A floating point number like: `3.1415926`. Float's are represented internally as
|
||||
golang's `float64`.
|
||||
|
||||
#### list
|
||||
An ordered collection of values of the same type, eg: `[6, 7, 8, 9,]`. It is
|
||||
worth mentioning that empty lists have a type, although without type hints it
|
||||
can be impossible to infer the item's type.
|
||||
|
||||
#### map
|
||||
An unordered set of unique keys of the same type and corresponding value pairs
|
||||
of another type, eg: `{"boiling" => 100, "freezing" => 0, "room" => "25", "house" => 22, "canada" => -30,}`.
|
||||
That is to say, all of the keys must have the same type, and all of the values
|
||||
must have the same type. You can use any type for either, although it is
|
||||
probably advisable to avoid using very complex types as map keys.
|
||||
|
||||
#### struct
|
||||
An ordered set of field names and corresponding values, each of their own type,
|
||||
eg: `struct{answer => "42", james => "awesome", is_mgmt_awesome => true,}`.
|
||||
These are useful for combining more than one type into the same value. Note the
|
||||
syntactical difference between these and map's: the key's in map's have types,
|
||||
and as a result, string keys are enclosed in quotes, whereas struct _fields_ are
|
||||
not string values, and as such are bare and specified without quotes.
|
||||
|
||||
#### func
|
||||
An ordered set of optionally named, differently typed input arguments, and a
|
||||
return type, eg: `func(s str) int` or:
|
||||
`func(bool, []str, {str: float}) struct{foo str; bar int}`.
|
||||
|
||||
### Expressions
|
||||
Expressions, and the `Expr` interface need to be better documented. For now
|
||||
please consume
|
||||
[lang/interfaces/ast.go](https://github.com/purpleidea/mgmt/tree/master/lang/interfaces/ast.go).
|
||||
These docs will be expanded on when things are more certain to be stable.
|
||||
|
||||
### Statements
|
||||
Statements, and the `Stmt` interface need to be better documented. For now
|
||||
please consume
|
||||
[lang/interfaces/ast.go](https://github.com/purpleidea/mgmt/tree/master/lang/interfaces/ast.go).
|
||||
These docs will be expanded on when things are more certain to be stable.
|
||||
|
||||
### Stages
|
||||
The mgmt compiler runs in a number of stages. In order of execution they are:
|
||||
* [Lexing](#lexing)
|
||||
* [Parsing](#parsing)
|
||||
* [Interpolation](#interpolation)
|
||||
* [Scope propagation](#scope-propagation)
|
||||
* [Type unification](#type-unification)
|
||||
* [Function graph generation](#function-graph-generation)
|
||||
* [Function engine creation and validation](#function-engine-creation-and-validation)
|
||||
|
||||
All of the above needs to be done every time the source code changes. After this
|
||||
point, the [function engine runs](#function-engine-running-and-interpret) and
|
||||
produces events. On every event, we "[interpret](#function-engine-running-and-interpret)"
|
||||
which produces a resource graph. This series of resource graphs are passed
|
||||
to the engine as they are produced.
|
||||
|
||||
What follows are some notes about each step.
|
||||
|
||||
#### Lexing
|
||||
Lexing is done using [nex](https://github.com/blynn/nex). It is a pure-golang
|
||||
implementation which is similar to _Lex_ or _Flex_, but which produces golang
|
||||
code instead of C. It integrates reasonably well with golang's _yacc_ which is
|
||||
used for parsing. The token definitions are in:
|
||||
[lang/lexer.nex](https://github.com/purpleidea/mgmt/tree/master/lang/lexer.nex).
|
||||
Lexing and parsing run together by calling the `LexParse` method.
|
||||
|
||||
#### Parsing
|
||||
The parser used is golang's implementation of
|
||||
[yacc](https://godoc.org/golang.org/x/tools/cmd/goyacc). The documentation is
|
||||
quite abysmal, so it's helpful to rely on the documentation from standard yacc
|
||||
and trial and error. One small advantage yacc has over standard yacc is that it
|
||||
can produce error messages from examples. The best documentation is to examine
|
||||
the source. There is a short write up available [here](https://research.swtch.com/yyerror).
|
||||
The yacc file exists at:
|
||||
[lang/parser.y](https://github.com/purpleidea/mgmt/tree/master/lang/parser.y).
|
||||
Lexing and parsing run together by calling the `LexParse` method.
|
||||
|
||||
#### Interpolation
|
||||
Interpolation is used to transform the AST (which was produced from lexing and
|
||||
parsing) into one which is either identical or different. It expands strings
|
||||
which might contain expressions to be interpolated (eg: `"the answer is: ${foo}"`)
|
||||
and can be used for other scenarios in which one statement or expression would
|
||||
be better represented by a larger AST. Most nodes in the AST simply return their
|
||||
own node address, and do not modify the AST.
|
||||
|
||||
#### Scope propagation
|
||||
Scope propagation passes the parent scope (starting with the top-level, built-in
|
||||
scope) down through the AST. This is necessary so that children nodes can access
|
||||
variables in the scope if needed. Most AST node's simply pass on the scope
|
||||
without making any changes. The `ExprVar` node naturally consumes scope's and
|
||||
the `StmtProg` node cleverly passes the scope through in the order expected for
|
||||
the out-of-order bind logic to work.
|
||||
|
||||
#### Type unification
|
||||
Each expression must have a known type. The unpleasant option is to force the
|
||||
programmer to specify by annotation every type throughout their whole program
|
||||
so that each `Expr` node in the AST knows what to expect. Type annotation is
|
||||
allowed in situations when you want to explicitly specify a type, or when the
|
||||
compiler cannot deduce it, however, most of it can usually be inferred.
|
||||
|
||||
For type inferrence to work, each node in the AST implements a `Unify` method
|
||||
which is able to return a list of invariants that must hold true. This starts at
|
||||
the top most AST node, and gets called through to it's children to assemble a
|
||||
giant list of invariants. The invariants can take different forms. They can
|
||||
specify that a particular expression must have a particular type, or they can
|
||||
specify that two expressions must have the same types. More complex invariants
|
||||
allow you to specify relationships between different types and expressions.
|
||||
Furthermore, invariants can allow you to specify that only one invariant out of
|
||||
a set must hold true.
|
||||
|
||||
Once the list of invariants has been collected, they are run through an
|
||||
invariant solver. The solver can return either return successfully or with an
|
||||
error. If the solver returns successfully, it means that it has found a trivial
|
||||
mapping between every expression and it's corresponding type. At this point it
|
||||
is a simple task to run `SetType` on every expression so that the types are
|
||||
known. If the solver returns in error, it is usually due to one of two
|
||||
possibilities:
|
||||
|
||||
1. Ambiguity
|
||||
|
||||
The solver does not have enough information to make a definitive or
|
||||
unique determination about the expression to type mappings. The set of
|
||||
invariants is ambiguous, and we cannot continue. An error will be
|
||||
returned to the programmer. In this scenario the user will probably need
|
||||
to add a type annotation, possibly because of a design bug in the user's
|
||||
program.
|
||||
|
||||
2. Conflict
|
||||
|
||||
The solver has conflicting information that cannot be reconciled. In
|
||||
this situation an explicit conflict has been found. If two invariants
|
||||
are found which both expect a particular expression to have different
|
||||
types, then it is not possible to find a valid solution. This almost
|
||||
always happens if the user has made a type error in their program.
|
||||
|
||||
Only one solver currently exists, but it is possible to easily plug in an
|
||||
alternate implementation if someone more skilled in the art of solver design
|
||||
would like to propose a more logical or performant variant.
|
||||
|
||||
#### Function graph generation
|
||||
At this point we have a fully type AST. The AST must now be transformed into a
|
||||
directed, acyclic graph (DAG) data structure that represents the flow of data as
|
||||
necessary for everything to be reactive. Note that this graph is *different*
|
||||
from the resource graph which is produced and sent to the engine. It is just a
|
||||
coincidence that both happen to be DAG's. (You don't freak out when you see a
|
||||
list data structure show up in more than one place, do you?)
|
||||
|
||||
To produce this graph, each node has a `Graph` method which it can call. This
|
||||
starts at the top most node, and is called down through the AST. The edges in
|
||||
the graphs must represent the individual expression values which are passed
|
||||
from node to node. The names of the edges must match the function type argument
|
||||
names which are used in the definition of the corresponding function. These
|
||||
corresponding functions must exist for each expression node and are produced by
|
||||
calling that expression's `Func` method. These are usually called by the
|
||||
function engine during function creation and validation.
|
||||
|
||||
#### Function engine creation and validation
|
||||
Finally we have a graph of the data flows. The function engine must first
|
||||
initialize which creates references to each of the necessary function
|
||||
implementations, and gets information about each one. It then needs to be type
|
||||
checked to ensure that the data flows all correctly match what is expected. If
|
||||
you were to pass an `int` to a function expecting a `bool`, this would be a
|
||||
problem. If all goes well, the program should get run shortly.
|
||||
|
||||
#### Function engine running and interpret
|
||||
At this point the function engine runs. It produces a stream of events which
|
||||
cause the `Output()` method of the top-level program to run, which produces the
|
||||
list of resources and edges. These are then transformed into the resource graph
|
||||
which is passed to the engine.
|
||||
|
||||
### Function API
|
||||
If you'd like to create a built-in, core function, you'll need to implement the
|
||||
function API interface named `Func`. It can be found in
|
||||
[lang/interfaces/func.go](https://github.com/purpleidea/mgmt/tree/master/lang/interfaces/func.go).
|
||||
Your function must have a specific type. For example, a simple math function
|
||||
might have a signature of `func(x int, x int) int`. As you can see, all the
|
||||
types are known _before_ compile time.
|
||||
|
||||
What follows are each of the method signatures and a description of each.
|
||||
Failure to implement the API correctly can cause the function graph engine to
|
||||
block, or the program to panic.
|
||||
|
||||
### Info
|
||||
```golang
|
||||
Info() *Info
|
||||
```
|
||||
|
||||
The Info method must return a struct containing some information about your
|
||||
function. The struct has the following type:
|
||||
|
||||
```golang
|
||||
type Info struct {
|
||||
Sig *types.Type // the signature of the function, must be KindFunc
|
||||
}
|
||||
```
|
||||
|
||||
You must implement this correctly. Other fields in the `Info` struct may be
|
||||
added in the future. This method is usually called before any other, and should
|
||||
not depend on any other method being called first. Other methods must not depend
|
||||
on this method being called first.
|
||||
|
||||
#### Example
|
||||
```golang
|
||||
func (obj *FooFunc) Info() *interfaces.Info {
|
||||
return &interfaces.Info{
|
||||
Sig: types.NewType("func(a str, b int) float"),
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Init
|
||||
```golang
|
||||
Init(*Init) error
|
||||
```
|
||||
|
||||
Init is called by the function graph engine to create an implementation of this
|
||||
function. It is passed in a struct of the following form:
|
||||
|
||||
```golang
|
||||
type Init struct {
|
||||
Hostname string // uuid for the host
|
||||
Input chan types.Value // Engine will close `input` chan
|
||||
Output chan types.Value // Stream must close `output` chan
|
||||
World resources.World
|
||||
Debug bool
|
||||
Logf func(format string, v ...interface{})
|
||||
}
|
||||
```
|
||||
|
||||
These values and references may be used (wisely) inside your function. `Input`
|
||||
will contain a channel of input structs matching the expected input signature
|
||||
for your function. `Output` will be the channel which you must send values to
|
||||
whenever a new value should be produced. This must be done in the `Stream()`
|
||||
function. You may carefully use `World` to access functionality provided by the
|
||||
engine. You may use `Logf` to log informational messages, however there is no
|
||||
guarantee that they will be displayed to the user. `Debug` specifies whether the
|
||||
function is running in a user-requested debug mode. This might cause you to want
|
||||
to print more log messages for example. You will need to save references to any
|
||||
or all of these info fields that you wish to use in the struct implementing this
|
||||
`Func` interface. At a minimum you will need to save `Output` as a minimum of
|
||||
one value must be produced.
|
||||
|
||||
#### Example
|
||||
```golang
|
||||
Please see the example functions in
|
||||
[lang/funcs/public/](https://github.com/purpleidea/mgmt/tree/master/lang/funcs/public/).
|
||||
```
|
||||
|
||||
### Stream
|
||||
```golang
|
||||
Stream() error
|
||||
```
|
||||
|
||||
Stream is called by the function engine when it is ready for your function to
|
||||
start accepting input and producing output. You must always produce at least one
|
||||
value. Failure to produce at least one value will probably cause the function
|
||||
engine to hang waiting for your output. This function must close the `Output`
|
||||
channel when it has no more values to send. The engine will close the `Input`
|
||||
channel when it has no more values to send. This may or may not influence
|
||||
whether or not you close the `Output` channel.
|
||||
|
||||
#### Example
|
||||
```golang
|
||||
Please see the example functions in
|
||||
[lang/funcs/public/](https://github.com/purpleidea/mgmt/tree/master/lang/funcs/public/).
|
||||
```
|
||||
|
||||
### Close
|
||||
```golang
|
||||
Close() error
|
||||
```
|
||||
|
||||
Close asks the particular function to shutdown its `Stream()` function and
|
||||
return.
|
||||
|
||||
#### Example
|
||||
```golang
|
||||
Please see the example functions in
|
||||
[lang/funcs/public/](https://github.com/purpleidea/mgmt/tree/master/lang/funcs/public/).
|
||||
```
|
||||
|
||||
### Polymorphic Function API
|
||||
For some functions, it might be helpful to be able to implement a function once,
|
||||
but to have multiple polymorphic variants that can be chosen at compile time.
|
||||
For this more advanced topic, you will need to use the
|
||||
[Polymorphic Function API](#polymorphic-function-api). This will help with code
|
||||
reuse when you have a small, finite number of possible type signatures, and also
|
||||
for more complicated cases where you might have an infinite number of possible
|
||||
type signatures. (eg: `[]str`, or `[][]str`, or `[][][]str`, etc...)
|
||||
|
||||
Suppose you want to implement a function which can assume different type
|
||||
signatures. The mgmt language does not support polymorphic types-- you must use
|
||||
static types throughout the language, however, it is legal to implement a
|
||||
function which can take different specific type signatures based on how it is
|
||||
used. For example, you might wish to add a math function which could take the
|
||||
form of `func(x int, x int) int` or `func(x float, x float) float` depending on
|
||||
the input values. You might also want to implement a function which takes an
|
||||
arbitrary number of input arguments (the number must be statically fixed at the
|
||||
compile time of your program though) and which returns a string.
|
||||
|
||||
The `PolyFunc` interface adds additional methods which you must implement to
|
||||
satisfy such a function implementation. If you'd like to implement such a
|
||||
function, then please notify the project authors, and they will expand this
|
||||
section with a longer description of the process.
|
||||
|
||||
#### Examples
|
||||
|
||||
What follows are a few examples that might help you understand some of the
|
||||
language details.
|
||||
|
||||
##### Example Foo
|
||||
TODO: please add an example here!
|
||||
|
||||
##### Example Bar
|
||||
TODO: please add an example here!
|
||||
|
||||
## Frequently asked questions
|
||||
(Send your questions as a patch to this FAQ! I'll review it, merge it, and
|
||||
respond by commit with the answer.)
|
||||
|
||||
### What is the difference between `ExprIf` and `StmtIf`?
|
||||
|
||||
The language contains both an `if` expression, and and `if` statement. An `if`
|
||||
expression takes a boolean conditional *and* it must contain exactly _two_
|
||||
branches (a `then` and an `else` branch) which each contain one expression. The
|
||||
`if` expression _will_ return the value of one of the two branches based on the
|
||||
conditional.
|
||||
|
||||
#### Example:
|
||||
```
|
||||
# this is an if expression, and both branches must exist
|
||||
$b = true
|
||||
$x = if $b {
|
||||
42
|
||||
} else {
|
||||
-13
|
||||
}
|
||||
```
|
||||
|
||||
The `if` statement also takes a boolean conditional, but it may have either one
|
||||
or two branches. Branches must only directly contain statements. The `if`
|
||||
statement does not return any value, but it does produce output when it is
|
||||
evaluated. The output consists primarily of resources (vertices) and edges.
|
||||
|
||||
#### Example:
|
||||
```
|
||||
# this is an if statement, and in this scenario the else branch was omitted
|
||||
$b = true
|
||||
if $b {
|
||||
file "/tmp/hello" {
|
||||
content => "world",
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### I don't like the mgmt language, is there an alternative?
|
||||
|
||||
Yes, the language is just one of the available "frontends" that passes a stream
|
||||
of graphs to the engine "backend". While it _is_ the recommended way of using
|
||||
mgmt, you're welcome to either use an alternate frontend, or write your own. To
|
||||
write your own frontend, you must implement the
|
||||
[GAPI](https://github.com/purpleidea/mgmt/blob/master/gapi/gapi.go) interface.
|
||||
|
||||
### I'm an expert in FRP, and you got it all wrong; even the names of things!
|
||||
|
||||
I am certainly no expert in FRP, and I've certainly got lots more to learn. One
|
||||
thing FRP experts might notice is that some of the concepts from FRP are either
|
||||
named differently, or are notably absent.
|
||||
|
||||
In mgmt, we don't talk about behaviours, events, or signals in the strict FRP
|
||||
definitons of the words. Firstly, because we only support discretized, streams
|
||||
of values with no plan to add continuous semantics. Secondly, because we prefer
|
||||
to use terms which are more natural and relatable to what our target audience is
|
||||
expecting. Our users are more likely to have a background in Physiology, or
|
||||
systems administration than a background in FRP.
|
||||
|
||||
Having said that, we hope that the FRP community will engage with us and help
|
||||
improve the parts that we got wrong. Even if that means adding continuous
|
||||
behaviours!
|
||||
|
||||
### This is brilliant, may I give you a high-five?
|
||||
|
||||
Thank you, and yes, probably. "Props" may also be accepted, although patches are
|
||||
preferred. If you can't do either, [donations](https://purpleidea.com/misc/donate/)
|
||||
to support the project are welcome too!
|
||||
|
||||
### Where can I find more information about mgmt?
|
||||
|
||||
Additional blog posts, videos and other material
|
||||
[is available!](https://github.com/purpleidea/mgmt/blob/master/docs/on-the-web.md).
|
||||
|
||||
## Suggestions
|
||||
|
||||
If you have any ideas for changes or other improvements to the language, please
|
||||
let us know! We're still pre 1.0 and pre 0.1 and happy to change it in order to
|
||||
get it right!
|
||||
Reference in New Issue
Block a user