lang: Add module imports and more

This enables imports in mcl code, and is one of last remaining blockers to using mgmt. Now we can start writing standalone modules, and adding standard library functions as needed. There's still lots to do, but this was a big missing piece. It was much harder to get right than I had expected, but I think it's solid! This unfortunately large commit is the result of some wild hacking I've been doing for the past little while. It's the result of a rebase that broke many "wip" commits that tracked my private progress, into something that's not gratuitously messy for our git logs. Since this was a learning and discovery process for me, I've "erased" the confusing git history that wouldn't have helped. I'm happy to discuss the dead-ends, and a small portion of that code was even left in for possible future use. This patch includes: * A change to the cli interface: You now specify the front-end explicitly, instead of leaving it up to the front-end to decide when to "activate". For example, instead of: mgmt run --lang code.mcl we now do: mgmt run lang --lang code.mcl We might rename the --lang flag in the future to avoid the awkward word repetition. Suggestions welcome, but I'm considering "input". One side-effect of this change, is that flags which are "engine" specific now must be specified with "run" before the front-end name. Eg: mgmt run --tmp-prefix lang --lang code.mcl instead of putting --tmp-prefix at the end. We also changed the GAPI slightly, but I've patched all code that used it. This also makes things consistent with the "deploy" command. * The deploys are more robust and let you deploy after a run This has been vastly improved and let's mgmt really run as a smart engine that can handle different workloads. If you don't want to deploy when you've started with `run` or if one comes in, you can use the --no-watch-deploy option to block new deploys. * The import statement exists and works! We now have a working `import` statement. Read the docs, and try it out. I think it's quite elegant how it fits in with `SetScope`. Have a look. As a result, we now have some built-in functions available in modules. This also adds the metadata.yaml entry-point for all modules. Have a look at the examples or the tests. The bulk of the patch is to support this. * Improved lang input parsing code: I re-wrote the parsing that determined what ran when we passed different things to --lang. Deciding between running an mcl file or raw code is now handled in a more intelligent, and re-usable way. See the inputs.go file if you want to have a look. One casualty is that you can't stream code from stdin *directly* to the front-end, it's encapsulated into a deploy first. You can still use stdin though! I doubt anyone will notice this change. * The scope was extended to include functions and classes: Go forth and import lovely code. All these exist in scopes now, and can be re-used! * Function calls actually use the scope now. Glad I got this sorted out. * There is import cycle detection for modules! Yes, this is another dag. I think that's #4. I guess they're useful. * A ton of tests and new test infra was added! This should make it much easier to add new tests that run mcl code. Have a look at TestAstFunc1 to see how to add more of these. As usual, I'll try to keep these commits smaller in the future!
2018-11-22 16:48:10 -05:00
parent 948a3c6d08
commit 96dccca475
146 changed files with 5301 additions and 1112 deletions
--- a/lang/gapi.go
+++ b/lang/gapi.go
@@ -18,24 +18,33 @@
 package lang

 import (
+	"bytes"
 	"fmt"
 	"strings"
 	"sync"

-	"github.com/purpleidea/mgmt/engine"
 	"github.com/purpleidea/mgmt/gapi"
+	"github.com/purpleidea/mgmt/lang/funcs"
+	"github.com/purpleidea/mgmt/lang/interfaces"
+	"github.com/purpleidea/mgmt/lang/unification"
 	"github.com/purpleidea/mgmt/pgraph"
+	"github.com/purpleidea/mgmt/util"

 	multierr "github.com/hashicorp/go-multierror"
 	errwrap "github.com/pkg/errors"
+	"github.com/spf13/afero"
 	"github.com/urfave/cli"
 )

 const (
 	// Name is the name of this frontend.
 	Name = "lang"
-	// Start is the entry point filename that we use. It is arbitrary.
-	Start = "/start." + FileNameExtension // FIXME: replace with a proper code entry point schema (directory schema)
+
+	// flagModulePath is the name of the module-path flag.
+	flagModulePath = "module-path"
+
+	// flagDownload is the name of the download flag.
+	flagDownload = "download"
 )

 func init() {
@@ -48,57 +57,321 @@ type GAPI struct {

 	lang *Lang // lang struct

-	data        gapi.Data
+	// this data struct is only available *after* Init, so as a result, it
+	// can not be used inside the Cli(...) method.
+	data        *gapi.Data
 	initialized bool
 	closeChan   chan struct{}
 	wg          *sync.WaitGroup // sync group for tunnel go routines
 }

+// CliFlags returns a list of flags used by the specified subcommand.
+func (obj *GAPI) CliFlags(command string) []cli.Flag {
+	result := []cli.Flag{}
+	modulePath := cli.StringFlag{
+		Name:   flagModulePath,
+		Value:  "", // empty by default
+		Usage:  "choose the modules path (absolute)",
+		EnvVar: "MGMT_MODULE_PATH",
+	}
+
+	// add this only to run (not needed for get or deploy)
+	if command == gapi.CommandRun {
+		runFlags := []cli.Flag{
+			cli.BoolFlag{
+				Name:  flagDownload,
+				Usage: "download any missing imports (as the get command does)",
+			},
+			cli.BoolFlag{
+				Name:  "update",
+				Usage: "update all dependencies to the latest versions",
+			},
+		}
+		result = append(result, runFlags...)
+	}
+
+	switch command {
+	case gapi.CommandGet:
+		flags := []cli.Flag{
+			cli.IntFlag{
+				Name:  "depth d",
+				Value: -1,
+				Usage: "max recursion depth limit (-1 is unlimited)",
+			},
+			cli.IntFlag{
+				Name:  "retry r",
+				Value: 0, // any error is a failure by default
+				Usage: "max number of retries (-1 is unlimited)",
+			},
+			//modulePath, // already defined below in fallthrough
+		}
+		result = append(result, flags...)
+		fallthrough // at the moment, we want the same code input arg...
+	case gapi.CommandRun:
+		fallthrough
+	case gapi.CommandDeploy:
+		flags := []cli.Flag{
+			cli.StringFlag{
+				Name:  fmt.Sprintf("%s, %s", Name, Name[0:1]),
+				Value: "",
+				Usage: "code to deploy",
+			},
+			// TODO: removed (temporarily?)
+			//cli.BoolFlag{
+			//	Name:  "stdin",
+			//	Usage: "use passthrough stdin",
+			//},
+			modulePath,
+		}
+		result = append(result, flags...)
+	default:
+		return []cli.Flag{}
+	}
+
+	return result
+}
+
 // Cli takes a cli.Context, and returns our GAPI if activated. All arguments
 // should take the prefix of the registered name. On activation, if there are
 // any validation problems, you should return an error. If this was not
-// activated, then you should return a nil GAPI and a nil error.
-func (obj *GAPI) Cli(c *cli.Context, fs engine.Fs) (*gapi.Deploy, error) {
-	if s := c.String(Name); c.IsSet(Name) {
-		if s == "" {
-			return nil, fmt.Errorf("input code is empty")
-		}
-
-		// read through this local path, and store it in our file system
-		// since our deploy should work anywhere in the cluster, let the
-		// engine ensure that this file system is replicated everywhere!
-
-		// TODO: single file input for now
-		if err := gapi.CopyFileToFs(fs, s, Start); err != nil {
-			return nil, errwrap.Wrapf(err, "can't copy code from `%s` to `%s`", s, Start)
-		}
-
-		return &gapi.Deploy{
-			Name: Name,
-			Noop: c.GlobalBool("noop"),
-			Sema: c.GlobalInt("sema"),
-			GAPI: &GAPI{
-				InputURI: fs.URI(),
-				// TODO: add properties here...
-			},
-		}, nil
+// activated, then you should return a nil GAPI and a nil error. This is passed
+// in a functional file system interface. For standalone usage, this will be a
+// temporary memory-backed filesystem so that the same deploy API is used, and
+// for normal clustered usage, this will be the normal implementation which is
+// usually an etcd backed fs. At this point we should be copying the necessary
+// local file system data into our fs for future use when the GAPI is running.
+// IOW, running this Cli function, when activated, produces a deploy object
+// which is run by our main loop. The difference between running from `deploy`
+// or from `run` (both of which can activate this GAPI) is that `deploy` copies
+// to an etcdFs, and `run` copies to a memFs. All GAPI's run off of the fs that
+// is passed in.
+func (obj *GAPI) Cli(cliInfo *gapi.CliInfo) (*gapi.Deploy, error) {
+	c := cliInfo.CliContext
+	cliContext := c.Parent()
+	if cliContext == nil {
+		return nil, fmt.Errorf("could not get cli context")
+	}
+	fs := cliInfo.Fs // copy files from local filesystem *into* this fs...
+	prefix := ""     // TODO: do we need this?
+	debug := cliInfo.Debug
+	logf := func(format string, v ...interface{}) {
+		cliInfo.Logf(Name+": "+format, v...)
 	}
-	return nil, nil // we weren't activated!
-}

-// CliFlags returns a list of flags used by this deploy subcommand.
-func (obj *GAPI) CliFlags() []cli.Flag {
-	return []cli.Flag{
-		cli.StringFlag{
-			Name:  fmt.Sprintf("%s, %s", Name, Name[0:1]),
-			Value: "",
-			Usage: "language code path to deploy",
+	if !c.IsSet(Name) {
+		return nil, nil // we weren't activated!
+	}
+
+	// empty by default (don't set for deploy, only download)
+	modules := c.String(flagModulePath)
+	if modules != "" && (!strings.HasPrefix(modules, "/") || !strings.HasSuffix(modules, "/")) {
+		return nil, fmt.Errorf("module path is not an absolute directory")
+	}
+
+	// TODO: while reading through trees of metadata files, we could also
+	// check the license compatibility of deps...
+
+	osFs := afero.NewOsFs()
+	readOnlyOsFs := afero.NewReadOnlyFs(osFs) // can't be readonly to dl!
+	//bp := afero.NewBasePathFs(osFs, base) // TODO: can this prevent parent dir access?
+	afs := &afero.Afero{Fs: readOnlyOsFs} // wrap so that we're implementing ioutil
+	localFs := &util.Fs{Afero: afs}       // always the local fs
+	downloadAfs := &afero.Afero{Fs: osFs}
+	downloadFs := &util.Fs{Afero: downloadAfs} // TODO: use with a parent path preventer?
+
+	// the fs input here is the local fs we're reading to get the files from
+	// this is different from the fs variable which is our output dest!!!
+	output, err := parseInput(c.String(Name), localFs)
+	if err != nil {
+		return nil, errwrap.Wrapf(err, "could not activate an input parser")
+	}
+
+	// no need to run recursion detection since this is the beginning
+	// TODO: do the paths need to be cleaned for "../" before comparison?
+
+	logf("lexing/parsing...")
+	ast, err := LexParse(bytes.NewReader(output.Main))
+	if err != nil {
+		return nil, errwrap.Wrapf(err, "could not generate AST")
+	}
+	if debug {
+		logf("behold, the AST: %+v", ast)
+	}
+
+	var downloader interfaces.Downloader
+	if c.IsSet(flagDownload) && c.Bool(flagDownload) {
+		downloadInfo := &interfaces.DownloadInfo{
+			Fs: downloadFs, // the local fs!
+
+			// flags are passed in during Init()
+			Noop:   cliContext.Bool("noop"),
+			Sema:   cliContext.Int("sema"),
+			Update: c.Bool("update"),
+
+			Debug: debug,
+			Logf: func(format string, v ...interface{}) {
+				// TODO: is this a sane prefix to use here?
+				logf("get: "+format, v...)
+			},
+		}
+		// this fulfills the interfaces.Downloader interface
+		downloader = &Downloader{
+			Depth: c.Int("depth"), // default of infinite is -1
+			Retry: c.Int("retry"), // infinite is -1
+		}
+		if err := downloader.Init(downloadInfo); err != nil {
+			return nil, errwrap.Wrapf(err, "could not initialize downloader")
+		}
+	}
+
+	importGraph, err := pgraph.NewGraph("importGraph")
+	if err != nil {
+		return nil, errwrap.Wrapf(err, "could not create graph")
+	}
+	importVertex := &pgraph.SelfVertex{
+		Name:  "",          // first node is the empty string
+		Graph: importGraph, // store a reference to ourself
+	}
+	importGraph.AddVertex(importVertex)
+
+	logf("init...")
+	// init and validate the structure of the AST
+	data := &interfaces.Data{
+		Fs:         localFs,     // the local fs!
+		Base:       output.Base, // base dir (absolute path) that this is rooted in
+		Files:      output.Files,
+		Imports:    importVertex,
+		Metadata:   output.Metadata,
+		Modules:    modules,
+		Downloader: downloader,
+
+		//World: obj.World, // TODO: do we need this?
+		Prefix: prefix,
+		Debug:  debug,
+		Logf: func(format string, v ...interface{}) {
+			// TODO: is this a sane prefix to use here?
+			logf("ast: "+format, v...)
 		},
 	}
+	// some of this might happen *after* interpolate in SetScope or Unify...
+	if err := ast.Init(data); err != nil {
+		return nil, errwrap.Wrapf(err, "could not init and validate AST")
+	}
+
+	logf("interpolating...")
+	// interpolate strings and other expansionable nodes in AST
+	interpolated, err := ast.Interpolate()
+	if err != nil {
+		return nil, errwrap.Wrapf(err, "could not interpolate AST")
+	}
+
+	// top-level, built-in, initial global scope
+	scope := &interfaces.Scope{
+		Variables: map[string]interfaces.Expr{
+			"purpleidea": &ExprStr{V: "hello world!"}, // james says hi
+			// TODO: change to a func when we can change hostname dynamically!
+			"hostname": &ExprStr{V: ""}, // NOTE: empty b/c not used
+		},
+		// all the built-in top-level, core functions enter here...
+		Functions: funcs.LookupPrefix(""),
+	}
+
+	logf("building scope...")
+	// propagate the scope down through the AST...
+	// We use SetScope because it follows all of the imports through. I did
+	// not think we needed to pass in an initial scope because the download
+	// operation should not depend on any initial scope values, since those
+	// would all be runtime changes, and we do not support dynamic imports,
+	// however, we need to since we're doing type unification to err early!
+	if err := interpolated.SetScope(scope); err != nil { // empty initial scope!
+		return nil, errwrap.Wrapf(err, "could not set scope")
+	}
+
+	// apply type unification
+	unificationLogf := func(format string, v ...interface{}) {
+		if debug { // unification only has debug messages...
+			logf("unification: "+format, v...)
+		}
+	}
+	logf("running type unification...")
+	if err := unification.Unify(interpolated, unification.SimpleInvariantSolverLogger(unificationLogf)); err != nil {
+		return nil, errwrap.Wrapf(err, "could not unify types")
+	}
+
+	// get the list of needed files (this is available after SetScope)
+	fileList, err := CollectFiles(interpolated)
+	if err != nil {
+		return nil, errwrap.Wrapf(err, "could not collect files")
+	}
+
+	// add in our initial files
+
+	// we can sometimes be missing our top-level metadata.yaml and main.mcl
+	files := []string{}
+	files = append(files, output.Files...)
+	files = append(files, fileList...)
+
+	// run some copy operations to add data into the filesystem
+	for _, fn := range output.Workers {
+		if err := fn(fs); err != nil {
+			return nil, err
+		}
+	}
+
+	// TODO: do we still need this, now that we have the Imports DAG?
+	noDuplicates := util.StrRemoveDuplicatesInList(files)
+	if len(noDuplicates) != len(files) {
+		// programming error here or in this logical test
+		return nil, fmt.Errorf("duplicates in file list found")
+	}
+
+	// sort by depth dependency order! (or mkdir -p all the dirs first)
+	// TODO: is this natively already in a correctly sorted order?
+	util.PathSlice(files).Sort() // sort it
+	for _, src := range files {  // absolute paths
+		// rebase path src to root file system of "/" for etcdfs...
+		dst, err := util.Rebase(src, output.Base, "/")
+		if err != nil {
+			// possible programming error
+			return nil, errwrap.Wrapf(err, "malformed source file path: `%s`", src)
+		}
+
+		if strings.HasSuffix(src, "/") { // it's a dir
+			// TODO: add more tests to this (it is actually CopyFs)
+			if err := gapi.CopyDirToFs(fs, src, dst); err != nil {
+				return nil, errwrap.Wrapf(err, "can't copy dir from `%s` to `%s`", src, dst)
+			}
+			continue
+		}
+		// it's a regular file path
+		if err := gapi.CopyFileToFs(fs, src, dst); err != nil {
+			return nil, errwrap.Wrapf(err, "can't copy file from `%s` to `%s`", src, dst)
+		}
+	}
+
+	// display the deploy fs tree
+	if debug || true { // TODO: should this only be shown on debug?
+		logf("input: %s", c.String(Name))
+		tree, err := util.FsTree(fs, "/")
+		if err != nil {
+			return nil, err
+		}
+		logf("tree:\n%s", tree)
+	}
+
+	return &gapi.Deploy{
+		Name: Name,
+		Noop: c.GlobalBool("noop"),
+		Sema: c.GlobalInt("sema"),
+		GAPI: &GAPI{
+			InputURI: fs.URI(),
+			// TODO: add properties here...
+		},
+	}, nil
 }

 // Init initializes the lang GAPI struct.
-func (obj *GAPI) Init(data gapi.Data) error {
+func (obj *GAPI) Init(data *gapi.Data) error {
 	if obj.initialized {
 		return fmt.Errorf("already initialized")
 	}
@@ -117,20 +390,21 @@ func (obj *GAPI) LangInit() error {
 	if obj.lang != nil {
 		return nil // already ran init, close first!
 	}
+	if obj.InputURI == "-" {
+		return fmt.Errorf("stdin passthrough is not supported at this time")
+	}

 	fs, err := obj.data.World.Fs(obj.InputURI) // open the remote file system
 	if err != nil {
 		return errwrap.Wrapf(err, "can't load code from file system `%s`", obj.InputURI)
 	}
+	// the lang always tries to load from this standard path: /metadata.yaml
+	input := "/" + interfaces.MetadataFilename // path in remote fs

-	b, err := fs.ReadFile(Start) // read the single file out of it
-	if err != nil {
-		return errwrap.Wrapf(err, "can't read code from file `%s`", Start)
-	}
-
-	code := strings.NewReader(string(b))
 	obj.lang = &Lang{
-		Input:    code, // string as an interface that satisfies io.Reader
+		Fs:    fs,
+		Input: input,
+
 		Hostname: obj.data.Hostname,
 		World:    obj.data.World,
 		Debug:    obj.data.Debug,
@@ -293,3 +567,127 @@ func (obj *GAPI) Close() error {
 	obj.initialized = false // closed = true
 	return nil
 }
+
+// Get runs the necessary downloads. This basically runs the lexer, parser and
+// sets the scope so that all the imports are followed. It passes a downloader
+// in, which can be used to pull down or update any missing imports. This will
+// also work when called with the download flag during a normal execution run.
+func (obj *GAPI) Get(getInfo *gapi.GetInfo) error {
+	c := getInfo.CliContext
+	cliContext := c.Parent()
+	if cliContext == nil {
+		return fmt.Errorf("could not get cli context")
+	}
+	prefix := "" // TODO: do we need this?
+	debug := getInfo.Debug
+	logf := getInfo.Logf
+
+	// empty by default (don't set for deploy, only download)
+	modules := c.String(flagModulePath)
+	if modules != "" && (!strings.HasPrefix(modules, "/") || !strings.HasSuffix(modules, "/")) {
+		return fmt.Errorf("module path is not an absolute directory")
+	}
+
+	osFs := afero.NewOsFs()
+	readOnlyOsFs := afero.NewReadOnlyFs(osFs) // can't be readonly to dl!
+	//bp := afero.NewBasePathFs(osFs, base) // TODO: can this prevent parent dir access?
+	afs := &afero.Afero{Fs: readOnlyOsFs} // wrap so that we're implementing ioutil
+	localFs := &util.Fs{Afero: afs}       // always the local fs
+	downloadAfs := &afero.Afero{Fs: osFs}
+	downloadFs := &util.Fs{Afero: downloadAfs} // TODO: use with a parent path preventer?
+
+	// the fs input here is the local fs we're reading to get the files from
+	// this is different from the fs variable which is our output dest!!!
+	output, err := parseInput(c.String(Name), localFs)
+	if err != nil {
+		return errwrap.Wrapf(err, "could not activate an input parser")
+	}
+
+	// no need to run recursion detection since this is the beginning
+	// TODO: do the paths need to be cleaned for "../" before comparison?
+
+	logf("lexing/parsing...")
+	ast, err := LexParse(bytes.NewReader(output.Main))
+	if err != nil {
+		return errwrap.Wrapf(err, "could not generate AST")
+	}
+	if debug {
+		logf("behold, the AST: %+v", ast)
+	}
+
+	downloadInfo := &interfaces.DownloadInfo{
+		Fs: downloadFs, // the local fs!
+
+		// flags are passed in during Init()
+		Noop:   cliContext.Bool("noop"),
+		Sema:   cliContext.Int("sema"),
+		Update: cliContext.Bool("update"),
+
+		Debug: debug,
+		Logf: func(format string, v ...interface{}) {
+			// TODO: is this a sane prefix to use here?
+			logf("get: "+format, v...)
+		},
+	}
+	// this fulfills the interfaces.Downloader interface
+	downloader := &Downloader{
+		Depth: c.Int("depth"), // default of infinite is -1
+		Retry: c.Int("retry"), // infinite is -1
+	}
+	if err := downloader.Init(downloadInfo); err != nil {
+		return errwrap.Wrapf(err, "could not initialize downloader")
+	}
+
+	importGraph, err := pgraph.NewGraph("importGraph")
+	if err != nil {
+		return errwrap.Wrapf(err, "could not create graph")
+	}
+	importVertex := &pgraph.SelfVertex{
+		Name:  "",          // first node is the empty string
+		Graph: importGraph, // store a reference to ourself
+	}
+	importGraph.AddVertex(importVertex)
+
+	logf("init...")
+	// init and validate the structure of the AST
+	data := &interfaces.Data{
+		Fs:         localFs,     // the local fs!
+		Base:       output.Base, // base dir (absolute path) that this is rooted in
+		Files:      output.Files,
+		Imports:    importVertex,
+		Metadata:   output.Metadata,
+		Modules:    modules,
+		Downloader: downloader,
+
+		//World: obj.World, // TODO: do we need this?
+		Prefix: prefix,
+		Debug:  debug,
+		Logf: func(format string, v ...interface{}) {
+			// TODO: is this a sane prefix to use here?
+			logf("ast: "+format, v...)
+		},
+	}
+	// some of this might happen *after* interpolate in SetScope or Unify...
+	if err := ast.Init(data); err != nil {
+		return errwrap.Wrapf(err, "could not init and validate AST")
+	}
+
+	logf("interpolating...")
+	// interpolate strings and other expansionable nodes in AST
+	interpolated, err := ast.Interpolate()
+	if err != nil {
+		return errwrap.Wrapf(err, "could not interpolate AST")
+	}
+
+	logf("building scope...")
+	// propagate the scope down through the AST...
+	// we use SetScope because it follows all of the imports through. i
+	// don't think we need to pass in an initial scope because the download
+	// operation shouldn't depend on any initial scope values, since those
+	// would all be runtime changes, and we do not support dynamic imports!
+	if err := interpolated.SetScope(nil); err != nil { // empty initial scope!
+		return errwrap.Wrapf(err, "could not set scope")
+	}
+
+	return nil // success!
+}