etcd: Wait for server to start before continuing

I think there was a rare race where we would make use of the etcd server
before it had fully started up. I only ever saw this occur on travis,
and with this fix hopefully we'll never see it again.

It is worth mentioning that much of my etcd code and the lib Run()
function could use a solid cleaning.
This commit is contained in:
James Shubin
2017-06-03 01:00:35 -04:00
parent d9601471df
commit 4f420dde05
2 changed files with 27 additions and 10 deletions

View File

@@ -346,6 +346,16 @@ func (obj *Main) Run() error {
} else if err := EmbdEtcd.Startup(); err != nil { // startup (returns when etcd main loop is running)
obj.Exit(fmt.Errorf("Main: Etcd: Startup failed: %v", err))
}
// wait for etcd server to be ready before continuing...
select {
case <-EmbdEtcd.ServerReady():
log.Printf("Main: Etcd: Server: Ready!")
// pass
case <-time.After(((etcd.MaxStartServerTimeout * etcd.MaxStartServerRetries) + 1) * time.Second):
obj.Exit(fmt.Errorf("Main: Etcd: Startup timeout"))
}
convergerStateFn := func(b bool) error {
// exit if we are using the converged timeout and we are the
// root node. otherwise, if we are a child node in a remote