Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

S23 azure #177

Open
wants to merge 25 commits into
base: main
Choose a base branch
from
Open

S23 azure #177

wants to merge 25 commits into from

Conversation

keting-ai
Copy link
Contributor

The azure part of s23 works

src/bench/bench.go Outdated Show resolved Hide resolved
src/boss/azure_worker.go Outdated Show resolved Hide resolved
src/boss/azure_worker.go Outdated Show resolved Hide resolved
src/boss/azure_worker.go Outdated Show resolved Hide resolved
src/boss/azure_worker.go Outdated Show resolved Hide resolved
src/boss/azure_worker.go Outdated Show resolved Hide resolved
src/boss/azure_worker.go Outdated Show resolved Hide resolved
src/boss/cloudvm/azure.go Outdated Show resolved Hide resolved
src/boss/cloudvm/worker.go Outdated Show resolved Hide resolved
src/common/config.go Outdated Show resolved Hide resolved
src/boss/azure_vm.go Outdated Show resolved Hide resolved
@@ -164,19 +164,40 @@ func create_lambdas(ctx *cli.Context) error {

// simple pandas operation (correlation between two columns in 1000x10 DataFrame)
path = filepath.Join(common.Conf.Registry, fmt.Sprintf("bench-pd-%d.py", i))
code = `# ol-install: numpy,pandas
code = `# ol-install: numpy,pandas,matplotlib,scipy
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't want to repurpose the existing benchmark. Can you add a new one if needed?

@@ -112,6 +115,15 @@ func randomString() string {
return strconv.Itoa(r.Int())
}

func AzureCreateVM(worker *Worker) (*AzureConfig, error) {
subscriptionId = os.Getenv("AZURE_SUBSCRIPTION_ID")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shouldn't this be part of the config file instead of an environment variable?


new_vm.Subnet = *subnet

/*
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't leave in commented out code


new_disk, err := createDisk(ctx, conn, *snapshot.ID, newDiskName)
if err != nil {
log.Print(err)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

instead of printing each time, maybe its better for the caller to print this?

log.Println(err.Error())
return conf, err
}
create_lock.Unlock()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

won't it be left locked if an error is returned? Can you find a better way, probably using the defer unlock pattern?

}
}

var conf_lock sync.Mutex
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does this need to be global, and shared across pools? Or should it be part of the pool?


// TODO: Rightnow we default to have only one resource group
type rgroups struct {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does this type add anything useful beyond a regular slice of rgroup's?

if err != nil {
log.Fatalf(err.Error())
}
if pool.platform == "gcp" {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we need different logic for these two? Can we handle errors consistently?

@@ -339,7 +353,12 @@ func (w *Worker) runCmd(command string) {

tries := 10
for tries > 0 {
sshcmd := exec.Command("ssh", user.Username+"@"+w.workerIp, "-o", "StrictHostKeyChecking=no", "-C", cmd)
var sshcmd *exec.Cmd
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it feels like we should have a generic ssh-with-retry kind of function used by both platforms

}

func LoadDefaults() error {
Conf = &Config{
Platform: "mock",
Scaling: "manual",
API_key: "abc", // TODO: autogenerate a random key
API_key: "abc", // TODO
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why are you deleting the more detailed comment?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants