Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multi-tenant Tensorboard server #101

Open
mohammedri opened this issue Mar 13, 2020 · 4 comments
Open

Multi-tenant Tensorboard server #101

mohammedri opened this issue Mar 13, 2020 · 4 comments

Comments

@mohammedri
Copy link

Currently if you go into a project and click on send to Tesorboard, it will create a server where it will run Tensorboard for that specific job. However this is not compatible with a multi-user and multi-tenant Atlas hosted on a cluster. Since there is only one instance of the Tensorboard Service, all users will clash.

@mohammedri mohammedri added this to To do in Atlas 🚀 via automation Mar 13, 2020
@mohammedri mohammedri moved this from To do to Backlog in Atlas 🚀 Mar 13, 2020
@mohammedri mohammedri moved this from Backlog to To do in Atlas 🚀 Apr 28, 2020
@amackillop
Copy link
Contributor

amackillop commented Apr 28, 2020

#123 should be completed first as this will inherently rely on how many users that there are.

@amackillop
Copy link
Contributor

My initial thoughts on accomplishing this:

  • We should only need to scale the tb server container with the number of users.
  • The tensorboard api should just forward the request to the correct tb_server container (based on user) instead of creating the links.
  • The logic for actually creating the sym links should live within an api running in the same container as the server.

Alternative:

  • Merge the two containers so that the server and api are both running in the same container
  • Scale this merged container with users.
  • The rest api (send_to_tensorboard endpoint) can decide which container to forward to based on the logged in user.

@amackillop
Copy link
Contributor

amackillop commented Apr 28, 2020

@ekhl See above, I just got that in ahead of your question lol. I can look into multi tenancy in the underlying tb_server itself.

@ekhl
Copy link
Contributor

ekhl commented May 2, 2020

For reference, an old issue that planned to productionize Tensorboard, mentioning multi-tenancy: tensorflow/tensorboard#92. Unfortunately the issue was closed because the planned features were "too ambitious and potentially overlap with the work other folks are doing"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Atlas 🚀
  
To do
Development

No branches or pull requests

3 participants