Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deployment on custom IP, problem with worker connection #391

Open
MrStrzelec opened this issue Oct 30, 2023 · 31 comments
Open

Deployment on custom IP, problem with worker connection #391

MrStrzelec opened this issue Oct 30, 2023 · 31 comments
Labels
bug Something isn't working deployment Need help with deployment help wanted Extra attention is needed

Comments

@MrStrzelec
Copy link

Hi there!
First of all I'd like to say that I really appreciate your software.
But I'm having trouble with a worker.

Basics:
We don't use any kind of huggingface/docker, so it only works on localhost.
We've run everything as u said in the tutorial, but we're not able to run it correctly.

So we're running server via systemd - python manage.py runserver .
We'll also run worker as systemd - celery -A server worker --loglevel=info -P gevent --concurrency 1 -E
(to run the mercury server no matter is server is restarted)

And yet we're still waiting for Worker.
Mercury Worker

Do you have any idea what's going on?
Thank you very much!

@pplonski
Copy link
Contributor

Hi @MrStrzelec,

Have you tried to start with mercury run? Additionally you can start server in verbose mode mercury run --verbose.

If you would like to run each part of architecture by your own I would recommend checking docker entrypoint script: https://github.com/mljar/mercury/blob/main/docker/mercury/entrypoint.sh

In your case, looks like you are missing: -Q celery,ws, please try to run celery -A server worker --loglevel=info -P gevent --concurrency 1 -E -Q celery,ws

Please let me know if it works for you.

@MrStrzelec
Copy link
Author

Hey!

Server working.
I've managed also to change celery script and adjust to your advice but not working. I've got this error from celery (systemd).

Okt 30 09:22:45 Servername celery[1155]: return request("get", url, params=params, **kwargs)
Okt 30 09:22:45 Servername celery[1155]: File "/opt/jupyterhub/lib/python3.10/site-packages/requests/api.py", line 59, in request
Okt 30 09:22:45 Servername celery[1155]: return session.request(method=method, url=url, **kwargs)
Okt 30 09:22:45 Servername celery[1155]: File "/opt/jupyterhub/lib/python3.10/site-packages/requests/sessions.py", line 589, in request
Okt 30 09:22:45 Servername celery[1155]: resp = self.send(prep, **send_kwargs)
Okt 30 09:22:45 Servername celery[1155]: File "/opt/jupyterhub/lib/python3.10/site-packages/requests/sessions.py", line 703, in send
Okt 30 09:22:45 Servername celery[1155]: r = adapter.send(request, **kwargs)
Okt 30 09:22:45 Servername celery[1155]: File "/opt/jupyterhub/lib/python3.10/site-packages/requests/adapters.py", line 519, in send
Okt 30 09:22:45 Servername celery[1155]: raise ConnectionError(e, request=request)
Okt 30 09:22:45 Servername celery[1155]: requests.exceptions.ConnectionError: HTTPConnectionPool(host=Servername', port=5668): Max retries exceeded with url: /api/v1/worker/18db8815-183d-4f84-929b-40522256a3e2/387/12/nb (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fb144005870>:Failed to establish a new connection: [Errno 111] Connection refused'))

@pplonski
Copy link
Contributor

I have no idea right now what might be the issue. Is it running on 127.0.0.1?

@MrStrzelec
Copy link
Author

It's running on a diffrent IP address. I thought i add this information.

python manage.py runserver IPADDRESS:PORT

@pplonski
Copy link
Contributor

Could you please add --verbose argument to server start command? Let's check if worker is connecting to the correct IP address.

@MrStrzelec
Copy link
Author

MrStrzelec commented Oct 30, 2023

user@server:/opt/jupyterhub/lib/python3.10/site-packages/mercury$ python manage.py runsever --verbose IP:PORT
Traceback (most recent call last):

  File "/usr/lib/python3.10/logging/config.py", line 565, in configure
    handler = self.configure_handler(handlers[name])
  File "/usr/lib/python3.10/logging/config.py", line 746, in configure_handler
    result = factory(**kwargs)
  File "/usr/lib/python3.10/logging/__init__.py", line 1169, in __init__
    StreamHandler.__init__(self, self._open())
  File "/usr/lib/python3.10/logging/__init__.py", line 1201, in _open
    return open_func(self.baseFilename, self.mode,
PermissionError: [Errno 13] Permission denied: '/opt/jupyterhub/lib/python3.10/site-packages/mercury/django-errors.log'

The above exception was the direct cause of the following exception:

Traceback (most recent call last):

 File "/opt/jupyterhub/lib/python3.10/site-packages/mercury/manage.py", line 22, in <module>
    main()
  File "/opt/jupyterhub/lib/python3.10/site-packages/mercury/manage.py", line 18, in main
    execute_from_command_line(sys.argv)
  File "/opt/jupyterhub/lib/python3.10/site-packages/django/core/management/__init__.py", line 442, in execute_from_command_line
    utility.execute()
  File "/opt/jupyterhub/lib/python3.10/site-packages/django/core/management/__init__.py", line 416, in execute
    django.setup()
  File "/opt/jupyterhub/lib/python3.10/site-packages/django/__init__.py", line 19, in setup
    configure_logging(settings.LOGGING_CONFIG, settings.LOGGING)
  File "/opt/jupyterhub/lib/python3.10/site-packages/django/utils/log.py", line 76, in configure_logging
    logging_config_func(logging_settings)
  File "/usr/lib/python3.10/logging/config.py", line 811, in dictConfig
    dictConfigClass(config).configure()
  File "/usr/lib/python3.10/logging/config.py", line 572, in configure
    raise ValueError('Unable to configure handler '
ValueError: Unable to configure handler 'file'

It tried to run also mercury run --verbose IP:PORT and output is similar

@pplonski
Copy link
Contributor

pplonski commented Oct 30, 2023

Looks like process that is running server doesnt have permission to write logs to the directory. This might be connected to #384 I will need some time to make it configurable.

Have you tried to deploy with docker-compose?

Have you already developed the notebook with mercury and deployment is the last step?

@pplonski pplonski changed the title Worker not working Deployment on custom IP, problem with worker connection Oct 30, 2023
@MrStrzelec
Copy link
Author

MrStrzelec commented Oct 30, 2023

PermissionError: [Errno 13] Permission denied: '/opt/jupyterhub/lib/python3.10/site-packages/mercury/django-errors.log'

I've just changed permissions to this file for everyone and now it's works. Server running correctly and worker working correctly. W/O docker or huggingface. For some reasons Mercury showing that he cannot find this file, while JupyterNotebook working correctly. U know why?

File not found

@pplonski
Copy link
Contributor

Great that it is working!

Do you have data file in the same directory as notebook? It should read file if it is in the same directory.

Alternative might be to provide full path to the file.

@MrStrzelec
Copy link
Author

Hey!

Sorry i had a day off yesterday.

Yep, data is in the same directory as a notebook. I also tried with full path and also didn't worked.

Any other idea?
Thanks a Ton :)

@pplonski
Copy link
Contributor

pplonski commented Nov 1, 2023

This is strange, could you provide full error message? Maybe try to print out the working directory before loading files, just to check the path. Try to add code:

import os
print(os.getcwd())

@MrStrzelec
Copy link
Author

MrStrzelec commented Nov 2, 2023

I think worker was not working at all, i've checked logs and found this:

[2023-11-02 12:33:20,930: INFO/MainProcess] Task apps.ws.tasks.task_start_websocket_worker[1cfbef57-5ef1-4ea6-b307-f1380f3a20aa] received
[2023-11-02 12:33:20,976: INFO/MainProcess] Task apps.ws.tasks.task_start_websocket_worker[1cfbef57-5ef1-4ea6-b307-f1380f3a20aa] succeeded in 0.04001394798979163s: None
NB 2023-11-02 12:33:22,661 Exception when notebook load, quit
Traceback (most recent call last):
  File "/opt/jupyterhub/lib/python3.10/site-packages/mercury/apps/../apps/nbworker/rest.py", line 36, in load_notebook
    raise Exception("Cant load notebook")
Exception: Cant load notebook

I think worker wouldn't find this file due to having problem with loading notebook.
I've check everything twice and didn't find anything.

@pplonski
Copy link
Contributor

pplonski commented Nov 7, 2023

Do you have notebook file in the same directory in which you start worker?

@keyvan-najafy
Copy link

Hi
I am having the same problem

I have created a private fork without any changes and deployed for http via docker-compose on my local computer with default setting and changed nothing but I am getting similar errors to this #391 (comment)

I have used mercury-deploy-demo repo for notebooks and everything seems to be working. admin panel, welcome.md , and notebook icons in welcome page are getting displayed properly but when I click on any of notebooks I get following errors in terminal and I am redirected to notebooks page and notebook is visible in the background but page stays in loading state forever

what should I do ?

mercuryprivate-mercury-1  |   NB 2023-11-23 22:43:06,960 Exception when notebook load, quit
mercuryprivate-mercury-1  |   Traceback (most recent call last):
mercuryprivate-mercury-1  |   File "/opt/conda/lib/python3.10/site-packages/urllib3/connection.py", line 174, in _new_conn
mercuryprivate-mercury-1  |     conn = connection.create_connection(
mercuryprivate-mercury-1  |   File "/opt/conda/lib/python3.10/site-packages/urllib3/util/connection.py", line 95, in create_connection
mercuryprivate-mercury-1  |     raise err
mercuryprivate-mercury-1  |   File "/opt/conda/lib/python3.10/site-packages/urllib3/util/connection.py", line 85, in create_connection
mercuryprivate-mercury-1  |     sock.connect(sa)
mercuryprivate-mercury-1  | ConnectionRefusedError: [Errno 111] Connection refused
mercuryprivate-mercury-1  |
mercuryprivate-mercury-1  | During handling of the above exception, another exception occurred:
mercuryprivate-mercury-1  |
mercuryprivate-mercury-1  | Traceback (most recent call last):
mercuryprivate-mercury-1  |   File "/opt/conda/lib/python3.10/site-packages/urllib3/connectionpool.py", line 715, in urlopen
mercuryprivate-mercury-1  |     httplib_response = self._make_request(
mercuryprivate-mercury-1  |   File "/opt/conda/lib/python3.10/site-packages/urllib3/connectionpool.py", line 416, in _make_request
mercuryprivate-mercury-1  |     conn.request(method, url, **httplib_request_kw)
mercuryprivate-mercury-1  |   File "/opt/conda/lib/python3.10/site-packages/urllib3/connection.py", line 244, in request
mercuryprivate-mercury-1  |     super(HTTPConnection, self).request(method, url, body=body, headers=headers)
mercuryprivate-mercury-1  |   File "/opt/conda/lib/python3.10/http/client.py", line 1283, in request
mercuryprivate-mercury-1  |     self._send_request(method, url, body, headers, encode_chunked)
mercuryprivate-mercury-1  |   File "/opt/conda/lib/python3.10/http/client.py", line 1329, in _send_request
mercuryprivate-mercury-1  |     self.endheaders(body, encode_chunked=encode_chunked)
mercuryprivate-mercury-1  |   File "/opt/conda/lib/python3.10/http/client.py", line 1278, in endheaders
mercuryprivate-mercury-1  |     self._send_output(message_body, encode_chunked=encode_chunked)
mercuryprivate-mercury-1  |   File "/opt/conda/lib/python3.10/http/client.py", line 1038, in _send_output
mercuryprivate-mercury-1  |     self.send(msg)
mercuryprivate-mercury-1  |   File "/opt/conda/lib/python3.10/http/client.py", line 976, in send
mercuryprivate-mercury-1  |     self.connect()
mercuryprivate-mercury-1  |   File "/opt/conda/lib/python3.10/site-packages/urllib3/connection.py", line 205, in connect
mercuryprivate-mercury-1  |     conn = self._new_conn()
mercuryprivate-mercury-1  |   File "/opt/conda/lib/python3.10/site-packages/urllib3/connection.py", line 186, in _new_conn
mercuryprivate-mercury-1  |     raise NewConnectionError(
mercuryprivate-mercury-1  | urllib3.exceptions.NewConnectionError: <urllib3.connection.HTTPConnection object at 0x7f16bc30e3e0>: Failed to establish a new connection: [Errno 111] Connection refused
mercuryprivate-mercury-1  |
mercuryprivate-mercury-1  | During handling of the above exception, another exception occurred:
mercuryprivate-mercury-1  |
mercuryprivate-mercury-1  | Traceback (most recent call last):
mercuryprivate-mercury-1  |   File "/opt/conda/lib/python3.10/site-packages/requests/adapters.py", line 486, in send
mercuryprivate-mercury-1  |     resp = conn.urlopen(
mercuryprivate-mercury-1  |   File "/opt/conda/lib/python3.10/site-packages/urllib3/connectionpool.py", line 799, in urlopen
mercuryprivate-mercury-1  |     retries = retries.increment(
mercuryprivate-mercury-1  |   File "/opt/conda/lib/python3.10/site-packages/urllib3/util/retry.py", line 592, in increment
mercuryprivate-mercury-1  |     raise MaxRetryError(_pool, url, error or ResponseError(cause))
mercuryprivate-mercury-1  | urllib3.exceptions.MaxRetryError: HTTPConnectionPool(host='localhost', port=80): Max retries exceeded with url: /api/v1/worker/dd83117b-dcd9-4c54-8782-aa60c094f6ac/16/1/nb (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f16bc30e3e0>: Failed to establish a new connection: [Errno 111] Connection refused'))
mercuryprivate-mercury-1  |
mercuryprivate-mercury-1  | During handling of the above exception, another exception occurred:
mercuryprivate-mercury-1  |
mercuryprivate-mercury-1  | Traceback (most recent call last):
mercuryprivate-mercury-1  |   File "/app/mercury/apps/../apps/nbworker/rest.py", line 32, in load_notebook
mercuryprivate-mercury-1  |     response = requests.get(
mercuryprivate-mercury-1  |   File "/opt/conda/lib/python3.10/site-packages/requests/api.py", line 73, in get
mercuryprivate-mercury-1  |     return request("get", url, params=params, **kwargs)
mercuryprivate-mercury-1  |   File "/opt/conda/lib/python3.10/site-packages/requests/api.py", line 59, in request
mercuryprivate-mercury-1  |     return session.request(method=method, url=url, **kwargs)
mercuryprivate-mercury-1  |   File "/opt/conda/lib/python3.10/site-packages/requests/sessions.py", line 589, in request
mercuryprivate-mercury-1  |     resp = self.send(prep, **send_kwargs)
mercuryprivate-mercury-1  |   File "/opt/conda/lib/python3.10/site-packages/requests/sessions.py", line 703, in send
mercuryprivate-mercury-1  |     r = adapter.send(request, **kwargs)
mercuryprivate-mercury-1  |   File "/opt/conda/lib/python3.10/site-packages/requests/adapters.py", line 519, in send
mercuryprivate-mercury-1  |     raise ConnectionError(e, request=request)
mercuryprivate-mercury-1  | requests.exceptions.ConnectionError: HTTPConnectionPool(host='localhost', port=80): Max retries exceeded with url: /api/v1/worker/dd83117b-dcd9-4c54-8782-aa60c094f6ac/16/1/nb (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f16bc30e3e0>: Failed to establish a new connection: [Errno 111] Connection refused'))

@kskadart
Copy link

kskadart commented Nov 26, 2023

Hello folks, face the same issue like @keyvan-najafy
I copy 2 repos: mercury and mercury-deploy-demo.
Copy .env to the mercury folder
Change path to notebooks in .env
Run docker compose build, docker compose up.

@pplonski
Copy link
Contributor

Hi @keyvan-najafy, @kskadart,

What operating system are you using? How to reproduce your environment?

@keyvan-najafy
Copy link

keyvan-najafy commented Nov 27, 2023

here is my environment:

  • Windows 10 enterprise latest update (also checked with with previous versions, got same errors)
  • docker desktop
  • WSL v2 + ubuntu 22.04 for containers

An update on the issue:

when I change the MERCURY_SERVER_URL in mercury/apps/nbworker/rest.py, and set it to http:\localhost:9000 or http:\mercury:9000 (instead default http://localhost:8000) I get passed the refused connection error but I get another error regarding websocket:

[2023-11-27 09:52:34,548: INFO/MainProcess] Task apps.ws.tasks.task_start_websocket_worker[eca29bc6-8f84-4a6f-b98d-78e8126bcb81] received
2023-11-27 13:22:34 mercuryprivate-mercury-1  | [2023-11-27 09:52:34,789: INFO/MainProcess] Task apps.ws.tasks.task_start_websocket_worker[eca29bc6-8f84-4a6f-b98d-78e8126bcb81] succeeded in 0.23135460000000307s: None
2023-11-27 13:22:36 mercuryprivate-mercury-1  | NB 2023-11-27 09:52:36,433 Start NBWorker with arguments ['/app/mercury/apps/nbworker', '2', '67ace814-d60a-4ab0-a92f-d70c229cccb3', '225', 'ws://localhost']
2023-11-27 13:22:36 mercuryprivate-mercury-1  | NB 2023-11-27 09:52:36,433 Load notebook id=2
2023-11-27 13:22:36 mercuryprivate-mercury-1  | NB 2023-11-27 09:52:36,438 Starting new HTTP connection (1): 127.0.0.1:9000
2023-11-27 13:22:36 mercuryprivate-mercury-1  | NB 2023-11-27 09:52:36,471 http://127.0.0.1:9000 "GET /api/v1/worker/67ace814-d60a-4ab0-a92f-d70c229cccb3/225/2/nb HTTP/1.1" 200 918
2023-11-27 13:22:36 mercuryprivate-mercury-1  | NB 2023-11-27 09:52:36,472 Load owner and user
2023-11-27 13:22:36 mercuryprivate-mercury-1  | NB 2023-11-27 09:52:36,475 Starting new HTTP connection (1): 127.0.0.1:9000
2023-11-27 13:22:36 mercuryprivate-mercury-1  | NB 2023-11-27 09:52:36,504 http://127.0.0.1:9000 "GET /api/v1/worker/67ace814-d60a-4ab0-a92f-d70c229cccb3/225/2/owner-and-user HTTP/1.1" 200 93
2023-11-27 13:22:36 mercuryprivate-mercury-1  | NB 2023-11-27 09:52:36,505 WS connect to ws://localhost/ws/worker/2/67ace814-d60a-4ab0-a92f-d70c229cccb3/225/
2023-11-27 13:22:36 mercuryprivate-mercury-1  | NB 2023-11-27 09:52:36,509 [Errno 99] Cannot assign requested address - goodbye
2023-11-27 13:22:36 mercuryprivate-mercury-1  | NB 2023-11-27 09:52:36,509 WS on_error, [Errno 99] Cannot assign requested address
2023-11-27 13:22:36 mercuryprivate-mercury-1  | NB 2023-11-27 09:52:36,509 Delete Worker output directory: /app/mercury/media/67ace814-d60a-4ab0-a92f-d70c229cccb3
2023-11-27 13:22:36 mercuryprivate-mercury-1  | NB 2023-11-27 09:52:36,509 WS close connection, status=None, msg=None 

so according to this error I get passed mercury/apps/nbworker/rest.py connection error and load notebooks but now web sockets cant connect anymore.

@pplonski
Copy link
Contributor

pplonski commented Nov 27, 2023

Please try to set REACT_APP_SERVER_WS to your address as well. For example: ws://mercury:9000. The REACT_APP_SERVER_WS sets the websocket server address, it is shared between client and worker.

@keyvan-najafy
Copy link

keyvan-najafy commented Nov 27, 2023

@pplonski adding REACT_APP_SERVER_WS did not help.
I managed to fix it somehow by editing mercury/apps/nbworker/main.py
The default server url is ws://localhost which is given to main.py by enterypoint.sh (I think so!!)
I simply changed this to ws://localhost:9000 and everything started to working

if __name__ == "__main__":
    log.info(f"Start NBWorker with arguments {sys.argv}")
    server_url = 'ws://localhost:9000' #### added this line
    for _ in range(CONNECT_MAX_TRIES):
        nb_worker = NBWorker(
            f"{server_url}/ws/worker/{notebook_id}/{session_id}/{worker_id}/",
            notebook_id,
            session_id,
            worker_id,
        )
        time.sleep(RECONNECT_WAIT_TIME)

so basicly the idea is that somehow either by enterypoint.sh or docker-compose.yml there is a mismatching for ports given to the mercury container
to sum up everything I have done:

  • add MERCURY_SERVER_URL: ${MERCURY_SERVER_URL} to docker-compose file under environment section
  • add MERCURY_SERVER_URL=http://127.0.0.1:9000 to .env file
  • add server_url = 'ws://localhost:9000' to mercury/apps/nbworker/main.py under if __name__ == "__main__":

This is just a temporary fix as it is pretty bad practice to change __main.py__ directly so perhaps enterypoint.sh or docker-compose.yml should get an update or even creating a new issue.

I am planning to deploy this on a local cloud platform which does not support docker compose (so I should put all containers separately). I hope these fixes don't not create anymore complications

@pplonski
Copy link
Contributor

Thank you @keyvan-najafy for steps description with fix. Yes, there is some miss mach that should be fixed.

@pplonski pplonski added help wanted Extra attention is needed deployment Need help with deployment labels Nov 28, 2023
@ghansham
Copy link

ghansham commented Dec 1, 2023

@MrStrzelec, did you get success. I am also getting exactly same error and running on jupyterhub. The web socket connection is successful but worker is queued and getting same error in rest.py in load_notebook method at line no. 32.

@mariliaribeiro
Copy link

mariliaribeiro commented Dec 3, 2023

@pplonski adding REACT_APP_SERVER_WS did not help. I managed to fix it somehow by editing mercury/apps/nbworker/main.py The default server url is ws://localhost which is given to main.py by enterypoint.sh (I think so!!) I simply changed this to ws://localhost:9000 and everything started to working

if __name__ == "__main__":
    log.info(f"Start NBWorker with arguments {sys.argv}")
    server_url = 'ws://localhost:9000' #### added this line
    for _ in range(CONNECT_MAX_TRIES):
        nb_worker = NBWorker(
            f"{server_url}/ws/worker/{notebook_id}/{session_id}/{worker_id}/",
            notebook_id,
            session_id,
            worker_id,
        )
        time.sleep(RECONNECT_WAIT_TIME)

so basicly the idea is that somehow either by enterypoint.sh or docker-compose.yml there is a mismatching for ports given to the mercury container to sum up everything I have done:

  • add MERCURY_SERVER_URL: ${MERCURY_SERVER_URL} to docker-compose file under environment section
  • add MERCURY_SERVER_URL=http://127.0.0.1:9000 to .env file
  • add server_url = 'ws://localhost:9000' to mercury/apps/nbworker/main.py under if __name__ == "__main__":

This is just a temporary fix as it is pretty bad practice to change __main.py__ directly so perhaps enterypoint.sh or docker-compose.yml should get an update or even creating a new issue.

I am planning to deploy this on a local cloud platform which does not support docker compose (so I should put all containers separately). I hope these fixes don't not create anymore complications

Thanks, it's working for me!

I think the main problem is because the server port is not present in the sys.argv called in mercury/apps/nbworker/main.py, and MERCURY_SERVER_URL is not being used to define server_url variable. The log is showing:

NB 2023-12-03 15:25:01,455 Start NBWorker with arguments ['/app/mercury/apps/nbworker', '1', '6913ad3b-9da1-40af-9305-fec1915ef334', '736', 'ws://127.0.0.1']

How the file mercury/apps/nbworker/main.py is being called? Is there any other file that set sys.argv?

@pplonski
Copy link
Contributor

pplonski commented Dec 4, 2023

Hi @mariliaribeiro,

Here is code that creates job for worker

job_params = {
"notebook_id": self.notebook_id,
"session_id": self.session_id,
"worker_id": worker.id,
#
# ugly hack for docker deployment
#
"server_url": self.server_address
if "0.0.0.0" not in self.server_address
else self.server_address + ":9000",
}
transaction.on_commit(lambda: task_start_websocket_worker.delay(job_params))

Here is code that starts worker process based on job params

command = [
sys.executable,
os.path.join(directory, "nbworker"),
str(job_params["notebook_id"]),
str(job_params["session_id"]),
str(job_params["worker_id"]),
job_params["server_url"],
]
log.debug("Start " + " ".join(command))
worker = subprocess.Popen(command)

Sorry, that not everything is working smoothly ... I'm focused right now on ipywidgets support in Mercury.

Anyway, have you successfully created app in Mercury? What is your use case? Do you enjoy working with Mercury?

@mariliaribeiro
Copy link

mariliaribeiro commented Dec 19, 2023

Hi @mariliaribeiro,

Here is code that creates job for worker

job_params = {
"notebook_id": self.notebook_id,
"session_id": self.session_id,
"worker_id": worker.id,
#
# ugly hack for docker deployment
#
"server_url": self.server_address
if "0.0.0.0" not in self.server_address
else self.server_address + ":9000",
}
transaction.on_commit(lambda: task_start_websocket_worker.delay(job_params))

Here is code that starts worker process based on job params

command = [
sys.executable,
os.path.join(directory, "nbworker"),
str(job_params["notebook_id"]),
str(job_params["session_id"]),
str(job_params["worker_id"]),
job_params["server_url"],
]
log.debug("Start " + " ".join(command))
worker = subprocess.Popen(command)

Sorry, that not everything is working smoothly ... I'm focused right now on ipywidgets support in Mercury.

Anyway, have you successfully created app in Mercury? What is your use case? Do you enjoy working with Mercury?

Hi @pplonski!

Thanks a lot! Yeah, I'm enjoy working with mercury. It really helpful when we are building and validating a MVP.

I did some tests on mercury/apps/ws/client.py to build a http application with docker. I realised that server_url in jobs_params dictionary in mercury/apps/ws/client.py was not working during address and port concatenation. I tried to open a MR with the changes, but I think I don't have permission.

Here are some changes. I hope it works for other cases!

# mercury/apps/ws/utils.py
import re
...

def get_client_server_url(server_address: str) -> str:
    """
    Method to treat server address to WS client job_params.
    """
    regex = r"(.*):([0-9].*)"

    server_url = server_address
    if len(server_address.split('://')[-1].split(":")) == 1:
        server_url = server_address + ":9000"
    elif re.search(regex, server_address):
        server_url = re.sub(regex, "\\1:9000", server_address, 0)
    return server_url
# mercury/apps/ws/client.py
from apps.ws.utils import get_client_server_url

...

def need_worker(self):
    ...
    job_params = {
        ....,
        "server_url": get_client_server_url(self.server_address)
    }

With this code we don't need to apply the change in server_url from mercury/apps/nbworker/main.py under if __name__ == "__main__" (#391 (comment)).

@nagatushar
Copy link

PermissionError: [Errno 13] Permission denied: '/opt/jupyterhub/lib/python3.10/site-packages/mercury/django-errors.log'

I've just changed permissions to this file for everyone and now it's works. Server running correctly and worker working correctly. W/O docker or huggingface. For some reasons Mercury showing that he cannot find this file, while JupyterNotebook working correctly. U know why?

File not found

/app/ "your file path"

@somdipdatta
Copy link

First: I love how easy it is to build apps from notebook, using mercury.

The problem I am having is likely related to this thread, hence posting here.
I deployed my mercury app in render.com, they do not allow 127.0.0.0, hence I am running
mercury run 0.0.0.0:8000

The client shows "Waiting for Worker.." and never connects.
I tried setting env REACT_APP_SERVER_WS and MERCURY_SERVER_URL, neither helped.

@pplonski
Copy link
Contributor

pplonski commented Mar 1, 2024

This might be an issue with hard coded worker address.

@pplonski pplonski added the bug Something isn't working label Mar 1, 2024
@ghansham
Copy link

ghansham commented Mar 2, 2024 via email

@somdipdatta
Copy link

Yes it is. We had to modify that

modified in an upcoming release?

@AGrzes
Copy link

AGrzes commented Apr 5, 2024

I'm facing the same issue.

My understanding is that the root cause is having single server_url / server_address that is used by both front-end and nbworker while

  • front-end may go through arbitrary number of reverse-proxies, load balancers, have ssl termination in front and so on
  • nbworker should go through shortest possible path -loopback / direct machine to machine / singe load balancer

at the moment the method to determine first address determines second address - and simple solution would be to allow overriding it for deployments that need it.

Now - the issue is marked help wanted - would PR be welcome?

@AGrzes
Copy link

AGrzes commented Apr 5, 2024

I think the PR #435 address exactly this.

Maybe the name MERCURY_TASK_SERVER_URL is not clear but it is "The address nbworker processes used to connect to main server through websockets"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working deployment Need help with deployment help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

9 participants