Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SOCKServer freeze after some create #98

Open
GindaChen opened this issue Mar 5, 2020 · 1 comment
Open

SOCKServer freeze after some create #98

GindaChen opened this issue Mar 5, 2020 · 1 comment
Labels

Comments

@GindaChen
Copy link
Member

The following will make the SOCK server freeze indefinitely:

> ol worker -d -o limits.installer_mem_mb=250,server_mode="sock",mem_pool_mb=500

# Create a sandbox from echo a few times...
# /path/to/echo refers to the test-registry/echo
> curl -X POST http://localhost:5000/create -d '{"parent": "", "leaf": true, "code": "/path/to/echo"}'
> curl -X POST http://localhost:5000/create -d '{"parent": "", "leaf": false, "code": "/path/to/echo"}'

Then do

> curl -X POST http://localhost:5000/create -d '{"parent": "1", "leaf": false, "code": "/path/to/echo"}'
> curl -X POST http://localhost:5000/create -d '{"parent": "1", "leaf": true, "code": "/path/to/echo"}'

Both create will fail. The worker freeze with log

2020/03/04 20:35:44 POST /create
2020/03/04 20:35:44 Parsed Args: map[code:/root/SOCKexp/use-sock/test-registry/echo leaf:false parent:1]
2020/03/04 20:35:44 <sandboxes>.Create(<SB 1>, false, /root/SOCKexp/use-sock/test-registry/echo, /root/SOCKexp/use-sock/default-ol/worker/scratch/dir-1008, <installs=[], imports=[], mem-limit-mb=50>)=8... [SOCK POOL sandboxes]

Then if we try to kill the worker, the worker is not getting properly killed. It freezes after the logs:

^C2020/03/04 20:33:00 received kill signal, cleaning up
2020/03/04 20:33:00 Destroy() [SB 1]
Traceback (most recent call last):
Traceback (most recent call last):
  File "sock2.py", line 175, in <module>
  File "sock2.py", line 175, in <module>
Traceback (most recent call last):
Traceback (most recent call last):
  File "sock2.py", line 175, in <module>
  File "sock2.py", line 175, in <module>
    main()
  File "sock2.py", line 171, in main
    main()
  File "sock2.py", line 171, in main
    start_container()
  File "sock2.py", line 136, in start_container
    main()
  File "sock2.py", line 171, in main
    start_container()
  File "sock2.py", line 136, in start_container
    main()
  File "sock2.py", line 171, in main
    exec(code)
  File "<string>", line 1, in <module>
    start_container()
  File "sock2.py", line 136, in start_container
    exec(code)
  File "<string>", line 1, in <module>
  File "sock2.py", line 52, in web_server
    start_container()
  File "sock2.py", line 136, in start_container
    exec(code)
  File "<string>", line 1, in <module>
  File "sock2.py", line 52, in web_server
    tornado.ioloop.IOLoop.instance().start()
  File "/usr/local/lib/python3.6/dist-packages/tornado/ioloop.py", line 863, in start
  File "sock2.py", line 63, in fork_server
    tornado.ioloop.IOLoop.instance().start()
    exec(code)
  File "/usr/local/lib/python3.6/dist-packages/tornado/ioloop.py", line 863, in start
  File "<string>", line 1, in <module>
    client, info = file_sock.accept()
  File "/usr/lib/python3.6/socket.py", line 205, in accept
  File "sock2.py", line 63, in fork_server
    client, info = file_sock.accept()
  File "/usr/lib/python3.6/socket.py", line 205, in accept
    fd, addr = self._accept()
KeyboardInterrupt
    event_pairs = self._impl.poll(poll_timeout)
KeyboardInterrupt
    fd, addr = self._accept()
    event_pairs = self._impl.poll(poll_timeout)
KeyboardInterrupt
KeyboardInterrupt
2020/03/04 20:33:00 ...returns <SB 5>, <nil> [SOCK POOL sandboxes]
2020/03/04 20:33:00 Save ID '5' to map
2020/03/04 20:33:00 parent.fork returned connection refused [SOCK POOL sandboxes]
2020/03/04 20:33:00 Destroy() [SB 6]
2020/03/04 20:33:00 CG ref count decremented to 0 [SOCK 6]
2020/03/04 20:33:00 parent.fork returned connection refused [SOCK POOL sandboxes]
2020/03/04 20:33:00 Destroy() [SB 7]
2020/03/04 20:33:00 CG ref count decremented to 1 [SOCK 1]
2020/03/04 20:33:00 Destroy() [SB 2]
2020/03/04 20:33:00 CG ref count decremented to 0 [SOCK 7]
2020/03/04 20:33:00 killed PIDs [] in CG [SOCK 6]
2020/03/04 20:33:00 unmount and remove dirs [SOCK 6]
2020/03/04 20:33:00 CG ref count decremented to 0 [SOCK 2]
2020/03/04 20:33:00 killed PIDs [] in CG [SOCK 7]
2020/03/04 20:33:00 unmount and remove dirs [SOCK 7]
2020/03/04 20:33:00 waiting for 1 procs in cg-2 to die [CGROUP POOL default-ol-sandboxes]
2020/03/04 20:33:00 ...returns <nil>, Fork from parent Sandbox failed [SOCK POOL sandboxes]
2020/03/04 20:33:00 Request Handler Failed: Fork from parent Sandbox failed
2020/03/04 20:33:00 ...returns <nil>, Fork from parent Sandbox failed [SOCK POOL sandboxes]
2020/03/04 20:33:00 Request Handler Failed: Fork from parent Sandbox failed
2020/03/04 20:33:00 killed PIDs [28427] in CG [SOCK 2]
2020/03/04 20:33:00 unmount and remove dirs [SOCK 2]
2020/03/04 20:33:01 Destroy() [SB 3]
2020/03/04 20:33:01 CG ref count decremented to 0 [SOCK 3]
2020/03/04 20:33:01 killed PIDs [] in CG [SOCK 3]
2020/03/04 20:33:01 unmount and remove dirs [SOCK 3]
2020/03/04 20:33:01 Destroy() [SB 4]
2020/03/04 20:33:01 CG ref count decremented to 0 [SOCK 4]
2020/03/04 20:33:01 killed PIDs [] in CG [SOCK 4]
2020/03/04 20:33:01 unmount and remove dirs [SOCK 4]
2020/03/04 20:33:01 make sure all memory is free [SOCK POOL sandboxes]
@GindaChen
Copy link
Member Author

Will dig deeper into the code and see what happens.

@kaimast kaimast added the bug label Apr 19, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants