Skip to content
This repository has been archived by the owner on Feb 22, 2020. It is now read-only.

Waiting on channel to be ready #323

Open
jloveric opened this issue Oct 11, 2019 · 6 comments
Open

Waiting on channel to be ready #323

jloveric opened this issue Oct 11, 2019 · 6 comments

Comments

@jloveric
Copy link

I'm running the demo-poems-ir and it seems to be stuck at

I:MyClient:[bas:__i:124]:setting up grpc insecure channel...
I:MyClient:[bas:__i:133]:waiting channel to be ready...

Probably something with my setup. Anyone else seen this.
Thanks

@davidlenz
Copy link

davidlenz commented Oct 13, 2019

@jloveric In case you make clean, did you make index again before make client_index d=10? I forgot once and got stuck at waiting channel to be ready...


Having similar issues though:

I:MyClient:[bas:__i:128]:setting up grpc insecure channel...
I:MyClient:[bas:__i:137]:waiting channel to be ready...
I:MyClient:[bas:__i:141]:create new stub...
C:MyClient:[bas:__i:146]:gnes client ready at 0.0.0.0:5566!
index [=                   ]  elapsed: 0.0s   speed: 0.0 batch/s

and then it stops there forever.

Here's the traceback for KeyboardInterrupt:

Traceback (most recent call last):
  File "app.py", line 41, in <module>
    MyClient(parser.parse_args())
  File "/usr/local/lib/python3.7/site-packages/gnes/client/cli.py", line 33, in __init__
    self.start()
  File "/usr/local/lib/python3.7/site-packages/gnes/client/cli.py", line 51, in start
    getattr(self, self.args.mode)()
  File "/usr/local/lib/python3.7/site-packages/gnes/client/cli.py", line 68, in index
    batch_size=self.args.batch_size)):
  File "/usr/local/lib/python3.7/site-packages/grpc/_channel.py", line 388, in __next__
    return self._next()
  File "/usr/local/lib/python3.7/site-packages/grpc/_channel.py", line 373, in _next
    _common.wait(self._state.condition.wait, _response_ready)
  File "/usr/local/lib/python3.7/site-packages/grpc/_common.py", line 140, in wait
    _wait_once(wait_fn, MAXIMUM_WAIT_TIMEOUT, spin_cb)
  File "/usr/local/lib/python3.7/site-packages/grpc/_common.py", line 105, in _wait_once
    wait_fn(timeout=timeout)
  File "/usr/local/lib/python3.7/threading.py", line 300, in wait
    gotit = waiter.acquire(True, timeout)
KeyboardInterrupt

Am running on Do Docker Image
and here's the vm setup:

sudo apt-get update
sudo apt-get -y upgrade
apt-get install make
docker swarm init --advertise-addr < VM IP ADDRESS >
git clone https://github.com/gnes-ai/demo-poems-ir.git
cd demo-poems-ir
make build
make index
make client_index d=10

EDIT: Fixed URL, minimized setup example

@hanxiao
Copy link
Collaborator

hanxiao commented Oct 14, 2019

hmm, give me some time and let me check. Debugging a Docker env is always challenging.

In the meantime, I'd like to give you a sneak peak on the ongoing effort of GNES Flow, it provides a pythonic and intuitive interface for building workflow in GNES. You can get some examples from this unit test. Once a flow is built, one can export a flow to Docker Swarm, K8S or even a SVG image in a painless way.

This is not a mature feature yet, our plan is to remove GNES compose module and use GNES Flow as the main interface in tutorial and web UI.

If you have any thoughts on the API design of GNES flow, feel free to give some feedback.

@hanxiao
Copy link
Collaborator

hanxiao commented Oct 17, 2019

@jloveric @davidlenz thanks again ❤️ for trying GNES and giving feedback at the early stage. I'd like to introduce you the new GNES Flow API (available since v0.0.46), enables a pythonic and intuitive way of building workflow in GNES. As an example, an indexing workflow can be simply defined as:

flow = (Flow(check_version=False, ctrl_with_ipc=True)
        .add_preprocessor(name='prep', yaml_path='yaml/prep.yml', replicas=3)
        .add_encoder(yaml_path='yaml/incep.yml', replicas=6)
        .add_indexer(name='vec_idx', yaml_path='yaml/vec.yml')
        .add_indexer(name='doc_idx', yaml_path='yaml/doc.yml', recv_from='prep')
        .add_router(name='sync', yaml_path='BaseReduceRouter', num_part=2, recv_from=['vec_idx', 'doc_idx']))

# then use it for indexing
with flow(backend='process') as fl:
    fl.index(bytes_gen=read_flowers(), batch_size=64)

🔰 You can find some resources here to help you getting started quickly:

🙇 Give it a try and we welcome your feedback and contribution.

@davidlenz
Copy link

@hanxiao i think this is an truly amazing project, so you and your team are the ones to be thanked. Keep up the great work.

I was able to successfully reproduce the flower example on a cloud machine using the following setup

sudo apt-get update
sudo apt-get -y upgrade
sudo apt-get install -y python3-pip
sudo apt-get install -y build-essential libssl-dev libffi-dev python-dev

pip3 install tensorflow==1.12
python3 -m pip install jupyterlab
pip3 install gnes[all]
apt-get install libsndfile-dev -y

git clone https://github.com/gnes-ai/demo-gnes-flow.git
cd demo-gnes-flow
TEST_WORKDIR=/tmp/gnes-flow-demo/
mkdir ${TEST_WORKDIR}
curl http://download.tensorflow.org/models/inception_v4_2016_09_09.tar.gz --output inception_v4_2016_09_09.tar.gz
tar -xvf inception_v4_2016_09_09.tar.gz
mv inception_v4.ckpt ${TEST_WORKDIR}
rm inception_v4_2016_09_09.tar.gz
curl http://www.robots.ox.ac.uk/~vgg/data/flowers/17/17flowers.tgz --output 17flowers.tgz
mv 17flowers.tgz ${TEST_WORKDIR}
jupyter lab --allow-root

note that i had to manually install libsndfile, as otherwise i would get an error when indexing flowers.

I wonder, how can i query the cloud machine from my local machine? I couldn't figure this out.

@AlexanderKUA
Copy link

AlexanderKUA commented Dec 4, 2019

Hello,

I have the same issue with demo-poems-ir. I'd like to reproduce this example because it's connected to my current task.
I found following issue with router service:

 I:RouterService:[bas:_ho:396]:a message in type: response with route: FrontendService▸SentSplitPreprocessor▸DictIndexer▸BaseReduceRouter
 W:MessageHandler:[bas:get:237]:cant find handler for message type: <class 'gnes_pb2.IndexResponse'>, fall back to the default handler
 I:MessageHandler:[bas:cal:255]:handling message with _handler_default
 I:RouterService:[hel:__e:301]:handling message takes 0.001 secs

It looks like result should be delivered to frontend (not to _handler_default).

Should I fix it? How can I fix it?

For your information Frontend logs:

I:FrontendService:[fro:__i: 24]:start a frontend with 10 workers
C:FrontendService:[fro:__e: 33]:listening at: 0.0.0.0:5566
I:ZmqClient:[bas:__i: 66]:current libzmq version is 4.3.2,  pyzmq version is 18.1.0
I:ZmqClient:[bas:__i: 78]:input 0.0.0.0:57908	 output 0.0.0.0:53463
I:FrontendService:[fro:Str:123]:receive request: 0
I:FrontendService:[fro:Str:126]:send new request into 0 appending tasks
I:FrontendService:[fro:Str:123]:receive request: 1
I:FrontendService:[fro:Str:126]:send new request into 1 appending tasks
I:FrontendService:[fro:Str:123]:receive request: 2
I:FrontendService:[fro:Str:126]:send new request into 2 appending tasks
I:FrontendService:[fro:Str:130]:all requests are sent, waiting for the responses...
I:FrontendService:[fro:get:108]:waiting for 3 responses ...

Thanks in advance

@AlexanderKUA
Copy link

It looks like problem was in

Router40:
    image: gnes/gnes:latest-alpine
    command: route --num_part 2 --port_in 57909 --socket_in PULL_BIND --port_out 57908 --socket_out
      PUSH_CONNECT --host_out Frontend00 --yaml_path BaseReduceRouter

this argument is not required or might be wrong
--num_part 2

Next issue with client_query no results are displayed.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants