Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix retries during inference #523

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from
Draft

Fix retries during inference #523

wants to merge 1 commit into from

Conversation

borzunov
Copy link
Collaborator

@borzunov borzunov commented Sep 28, 2023

#331 introduced a bug during inference retries that caused this:

[INFO] Route found: 0:18 via1EBzGt
[WARN] [petals.client.inference_session.step:327] Caught exception when running inference via RemoteSpanInfo(peer_id=<libp2p.peer.id.ID (12D3KooWLRHAtX9ccW9i1NvpPLigwGX9MstGw3oyCZwmh21EBzGt)>, start=0, end=18, server_info=ServerInfo(state=<ServerState.ONLINE: 2>, throughput=1040.823002928876, public_name=':duck:FYY:sun_with_face:', version='2.2.0', network_rps=1185.4980980484086, forward_rps=9887.81852782432, inference_rps=343.100557603763, adapters=(), torch_dtype='bfloat16', quant_type='nf4', using_relay=False, cache_tokens_left=1179648, next_pings={...})) (retry in 2 sec): AssertionError("Broken input cache: span=RemoteSpanInfo(peer_id=<libp2p.peer.id.ID (12D3KooWLRHAtX9ccW9i1NvpPLigwGX9MstGw3oyCZwmh21EBzGt)>, start=0, end=18, server_info=ServerInfo(state=<ServerState.ONLINE: 2>, throughput=1040.823002928876, public_name=':duck:FYY:sun_with_face:', version='2.2.0', network_rps=1185.4980980484086, forward_rps=9887.81852782432, inference_rps=343.100557603763, adapters=(), torch_dtype='bfloat16', quant_type='nf4', using_relay=False, cache_tokens_left=1179648, next_pings={...})) shape=torch.Size([1, 579, 8192]) position=0 n_input_tokens=1")

@@ -359,6 +358,10 @@ def _update_sequence(self, server_idx: int, block_idx: int, attempt_no: int) ->
# If there is a failed span, this code replaces it, otherwise it just adds new ones
if server_idx < n_prev_spans:
updated_sessions[0].history = self._server_sessions[server_idx].history
updated_sessions[0].position = self._position
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the fix

@borzunov borzunov marked this pull request as draft September 28, 2023 17:33
@borzunov borzunov force-pushed the fix-inference-retry branch 2 times, most recently from 160211a to 567e34b Compare September 28, 2023 18:40
@borzunov borzunov force-pushed the fix-inference-retry branch 2 times, most recently from ef59cf6 to 3f70ab6 Compare October 23, 2023 16:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant