Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HITL - Rearrange session handling #1965

Merged
merged 5 commits into from
May 20, 2024
Merged

HITL - Rearrange session handling #1965

merged 5 commits into from
May 20, 2024

Conversation

0mdc
Copy link
Contributor

@0mdc 0mdc commented May 19, 2024

Motivation and Context

This changeset adds session handling in rearrange_v2.

A session is defined as a sequence of episodes to be done by a fixed set of users.
The session is the unit of work that is collected during a HITL experiment.

Additions:

  • Session management.
  • Episode loading from connection parameters.
  • Load screen.
  • Start screen.
  • End session screen.
state_machine_2-2024-05-19_09.44.03.mp4

How it works

The state machine is extended as such:

Reset (initial state):

  • Kicks all users.
  • Previous session is deleted.
  • Changes to Lobby when all users are disconnected.

Lobby:

  • Blank screen.
  • New connections are only allowed during this state.
  • Changes to Start Session when max_user_count are connected.

Start Session:

  • Parses the episodes field from the users connection parameters.
    • This comes from the URL parameters in WebGL, or a config file on Android.
  • The Session object is created.
  • Switches to Load Episode when successful. Cancels session otherwise.

Load Episode:

  • Increment the session episode index.
  • If the next episode exists:
    • Load it.
    • Show a loading screen
    • Switch to Start Screen
  • If the last episode is completed, change to End Session.
  • If any user disconnects, cancel the session.

Start Screen:

  • Show a top-down view of the scene with a "Start" button.
  • Change to RearrangeV2 when all users pressed the button.
  • If any user disconnects or the state times out, cancel the session.

RearrangeV2:

  • HITL application.
  • Change to Load Episode when all users signaled the episode as completed.
  • If any user disconnects or is AFK, cancel the session.

End Session:

  • Show a "Session ended" screen.
  • [Upcoming PR] Upload the session data to S3.
  • Switch to Reset when upload is finished.

How Has This Been Tested

Tested in multiplayer HITL application.

Types of changes

  • [Development]

Checklist

  • My code follows the code style of this project.
  • I have updated the documentation if required.
  • I have read the CONTRIBUTING document.
  • I have completed my CLA (see CONTRIBUTING)
  • I have added tests to cover my changes if required.

@facebook-github-bot facebook-github-bot added the CLA Signed Do not delete this pull request or issue due to inactivity. label May 19, 2024
Copy link
Contributor Author

@0mdc 0mdc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comments for reviewers.

destination_mask=Mask.from_index(user_index),
)

# Server-only: Press numeric keys to start episode on behalf of users.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is for easing local testing.

Attempt to get episodes from client connection parameters.
Episode IDs are indices within the episode sets.

Format: {lower_bound_inclusive}-{upper_bound_exclusive} (e.g. "100-110").
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that the current episode sets we use have IDs that match their index within the episode set (e.g. the ID of the first episode is a "0" string).
This approach is error-prone and sloppy.

This will change in the future so that a list of episode IDs can be specified instead of a range of episode indices.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where will these sets need to come from? Is this an artifact of supporting how episode-sets were created in the first place and stored?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an artifact of how episode sets were created.

Where will these sets need to come from?

They would be generated by the episode generator, possibly in json format, and fed to whichever system is responsible for generating HITL tasks (e.g. Psiturk).

Right now, we are limited to group episodes contiguously to satisfy the range-based system. This is error-prone and inflexible. For example, if we want N sets to start with a "practice episode", we need to copy the tutorial N times so that episode ranges stay contiguous.

examples/hitl/rearrange_v2/util.py Show resolved Hide resolved
# Wait for clients to signal that content finished loading on their end.
# HACK: The server isn't immediately aware that clients are loading. For now, we simply skip some frames.
# TODO: Use the keyframe ID from 'ClientMessageManager.set_server_keyframe_id()' to find the when the loading state is up-to-date.
if self._frame_number > 20:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This hack will be addressed in a future PR.

The correct approach to handle this is to look at the frame number when the new scene keyframe is sent, and wait for all clients to echo back this frame number.

This feature is supported but coupled with other systems. See ClientHelper.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is to ensure that both clients are looking at the same state, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is to ensure that the server doesn't progress to the start screen until all clients finished downloading the content.

@0mdc 0mdc mentioned this pull request May 20, 2024
5 tasks
Copy link
Contributor

@zephirefaith zephirefaith left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nits for readability, let's add explicit issues with BE flag for the few places I flagged. Generally this looks good.

Thanks for your hard-work on this!

Comment on lines +50 to +55
def create_app_state_start_screen(
app_service: AppService, app_data: AppData, session: Session
) -> AppStateBase:
from app_state_start_screen import AppStateStartScreen

return AppStateStartScreen(app_service, app_data, session)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this way of organizing and calling code.
There are a few misdirections at the beginning, but once I understand how code is laid out, easier to follow from there on.

Comment on lines 301 to +312
def get_next_state(self) -> Optional[AppStateBase]:
# If cancelled, skip upload and clean-up.
if self._cancel or self._is_episode_finished():
return create_app_state_reset(self._app_service, self._app_data)
if self._cancel:
return create_app_state_cancel_session(
self._app_service,
self._app_data,
self._session,
"User disconnected",
)
elif self._is_episode_finished():
return create_app_state_load_episode(
self._app_service, self._app_data, self._session
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK so rearrange-state exclusively feeds into either loading of next episode or canceling sessions.

Does it matter if the session is canceled due to completion or inaction? Probably handled in app_state_end_session?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In these cases, users are kicked, which causes "user disconnected" error.

Not ideal, but could be improved with some refactoring.

examples/hitl/rearrange_v2/util.py Show resolved Hide resolved
examples/hitl/rearrange_v2/util.py Show resolved Hide resolved
episode_ids: List[str],
connection_records: Dict[int, ConnectionRecord],
):
self.success = False
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this per episode? What does success mean for a session?
If not for session, suggest renaming for clarity, e.g. self.last_episode_success

Comment on lines 54 to 63
# If all users pressed the "Start" button, begin the session.
ready_to_start = True
for user_ready in self._ready_to_start:
ready_to_start &= user_ready
if ready_to_start or SKIP_START_SCREEN:
return create_app_state_rearrange(
self._app_service, self._app_data, self._session
)

return None
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assuming this keeps getting called in a input--get-next-state--execute loop?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct, this is handled by the state machine.

examples/hitl/rearrange_v2/app_state_start_screen.py Outdated Show resolved Hide resolved
examples/hitl/rearrange_v2/app_state_start_screen.py Outdated Show resolved Hide resolved
examples/hitl/rearrange_v2/app_state_start_screen.py Outdated Show resolved Hide resolved
# Wait for clients to signal that content finished loading on their end.
# HACK: The server isn't immediately aware that clients are loading. For now, we simply skip some frames.
# TODO: Use the keyframe ID from 'ClientMessageManager.set_server_keyframe_id()' to find the when the loading state is up-to-date.
if self._frame_number > 20:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is to ensure that both clients are looking at the same state, right?

@0mdc 0mdc merged commit 11ec7bc into main May 20, 2024
3 of 4 checks passed
@0mdc 0mdc deleted the 0mdc/hitl_session branch May 20, 2024 18:58
0mdc added a commit that referenced this pull request May 23, 2024
* Add session management.

* Formatting changes.

* Add clarifications to episode resolution.

* Document temporary hack to check for client-side loading status.

* Review pass - variable renaming and camera matrix chaching.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed Do not delete this pull request or issue due to inactivity.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants