Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Announcement: Human face metadata entry to WebCodecs VideoFrame Metadata Registry #607

Open
ttoivone opened this issue Nov 15, 2022 · 21 comments · May be fixed by #787
Open

Announcement: Human face metadata entry to WebCodecs VideoFrame Metadata Registry #607

ttoivone opened this issue Nov 15, 2022 · 21 comments · May be fixed by #787
Labels
CR Blocking Needs to resolved for Candidate Recommendation PR exists A PR has been submitted that addresses this issue

Comments

@ttoivone
Copy link

As per requirement of the VideoFrame Metadata Registry, this issue has been filed to allow discussion about the face metadata on being added into VideoFrameMetadata dictionary. The proposal is in the Media capture extensions PR #78 and the face detection explainer shall be updated.

To prevent discussions in multiple places, I do suggest that the discussion about the metadata is actually done in Media Capture Extensions Media capture extensions PR #78 where the explainer and spec will be updated, instead of this issue.

@aboba
Copy link
Collaborator

aboba commented Feb 2, 2023

I'd prefer a more general approach, providing segmentation metadata rather than just metadata specific to face detection. The reason is that encoders can potentially use segmentation metadata to focus their efforts. So if we created face detection metadata, we might also need background blur metadata, whereas if we specify segmentation metadata, we could potentially use that for both face detection and background blur, as well as generic segmentation.

Related: w3c/mediacapture-extensions#79

@aboba
Copy link
Collaborator

aboba commented Mar 1, 2023

FYI, the metadata proposal has been modified to make it more generic so as to make it more useful for segmentation.

This in turn raises the question as to how a WebCodecs Encoder could take advantage of the segmentation information (e.g. per-macroblock QP).

@aboba aboba added the PR exists A PR has been submitted that addresses this issue label Mar 1, 2023
@chrisn
Copy link
Member

chrisn commented Aug 11, 2023

One concern I have is that this adds face detection specific attributes under a top level segments entry, which is a rather generic name. It's still specific because of the SegmentType enum.

So I'd prefer to either rename segments to something indicating face detection as the source of the metadata, or relax the requirements to make SegmentType optional.

Aside from that, I'd be happy to review a PR that adds to the registry.

@youennf
Copy link
Contributor

youennf commented Sep 11, 2023

Discussed at TPAC:

  • segments naming or SegmentType extension point should be discussed in WebRTC WG.
  • above issue is not a blocker for the registry PR.

@aboba aboba added the CR Blocking Needs to resolved for Candidate Recommendation label Apr 17, 2024
@aboba aboba removed the CR Blocking Needs to resolved for Candidate Recommendation label May 8, 2024
@Djuffin
Copy link
Contributor

Djuffin commented May 14, 2024

I've just noticed that this is already a part of Media Capture and Streams Extensions editor draft

w3c/mediacapture-extensions#78

@aboba
Copy link
Collaborator

aboba commented May 14, 2024

@Djuffin If a registry entry would not affect the behavior of WebCodecs or any other MEDIA WG specification, I am wondering why defining the metadata entry in Media Capture and Streams Extensions isn't sufficient. Are we concerned about naming conflicts? That could be addressed without a registry. Or is the concern about poorly designed extensions? If the entries don't affect the behavior of WebCodecs or other MEDIA WG specs, why burden the MEDIA WG with the need to review other WG documents? The lack of convincing motivation could explain why no registry entries have been approved since the registry was created.

@padenot
Copy link
Collaborator

padenot commented May 14, 2024

Monkey patching of other specs is bad practice and frowned upon, they should open a PR here and we should merge their stuff. Let's revert it from their spec.

@chrisn
Copy link
Member

chrisn commented May 14, 2024

@padenot Do you think we should continue with the registry approach here or merge into WebCodecs spec?

@padenot
Copy link
Collaborator

padenot commented May 14, 2024

The registry has all the right characteristics I think.

@chrisn
Copy link
Member

chrisn commented May 14, 2024

AIUI the plan we agreed between WebRTC and Media WGs was to add the face detection metadata as a registry entry, so I'm not clear why that didn't happen and it was added directly to Media Capture and Streams Extensions.

@alvestrand
Copy link

The experience with use of registries is that when review is required, the effective use of registries requires speedy review to acceptance or (justified) rejection.

If registries don't have effective reviewers attached, people will search for other solutions to get their extensions published while they wait for the review to complete.

@youennf
Copy link
Contributor

youennf commented May 15, 2024

I'm not clear why that didn't happen and it was added directly to Media Capture and Streams Extensions.

AIUI, all we seem to be missing is the PR on the metadata registry to add this entry, which would point to the Media Capture and Streams Extensions section.

We could consider moving the metadata definition in a separate document but I do not think this is a blocker.

@padenot
Copy link
Collaborator

padenot commented May 15, 2024

Review times aren't a problem in Web Codecs, reviewers are well known and the issues are triaged every week or even day. Worst case it's fairly trivial to ping one of us to get something merged.

Please do not monkey-patch specifications, and instead do the correct thing, which is to define the interface in a centralized location, which is one of the purpose of a registry. You can then link to it from other specifications, and all is good, and it's easy and clear.

@aboba
Copy link
Collaborator

aboba commented May 15, 2024

@alvestrand One reason that the review was not done is that it is not clear who was to do it or what they were supposed to have done. There is no "Designated Expert" assigned to review VideoFrame registry entries, nor are there guidelines for the review that such an Expert would do. If the registry policy was "WG Consensus" then a WG Consensus call should have been done in the MEDIA WG, but that didn't happen either, and in any case it's not clear what the MEDIA WG would have been asked to come to consensus about.

@youennf
Copy link
Contributor

youennf commented May 15, 2024

Please do not monkey-patch specifications

There is no monkey patching AFAICS.
The only interaction place is the use of partial WebIDL dictionary, which seems perfectly fine.

to define the interface in a centralized location, which is one of the purpose of a registry.

Ah, I do not think we have the same understanding of what the registry is then.
I see the registry as a single place to get the list of all members of the metadata dictionary, just like the webcodecs codec registry lists the available codecs defined for WebCodecs.

The actual definition of the entry, including its interface, can be in another document, which does not necessarily have to be managed by the Media WG.

AISI, what the Media WG adds in terms of coordination is related to the top level member names.

@padenot
Copy link
Collaborator

padenot commented May 15, 2024

It's perfectly conceivable that a VideoFrame that holds human face metadata is sent to an encoder and this encoder uses this information to produce a better encoded video, as mentioned here, so this can clearly change the behaviour of Web Codecs objects.

https://www.w3.org/2023/Process-20231103/#registries has some info on what registries are and are for. Of the list in the introduction, that outlines the purpose of registries:

  • non-collision: isn't guaranteed
  • non-duplication: also isn't guaranteed
  • information: clearly also that didn't happen because Eugene found weeks later that this had been merged
  • submission: fairly trivial, just a PR, since this was discussed with the right stakeholders already
  • consensus: that was done correctly

I'm fine with your PR, at least now it's possible for someone that implements the Web Codecs API to know of the existence of this feature, which is mostly what I care about here, and it's also not possible for someone else to use the same name. I'd say someone that want to extend this interface would click on all members to understand what they are about, so non-duplication is also guaranteed.

It would be nicer to have all the interfaces in a centralized location, since this is what is done for all the other registries in Web Codecs, it really isn't hard. This was discussed by the right people, including a Web Codecs editor in the other PR, everybody agreed.

@chrisn
Copy link
Member

chrisn commented May 15, 2024

Notes from 14 May 2024 Media WG meeting here.

@chrisn
Copy link
Member

chrisn commented May 15, 2024

The key part of the requirments we came up with at the time were:

3 Each metadata entry must be defined by a W3C specification that has reached consensus in the originating Working Group.

5 A candidate registration entry must be announced by filing an issue in the WebCodecs GitHub issue tracker so they can be discussed and evaluated for compliance before being added to the registry. If the Media Working Group reaches consensus to accept the candidate, a pull request should be drafted (either by editors or by the party requesting the candidate registration) to register the candidate. The registry editors will review and merge the pull request.

Item 3 enables WebRTC WG to define its own spec as a registry entry, rather than requiring them all to be Media WG specs, or live in this repository even. But we can change that if it makes sense to, but I worry about the complexity of having multiple WGs managing different specs in this repo.

Item 5 means the Media WG has to agree to add the registry entry, which I hope wouldn't delay things coming from WebRTC WG by too much, but it would give the opportunity for (at least partially) independent review of proposed additions.

We discussed in the meeting yesterday about evolving the requirements as we learn, so if we need to write guidance on what's expected from a metadata spec we can.

On monkey-patching, do we want all possible registry entries to be in a single spec? I thought the partial dictionary approach was a good one, allowing different metadata use cases/applications to have their own spec.

@youennf
Copy link
Contributor

youennf commented May 16, 2024

I'm fine with your PR, at least now it's possible for someone that implements the Web Codecs API to know of the existence of this feature

OK, let's go with this PR in the short term then, this seems inline with the current requirements quoted by @chrisn.
We can always improve upon it.

It would be nicer to have all the interfaces in a centralized location, since this is what is done for all the other registries in Web Codecs, it really isn't hard.

The codecs registry is just a list of pointers to other specs, the current model is following that approach.
If we were to define all metadata in a single location, it would be more a spec than a registry.
This is fine as well, but not what I think we agreed.

It's perfectly conceivable that a VideoFrame that holds human face metadata is sent to an encoder and this encoder uses this information to produce a better encoded video, as mentioned here, so this can clearly change the behaviour of Web Codecs objects.

Sure, but it would probably not change the WebCodecs spec algorithms, so no need for monkey patching.
If a metadata definition requires changes to WebCodecs algorithms, maybe it should not be a metadata and be an API directly at VideoFrame level.

To be noted also that the segment metadata is also related to other getUserMedia APIs hence why its current location is not a bad choice either.

@chrisn
Copy link
Member

chrisn commented May 16, 2024

What level of interest or implementer commitment is needed to add a feature to Media Capture and Streams Extensions? And similar for how the face metadata spec might "graduate" to Media Capture and Streams (or possibly it's own standalone spec?) I'm wondering if we want the registry requirement to require a /TR spec.

@jan-ivar
Copy link
Member

It's the same as for mediacapture-main. New features are merely triaged away from mediacapture-main which is preparing for REC (which unfortunately is taking longer than anticipated).

This is modeled on webrtc-extensions which existed ahead of its present post-REC role, as a way to get webrtc-pc to REC.

If face metadata were to gain two implementations (pre- or post-REC), we'd move it to mediacapture-main, similar to w3c/webrtc-pc#2953.

How does the Media WG manage updates to EME and the two implementation requirement?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CR Blocking Needs to resolved for Candidate Recommendation PR exists A PR has been submitted that addresses this issue
Projects
None yet
8 participants