Patch Subset method for Incremental Font Transfer, Explained

Author

Chris Lilley, W3C

Participate

IFT Issue tracker
IFT Patch-Subset Specification
IFT Range Request Specification

Introduction

Web Fonts allow web pages to download and use fonts on demand, without the fonts needing to be installed.

Incremental transfer allows clients to load only the portions of the font they actually need which speeds up font loads and reduces data transfer needed to load the fonts. A font can be loaded over multiple requests where each request incrementally adds additional data.

Motivating Use Cases

WebFont usage is high globally, around 75% of top-level web pages use it.

However, WebFonts are currently primarily used with simple writing systems such as Latin, Greek and Cyrillic where the median WOFF2 size is 8.3kB

For fonts with many glyphs (such as are typically used for Chinese and Japanese, for example), even with the compression provided by WOFF 1 or 2, download sizes are still far too large with a median WOFF2 size of 1.8MB. Thus, usage of Web Fonts in China and Japan is close to zero.

For languages with a small set of glyphs, static font subsetting is widely deployed.

However, for those languages with complex shaping requirements, static subsetting gives small files (median WOFF2 size of 93.5kB) but is known to sometimes produce malformed, illegible text.

Static subsetting fails when there are complex inter-relationships between different OpenType™ tables, or when characters are shared between multiple writing systems but behave differently in each one.

Non-goals

Changes to the Open Font Format or OpenType specifications are out of scope.

Evaluation Report

A 2020 Evaluation Report simulated and evaluates solutions which would allow WebFonts to be used where slow networks, very large fonts, or complex subsetting requirements currently preclude their use.

Note: At that time, the technology was called Progressive Font Enrichment (PFE). The name has since been changed to Incremental Font Transfer (IFT).

Performance was simulated on different speeds of network (from fast wired to 2G), for three classes of writing system (simple alphabetic, complex shaping, and large) and for two methods (Range Request and Patch Subset, see below).

Both size (total bytes transferred, including overhead) and network cost (impact of latency on time to render) were considered.

Range Request vs. Patch Subset

The Patch Subset method requires the server to respond to a PatchRequest by validating the request, computing a binary patch between the current, subsetted font on the client and the desired subset of the original font, and then sending the patch, which the client applies to produce a new, enlarged subset font.

It therefore requires new server capabilities, in addition to client changes.

The Range Request method relies on the existing HTTP Range Request functionality and therefore can be used with any HTTP server. For best efficiency, the font should be re-ordered before upload to the server. The client still needs to be updated to support this method.

Self-hosting of fonts remains popular and is likely to grow due to privacy concerns over centralized hosting services. Thus, a method that does not require a specialized server is attractive. At the same time, a method that offers no benefit or makes performance much worse is of no use, regardless of ease of deployment.

Thus the IFT specification (and this explainer) focusses on the Patch Subset method, and gives a way to negotiate a method.

Progress on the Range Request method is slower, with more issues, and it is currently in a separate specification.

Detailed design discussion

Why use two methods

Early review by the IETF HTTP WG raised a question of why we need two different methods, why not just pick the best one. The main issues raised were:

Why do we have two methods
Explain more clearly why we have two methods and what the trade-offs are.

As a result the spec now clearly explains the benefits and trade-offs. The issues were closed to the satisfaction of the commenter.

The overhead of doing method negotiation was also discussed:

Method negotiation has potential time wastes in it

We were able to eliminate that overhead by removing the uneeded PatchRequest message, when initiating a range request session.

Why use a url query parameter

The FPWD used url queries, because we wanted to avoid multiple round trips before getting the font data, and because on first request the client doesn't know which methods the server supports. This requires sending some binary data.

It was seen as problematic by the HTTP WG, because it impinges on a server's authority over its own URLs.

Query parameters
Negotiation algorithm is not ideal

The HTTP WG introduced us the HTTP QUERY, which seemed like a better way to achieve the same result. There is an open issue, as QUERY is still a draft with limited real-world deployment:

Add QUERY as a HTTP method type used for patch subset.

We solved this by firstly introducing three CSS font-tech keywords: incremental-patch, incremental-range and incremental-auto. Secondly, by using a Font-Patch-Request HTTP header for the initial request (thus avoiding the overhead of a CORS preflight request). The binary data is compactly encoded in CBOR. Subsequent requests can use HTTP POST.

These changes resolved the issues to the satisfaction of the commenters, and we removed the query parameter.

Privacy concern: snooping on the user

The Privacy IG raised a concern:

Proposal would allow pages to learn how the user is interacting with the site

There was initially some confusion over terminology such as "third party", and also perhaps a lack of appreciation that HTTPS (which is required for IFT) prevents the person-in-the-middle attack, or that information is required to not leak across origins.

Require that incrementally-loaded fonts not be preserved nor exposed to other origins

Discussion then centere around what an (assumed malicious) IFT font server could learn about the user. For example, in languages with very large character sets, it might be possible to infer the type of content on the web page, by looking for unusual characters whose use tends to be domain specific.

Discussion is ongoing, and the current state of our understanding is in the specification as Content inference from character set and also in Simulating Impact of Noise on Privacy.

Testing

We are working on tests, even at this early stage; the specification marks up each testable assertion.

test suite for servers (in progress)
test suite for clients (not yet)

Stakeholder Feedback / Opposition

Chromium : Positive
WebKit : Positive
Gecko : No signals
Font Vendors (Adobe, Apple, Dalton Maag, Google) : Positive

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Patch-Subset-Explainer.md

Patch-Subset-Explainer.md

Patch Subset method for Incremental Font Transfer, Explained

Author

Participate

Table of Contents

Introduction

Motivating Use Cases

Non-goals

Evaluation Report

Range Request vs. Patch Subset

Detailed design discussion

Why use two methods

Why use a url query parameter

Privacy concern: snooping on the user

Testing

Stakeholder Feedback / Opposition

Files

Patch-Subset-Explainer.md

Latest commit

History

Patch-Subset-Explainer.md

File metadata and controls

Patch Subset method for Incremental Font Transfer, Explained

Author

Participate

Table of Contents

Introduction

Motivating Use Cases

Non-goals

Evaluation Report

Range Request vs. Patch Subset

Detailed design discussion

Why use two methods

Why use a url query parameter

Privacy concern: snooping on the user

Testing

Stakeholder Feedback / Opposition