Add support for VT paging operations #16615

j4james · 2024-01-29T00:44:52Z

Summary of the Pull Request

This PR adds support for multiples pages in the VT architecture, along
with new operations for moving between those pages: NP (Next Page),
PP (Preceding Page), PPA (Page Position Absolute), PPR (Page
Position Relative), and PPB (Page Position Back).

There's also a new mode, DECPCCM (Page Cursor Coupling Mode), which
determines whether or not the active page is also the visible page, and
a new query sequence, DECRQDE (Request Displayed Extent), which can be
used to query the visible page.

References and Relevant Issues

When combined with DECCRA (Copy Rectangular Area), which can copy
between pages, you can layer content on top of existing output, and
still restore the original data afterwards. So this could serve as an
alternative solution to #10810.

Detailed Description of the Pull Request / Additional comments

On the original DEC terminals that supported paging, you couldn't have
both paging and scrollback at the same time - only the one or the other.
But modern terminals typically allow both, so we support that too.

The way it works, the currently visible page will be attached to the
scrollback, and any content that scrolls off the top will thus be saved.
But the background pages will not have scrollback, so their content is
lost if it scrolls off the top.

And when the screen is resized, only the visible page will be reflowed.
Background pages are not affected by a resize until they become active.
At that point they just receive the traditional style of resize, where
the content is clipped or padded to match the new dimensions.

I'm not sure this is the best way to handle resizing, but we can always
consider other approaches once people have had a chance to try it out.

Validation Steps Performed

I've added some unit tests covering the new operations, and also done a
lot of manual testing.

PR Checklist

Closes Add support for the VT Paging APIs (NP, PP, PPA) #13892
Tests added/passed

lhecker

Neat! I haven't finished reviewing it yet unfortunately, but here are a couple initial thoughts I had...

src/terminal/adapter/PageManager.hpp

src/terminal/adapter/PageManager.cpp

src/cascadia/TerminalCore/Terminal.hpp

j4james · 2024-03-11T00:15:05Z

I think this is ready for review now. I've been using it for a while, and it works nicely for my use cases, although I'm not sure how much benefit it will be for anyone else.

And I should warn you that it might have some impact on performance, but I'm hoping it isn't significant. I did a test run with the new benchmark tool, and the only tests that looked a lot worse were the 4Ki WriteConsole ones, but the times are in microseconds, so I'm not sure they're really as bad as they look.

lhecker

Honestly, LGTM. I can confirm that there's a consistent performance regression, albeit not a large one. It can probably be fixed by making the code more "data oriented" and by avoiding large struct copies. ("large" for MSVC = anything larger than a pointer, unfortunately.) But I'm not particularly worried about this regression. We have a lot of way lower hanging fruits we can tackle instead.
I'm currently mostly worried about the overall architecture of the terminal, since that has a much bigger impact on performance than any SIMD, etc., we may ever do, and may also help us keep the code lean.

lhecker · 2024-03-18T17:13:33Z

src/terminal/adapter/PageManager.cpp

+ // If we're changing the visible page, what we do is swap out the current
+ // visible page into its backing buffer, and swap in the new page from the
+ // backing buffer to the main buffer. That way the rest of the system only
+ // ever has to deal with the main buffer.
+ if (makeVisible && _visiblePageNumber != newPageNumber)
+ {
+ const auto& newBuffer = _getBuffer(newPageNumber, pageSize);
+ auto& saveBuffer = _getBuffer(_visiblePageNumber, pageSize);
+ for (auto i = 0; i < pageSize.height; i++)
+ {
+ saveBuffer.GetMutableRowByOffset(i).CopyFrom(visibleBuffer.GetRowByOffset(visibleTop + i));
+ }
+ for (auto i = 0; i < pageSize.height; i++)
+ {
+ visibleBuffer.GetMutableRowByOffset(visibleTop + i).CopyFrom(newBuffer.GetRowByOffset(i));
+ }
+ _visiblePageNumber = newPageNumber;
+ redrawRequired = true;
+ }


More importantly, I'm a little worried about this change. Why don't we swap the visible buffer in the _api here? Is it because of some fundamental reason or did you do it due to our current, rather "intertwined" architecture?

If it's the latter I'm afraid this PR is making things worse. If I'm reading it correctly, the original DECs did not seem to have made a distinction between the "main visible" buffer and other pages. They were all pages, but one of them was simply the visible. Is my understanding correct? If so, do you think it's feasible to change the architecture in this PR? I'd completely understand if that was too difficult or time consuming.

Why don't we swap the visible buffer in the _api here?

Well the first problem is that anything in the _api needs to be written twice, because the buffer management in Windows Terminal is entirely separate from conhost. In an ideal world we'd have a unified buffer API that covered all the needs of the legacy console API, the XTerm alt buffers, and the VT paging, and which was shared between conhost and WT, but addressing that problem is way more work than I wanted to take on now.

And besides that, I'm not sure there'd be any benefit to it anyway. When swapping the visible page, it wouldn't just be a matter of pointing the renderer at the appropriate buffer - it's kind of mix. The scrollback (the S in that diagram above) is shared between all pages. It's only the P+V part that changes when you switch the visible page.

the original DECs did not seem to have made a distinction between the "main visible" buffer and other pages.

That is correct, but they didn't support scrollback and paging at the same time, so the S section in the diagram above would not have existed when paging was enabled. But modern terminals do tend to support both, so I thought it would be best if we could handle that too. And if you want to retain scrollback (which is stored in the main buffer) when switching the visible page, there's got to be some form of buffer merging.

Given what you wrote, and speaking long term, would you be in favor of us splitting the storage of the viewport/page and scrollback into separate buffers? I've been thinking about whether this would allow us to use a more efficient compression scheme for the scrollback, while using a more performant architecture for the viewport (= no run-length compression of attributes, etc.). It may also simplify building an infinite scrollback feature. Now, reading your comment, it seems that this may also simplify the paging implementation...

Yeah, that makes sense to me, at least from a VT point of view. And if that split also enabled us to have the viewport/page storage being a different width to the scrollback buffer, that might be a convenient way to support VT windowing (not an essential feature, but it's something which might be nice to have one day).

The things I think would be most problematic would be the legacy console APIs, and buffer resizing. But if you want to support infinite scrollback, those areas are likely to be complicated regardless of the architecture.

lhecker · 2024-03-18T17:17:26Z

src/terminal/adapter/PageManager.cpp

+ {
+ // Page buffers are created on demand, and are sized to match the active
+ // page dimensions without any scrollback rows.
+ buffer = std::make_unique<TextBuffer>(pageSize, TextAttribute{}, 0, false, _renderer);


(BTW It'd be great if we could move to using shared_ptr<TextBuffer> over time so that we can more safely share buffers with the rendering thread, or simply between components.)

OK. I just though the unique ptr was more efficient, but I'm happy to change that.

I'm fine with the unique_ptr here and I don't think you need to change it. If we were to change it, I think we should make it consistent across the code base.

The only difference between unique_ptr and shared_ptr is that the latter allocates an 8 byte large control block which contains the strong and weak reference counts. Reference counting itself only occurs on copies and so moving a shared_ptr has the same overhead as a unique_ptr (= none, apart from the struct copy). The allocation of the control block can be avoided by using make_shared. I believe this makes shared_ptr as cost-free as unique_ptr.

lhecker · 2024-03-18T17:19:19Z

src/terminal/adapter/PageManager.cpp

+
+Page PageManager::ActivePage() const
+{
+ return Get(_activePageNumber);


You may be able to improve performance quite a bit here by caching the active page and handing out a mutable reference only. (Same for VisiblePage.) Constructing and returning large structs like this is expensive, particularly with MSVC. (GetBufferAndViewport has the same overhead as Get, and we may avoid both overheads this way.)

However, I believe this cannot be done right now since the visibleViewport and isMainBuffer are managed by the _api. If the PageManager was in charge of the active and visible page, as well as the viewport¹ then it could cache information in a robust manner I believe. I suppose this ties into my other comment re: architecture.

Footnotes

"viewports"? I'm sure if viewport is a plural in VT world. ↩

However, I believe this cannot be done right now since the visibleViewport and isMainBuffer are managed by the _api.

Yeah, I had hoped that one day we might avoid this issue by having AdaptDispatch register some sort of callback with the api, so it can get notified of buffer and viewport changes, instead of having to query them every time. But we always have to deal with the fact that the legacy console APIs are in control of those aspects of the architecture, so that information has to be passed across the api boundary one way or the other. Again, not something I want to try and solve now.

However, if it's the copying of these page objects that is causing the performance regression, that might still be something I can improve. When I have more time I'll do some experimenting and see if I can find a way to make that any faster.

lhecker · 2024-03-25T15:29:12Z

FWIW I'm still somewhat worried about putting the PageManager into the VT specific code. However, I don't think I could suggest an alternative.
As we've talked about many times before, I would personally greatly prefer if there would not be 2 terminal API implementations, but rather just a single Terminal class. If such a single Terminal class existed, I think the pages should be stored there. I believe that this may long term simplify certain operations that we can't foresee yet.

DHowett · 2024-04-23T20:24:12Z

Yesterday I asked Leonard to explain this to me so I'd be able to review it for 1.21. I'm sorry we let you marinate for 3 months! I'm on this today, for release next week or so.

Thanks so much, James, for everything.

j4james · 2024-04-23T22:44:23Z

@DHowett Honestly I don't mind if you want to leave this for a while. I was hoping there might be a way to improve the performance based on Leonard's suggestion about page caching, but I've been putting off working on that because I was having too much fun actually using the new functionality (e.g. see VT-Rex).

That said, I also wouldn't mind if you chose to merge it as is, but I'm somewhat concerned this might be a feature that nobody else uses besides me, and so for everyone else it would just be an unwanted performance regression.

lhecker · 2024-04-23T23:51:17Z

I don't think you need to worry about the performance too much - The regression is rather small. I forgot how much it was, but it was like -5% or something. The recent cursor caching PR (if you've seen it) alone brought us +20%, and there's a hundred more places like that.

IMO the best thing we can do for perf. is to instead shrink our architecture. If we have less interfaces, less adapters, less classes, and so on, and more data driven design overall, we'll probably straight up double our VT performance without even touching the underlying algorithms. In ConPTY for instance, if we ignore VtEngine, the actual text and attribute insertion including width measurement only consumes around 20% of CPU time in the worst case.

BTW if anyone (or you) wants to work on perf., here's something that would probably net us another >10% uplift, with the potential to improve performance much more in the long run (= it's an unblocker for other optimizations): DelayEOLWrap() is currently a boolean flag and it and GetDelayedAtPosition() are somewhat costly. We could implement it this way for instance:

void Cursor::DelayEOLWrap() noexcept
{
    return _cPosition.x >= _parentBuffer._width;
}

That is, we represent a delayed cursor by putting it into a column past the right end of the buffer. We would need to modify GetPosition to be something like

til::point Cursor::GetPosition() const noexcept
{
    const auto width = _parentBuffer._width;
    return { std::min(_cPosition.x, width - 1), _cPosition.y };
}

One major VT perf. cost is that our parser simply isn't a traditional one: It should ideally be implemented as a state machine with lookup table. Something like Alacritty's parser: https://github.com/alacritty/vte/blob/master/src/table.rs
This is because a major cost resides in StateMachine::_ActionParam and friends, as all those branches (including those abstracted away in the STL, etc.) are fairly costly overall (around 1/3rd of our total CPU cost).

AdaptDispatch::_DoLineFeed is also rather slow. I'm not entirely yet why, but maybe it could be "math'ified" a bit more?

P.S.: Another very difficult, but very impactful change would be calling AccessibilityNotifier functions only after StateMachine::ProcessString returns in DoWriteConsole. This would net us +100% perf in conhost (and later in WT if we ever get a conhost.lib). It can be done by accumulating a single til::point_span that covers the entire damaged region, and a total cumulative scroll offset. It's fine if the span covers too much area (for instance because of CUP sequences jumping around and changing things in multiple places), because such apps aren't even accessible now either.

However, I think the other changes above are more important right now as they simplify our architecture.

j4james · 2024-04-24T22:15:04Z

DelayEOLWrap() is currently a boolean flag and it and GetDelayedAtPosition() are somewhat costly.

Just on this point, I don't think the boolean flag is avoidable, because delayed wrapping is not actually limited to the final column of the display. With margins applied, you can have a delayed wrap occurring almost anywhere. And technically I think you can probably set the flag manually with DECCIR, without even being on the margin.

j4james · 2024-05-02T19:48:14Z

FYI, I've had a chance to experiment with the page caching, but I couldn't get it to make much of a difference in terms of performance. So what's here now is as good as it's going to get from me (at least in the short term). If you all are happy with it as it is, it would be nice to have it merged, but I don't mind if you decide against it.

lhecker · 2024-05-02T21:39:39Z

Ah that's fine then. We're about to fork off release-1.21 now and I think afterwards we can merge it (for version 1.22). I can't speak for them, but it may still take a little bit, because Dustin & Mike are quite busy right now with the release of not just this, but sudo, bug fixes, etc.

zadjii-msft

holy crap I reviewed this like a month ago and just never hit ✅. My bad. Well, let's get this in for 1.22 now, shall we?

DHowett · 2024-05-07T22:21:35Z

Alright! Let's get this merge conflict unconflicted and we can land this thing. Thank you for your patience.

James, generally speaking even if you're the only person who would benefit from a particular aspect of VT compatibility I'd still love for us to have it. 🙂

j4james · 2024-05-10T22:17:56Z

@DHowett In case you haven't seen, the merge conflicts have been resolved now.

microsoft-github-policy-service bot added Issue-Task It's a feature request, but it doesn't really need a major design. Area-VT Virtual Terminal sequence support Product-Conhost For issues in the Console codebase Product-Terminal The new Windows Terminal. labels Jan 29, 2024

lhecker reviewed Jan 29, 2024

View reviewed changes

src/terminal/adapter/PageManager.hpp Show resolved Hide resolved

src/terminal/adapter/PageManager.cpp Show resolved Hide resolved

src/terminal/adapter/PageManager.cpp Outdated Show resolved Hide resolved

src/cascadia/TerminalCore/Terminal.hpp Outdated Show resolved Hide resolved

j4james force-pushed the feature-paging branch from da21fd9 to 117323f Compare February 3, 2024 21:44

j4james force-pushed the feature-paging branch from 117323f to f5b0fdd Compare February 25, 2024 21:37

j4james added 14 commits March 9, 2024 22:26

Hook up the paging escape sequences.

5ae9884

Add api returning both buffer and viewport.

6b43e44

Add class for managing VT page buffers.

8c1fd66

Integrate page manager into AdaptDispatch.

af6fd19

Remove unused ITerminalApi methods.

e451c5b

Add support for page cursor coupling mode.

22535b0

Add support for DECRQDE query.

7046768

Use Page instead of TextBuffer in more AdaptDispatch methods.

bdb890f

Handle cursor and attribute access through Page object.

f82f40d

Get buffer/viewport dimensions through Page object.

15a7c46

Add some terms to spellbot dictionary.

79f26f5

Correct viewport positioning on background pages.

6ded041

Use struct in place of tuple.

293a73f

Eliminate some unnecessary textbuffer locals.

69e2037

j4james force-pushed the feature-paging branch from f5b0fdd to 5b1276c Compare March 10, 2024 00:04

j4james added 4 commits March 10, 2024 17:20

Add some more comments.

b35d5ed

Only propagate ED2 through conpty for the active buffer.

d452725

Update the DECXCPR test to check the page number.

c5d0c12

Add some unit tests for the new paging operations.

14de021

j4james force-pushed the feature-paging branch from 5b1276c to 14de021 Compare March 10, 2024 18:17

j4james marked this pull request as ready for review March 11, 2024 00:14

lhecker reviewed Mar 18, 2024

View reviewed changes

lhecker approved these changes Mar 25, 2024

View reviewed changes

j4james added 2 commits May 2, 2024 18:48

Merge branch 'main' into feature-paging

e25d792

Replace obsolete GetTextBuffer api calls.

2488e9a

zadjii-msft approved these changes May 7, 2024

View reviewed changes

Merge branch 'main' into feature-paging

7795bab

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support for VT paging operations #16615

Add support for VT paging operations #16615

j4james commented Jan 29, 2024 •

edited

lhecker left a comment

j4james commented Mar 11, 2024

lhecker left a comment •

edited

lhecker Mar 18, 2024

j4james Mar 21, 2024

lhecker Mar 25, 2024 •

edited

j4james Mar 25, 2024

lhecker Mar 18, 2024

j4james Mar 21, 2024

lhecker Mar 25, 2024

lhecker Mar 18, 2024 •

edited

j4james Mar 21, 2024

lhecker commented Mar 25, 2024 •

edited

DHowett commented Apr 23, 2024

j4james commented Apr 23, 2024

lhecker commented Apr 23, 2024 •

edited

j4james commented Apr 24, 2024

j4james commented May 2, 2024

lhecker commented May 2, 2024 •

edited

zadjii-msft left a comment

DHowett commented May 7, 2024

j4james commented May 10, 2024

Add support for VT paging operations #16615

Are you sure you want to change the base?

Add support for VT paging operations #16615

Conversation

j4james commented Jan 29, 2024 • edited

Summary of the Pull Request

References and Relevant Issues

Detailed Description of the Pull Request / Additional comments

Validation Steps Performed

PR Checklist

lhecker left a comment

Choose a reason for hiding this comment

j4james commented Mar 11, 2024

lhecker left a comment • edited

Choose a reason for hiding this comment

lhecker Mar 18, 2024

Choose a reason for hiding this comment

j4james Mar 21, 2024

Choose a reason for hiding this comment

lhecker Mar 25, 2024 • edited

Choose a reason for hiding this comment

j4james Mar 25, 2024

Choose a reason for hiding this comment

lhecker Mar 18, 2024

Choose a reason for hiding this comment

j4james Mar 21, 2024

Choose a reason for hiding this comment

lhecker Mar 25, 2024

Choose a reason for hiding this comment

lhecker Mar 18, 2024 • edited

Choose a reason for hiding this comment

Footnotes

j4james Mar 21, 2024

Choose a reason for hiding this comment

lhecker commented Mar 25, 2024 • edited

DHowett commented Apr 23, 2024

j4james commented Apr 23, 2024

lhecker commented Apr 23, 2024 • edited

j4james commented Apr 24, 2024

j4james commented May 2, 2024

lhecker commented May 2, 2024 • edited

zadjii-msft left a comment

Choose a reason for hiding this comment

DHowett commented May 7, 2024

j4james commented May 10, 2024

j4james commented Jan 29, 2024 •

edited

lhecker left a comment •

edited

lhecker Mar 25, 2024 •

edited

lhecker Mar 18, 2024 •

edited

lhecker commented Mar 25, 2024 •

edited

lhecker commented Apr 23, 2024 •

edited

lhecker commented May 2, 2024 •

edited