Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add readahead for pagecache #848

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open

Conversation

Sun12551
Copy link

@Sun12551 Sun12551 commented May 15, 2024

This pull request introduces a new feature to the page cache system - a readahead mechanism implemented as ondemand_readahead, replacing the existing commit_page implementation. The enhanced functionality now identifies sequential I/O patterns and performs readahead accordingly.

Before this modification, the bandwidth for sequential read/write operations in exFAT was approximately 50MiB/s. With this update, the corresponding bandwidth can now be boosted to around 200MiB/s, showcasing a significant performance improvement in handling sequential I/O operations.

Copy link
Contributor

@yingdi-shan yingdi-shan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR introduces readahead, which can help to improve the sequential read performance across all filesystems.

Comment on lines 119 to 120
pub fn set_ra_page(&mut self, page: usize) {
self.ra_pages = if page > Self::MAX_RA_PAGES {
Self::MAX_RA_PAGES
} else if page < Self::INIT_RA_PAGES {
Self::INIT_RA_PAGES
} else {
page
};
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the purpose of this function?

Copy link
Author

@Sun12551 Sun12551 May 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function allows users of the page cache to set the maximum size of the readahead window. I have renamed it to set_max_window_size.


pub fn new() -> Self {
Self {
ra_start: Self::INVALID_PAGE_IDX,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of using a constant to represent invalid state, I recommend using an Option.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

Comment on lines 177 to 179
if nreqs == 0 {
return false;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The logic does not follow the name of this function.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

Some(self.ra_start..self.ra_start + self.ra_size)
}

fn is_all_req_completed(&self) -> bool {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can use has_completed_all_requests.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

Ok(())
}

pub fn should_readahead(&mut self, idx: usize, npages: usize) -> bool {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

npages represents the maximal number of pages for readahead, maybe we should consider changing its name for better clarity.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

@yingdi-shan
Copy link
Contributor

LGTM. Thanks for your contribution.

@tatetian
Copy link
Contributor

tatetian commented Jun 4, 2024

@liqinggd Could you take a look at this PR? Your inputs are highly appreciated.

@tatetian
Copy link
Contributor

tatetian commented Jun 4, 2024

Read-ahead is an important I/O optimization technique. Good job on bring it to Asterinas!

My concern is about the downside of read-ahead. It has been shown that the read-ahead mechanism introduced in this PR can bring significant performance boost to sequential reads of ramfs. But what about its impact on the performance of other workload patterns such as random reads and writes? What about its impact on the performance of Ext2 and Ramfs? I think we need real numbers to demonstrate that the read-ahead mechanism only incurs negligible overheads to the "bad" cases.

Ramfs is a particularly interesting case because ramfs is not expected to enjoy any performance benefit from the read-ahead mechanism, but it also utilizes the page cache mechanism. In light of this, I think read-ahead should be a feature that can be opted in by file systems. Individual file systems should decide if read-ahead makes sense for it. Another benefit of making read-ahead optional feature is to enable turning it on/off easily to evaluate its performance impact.

@Sun12551 Sun12551 force-pushed the readahead branch 3 times, most recently from 859a862 to d0d3f2d Compare June 6, 2024 08:08
@@ -85,16 +87,179 @@ impl Debug for PageCache {
}
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add some comments to explain the structure

}

/// Set the maximum readahead window size.
pub fn set_max_window_size(&mut self, size: usize) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This API gives the user a surprise because the user cannot set the size as one wanted

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed the limit on the maximum window size and now the user is free to set it. The previous maximum window size (MAX_WINDOW_SIZE) was changed to the default maximum size (DEFAULT_MAX_SIZE).

pub fn wait_for_prev_readahead(
&mut self,
pages: &mut MutexGuard<LruCache<usize, Page>>,
force: bool,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why add a force argument here? It makes the logic of the code confusing. I suggest to remove this argument.

Copy link
Author

@Sun12551 Sun12551 Jun 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The force parameter is used to indicate whether to wait for the current request to complete. Maybe because I have written some logic that does not belong to this function, making this function a little confused. I've removed that part of the logic.

@@ -85,16 +87,179 @@ impl Debug for PageCache {
}
}

struct ReadaheadWindow {
/// Window start.
start: usize,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it is an index, rename it to start_idx. I suggest to use Range<usize> here to represent the window.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

/// Window size.
size: usize,
/// Look ahead position in current window.
lookahead_index: usize,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can not see the benefit of lookahead_index, it is just the start of the range.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The lookahead_index indicates where the readahead is triggered. We set the lookahead_index to the start of the window for now. This should be adjustable by the user, which is a TODO here.

})
}

pub fn update(&self, max_size: usize, max_page: usize) -> Option<Self> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This API gives me a surprise, because update means an in-place modification.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have renamed it to next.

}

impl ReadaheadWindow {
pub fn new(start_page: usize, init_size: usize, max_page: usize) -> Option<Self> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the constructor just use Range as the input argument is enough. The argument max_page is wired.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

pub fn new() -> Self {
Self {
ra_window: None,
max_size: Self::MAX_WINDOW_SIZE,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest this value is a default window size.

Ok(())
}

pub fn should_readahead(&mut self, idx: usize, max_page: usize) -> bool {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The function is intended solely to determine whether readahead should be performed; the operation of updating the window should be placed where readahead is actually executed later on. It seems peculiar that a function meant for making a judgment call would also require a mutable reference (&mut).

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. I moved the window update logic to set_up_window, which will be called after should_readhead decided to conduct the readahead.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants