Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Search pages in all platforms #279

Open
wants to merge 2 commits into
base: main
Choose a base branch
from
Open

Conversation

marchersimon
Copy link

Fixes #278

I don't know if the implementation is efficient, but at least it feels logical.
What I've done:

  • I created a PriorityQueue
  • The current platforms has the highest priority, followed by common, and then all others
  • Search for a page in this order, until it is found

It seems to work and doesn't break anything (hopefully)

@niklasmohrin
Copy link
Collaborator

Thanks! This seems like a good change, I have some comments about implementation though. Firstly, I don't think we need an efficient priority queue - these smart data structures are sometimes even slower for small inputs than just doing the straight-forward thing (see for example Vec::clear_duplicates in extensions.rs). Using a vector and just sorting it by priority is probably faster and doesn't need the dependency. Secondly, you repeat the possible platform dirs, but this should really be something like Platform::all(). This way, we don't duplicate this knowledge between multiple places.

In theory, we are doing extra IO work because we are checking the current platform and common twice, but I think this shouldn't make much of a difference (and if it does, we can fix that in another PR).

Please let us know if you need any help or have questions, note that both @dbrgn and I might take some time to respond (like with this PR)

@marchersimon
Copy link
Author

Using a vector and just sorting it by priority is probably faster and doesn't need the dependency.

Good point. I'll try to use a vector for it.

In theory, we are doing extra IO work because we are checking the current platform and common twice

Actually not, because when adding a new element to a priority_queue, which does already exist, the priority of the existing element is updated.

I'll make the changes in the coming days, thanks for the feedback

@marchersimon
Copy link
Author

@niklasmohrin @dbrgn I've now made the changes

Copy link
Collaborator

@niklasmohrin niklasmohrin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the long wait. This looks simpler already, nice work :)

I think this is 100% there logically, however I feel like we can improve the readability a bit when employing some Rust patterns and helpers. Please see my comments below and feel free to ask if something is unclear or if you disagree with any of my suggestions :)

Comment on lines +50 to +53
pub fn get_platforms() -> Vec<&'static str> {
vec![ "linux", "osx", "sunos", "windows", "android" ]
}

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is closer, but not exactly what I was aiming for - I should have been clearer. Now, we have a different problem - the "knowledge" about how the directories for the cache are called is now spread into the types module. This is the same problem, but more subtle. Instead, Let's make this return a slice of PlatformTypes. I am also in favor of the short all name. We then arrive at this signature: pub const fn all() -> &'static [Self]

Comment on lines +292 to +303
let current_platform = self.get_platform_dir();

let mut platforms = PlatformType::get_platforms();
platforms.push("common");

// remove the current platform from the vector and re-add it to the end
let index = platforms.iter().position(|&p| p == current_platform).unwrap();
platforms.remove(index);
platforms.push(current_platform);

// Cycle through all platforms in reverse order (current_platform - common - all other platforms)
for platform in platforms.iter().rev() {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(note: this comment is continued below, but I kept it to highlight the process that led me to the solution I propose later)

This can be slightly easier, we need to do a small refactor though. First, there is get_platform_dir which knows the directory name for a platform. Currently, it only allows getting the directory name of the cache's platform, let's change it so that it is an associated function (no taking &self) of Cache and instead gets a PlatfomType as input. Further, let's rename it to directory_name_for, which will make our code here more readable.

Then, we want to do the following: Look for the command for the current platform, for common commands, and then the remaining platforms in any order. With our tools, we can now write this as:

let current_platform_dir = Self::directory_name_for(self.platform);

let mut directories_to_search = vec![current_platform_dir, "common"];
directories_to_search.extend(
    PlatformType::all()
        .iter()
        .copied()
        .filter(|&p| p != self.platform)
        .map(Self::directory_name_for),
);

Note that we now have the first directory to search at the beginning, so we don't need the .rev later

// Did not find platform specific results, fall back to "common"
Self::find_page_for_platform(&page_filename, &cache_dir, "common", &lang_dirs)
.map(|page| PageLookupResult::with_page(page).with_optional_patch(patch_path))
return None;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clippy tells us that this return is unneeded. In general, when handling Some, None, iterators, etc. manually, there is a good chance that there is already some helper function from the standard library that does exactly what we want. In this case, it is find_map. Using that, we can express the above as:

directories_to_search
    .into_iter()
    .find_map(|platform_dir| {
        Self::find_page_for_platform(&page_filename, &cache_dir, platform_dir, &lang_dirs)
    })
    .map(|page| PageLookupResult::with_page(page).with_optional_patch(patch_path))

This is nice! But there is one more flaw in there. We are using into_iter on our freshly created vector. Maybe we can spare the allocation? Of course, the creation of our directories_to_search is basically just building an iterator and collecting it, so we can just skip the "collecting" step. Finally, let's use this opportunity to also give names to the iterators themselves and remove the comments - now, everything is clear from the code:

let main_directories = [current_platform_dir, "common"];
let secondary_directories = PlatformType::all()
    .iter()
    .copied()
    .filter(|&p| p != self.platform)
    .map(Self::directory_name_for);
let mut directories_to_search = main_directories
    .iter()
    .copied()
    .chain(secondary_directories);

Now, we can remove the into_iter below and arrive at what I believe to be the optimal solution.

@marchersimon
Copy link
Author

Thanks for the extensive feedback @niklasmohrin. Unfortunately this is where my Rust knowledge ends. I haven't really worked with slices, lifetime specifiers or other Rust patters yet. I'm still learning and it could take some while 'till i get there. But if you want to, you can finish my PR.

@niklasmohrin
Copy link
Collaborator

niklasmohrin commented Sep 14, 2022

No worries, this is what review is for :) You can take your time if you want to move forward with the PR and understand the proposed changes. Feel free to ask questions at any time, we might take some time to respond though.

if you dont want to work on it anymore at some poibt, I can also apply the review. In this case, let me know :)

@dbrgn dbrgn added the waiting-for-author The PR needs an update before it can be considered for merging. label Sep 24, 2022
@matthew-e-brown
Copy link

I just ran into this issue myself: I use a Windows machine for work, but spend a lot of time SSH'd into remote Linux boxes. So, things like tldr a2enmod don't appear unless I want to manually -p linux each time, or install and run tealdeer on the remote Linux box, which isn't always feasible.

I've given a quick read through this PR, and I must commend @niklasmohrin on his review style—very warm feedback for somebody who is/was new to Rust. So, @marchersimon, I'd like to add my encouragement for you to come back and move forward on this PR now that a few months have passed. How do you feel about moving forwards? It would be great to see it implemented. I have faith in you! 😄

I would of course be willing to help as well, or even take it over from you if that's what you'd like. As someone with ties to teaching, though, I would much prefer to see it be done by somebody who is/was using it as a chance to learn a new language.

Cheers. 🙂

@niklasmohrin
Copy link
Collaborator

@matthew-e-brown Due to the inactivity, I think it would be fair for you to take this PR over if you want to

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
waiting-for-author The PR needs an update before it can be considered for merging.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Search other Platforms when command isn't found
4 participants