Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Require 'unsafe' keyword for custom implementations of nom traits #90

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

progval
Copy link
Collaborator

@progval progval commented Aug 13, 2023

nom_locate progressively advances through a fragment by slicing it, but expects to be able to go backward by as much as it advanced. This is normally fine, but custom implementations of nom types could cause UB by implementing slicing incorrectly.

After this commit, they will need to implement the unsafe trait RewindableFragment, putting the burden of soundness on these implementations.

@stephaneyfx provides an example of a maliciously constructed fragment type exercising this behavior in GH-88.

I tried to keep constraint to the bare minimum, but it probably makes sense to require the trait to be implemented when calling LocatedSpan::new/LocatedSpan::new_extra and not having to care later, too. Thought?

@AngelOfSol @ThePerkinrex @zertosh I believe you all have crates with custom types that implement the nom traits. Would this work for you? (Of course I'll do a major version bump when pushing this change to crates.io)

Closes GH-88.

`nom_locate` progressively advances through a fragment by slicing it,
but expects to be able to go backward by as much as it advanced. This is
normally fine, but custom implementations of `nom` types could cause
UB by implementing slicing incorrectly.

After this commit, they will need to implement the unsafe trait
`RewindableFragment`, putting the burden of soundness on these
implementations.

stephaneyfx provides an example of a maliciously constructed fragment type
exercising this behavior:

> This function is called from public and safe functions like get_line_beginning. It assumes that the current fragment is part of a larger fragment and attempts to read before the beginning of the current fragment. This assumption may be incorrect as demonstrated by the following program that exhibits UB without unsafe and outputs garbage (which can change on every run).

```rust
use nom::{AsBytes, InputTake, Offset, Slice};
use nom_locate::LocatedSpan;
use std::{
    cell::Cell,
    ops::{RangeFrom, RangeTo},
    rc::Rc,
};

struct EvilInput<'a>(Rc<Cell<&'a [u8]>>);

impl<'a> AsBytes for EvilInput<'a> {
    fn as_bytes(&self) -> &[u8] {
        self.0.get()
    }
}

impl Offset for EvilInput<'_> {
    fn offset(&self, second: &Self) -> usize {
        self.as_bytes().offset(second.as_bytes())
    }
}

impl Slice<RangeFrom<usize>> for EvilInput<'_> {
    fn slice(&self, range: RangeFrom<usize>) -> Self {
        Self(Rc::new(Cell::new(self.0.get().slice(range))))
    }
}

impl Slice<RangeTo<usize>> for EvilInput<'_> {
    fn slice(&self, range: RangeTo<usize>) -> Self {
        Self(Rc::new(Cell::new(self.0.get().slice(range))))
    }
}

fn main() {
    let new_input = [32u8];
    let original_input = [33u8; 3];
    let evil_input = EvilInput(Rc::new(Cell::new(&original_input)));
    let span = LocatedSpan::new(evil_input).take_split(2).0;
    span.fragment().0.set(&new_input);
    let beginning = span.get_line_beginning();
    dbg!(beginning);
    dbg!(new_input.as_ptr() as usize - beginning.as_ptr() as usize);
}
```

Example output:

```
[src/main.rs:43] beginning = [
    201,
    127,
    32,
]
[src/main.rs:44] new_input.as_ptr() as usize - beginning.as_ptr() as usize = 2
```
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

LocatedSpan::get_unoffsetted_slice can lead to UB
1 participant