Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Safer implementation of the uninit_vector function #170

Open
kitcatier opened this issue Mar 21, 2023 · 2 comments
Open

Safer implementation of the uninit_vector function #170

kitcatier opened this issue Mar 21, 2023 · 2 comments

Comments

@kitcatier
Copy link

pub unsafe fn uninit_vector<T>(length: usize) -> Vec<T> {
let mut vector = Vec::with_capacity(length);
vector.set_len(length);
vector
}

Hello, here's a safer implementation that uses std::mem::MaybeUninit to create an uninitialized Vec:

pub fn uninit_vector<T>(length: usize) -> Vec<T> { 
     let mut vector = Vec::with_capacity(length); 
     unsafe { 
         let data_ptr = vector.as_mut_ptr(); 
         std::ptr::write_bytes(data_ptr, 0, length); 
         vector.set_len(length); 
     } 
     return vector; 
 }

This implementation uses std::ptr::write_bytes to initialize each element in the vector. This ensures that all elements in the vector are properly initialized, avoiding potential memory safety issues.

@irakliyk
Copy link
Collaborator

Hi! Thank you for brining this up. If I understood correctly, the above would actually initialize all the allocated memory to zeros, right? If so, what is the advantage of doing this vs. just creating a vector with initialized default elements. For example, something like vec![T::default(); length];?

@E-Mans-Application
Copy link

Hi,

If I understand well, the purpose of uninit_vector is to pre-allocate the vector but initialize the items later. Given that, it does not seem interesting to initialize the items with zeros (moreover, 0 may not be a valid value for the items either).

However, the current implementation is unsound. If you read documentation of set_len, you can see you break the first safety requirement: "The elements at old_len..new_len must be initialized." Moreover, you can't use res[i] = value to initialize res[i], because this reads res[i] and drops it before writing the new value. Last but not least, if anything wrong happens in the function between set_len and the initialization of the last item and causes the function to return early, the vector will be dropped and will run destructor of uninitialized items. You can notice it if you use an item type with a significant Drop implementation, such as Vec<Vec<u8>>.

For example, this crashes due to heap corruption:

let mut v: Vec<Vec<u8>> = Vec::with_capacity(10);
unsafe { v.set_len(10) };
for i in 0..10 {
  v[i] = vec![i as u8]; // This drops the previous value of `v[i]`, which is not initialized
}

In cases where you simply use a for loop to initialize the items, you should take a look at Vec::resize_with. This does not use unsafe code at all and, on my machine, does not seem slower than methods that use unsafe code:

let mut vec = Vec::new();
let mut i = 0usize;
vec.resize_with(n, || { i += 1; value_depending_on_i });

For places where the above cannot apply easily, here are some things you can do instead:

  1. Use MaybeUninit, and use set_len after the items have been initialized.
let mut vec = Vec::with_capacity(n);
let uninit_vec = vec.spare_capacity_mut();
for i in 0..n {
    uninit_vec[i].write(value);
}
unsafe { vec.set_len(n) };
  1. Use ptr::write, which does not read the uninitialized value, and use set_len after the items have been initialized.
let mut vec: Vec<T> = Vec::with_capacity(n);
for i in 0..n {
    unsafe { ptr::write(vec.as_mut_ptr().add(i), value) };
}
unsafe { vec.set_len(n) };

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants