Skip to content

Boyer-Moore-MagicLen, a fast string search algorithm implemented in Rust.

License

Notifications You must be signed in to change notification settings

magiclen/boyer-moore-magiclen

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

62 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Boyer-Moore-MagicLen

CI

This crate can be used to search substrings in a string or search any sub-sequences in any sequence by using boyer-moore-magiclen (which is sometimes faster than boyer-moore and boyer-moore-horspool).

Usage

For binary data and UTF-8 data, use the BMByte struct. For character sequences, use the BMCharacter struct (however it is much slower than BMByte). The BMCharacter struct needs the standard library support, and you have to enable the character feature to make it available.

Every BMXXX has a from associated function to create the instance by a search pattern (the needle).

For example,

use boyer_moore_magiclen::BMByte;

let bmb = BMByte::from("oocoo").unwrap();

Now, we can search any binary data or UTF-8 data for the pattern oocoo.

There are two search modes and two search directions. The first mode is called full text search, which finds the positions of the matched sub-sequences including the overlapping ones.

use boyer_moore_magiclen::BMByte;

let bmb = BMByte::from("oocoo").unwrap();

assert_eq!(vec![1, 4], bmb.find_full_in("coocoocoocoo", 2));

The other mode is called normal text search, which finds the positions of the matched sub-sequences excluding the overlapping ones.

use boyer_moore_magiclen::BMByte;

let bmb = BMByte::from("oocoo").unwrap();

assert_eq!(vec![1, 7], bmb.find_in("coocoocoocoo", 2));

The search direction can be from the head (searching forward, find_xxx) or from the tail (searching backward, rfind_xxx).

use boyer_moore_magiclen::BMByte;

let bmb = BMByte::from("oocoo").unwrap();

assert_eq!(vec![7, 1], bmb.rfind_in("coocoocoocoo", 2));

To search all results at a time, use the find_all_in, rfind_all_in, find_full_all_in or rfind_full_all_in method.

use boyer_moore_magiclen::BMByte;

let bmb = BMByte::from("oocoo").unwrap();

assert_eq!(vec![7, 4, 1], bmb.rfind_full_all_in("coocoocoocoo"));

Benchmark

cargo bench --bench full_text_search

or

cargo bench --bench normal_text_search

Crates.io

https://crates.io/crates/boyer-moore-magiclen

Documentation

https://docs.rs/boyer-moore-magiclen

License

MIT

About

Boyer-Moore-MagicLen, a fast string search algorithm implemented in Rust.

Topics

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages