Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cucumber Expressions AST into Regex expansion #2

Merged
merged 25 commits into from
Nov 22, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .clippy.toml
Original file line number Diff line number Diff line change
Expand Up @@ -6,5 +6,6 @@ standard-macro-braces = [
{ name = "assert", brace = "(" },
{ name = "assert_eq", brace = "(" },
{ name = "assert_ne", brace = "(" },
{ name = "matches", brace = "(" },
{ name = "vec", brace = "[" },
]
34 changes: 34 additions & 0 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -53,6 +53,39 @@ jobs:
# Testing #
###########

feature:
name: Feature
if: ${{ github.ref == 'refs/heads/main'
|| startsWith(github.ref, 'refs/tags/v')
|| !contains(github.event.head_commit.message, '[skip ci]') }}
strategy:
fail-fast: false
matrix:
feature:
- <none>
- into-regex
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v2
- uses: actions-rs/toolchain@v1
with:
profile: minimal
toolchain: nightly
- uses: actions-rs/toolchain@v1
with:
profile: minimal
toolchain: stable
override: true

- run: cargo +nightly update -Z minimal-versions

- run: cargo check --no-default-features
${{ matrix.feature != '<none>'
&& format('--features {0}', matrix.feature)
|| '' }}
env:
RUSTFLAGS: -D warnings

msrv:
name: MSRV
if: ${{ github.ref == 'refs/heads/main'
Expand Down Expand Up @@ -141,6 +174,7 @@ jobs:
name: Release on GitHub
needs:
- clippy
- feature
- msrv
- rustdoc
- rustfmt
Expand Down
9 changes: 8 additions & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,9 +11,16 @@ All user visible changes to `cucumber-expressions` crate will be documented in t

### Added

- ???
- [Cucumber Expressions] AST and parser. ([#1])
- Expansion of [Cucumber Expressions] AST into [`Regex`] behind `into-regex` feature flag. ([#2])

[#1]: /../../pull/1
[#2]: /../../pull/2




[`Regex`]: https://docs.rs/regex

[Cucumber Expressions]: https://github.com/cucumber/cucumber-expressions#readme
[Semantic Versioning 2.0.0]: https://semver.org
19 changes: 19 additions & 0 deletions Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@ name = "cucumber-expressions"
version = "0.1.0-dev"
edition = "2021"
rust-version = "1.56"
description = "Cucumber Expressions AST and parser."
license = "MIT OR Apache-2.0"
authors = [
"Ilya Solovyiov <[email protected]>",
Expand All @@ -16,4 +17,22 @@ categories = ["compilers", "parser-implementations"]
keywords = ["cucumber", "expression", "expressions", "cucumber-expressions"]
include = ["/src/", "/LICENSE-*", "/README.md", "/CHANGELOG.md"]

[package.metadata.docs.rs]
all-features = true
rustdoc-args = ["--cfg", "docsrs"]

[features]
# Enables ability to expand AST into regex.
into-regex = ["either", "regex"]

[dependencies]
derive_more = { version = "0.99.16", features = ["as_ref", "deref", "deref_mut", "display", "error", "from", "into"], default_features = false }
nom = "7.0"
nom_locate = "4.0"

# "into-regex" feature dependencies
either = { version = "1.6", optional = true }
regex = { version = "1.5", optional = true }

# TODO: Remove once `derive_more` 0.99.17 is released.
syn = "1.0.81"
71 changes: 69 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,15 +2,76 @@
===============================

[![Documentation](https://docs.rs/cucumber-expressions/badge.svg)](https://docs.rs/cucumber-expressions)
[![CI](https://github.com/cucumber-rs/cucumber-expressions/workflows/CI/badge.svg?branch=master "CI")](https://github.com/cucumber-rs/cucumber-expressions/actions?query=workflow%3ACI+branch%3Amaster)
[![CI](https://github.com/cucumber-rs/cucumber-expressions/workflows/CI/badge.svg?branch=main "CI")](https://github.com/cucumber-rs/cucumber-expressions/actions?query=workflow%3ACI+branch%3Amaster)
[![Rust 1.56+](https://img.shields.io/badge/rustc-1.56+-lightgray.svg "Rust 1.56+")](https://blog.rust-lang.org/2021/10/21/Rust-1.56.0.html)
[![Unsafe Forbidden](https://img.shields.io/badge/unsafe-forbidden-success.svg)](https://github.com/rust-secure-code/safety-dance)

- [Changelog](https://github.com/cucumber-rs/cucumber-expressions/blob/main/CHANGELOG.md)

Rust implementation of [Cucumber Expressions].

This crate provides [AST] and parser of [Cucumber Expressions].
This crate provides [AST] parser, and [`Regex`] expansion of [Cucumber Expressions].

```rust
use cucumber_expressions::Expression;

let re = Expression::regex("I have {int} cucumbers in my belly").unwrap();
let caps = re.captures("I have 42 cucumbers in my belly").unwrap();

assert_eq!(&caps[0], "I have 42 cucumbers in my belly");
assert_eq!(&caps[1], "42");
```




## Cargo features

- `into-regex`: Enables expansion into [`Regex`].




## Grammar

This implementation follows a context-free grammar, [which isn't yet merged][1]. Original grammar is impossible to follow while creating a performant parser, as it consists errors and describes not an exact [Cucumber Expressions] language, but rather some superset language, while being also context-sensitive. In case you've found some inconsistencies between this implementation and the ones in other languages, please file an issue!

[EBNF] spec of the current context-free grammar implemented by this crate:
```ebnf
expression = single-expression*

single-expression = alternation
| optional
| parameter
| text-without-whitespace+
| whitespace+
text-without-whitespace = (- (text-to-escape | whitespace))
| ('\', text-to-escape)
text-to-escape = '(' | '{' | '/' | '\'

alternation = single-alternation, (`/`, single-alternation)+
single-alternation = ((text-in-alternative+, optional*)
| (optional+, text-in-alternative+))+
text-in-alternative = (- alternative-to-escape)
| ('\', alternative-to-escape)
alternative-to-escape = whitespace | '(' | '{' | '/' | '\'
whitespace = ' '

optional = '(' text-in-optional+ ')'
text-in-optional = (- optional-to-escape) | ('\', optional-to-escape)
optional-to-escape = '(' | ')' | '{' | '/' | '\'

parameter = '{', name*, '}'
name = (- name-to-escape) | ('\', name-to-escape)
name-to-escape = '{' | '}' | '(' | '/' | '\'
```




## [`Regex`] Production Rules

Follows original [production rules].



Expand All @@ -27,5 +88,11 @@ at your option.



[`Regex`]: https://docs.rs/regex

[AST]: https://en.wikipedia.org/wiki/Abstract_syntax_tree
[Cucumber Expressions]: https://github.com/cucumber/cucumber-expressions#readme
[EBNF]: https://en.wikipedia.org/wiki/Extended_Backus–Naur_form

[1]: https://github.com/cucumber/cucumber-expressions/issues/41
[2]: https://github.com/cucumber/cucumber-expressions/blob/main/ARCHITECTURE.md#production-rules
173 changes: 173 additions & 0 deletions src/ast.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,173 @@
// Copyright (c) 2021 Brendan Molloy <[email protected]>,
// Ilya Solovyiov <[email protected]>,
// Kai Ren <[email protected]>
//
// Licensed under the Apache License, Version 2.0 <LICENSE-APACHE or
// http://www.apache.org/licenses/LICENSE-2.0> or the MIT license
// <LICENSE-MIT or http://opensource.org/licenses/MIT>, at your
// option. This file may not be copied, modified, or distributed
// except according to those terms.

//! [Cucumber Expressions][1] [AST].
//!
//! See details in the [grammar spec][0].
//!
//! [0]: crate#grammar
//! [1]: https://github.com/cucumber/cucumber-expressions#readme
//! [AST]: https://en.wikipedia.org/wiki/Abstract_syntax_tree

use derive_more::{AsRef, Deref, DerefMut};
use nom::{error::ErrorKind, Err, InputLength};
use nom_locate::LocatedSpan;

use crate::parse;

/// [`str`] along with its location information in the original input.
pub type Spanned<'s> = LocatedSpan<&'s str>;

/// Top-level `expression` defined in the [grammar spec][0].
///
/// See [`parse::expression()`] for the detailed grammar and examples.
///
/// [0]: crate#grammar
#[derive(AsRef, Clone, Debug, Deref, DerefMut, Eq, PartialEq)]
pub struct Expression<Input>(pub Vec<SingleExpression<Input>>);

impl<'s> TryFrom<&'s str> for Expression<Spanned<'s>> {
type Error = parse::Error<Spanned<'s>>;

fn try_from(value: &'s str) -> Result<Self, Self::Error> {
parse::expression(Spanned::new(value))
.map_err(|e| match e {
Err::Error(e) | Err::Failure(e) => e,
Err::Incomplete(n) => parse::Error::Needed(n),
})
.and_then(|(rest, parsed)| {
rest.is_empty()
.then(|| parsed)
.ok_or(parse::Error::Other(rest, ErrorKind::Verify))
})
}
}

impl<'s> Expression<Spanned<'s>> {
/// Parses the given `input` as an [`Expression`].
///
/// # Errors
///
/// See [`parse::Error`] for details.
pub fn parse<I: AsRef<str> + ?Sized>(
input: &'s I,
) -> Result<Self, parse::Error<Spanned<'s>>> {
Self::try_from(input.as_ref())
}
}

/// `single-expression` defined in the [grammar spec][0], representing a single
/// entry of an [`Expression`].
///
/// See [`parse::single_expression()`] for the detailed grammar and examples.
///
/// [0]: crate#grammar
#[derive(Clone, Debug, Eq, PartialEq)]
pub enum SingleExpression<Input> {
/// [`alternation`][0] expression.
///
/// [0]: crate#grammar
Alternation(Alternation<Input>),

/// [`optional`][0] expression.
///
/// [0]: crate#grammar
Optional(Optional<Input>),

/// [`parameter`][0] expression.
///
/// [0]: crate#grammar
Parameter(Parameter<Input>),

/// [`text-without-whitespace+`][0] expression.
///
/// [0]: crate#grammar
Text(Input),

/// [`whitespace+`][0] expression.
///
/// [0]: crate#grammar
Whitespaces(Input),
}

/// `single-alternation` defined in the [grammar spec][0], representing a
/// building block of an [`Alternation`].
///
/// [0]: crate#grammar
pub type SingleAlternation<Input> = Vec<Alternative<Input>>;

/// `alternation` defined in the [grammar spec][0], allowing to match one of
/// [`SingleAlternation`]s.
///
/// See [`parse::alternation()`] for the detailed grammar and examples.
///
/// [0]: crate#grammar
#[derive(AsRef, Clone, Debug, Deref, DerefMut, Eq, PartialEq)]
pub struct Alternation<Input>(pub Vec<SingleAlternation<Input>>);

impl<Input: InputLength> Alternation<Input> {
/// Returns length of this [`Alternation`]'s span in the `Input`.
pub(crate) fn span_len(&self) -> usize {
self.0
.iter()
.flatten()
.map(|alt| match alt {
Alternative::Text(t) => t.input_len(),
Alternative::Optional(opt) => opt.input_len() + 2,
})
.sum::<usize>()
+ self.len()
- 1
}

/// Indicates whether any of [`SingleAlternation`]s consists only from
/// [`Optional`]s.
pub(crate) fn contains_only_optional(&self) -> bool {
(**self).iter().any(|single_alt| {
single_alt
.iter()
.all(|alt| matches!(alt, Alternative::Optional(_)))
})
}
}

/// `alternative` defined in the [grammar spec][0].
///
/// See [`parse::alternative()`] for the detailed grammar and examples.
///
/// [0]: crate#grammar
#[derive(Clone, Copy, Debug, Eq, PartialEq)]
pub enum Alternative<Input> {
/// [`optional`][1] expression.
///
/// [1]: crate#grammar
Optional(Optional<Input>),

/// Text.
Text(Input),
}

/// `optional` defined in the [grammar spec][0], allowing to match an optional
/// `Input`.
///
/// See [`parse::optional()`] for the detailed grammar and examples.
///
/// [0]: crate#grammar
#[derive(AsRef, Clone, Copy, Debug, Deref, DerefMut, Eq, PartialEq)]
pub struct Optional<Input>(pub Input);

/// `parameter` defined in the [grammar spec][0], allowing to match some special
/// `Input` described by a [`Parameter`] name.
///
/// See [`parse::parameter()`] for the detailed grammar and examples.
///
/// [0]: crate#grammar
#[derive(AsRef, Clone, Copy, Debug, Deref, DerefMut, Eq, PartialEq)]
pub struct Parameter<Input>(pub Input);
Loading