Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support back-references with ">>" #49

Open
jameskirkwood opened this issue Feb 2, 2022 · 3 comments
Open

Support back-references with ">>" #49

jameskirkwood opened this issue Feb 2, 2022 · 3 comments

Comments

@jameskirkwood
Copy link

jameskirkwood commented Feb 2, 2022

Because seq produces a Parser that continues to borrow its tag, it's not possible to use the overloaded right shift operator (>>) with seq to create a back-reference to a previously parsed fragment.

As a basic example, the following will not compile because tag does not live long enough:

fn example() -> Parser<u8, Vec<u8>> {
    (sym(b'<') * none_of(b">").repeat(0..) - sym(b'>')) >> |tag| {
        (call(example) | none_of(b"<>").repeat(0..)) - seq(b"</") - seq(&tag) - sym(b'>')
    }
}

One solution is to modify seq so that it makes an internal copy of tag to be moved into the closure it generates. I tried this but I wasn't quite successful as I also changed the return type to Parser<'a, I, Vec<I>> and introduced a copy every time the sequence matched (only for the result to be immediately discarded).

Perhaps there is a way for seq to support both borrowing and owning its tag, or perhaps there is a good case for a new parser factory that matches against an owned tag?

Suggestions for alternatives are welcome.

@J-F-Liu
Copy link
Owner

J-F-Liu commented Feb 2, 2022

A workaround is:

fn example<'a>() -> Parser<'a, u8, Vec<u8>> {
    (sym(b'<') * none_of(b">").repeat(0..) - sym(b'>'))
        >> |tag| {
            (call(example) | none_of(b"<>").repeat(0..))
                - seq(b"</") - take(tag.len()).convert(move |t| if t == tag { Ok(()) } else { Err(()) })
                - sym(b'>')
        }
}

You may else define a new owned version of seq.

@jameskirkwood
Copy link
Author

I prefer your workaround as I don't need to use Parser::new, but for the record here is an owned version of seq:

fn seq_owned<'a, I>(tag: Vec<I>) -> Parser<'a, I, Vec<I>>
where
    I: PartialEq + Debug + Clone,
{
    Parser::new(move |input: &[I], start: usize| {
        let mut index = 0;
        loop {
            let pos = start + index;
            if index == tag.len() {
                return Ok((tag.to_owned(), pos));
            }
            if let Some(s) = input.get(pos) {
                if tag[index] != *s {
                    return Err(Error::Mismatch {
                        message: format!("seq {:?} expect: {:?}, found: {:?}", tag, tag[index], s),
                        position: pos,
                    });
                }
            } else {
                return Err(Error::Incomplete);
            }
            index += 1;
        }
    })
}

@jameskirkwood
Copy link
Author

jameskirkwood commented Feb 20, 2022

...And here is a much shorter owned version of seq that encapsulates your workaround, which could be a useful recipe:

fn seq_owned(tag: &[u8]) -> Parser<u8, ()> {
    let tag = tag.to_owned();
    take(tag.len()).convert(move |t| if t == tag { Ok(()) } else { Err(()) })
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants