Correct approach for context-dependent rules #208
Replies: 5 comments
-
Regarding try-catch and keeping the flag consistent, the grammar might throw exceptions, but it doesn't itself catch them; usually an exception aborts a parsing run which is why we call them "global errors", in which case what happens with the flag shouldn't matter, because you never backtrack after an exception. Unless you are doing something that makes the grammar continue after an exception you are done whenever one occurs. |
Beta Was this translation helpful? Give feedback.
-
Regarding the additional state, you could indeed use something similar to |
Beta Was this translation helpful? Give feedback.
-
The way your Control< sep_horiz >::template match< A, R, Action, Control >( in, nocrlf, st... ); instead, i.e. routing the invocation of the sub-rules through the Control just like all built-in rules do. |
Beta Was this translation helpful? Give feedback.
-
In the PEGTL, each rule should do the minimum of what is required and then combine those simple rules into more complicated constructs. This means: struct is_crlf
{
using subs_t = pegtl::empty_list;
template< pegtl::apply_mode A, pegtl::rewind_mode R,
template< typename ... > class Action,
template< typename ... > class Control,
typename Input, typename... States >
static bool match(Input &in, int &disable_crlf, States &&...st)
{
return disable_crlf == 0;
}
}; and then use I could also continue to talk about the recently added It looks to me as if you and marking a whole sub-part of the grammar and these can potentially nest. Instead of having a run-time flag, one could also think about using a control class wrapper as a flag. This would automatically give you everything you need. Something along those (untested) lines: https://godbolt.org/z/RaAnM2 #include <type_traits>
#include <tao/pegtl.hpp>
using namespace tao::pegtl;
namespace demo
{
namespace internal
{
template< template< typename... > class Control >
struct disable_crlf_control
{
template< typename Rule >
struct type
: Control< Rule >
{
static constexpr bool crlf_disabled = true;
};
};
template< typename Control, typename = void >
struct is_crlf_disabled
: std::false_type {};
template< typename Control >
struct is_crlf_disabled< Control, std::enable_if_t< Control::crlf_disabled > >
: std::true_type {};
} // namespace internal
template< typename Rule >
struct disable_crlf
{
using subs_t = type_list< Rule >;
template< apply_mode A,
rewind_mode M,
template< typename ... > class Action,
template< typename ... > class Control,
typename ParseInput,
typename... States >
static bool match( ParseInput& in, States&&... st )
{
if constexpr( internal::is_crlf_disabled< Control< void > >::value ) {
return Control< Rule >::template match< A, M, Action, Control >( in, st... );
}
else {
return internal::disable_crlf_control< Control >::template type< Rule >::template match< A, M, Action, internal::disable_crlf_control< Control >::template type >( in, st... );
}
}
};
struct is_crlf
{
using subs_t = empty_list;
template< apply_mode A,
rewind_mode M,
template< typename ... > class Action,
template< typename ... > class Control,
typename ParseInput,
typename... States >
static bool match( ParseInput& in, States&&... st )
{
return !internal::is_crlf_disabled< Control< void > >::value;
}
};
} |
Beta Was this translation helpful? Give feedback.
-
@samhocevar Did you find a good solution to this issue? Anything worth sharing? |
Beta Was this translation helpful? Give feedback.
-
I wanted to know whether this is the correct approach to my problem. I am using your Lua grammar to parse a language close to Lua, i.e. mostly whitespace agnostic, with a few exceptions where carriage returns are disallowed. But the rest of the grammar perfectly fits the language so I have just modified it in the following ways:
add an
int nocrlf = 0
parse stateadd a
disable_crlf<true|false>
rule that always succeeds but changes thenocrlf
value:sep
rule for separators, addsep_horiz
(which only matches horizontal whitespace) andsep_normal
(which matches all separators including carriage returns):sep
rule behave differently depending on the value ofnocrlf
:one_line_seq
template rule that behaves likeseq
with carriage returns disabled:This allows me to use all the other grammar rules inside
one_line_seq
without modifications. But I am not sure I am following the spirit of the PEGTL here.try_catch
idiom the correct way to ensurenocrlf
remains consistent with backtracking?nocrlf
somewhere in the grammar so that callers do not have to instantiate it? is this whattao::pegtl::state<>
is for?Beta Was this translation helpful? Give feedback.
All reactions