Skip to content

Commit

Permalink
Transitional refactoring of eol.
Browse files Browse the repository at this point in the history
  • Loading branch information
ColinH committed Nov 15, 2023
1 parent 8a6ae17 commit abcf942
Show file tree
Hide file tree
Showing 38 changed files with 270 additions and 385 deletions.
2 changes: 1 addition & 1 deletion doc/Changelog.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@
* Added functions to visit and flatten [nested exceptions](Contrib-and-Examples.md#taopegtlcontribnested_exceptionshpp).
* Added new customization point for error messages.
* Added optional source line output for the tracer.
* Added new ASCII rules [`cntrl`](Rule-Reference.md#cntrl), [`cr`](Rule-Reference.md#cr), [`esc`](Rule-Reference.md#esc), [`ff`](Rule-Reference.md#ff), [`graph`](Rule-Reference.md#graph), [`ht`](Rule-Reference.md#ht), [`lf`](Rule-Reference.md#lf), [`sp`](Rule-Reference.md#sp), [`vt`](Rule-Reference.md#vt).
* Added new ASCII rules [`cntrl`](Rule-Reference.md#cntrl), [`cr`](Rule-Reference.md#cr), [`crlf`](Rule-Reference.md#crlf), [`esc`](Rule-Reference.md#esc), [`ff`](Rule-Reference.md#ff), [`graph`](Rule-Reference.md#graph), [`ht`](Rule-Reference.md#ht), [`lf`](Rule-Reference.md#lf), [`lfcr`](Rule-Reference.md#lfcr), [`sp`](Rule-Reference.md#sp), [`vt`](Rule-Reference.md#vt).
* Added new atomic rule [`everything`](Rule-Reference.md#everything).
* Added new convenience rule [`partial`](Rule-Reference.md#partial-r-).
* Added new convenience rule [`star_partial`](Rule-Reference.md#star_partial-r-).
Expand Down
13 changes: 13 additions & 0 deletions doc/Rule-Reference.md
Original file line number Diff line number Diff line change
Expand Up @@ -786,6 +786,11 @@ ASCII rules do not usually rely on other rules.
* Matches and consumes a single ASCII carriage return character of value `13` or `0x0d`.
* [Equivalent] to `one< '\r' >`.

###### `crlf`

* Matches and consumes the common ASCII carriage return followed by a line feed.
* [Equivalent] to `string< '\r', '\n' >`.

###### `digit`

* Matches and consumes a single ASCII decimal digit character.
Expand Down Expand Up @@ -859,6 +864,11 @@ ASCII rules do not usually rely on other rules.
* Matches and consumes a single ASCII line feed (new line) character of value `10` or `0x0a`.
* [Equivalent] to `one< '\n' >`.

###### `lfcr`

* Matches and consumes an uncommon ASCII line feed followed by a carriage return.
* [Equivalent] to `string< '\n', '\r' >`.

###### `lower`

* Matches and consumes a single ASCII lower-case alphabetic character.
Expand Down Expand Up @@ -955,6 +965,7 @@ ASCII rules do not usually rely on other rules.
* [Equivalent] to `seq< one< C >... >`.
* [Meta data] and [implementation] mapping:
- `ascii::string<>::rule_t` is `internal::success`
- `ascii::string< C >:rule_t` is `internal::one< result_on_found::success, internal::peek_char, C >`
- `ascii::string< C... >::rule_t` is `internal::string< C... >`

###### `TAO_PEGTL_ISTRING( "..." )`
Expand Down Expand Up @@ -1565,6 +1576,7 @@ Binary rules do not rely on other rules.
* [`cntrl`](#cntrl) <sup>[(ascii rules)](#ascii-rules)</sup>
* [`control< C, R... >`](#control-c-r-) <sup>[(meta rules)](#meta-rules)</sup>
* [`cr`](#cr) <sup>[(ascii rules)](#ascii-rules)</sup>
* [`crlf`](#crlf) <sup>[(ascii rules)](#ascii-rules)</sup>
* [`dash`](#dash) <sup>[(icu rules)](#icu-rules-for-binary-properties)</sup>
* [`decomposition_type< V >`](#decomposition_type-v-) <sup>[(icu rules)](#icu-rules-for-enumerated-properties)</sup>
* [`default_ignorable_code_point`](#default_ignorable_code_point) <sup>[(icu rules)](#icu-rules-for-binary-properties)</sup>
Expand Down Expand Up @@ -1613,6 +1625,7 @@ Binary rules do not rely on other rules.
* [`keyword< C... >`](#keyword-c-) <sup>[(ascii rules)](#ascii-rules)</sup>
* [`lead_canonical_combining_class< V >`](#lead_canonical_combining_class-v-) <sup>[(icu rules)](#icu-rules-for-value-properties)</sup>
* [`lf`](#lf) <sup>[(ascii rules)](#ascii-rules)</sup>
* [`lfcr`](#lfcr) <sup>[(ascii rules)](#ascii-rules)</sup>
* [`line_break< V >`](#line_break-v-) <sup>[(icu rules)](#icu-rules-for-enumerated-properties)</sup>
* [`list< R, S >`](#list-r-s-) <sup>[(convenience)](#convenience)</sup>
* [`list< R, S, P >`](#list-r-s-p-) <sup>[(convenience)](#convenience)</sup>
Expand Down
1 change: 1 addition & 0 deletions include/tao/pegtl.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@
#include "pegtl/version.hpp"

#include "pegtl/ascii.hpp"
#include "pegtl/eol.hpp"
#include "pegtl/rules.hpp"
#include "pegtl/utf8.hpp"

Expand Down
2 changes: 1 addition & 1 deletion include/tao/pegtl/argv_input.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,7 @@ namespace TAO_PEGTL_NAMESPACE

} // namespace internal

template< tracking_mode P = tracking_mode::eager, typename Eol = eol::lf_crlf >
template< tracking_mode P = tracking_mode::eager, typename Eol = ascii::lf_crlf >
struct argv_input
: memory_input< P, Eol >
{
Expand Down
2 changes: 0 additions & 2 deletions include/tao/pegtl/ascii.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,6 @@ namespace TAO_PEGTL_NAMESPACE
struct any : internal::any< internal::peek_char > {};
struct blank : internal::one< internal::result_on_found::success, internal::peek_char, ' ', '\t' > {};
struct cntrl : internal::ranges< internal::peek_char, static_cast< char >( 0 ), static_cast< char >( 31 ), static_cast< char >( 127 ) > {};
struct cr : internal::one< internal::result_on_found::success, internal::peek_char, '\r' > {};
struct digit : internal::range< internal::result_on_found::success, internal::peek_char, '0', '9' > {};
struct esc : internal::one< internal::result_on_found::success, internal::peek_char, static_cast< char >( 27 ) > {};
struct ellipsis : internal::string< '.', '.', '.' > {};
Expand All @@ -33,7 +32,6 @@ namespace TAO_PEGTL_NAMESPACE
struct identifier : internal::identifier {};
template< char... Cs > struct istring : internal::istring< Cs... > {};
template< char... Cs > struct keyword : internal::seq< internal::string< Cs... >, internal::not_at< internal::identifier_other > > { static_assert( sizeof...( Cs ) > 0 ); };
struct lf : internal::one< internal::result_on_found::success, internal::peek_char, '\n' > {};
struct lower : internal::range< internal::result_on_found::success, internal::peek_char, 'a', 'z' > {};
template< char... Cs > struct not_one : internal::one< internal::result_on_found::failure, internal::peek_char, Cs... > {};
template< char Lo, char Hi > struct not_range : internal::range< internal::result_on_found::failure, internal::peek_char, Lo, Hi > {};
Expand Down
26 changes: 20 additions & 6 deletions include/tao/pegtl/buffer_input.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -33,14 +33,14 @@

namespace TAO_PEGTL_NAMESPACE
{
template< typename Reader, typename Eol = eol::lf_crlf, typename Source = std::string, std::size_t Chunk = 64 >
template< typename Reader, typename Eol = ascii::lf_crlf, typename Source = std::string, std::size_t Chunk = 64 >
class buffer_input
{
public:
using data_t = char;
using reader_t = Reader;

using eol_t = Eol;
using eol_rule = Eol;
using source_t = Source;

using rewind_position_t = internal::large_position;
Expand Down Expand Up @@ -126,7 +126,7 @@ namespace TAO_PEGTL_NAMESPACE

void bump( const std::size_t in_count = 1 ) noexcept
{
internal::bump( m_current, in_count, Eol::ch );
internal::bump( m_current, in_count, '\n' );
}

void bump_in_this_line( const std::size_t in_count = 1 ) noexcept
Expand Down Expand Up @@ -214,16 +214,30 @@ namespace TAO_PEGTL_NAMESPACE
return static_cast< std::size_t >( m_buffer.get() + m_maximum - m_end );
}

template< apply_mode A,
rewind_mode M,
template< typename... >
class Action,
template< typename... >
class Control,
typename ParseInput,
typename... States >
[[nodiscard]] static bool match_eol( ParseInput& in, States&&... st )
{
if( Control< typename Eol::rule_t >::template match< A, M, Action, Control >( in, st... ) ) {
// in.template consume< eol_consume_tag >( 0 );
return true;
}
return false;
}

private:
Reader m_reader;
std::size_t m_maximum;
std::unique_ptr< char[] > m_buffer;
rewind_position_t m_current;
char* m_end;
const Source m_source;

public:
std::size_t private_depth = 0;
};

} // namespace TAO_PEGTL_NAMESPACE
Expand Down
3 changes: 2 additions & 1 deletion include/tao/pegtl/contrib/parse_tree.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,7 @@
#include "../apply_mode.hpp"
#include "../config.hpp"
#include "../demangle.hpp"
#include "../eol.hpp"
#include "../memory_input.hpp"
#include "../normal.hpp"
#include "../nothing.hpp"
Expand Down Expand Up @@ -105,7 +106,7 @@ namespace TAO_PEGTL_NAMESPACE::parse_tree
return { m_begin.data, m_end.data };
}

template< tracking_mode P = tracking_mode::eager, typename Eol = eol::lf_crlf >
template< tracking_mode P = tracking_mode::eager, typename Eol = ascii::lf_crlf >
[[nodiscard]] memory_input< P, Eol > as_memory_input() const
{
assert( has_content() );
Expand Down
4 changes: 2 additions & 2 deletions include/tao/pegtl/contrib/raw_string.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,7 @@ namespace TAO_PEGTL_NAMESPACE
using subs_t = empty_list;

template< apply_mode A,
rewind_mode,
rewind_mode M,
template< typename... >
class Action,
template< typename... >
Expand All @@ -43,7 +43,7 @@ namespace TAO_PEGTL_NAMESPACE
case Open:
marker_size = i + 1;
in.bump_in_this_line( marker_size );
(void)eol::match( in );
(void)in.template match_eol< A, M, Action, Control >( in );
return true;
case Marker:
break;
Expand Down
2 changes: 1 addition & 1 deletion include/tao/pegtl/cstream_input.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@

namespace TAO_PEGTL_NAMESPACE
{
template< typename Eol = eol::lf_crlf, std::size_t Chunk = 64 >
template< typename Eol = ascii::lf_crlf, std::size_t Chunk = 64 >
struct cstream_input
: buffer_input< internal::cstream_reader, Eol, std::string, Chunk >
{
Expand Down
38 changes: 16 additions & 22 deletions include/tao/pegtl/eol.hpp
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
// Copyright (c) 2016-2023 Dr. Colin Hirsch and Daniel Frey
// Copyright (c) 2014-2023 Dr. Colin Hirsch and Daniel Frey
// Distributed under the Boost Software License, Version 1.0.
// (See accompanying file LICENSE_1_0.txt or copy at https://www.boost.org/LICENSE_1_0.txt)

Expand All @@ -7,32 +7,26 @@

#include "config.hpp"

#include "internal/eol.hpp"

#include "internal/cr_crlf_eol.hpp"
#include "internal/cr_eol.hpp"
#include "internal/crlf_eol.hpp"
#include "internal/lf_crlf_eol.hpp"
#include "internal/lf_eol.hpp"
#include "internal/one.hpp"
#include "internal/peek_char.hpp"
#include "internal/result_on_found.hpp"
#include "internal/sor.hpp"
#include "internal/string.hpp"

namespace TAO_PEGTL_NAMESPACE
{
inline namespace ascii
{
// Struct eol is both a rule and a pseudo-namespace for the
// member structs cr, etc. (which are not themselves rules).

struct eol
: internal::eol
{
// clang-format off
struct cr : internal::cr_eol {};
struct cr_crlf : internal::cr_crlf_eol {};
struct crlf : internal::crlf_eol {};
struct lf : internal::lf_eol {};
struct lf_crlf : internal::lf_crlf_eol {};
// clang-format on
};
// clang-format off
struct cr : internal::one< internal::result_on_found::success, internal::peek_char, '\r' > {};
struct crlf : internal::string< '\r', '\n' > {};
struct lf : internal::one< internal::result_on_found::success, internal::peek_char, '\n' > {};
struct lfcr : internal::string< '\n', '\r' > {};
struct cr_lf : internal::one< internal::result_on_found::success, internal::peek_char, '\r', '\n' > {};
struct cr_crlf : internal::sor< internal::string< '\r', '\n' >, internal::one< internal::result_on_found::success, internal::peek_char, '\r' > > {};
struct lf_crlf : internal::sor< internal::one< internal::result_on_found::success, internal::peek_char, '\n' >, internal::string< '\r', '\n' > > {};
struct cr_lf_crlf : internal::sor< internal::string< '\r', '\n' >, internal::one< internal::result_on_found::success, internal::peek_char, '\r', '\n' > > {};
// clang-format on

} // namespace ascii

Expand Down
4 changes: 2 additions & 2 deletions include/tao/pegtl/file_input.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -22,14 +22,14 @@
namespace TAO_PEGTL_NAMESPACE
{
#if defined( _POSIX_MAPPED_FILES ) || defined( _WIN32 )
template< tracking_mode P = tracking_mode::eager, typename Eol = eol::lf_crlf >
template< tracking_mode P = tracking_mode::eager, typename Eol = ascii::lf_crlf >
struct file_input
: mmap_input< P, Eol >
{
using mmap_input< P, Eol >::mmap_input;
};
#else
template< tracking_mode P = tracking_mode::eager, typename Eol = eol::lf_crlf >
template< tracking_mode P = tracking_mode::eager, typename Eol = ascii::lf_crlf >
struct file_input
: read_input< P, Eol >
{
Expand Down
23 changes: 16 additions & 7 deletions include/tao/pegtl/internal/action_input.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -62,11 +62,13 @@ namespace TAO_PEGTL_NAMESPACE::internal

[[nodiscard]] std::string string() const
{
static_assert( sizeof( data_t ) == 1 );
return std::string( static_cast< const char* >( m_begin.data ), size() );
}

[[nodiscard]] std::string_view string_view() const noexcept
{
static_assert( sizeof( data_t ) == 1 );
return std::string_view( static_cast< const char* >( m_begin.data ), size() );
}

Expand All @@ -75,34 +77,41 @@ namespace TAO_PEGTL_NAMESPACE::internal
return m_begin.data[ offset ];
}

template< typename T >
[[nodiscard]] T peek_as( const std::size_t offset = 0 ) const noexcept
{
static_assert( sizeof( T ) == sizeof( data_t ) );
return static_cast< T >( peek( offset ) );
}

[[nodiscard]] char peek_char( const std::size_t offset = 0 ) const noexcept
{
return static_cast< char >( peek( offset ) );
return peek_as< char >( offset );
}

[[nodiscard]] std::byte peek_byte( const std::size_t offset = 0 ) const noexcept
{
return static_cast< std::byte >( peek( offset ) );
return peek_as< std::byte >( offset );
}

[[nodiscard]] std::uint8_t peek_uint8( const std::size_t offset = 0 ) const noexcept
{
return static_cast< std::uint8_t >( peek( offset ) );
return peek_as< std::uint8_t >( offset );
}

[[nodiscard]] const ParseInput& input() const noexcept
{
return m_input;
}

[[nodiscard]] const rewind_position_t& rewind_position() const noexcept
[[nodiscard]] decltype( auto ) current_position() const
{
return m_begin;
return m_input.previous_position( m_begin ); // NOTE: O(n) with lazy inputs -- n is return value!
}

[[nodiscard]] decltype( auto ) current_position() const
[[nodiscard]] const rewind_position_t& rewind_position() const noexcept
{
return m_input.previous_position( m_begin ); // NOTE: O(n) with lazy inputs -- n is return value!
return m_begin;
}

protected:
Expand Down
2 changes: 1 addition & 1 deletion include/tao/pegtl/internal/bump_help.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@ namespace TAO_PEGTL_NAMESPACE::internal
template< typename Rule, typename ParseInput >
void bump_help( ParseInput& in, const std::size_t count )
{
if constexpr( Rule::test_any( ParseInput::eol_t::ch ) ) {
if constexpr( Rule::test_any( '\n' ) ) {
in.bump( count );
}
else {
Expand Down
34 changes: 0 additions & 34 deletions include/tao/pegtl/internal/cr_crlf_eol.hpp

This file was deleted.

Loading

0 comments on commit abcf942

Please sign in to comment.