Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ensure iterators satisfy input_iterator concept for views and ranges #273

Open
wants to merge 23 commits into
base: master
Choose a base branch
from

Conversation

wdconinc
Copy link
Contributor

@wdconinc wdconinc commented Mar 28, 2022

Some changes to the iterators to make #272 work. This now enables all except for C++ ranges, and there are unit tests to go with it. C++20 unit tests still fail. This applies to all iterators, both mutable and const, but I think for input_iterators that is fine. For the forward iterators we probably need to split the iterators.jinja2 into two(?).

BEGINRELEASENOTES

  • Allow collections to work with STL container algorithms such as std::find, std::find_if
  • Ensure iterators are input_iterators and support concepts for c++20 views and ranges

ENDRELEASENOTES

tests/CMakeLists.txt Outdated Show resolved Hide resolved
Copy link
Collaborator

@tmadlener tmadlener left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @wdconinc. Very nice work (and it seems like you didn't have much problems with the templates).

A few "higher level" questions:

  • What is currently still missing to making the iterators satisfy the forward iterator concept?
  • Do we still need input iterators if we can support forward iterators? If I understand the iterator nesting correctly, than the latter is a superset of the former functionality wise, right?
  • Do the unittests need I/O at the moment? Or could they be put into Catch2 test cases?
  • Would it be possible to add same static checks to the unittests, that essentially just asser that our iterators fulfill the concepts that we want?

python/templates/macros/iterator.jinja2 Outdated Show resolved Hide resolved
python/templates/macros/iterator.jinja2 Outdated Show resolved Hide resolved
python/templates/Collection.cc.jinja2 Outdated Show resolved Hide resolved
@wdconinc
Copy link
Contributor Author

Thanks for the comments. I had a static_assert on is_forward_iterator that was failing due to failing is_constructible, so something is still going wrong (it might have been in part because of where the static_assert was, since some definitions were in other code units, so it was probably bound to fail). Maybe just adding the definitions (I think it was the iterator reference/dereference and increment/decrement) into the class decaration would allow for a static_assert.

Otherwise, I do think we have all the bits and pieces for forward_iterators and probably even birdirectional_iterator.

I added reverse iterators, not because they are required for these concepts, but because they are typically implemented and may be useful. However, there are some issues with the rend sentinel since we use a size_t for the index. A -- beyond the end just wraps around and needs more testing.

It is true that a lot of this can go into unit tests. I was thinking that some of the memory issues would be picked up better by a full IO test, but then I realized that those are all disabled in the sanitizers so it's not quite as useful and unittests would be just fine.

We should likely add a C++20 workflow to the testing, if that is possible given the LCG upstream.

@tmadlener
Copy link
Collaborator

Maybe just adding the definitions (I think it was the iterator reference/dereference and increment/decrement) into the class decaration would allow for a static_assert.

Ah, so if I understand, you were trying to static_assert in the generated code directly? I would have simply done something like we do for some other type checks as well in the unittests, e.g. (there are a few more involved ones for const-correctness further down)

podio/tests/unittest.cpp

Lines 197 to 198 in 1dc1723

STATIC_REQUIRE(std::is_standard_layout_v<ExampleClusterData>); // Generated data classes do not have standard layout
STATIC_REQUIRE(std::is_trivially_copyable_v<ExampleClusterData>); // Generated data classes are not trivially copyable

However, there are some issues with the rend sentinel since we use a size_t for the index. A -- beyond the end just wraps around and needs more testing.

That is a good point, but I think we could put static_cast<size_t>(-1) as a sentinel value there. I don't think we technically support collections with that size, because the ObjectID uses a signed integer for handling the relations during I/O, so numeric_limits<int>::max() is probably the largest collection we can have before things start to break. We had a bit of discussion about that in #170 if you are interested.

@wdconinc
Copy link
Contributor Author

Apparently, I think, one of the issues I'm currently running into to satisfy static_assert(std::ranges::bidirectional_range<ExampleMCCollection>); is the c++ standard defect report http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2021/p2325r3.html (views should not be required to be default constructible), which we do not satisfy for iterators. This has been resolved in gcc-12, but not 11.

@wdconinc
Copy link
Contributor Author

wdconinc commented Mar 29, 2022

For views, we may need one more change: iterators must allow for a const dereference, which is not how we are defining them. What do you think of a mutable m_object?

  {{ prefix }}{{ class.bare_type }} operator*() const {
    m_object.m_obj = (*m_collection)[m_index];
    return m_object;
  };
private:
  size_t m_index;
  mutable {{ prefix }}{{ class.bare_type }} m_object;
  const {{ class.bare_type }}ObjPointerContainer* m_collection;

With that change, I think most of the ranges functionality works.

@wdconinc wdconinc marked this pull request as ready for review March 30, 2022 01:35
@wdconinc
Copy link
Contributor Author

The unit tests pass the more advanced tests, but fail the following ones (doesn't fill correctly, doesn't find correctly), so that needs a bit of investigation.

  // fill from const
  collection.clear();
  collection.create();
  auto const_hit = ExampleHit(0x42ULL, 0., 0., 0., 0.);
  //FIXME this should work but we may need a conversion constructor!
  //std::fill(collection.begin(), collection.end(), const_hit);
  //REQUIRE(collection.begin()->cellID() == 0x42ULL);
  // fill from mutable
  collection.clear();
  collection.create();
  auto mutable_hit = MutableExampleHit(0x42ULL, 0., 0., 0., 0.);
  std::fill(collection.begin(), collection.end(), mutable_hit);
  //FIXME
  //REQUIRE(collection.begin()->cellID() == 0x42ULL);

  // find in collection
  collection.clear();
  collection.create();
  collection.create(0x42ULL, 0., 0., 0., 0.);
  auto found = std::find_if(collection.begin(), collection.begin(),
    [](const auto& h){ return h.cellID() == 0x42ULL; });
  STATIC_REQUIRE(std::is_same_v<decltype(found), ExampleHitMutableCollectionIterator>);
  REQUIRE(found != collection.end());
  //FIXME
  //REQUIRE(found->cellID() == 0x42ULL);

Other outstanding questions:

  • Any concerns with mutable for the m_object pointer in the iterator?

@wdconinc
Copy link
Contributor Author

I guess the fact that std::fill doesn't work is expected in the podio philosophy of not modifying objects when they are inside a container. Still confused by why std::find fails...

tests/unittest.cpp Outdated Show resolved Hide resolved
@tmadlener
Copy link
Collaborator

Regarding making the m_object mutable, I think we could do without that actually since we are in any case returning a value and could in principle skip filling the internal one first. Unless I am missing something here and we actually need to update the internal state in operator* for other functionality. As far as I can see we don't have to do that, right? So I think the following should work:

  {{ prefix }}{{ class.bare_type }} operator*() const {
    return {(*m_collection)[m_index])};
  };

(I am not entirely sure if the return like this works or if the {{ prefix }}{{ class.bare_type }} has to be spelled out.)

@wdconinc wdconinc changed the title input_iterator type and unit tests Ensure iterators satisfy bidirectional_iterator concept for views and ranges Mar 30, 2022
@wdconinc
Copy link
Contributor Author

Regarding making the m_object mutable, I think we could do without that actually since we are in any case returning a value and could in principle skip filling the internal one first. Unless I am missing something here and we actually need to update the internal state in operator* for other functionality. As far as I can see we don't have to do that, right? So I think the following should work:

  {{ prefix }}{{ class.bare_type }} operator*() const {
    return {(*m_collection)[m_index])};
  };

I am not sure how strong the equality preserving constraint on dereferencing iterators is. In this case dereferencing the same (unincremented) iterator twice would result in two different objects (both pointing to the same internal data), not the same object (which we have now due to assignment). I am not sure if our implementation would treat those as equal.

I'll run some tests on this.

@tmadlener
Copy link
Collaborator

In this case dereferencing the same (unincremented) iterator twice would result in two different objects (both pointing to the same internal data), not the same object (which we have now due to assignment). I am not sure if our implementation would treat those as equal.

If this falls back to using operator== of the returned object, than we should be OK, since that makes two objects equal to each other if they point to the same data:

bool operator==(const {{ full_type }}& other) const { return m_obj == other.m_obj; }

@wdconinc
Copy link
Contributor Author

If this falls back to using operator== of the returned object, than we should be OK, since that makes two objects equal to each other if they point to the same data:

bool operator==(const {{ full_type }}& other) const { return m_obj == other.m_obj; }

Ah, so that's where the operator== is :-) I couldn't find it.

@tmadlener
Copy link
Collaborator

I think some of the CI failures are not related to the changes in this PR, e.g.:
https://github.com/AIDASoft/podio/runs/5754875143?check_suite_focus=true#step:4:247

They should be fixed once #253 and #254 are merged. You can try to target the develop branch to see how things look with them merged.

I will merge those this week, also to unclog some of the other PRs that are essentially depending on them.

@wdconinc
Copy link
Contributor Author

I think some of the CI failures are not related to the changes in this PR

Yeah, the cvmfs action on macos is fickle (and it's annoying to debug them).

There are some linux fails too, though also unrelated (unused capture in const auto checkCollections = [nElements](const ExampleHitCollection& hits,).

Copy link
Collaborator

@tmadlener tmadlener left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the latest additions. For the read tests we usually just throw a std::runtime_error if something is not as expected (at least for the time being). However, I am not entirely sure whether it is possible for all the tests in read_iterators.cpp or read_views.cpp to easily formulate conditions. In any case, since the iterators are directly using collection resources after I/O I wouldn't expect any problems there.

Overall this looks good to me, I just have one minor comment inline for one of the unittests.

tests/unittest.cpp Outdated Show resolved Hide resolved
@wdconinc
Copy link
Contributor Author

Thanks for the latest additions. For the read tests we usually just throw a std::runtime_error if something is not as expected (at least for the time being). However, I am not entirely sure whether it is possible for all the tests in read_iterators.cpp or read_views.cpp to easily formulate conditions. In any case, since the iterators are directly using collection resources after I/O I wouldn't expect any problems there.

I was going to remove those IO tests read_iterators and read_views again since they don't really test anything beyond the unit tests.

@wdconinc
Copy link
Contributor Author

wdconinc commented Apr 1, 2022

The single remaining issue in the unit tests (commented out) is with the std::fill and is essentially the following:

for (auto i = collection.begin(); i != collection.end(); i++) {
  *i = MutableExampleHit(0x42ULL, 0., 0., 0., 0.); 
}

This compiles but fails to modify the collection because operator* returns a valid lvalue reference for the purpose of the output_iterator (and forward_iterator, bidirectional_iterator), but it is not resulting in a proper std::fill functionality since I think the assignment happens to a copy and not to the objects pointed to in the container. I am not sure if this is addressable (no pun intended).

It is unfortunate that std::fill 'seems to work' but just leaves the collection unchanged.

I think the line in the loop calls, in sequence and for the lhs only:

MutableExampleHit ExampleHitMutableCollectionIterator::operator*() const {
  return {(*m_collection)[m_index]};
}

MutableExampleHit::MutableExampleHit( ExampleHitObj* obj) : m_obj(obj) {
  if (m_obj) m_obj->acquire();
}

MutableExampleHit& MutableExampleHit::operator=(MutableExampleHit other) {
  swap(*this, other);
  return *this;
}

With mutable m_object in the iterator dereference we don't get any different results but the chain is different:

MutableExampleHit ExampleHitMutableCollectionIterator::operator*() const {
  m_object.m_obj = (*m_collection)[m_index];
  return m_object;
}

MutableExampleHit& MutableExampleHit::operator=(MutableExampleHit other) {
  swap(*this, other);
  return *this;
}

@tmadlener
Copy link
Collaborator

tmadlener commented Apr 1, 2022

std::fill seems to be a tricky one to solve indeed. I can't see an easy way to propagate the swapping of the internal pointers in the objects to be reflected in the collection. But since std::fill only requires a forward_iterator, I also don't see an easy way to make it fail at compile time.

Can we = delete the overloads for our generated collections? Something along the lines of

namespace std {
template<typename T>
std::fill(CollectionIterator, CollectionIterator, const T&) = delete;
}

(and similar for all of the other iterators as well). I know opening namespace std is not exactly considered best practice, but we could at least make things fail to compile for std::fill instead of silently not doing the right thing. Additionally, this doesn't really scale if we have to = delete half of the std::algorithm or std::ranges functionality.

I have played a bit with a normal std::vector and it seems to work there: https://godbolt.org/z/ohrP333Er

Or would you want std::fill to work properly?

@wdconinc
Copy link
Contributor Author

wdconinc commented Apr 1, 2022

Deleting the std::fill is an interesting suggestion. I wonder which other STL algorithms rely on assigning to dereferenced pointers, and whether we would need to delete them all... I could imagine that a large fraction of the modifying sequence operations are likely affected.

@wdconinc
Copy link
Contributor Author

wdconinc commented Apr 1, 2022

So, I guess I'm ok with std::fill failing until someone figures out how to make it work, as long as we are somehow clear about this.

@tmadlener
Copy link
Collaborator

I could imagine that a large fraction of the modifying sequence operations are likely affected.

Right. That is a lot to delete...

So, I guess I'm ok with std::fill failing until someone figures out how to make it work, as long as we are somehow clear about this.

Failing in the sense of "compiling but not working" or actually deleting it to make it fail at compile time? What we can always do (at least at this point) is to properly document what is and what is not possible with our iterators. Which would essentially boil down to "everything that leaves the collection unchanged", I suppose.

@wdconinc
Copy link
Contributor Author

wdconinc commented Apr 1, 2022

Maybe we should ensure that a dereferenced iterator lvalue is always a const object (even for mutable iterators!) so the assignment implied in the std::fill would have to fail.

@wdconinc wdconinc changed the title Ensure iterators satisfy bidirectional_iterator concept for views and ranges Ensure iterators satisfy input_iterator concept for views and ranges Aug 1, 2022
@wdconinc
Copy link
Contributor Author

wdconinc commented Aug 1, 2022

I've restricted the iterators to just input_iterators, with explicit unit tests to check they are not more than that. Due to the limitation in the iterators, the ranges cannot be used with reverse views so those have been removed from the unit tests.

Copy link
Collaborator

@tmadlener tmadlener left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for updating this. I have essentially just minor comments relating to tagging all related tests with one common tag and one placement of an #if.

From the provided unit tests it looks like the std::fill case still compiles, but then fails at runtime, right? I would have assumed our iterators not being forward iterators would make std::fill fail at compile time, but that doesn't seem to be the case. I am not sure I understand why at this point.

tests/unittest.cpp Outdated Show resolved Hide resolved
tests/unittest.cpp Outdated Show resolved Hide resolved
tests/unittest.cpp Outdated Show resolved Hide resolved
tests/unittest.cpp Outdated Show resolved Hide resolved
tests/unittest.cpp Outdated Show resolved Hide resolved
Co-authored-by: Thomas Madlener <thomas.madlener@desy.de>
tests/unittest.cpp Outdated Show resolved Hide resolved
Co-authored-by: Thomas Madlener <thomas.madlener@desy.de>
@wdconinc
Copy link
Contributor Author

wdconinc commented Aug 2, 2022

From the provided unit tests it looks like the std::fill case still compiles, but then fails at runtime, right?

I think this still relates to #281. To prevent std::fill on compile we would probably have to remove the Mutable iterators for regular const collections, which are the ones in master already. This PR doesn't really change that.

@wdconinc
Copy link
Contributor Author

wdconinc commented Aug 2, 2022

Possible next ideas to explore:

  • delete std::fill and all other algorithms that result in incorrect behavior,
  • delete the underlying assignment operator that operates on dereferenced pointer lvalues.

@wdconinc
Copy link
Contributor Author

wdconinc commented Aug 2, 2022

I'm thinking the following is what we may need in python/templates/macros/declarations.jinja2:

{% if prefix == 'Mutable' %}
  /// copy-assignment operator
  {{  full_type }}& operator=({{ type }} other);
  {{  full_type }}& operator=({{ full_type }} other);
{% else %}
  /// copy-assignment operator deleted (use Mutable type)
  {{  full_type }}& operator=({{ full_type }} other) = delete;
{% endif %}

This would remove assignments like object1 = object2, but allow mutableobject1 = mutableobject2 and mutableobject1 = object2. Implementation of this highlights a few other false-positives in the unittests (tests/read_test.h where we reuse the p in auto p = mcps[0] which is an Object, not MutableObject). Ultimately this would also remove this assignment in std::fill or the equivalent assignment in

auto i = collection.begin();
*i = MutableExampleHit;

since const-correctness of i and begin() would block assignment.

Ultimately this should mean that std::fill will work on MutableObjectCollection, but would fail at compilation on ObjectCollection.

@wdconinc
Copy link
Contributor Author

wdconinc commented Aug 2, 2022

Nevermind, removing assignment violates std::assignable_from<T&, T> which is required by std::input_or_output_iterator, so removing assignment is not the right approach (though we are allowed to make it fail).

@wdconinc
Copy link
Contributor Author

wdconinc commented Aug 2, 2022

Other things I tried today that didn't pan out:

  • delete begin and force use of cbegin with as_const (cbegin is required to be what begin returns on const),
  • throw exceptions in problematic code (too many false positives, and didn't even block std::fill).

@tmadlener
Copy link
Collaborator

I just tried to explicitly generate deleted overloads for std::fill for all of the iterators and that seems to do the trick and calls to std::fill no longer compile.

changes on top of this PR

There is also some additional work in there which would in principle allow us to do static_asserts to make sure that calling std::fill really doesn't work. It is pretty esoteric, and I essentially just wanted to see if I can get something like that to work, so we could just pull in parts that generate the deletes for the overloads.

@wdconinc
Copy link
Contributor Author

Do we actually need a mutable iterator on collections? Can we just make it entirely non-mutable?

Right now that will fail the following constructs:

  for (auto&& i : coll) {
    i.energy(42); // make sure that we can indeed change the energy here for
                  // non-const collections
  }
  REQUIRE(hit1.energy() == 42);

Are these used in the wild? It only fails at the unittest stage right now, with these specific cases. Does it make sense to force users to treat collections as always const?

@tmadlener
Copy link
Collaborator

Are these used in the wild?

Good question, I don't know. We could at probably flush out most of the use cases with a deprecation warning to see how many there are.

On the other hand, I am not entirely sure if I like the idea of removing the iterators, since operations done through the iterators are working, but assignments through dereferenced iterators are failing.

@hegner
Copy link
Collaborator

hegner commented May 5, 2023

@tmadlener - should we resurrect this?

@tmadlener
Copy link
Collaborator

tmadlener commented May 5, 2023

I am not entirely sure how easy this is to resurrect, given that doing it “properly” would probably mean quite a bit of work. I will have to look at the discussion in here again and see what the current status is. Maybe there are things that can be salvaged for now and we come back to proper ranges support at a later point

@hegner
Copy link
Collaborator

hegner commented May 5, 2023

OK. Thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants