.info() method in class Sequence #49

VascoSch92 · 2024-04-14T14:35:36Z

Add a .info() method to the class Sequence which print all the relevant information about the chose sequence.
Infornation to include:

Name of the sequence
Names of the classes implementing the sequence
First 5 (?) terms of the sequence
is finite? If yes also the length
Is periodic
Reference (?)
Is mononotonic increasing? Increasing? Constant?
Is strictly positive? Positive?
...

Open question:

what should be the format of the output?

API:

@property
def info(self) -> str:
    ...

The text was updated successfully, but these errors were encountered:

fkiraly · 2024-04-14T22:48:57Z

If we were using scikit-base, as proposed in #45, most of these I'd model as tags, which you get as a dict via get_tags, and one can use for search and retrieval via all_objects.

e.g.,

class Fibonacci():

_tags = {
    "name": "Fibonacci Sequence",
    "OEIS": True,
    "OEIS_ID": "A000045",
    "is_finite": False,
    "is_periodic": False,
    "is_monotonic_increasing": True,
}

The benefit is, for a new property, a new tag is added easily.

I am also thinking about print_tags or similar, at least for sktime, for a version with explanation of each tag, and meta-tags or tag documentation, see this PR for a draft design covering the second half of this: sktime/sktime#6289

Things I would not make tags or methods:

references, I'd include them in docstring in a Reference section, and use sth like numpydoc/rst.
names of classes implementing - why would ther be multiple?
- either way, related classes can be referenced in the docstring, unless we need to query it at runtime

fkiraly · 2024-04-15T00:16:56Z

Taking inspiration from this discussion, here's an experimental design that allows tags to be accessed as attributes, in BaseObject descendants:

sktime/skbase#313

Reviews of the idea appreciated. This would allow sth like

fi = Fibonacci()

fi.is_periodic
# False

if the tags are set in the dict, as above.

The advantage above attributes, or properties:

the "tags" are localized in one place, separate from other attributes
they are also bundled in the code, in one place (while one could do that with attribs too, the dict enforces it)

Disadvantage is that it is more custom, and not entirely "python zen".

Not too strong arguments either way imo, opinions welcome (especially critical ones).

VascoSch92 · 2024-04-15T20:01:52Z

The benefit is, for a new property, a new tag is added easily.

I see your point. The problem is that I don't know if it is scalable. Let's say we want to delete the property is_finite. With the tag method I have to go through all the 50+ sequences that are already implemented and delete it from _tags. In the current setting. I can just delete 3 methods in 3 different classes and it is done. Same thing if we want to add a new property. I saw that tags are more adapt to a skpro setting where you use it as a label to classify/order statistical models.

names of classes implementing - why would ther be multiple?

The problem is that a sequence of number can have different names. For example, the sequence
0, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89,...
is called the Fibonacci sequence, or the Fibonacci numbers, or the sequence A000045. Therefore, you can instantiate it using the classes A000045, FibonacciNumbers or FibonacciSequence. Same classes but different names.

fkiraly · 2024-04-17T11:41:57Z

In the current setting. I can just delete 3 methods in 3 different classes and it is done. Same thing if we want to add a new property.

Can you explain that?

I agree that if there is some default logic that infers the property, it's probably not good to make it a tag.
Though, what is this being inferred from? Is being finite not an inherent property of the sequence?

In case you are worried about inheritance: the tag system in scikit-base does support inheritance. That is, if all descendants of a class have a parent where a tag is set, but do not set the tag themselves, then they will behave as if they had the ancestor's tag.

fkiraly · 2024-04-17T11:44:52Z

Same classes but different names.

I would be extra cautious and would strongly recommend not to introduce different class names.

Yes, an object can have different names in literature, that's the same in stats/ML, e.g., Kriging, Gaussian process regression, kernel ridge regression with error variance fitting, all refer to the same algorithm (including parameters) up to renaming of symbols.

I think dealing with different names is more a matter of documentation and tagging, rather than creating multiple classes though, because classes live on the software architecture level, and proliferating references is anti-DRY.

Where I could understand different classes for the same mathematical object is OEIS vs Fibonacci, but here I would argue that the logical object differs, in parameterization. That is, OEIS("A00045") happens to be the same as Fibonacci mathematically, but it is so in different parameterization. OEIS is distinct from Fibonacci, as it parameterizes sequences by the OEIS register identifier.

VascoSch92 · 2024-04-18T12:17:46Z

In case you are worried about inheritance: the tag system in scikit-base does support inheritance. That is, if all descendants of a class have a parent where a tag is set, but do not set the tag themselves, then they will behave as if they had the ancestor's tag.

Ah ok i didn't know. Is there a documentation for this class?

I would be extra cautious and would strongly recommend not to introduce different class names.

The idea was just to make the package more user-firendly.

fkiraly · 2024-04-18T13:02:36Z

The idea was just to make the package more user-firendly.

I get that, but I would argue that proliferation of names makes it less user friendly actually.

See https://peps.python.org/pep-0020/

Violates "There should be one-- and preferably only one --obvious way to do it."

The "why" goes along the lines, users will be confused about which one to pick, and will start wondering whether these are indeed just aliases. There is also a higher risk of errors as you allow inconsistency of imports across different locations. For classes identity matters too, so either you have multiple distinct classes (which is bad as identical objects can arise from non-identical classes, how do you check instances), or all except one of the classes will point to one other class, which users will not expect, in which case one can easily argue for leaving just that one special class in place.

VascoSch92 added the enhancement improvements to existing features that do not introduce new functionality label Apr 14, 2024

VascoSch92 changed the title ~~Info method in Core~~ .info() method in class Sequence Apr 14, 2024

VascoSch92 added the question Further information is requested label Apr 14, 2024

fkiraly mentioned this issue Apr 18, 2024

[DOC] docstring with mathematical description for QPD_Empirical sktime/skpro#255

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

.info() method in class Sequence #49

.info() method in class Sequence #49

VascoSch92 commented Apr 14, 2024 •

edited

Loading

fkiraly commented Apr 14, 2024

fkiraly commented Apr 15, 2024 •

edited

Loading

VascoSch92 commented Apr 15, 2024 •

edited

Loading

fkiraly commented Apr 17, 2024

fkiraly commented Apr 17, 2024 •

edited

Loading

VascoSch92 commented Apr 18, 2024

fkiraly commented Apr 18, 2024 •

edited

Loading

.info() method in class Sequence #49

.info() method in class Sequence #49

Comments

VascoSch92 commented Apr 14, 2024 • edited Loading

fkiraly commented Apr 14, 2024

fkiraly commented Apr 15, 2024 • edited Loading

VascoSch92 commented Apr 15, 2024 • edited Loading

fkiraly commented Apr 17, 2024

fkiraly commented Apr 17, 2024 • edited Loading

VascoSch92 commented Apr 18, 2024

fkiraly commented Apr 18, 2024 • edited Loading

VascoSch92 commented Apr 14, 2024 •

edited

Loading

fkiraly commented Apr 15, 2024 •

edited

Loading

VascoSch92 commented Apr 15, 2024 •

edited

Loading

fkiraly commented Apr 17, 2024 •

edited

Loading

fkiraly commented Apr 18, 2024 •

edited

Loading