Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug fixes, send/receive debug logs and Unsigned 24 support #267

Open
wants to merge 11 commits into
base: master
Choose a base branch
from

Conversation

TzviRonen
Copy link

Changes:

  • Added test if callback exist before removing it from network subscribers.
  • Added to sdo client and sdo server, send and receive debug logs.
  • Added pause time before retrying send_request in sdo client request_response.
  • Fixed the var len calculation in pdo base get_data. This one took me hours to debug..
  • Added support for Unsigned 24 data types.

@@ -29,6 +29,9 @@ class SdoClient(SdoBase):
#: Seconds to wait before sending a request, for rate limiting
PAUSE_BEFORE_SEND = 0.0

# Seconds to wait before next read attempt for response in queue. For delayed responses.
PAUSE_BEFORE_READ = 0.2
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps RETRY_DELAY or something would be more easily understandable.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the benefit of this mechanism? read_response() will already wait up to RESPONSE_TIMEOUT seconds, so this additional delay only slows down communication in case a reply is already available shortly after the first timeout.

If there is a special case that requires such behavior, we should maybe make it optional by defaulting the constant to zero?

Copy link
Author

@TzviRonen TzviRonen Sep 8, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@acolomb If it fails to respond to the send_request that may be a sign that the device isn't ready and not necessarily a communication error.
Why not use RESPONSE_TIMEOUT? I want to know about it in the log's as soon as it happens and yet, wait more time before the next read_response().
I used it only for debugging. I agree with you it should be optional by defaulting it to zero.
And I agree with @christiansandberg about the naming. I changed the name to RETRY_DELAY.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a very slight notion, but time.sleep(0) will allow switching to a different thread AFAIK, while not calling it at all does not. I suppose that doesn't make much difference, but maybe better check the constant first and skip sleep if it's zero?

@@ -42,6 +45,7 @@ def __init__(self, rx_cobid, tx_cobid, od):
self.responses = queue.Queue()

def on_response(self, can_id, data, timestamp):
logger.debug(f"received response in {can_id} data {data}.")
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I understand it, it is preferable to use old-style string interpolation for logging because then the final message will only be constructed if needed.

https://docs.python.org/3/howto/logging.html#logging-variable-data

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, we should keep a consistent style here, meaning %-style format strings and arguments to the debug() function.

Copy link
Collaborator

@acolomb acolomb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wish there was more explanation / better commit messages for these changes. Some cleanup required IMHO and better to keep things separated for different purposes, so I would recommend against squash-merging this as is.

Comment on lines 82 to 83
if callback in self.subscribers[can_id]:
self.subscribers[can_id].remove(callback)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you describe the circumstances where raising a ValueError exception here would have caused problems?

If a non-registered callback is passed, that is clearly the caller's fault and I think it's correct to raise an exception in response. The main place where this function is used is in remove_network() on the nodes, which in turn gets called when deleting it from the network object. If it is part of the network, these callbacks have been recorded in the list previously, so no exception will be raised.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right. I removed this change.

@@ -29,6 +29,9 @@ class SdoClient(SdoBase):
#: Seconds to wait before sending a request, for rate limiting
PAUSE_BEFORE_SEND = 0.0

# Seconds to wait before next read attempt for response in queue. For delayed responses.
PAUSE_BEFORE_READ = 0.2
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the benefit of this mechanism? read_response() will already wait up to RESPONSE_TIMEOUT seconds, so this additional delay only slows down communication in case a reply is already available shortly after the first timeout.

If there is a special case that requires such behavior, we should maybe make it optional by defaulting the constant to zero?

@@ -42,6 +45,7 @@ def __init__(self, rx_cobid, tx_cobid, od):
self.responses = queue.Queue()

def on_response(self, can_id, data, timestamp):
logger.debug(f"received response in {can_id} data {data}.")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, we should keep a consistent style here, meaning %-style format strings and arguments to the debug() function.

@@ -88,6 +93,7 @@ def request_response(self, sdo_request):
if not retries_left:
self.abort(0x5040000)
raise
time.sleep(self.PAUSE_BEFORE_READ)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If an additional delay is appropriate here (see above), it should wait after logging, not before.

@@ -562,7 +562,7 @@ def get_data(self):
data = data | (~((1 << self.length) - 1))
data = od_struct.pack(data)
else:
data = self.pdo_parent.data[byte_offset:byte_offset + len(self.od) // 8]
data = self.pdo_parent.data[byte_offset:byte_offset + self.length // 8]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using the cached value here seems like a good idea, but it does change the semantics. That should at least be explained in the commit message.

Why not apply the same treatment in set_data()? Is there maybe some intentional reason len(self.od) and self.length are both used in these functions?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I couldn't think of any reason len (self. OD) is used here.
Somehow len (self.od) changed during the program execution and wasn't equal to self.length when get_data was called. It made the returned data to be in the wrong size. Changing it to self.length fixed the problem.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, len(self.od) returns the number of bits (rounded up to full bytes) that the object consists of according to the node's object dictionary, which is constant. While self.length starts off with the same value, it might be overridden when adding the Variable to a PDO mapping, where you can e.g. specify to map only (the lower) 16 bits of a 32 bit unsigned object. That difference is adjusted in pdo.Map.add_variable().

I agree there are some corner cases in the code that don't deal well with mapping only some bits of an object. But to fix that properly, each use within get_data() and set_data() should be reconsidered individually. Just changing one instance might fix your problem, but cause issues for other users.

By the way, my (hacky) solution so far was to adjust the ObjectDictionary instance to artificially overwrite the data_type member when using partial objects in PDO mappings. That breaks SDO access though, and doesn't cope well with signed numbers (two's complement requires changing the higher bytes as well when crossing zero).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@acolomb This one

@@ -219,6 +219,17 @@ def add_member(self, variable):
self.subindices[variable.subindex] = variable
self.names[variable.name] = variable

class Unsigned24(struct.Struct):
def __init__(self, *args, **kwargs):
super(Unsigned24, self).__init__("<I", *args, **kwargs)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we still trying to support Python 2? If not, this can be shortened to just super().

Also, why does this use I (unsigned int) whereas regular 32-bit ints use L (unsigned long)?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed the super thing.
This isn't an attempt to convert the uint24 to a regular 32-bit. The only reason I'm converting it to an I is because that's the closest data types Struct supports.
Then I'm using Struct.unpack("<I) to make that 32-bit int to python int.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had to look it up in the Python library docs, but as I see it, the <L and <I formats are equivalent, both using four bytes for the conversion. So your code seems correct, just wondering why not use the same letter when it doesn't really make a difference?

@@ -232,6 +243,7 @@ class Variable(object):
UNSIGNED8: struct.Struct("B"),
UNSIGNED16: struct.Struct("<H"),
UNSIGNED32: struct.Struct("<L"),
UNSIGNED24: Unsigned24(),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be ordered by size?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

@@ -4,6 +4,7 @@
__pycache__/
*.py[cod]
*$py.class
.idea/
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a separate section for IDE stuff down below, if such things must be collected here.

This change seems unrelated though, so let's keep it in a separate commit / PR.

@acolomb
Copy link
Collaborator

acolomb commented Sep 8, 2021

Regarding the additional log messages, I think that's too much verbosity. I sometimes rely on the DEBUG log level setting, but this would really spam the log without much additional gain. Is there some timing issue you're trying to debug? Then maybe having these debug messages locally is better than adding them here for every library user.

Copy link
Author

@TzviRonen TzviRonen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regarding the additional log messages, I think that's too much verbosity. I sometimes rely on the DEBUG log level setting, but this would really spam the log without much additional gain. Is there some timing issue you're trying to debug? Then maybe having these debug messages locally is better than adding them here for every library user.

This log very helped me debug as they are at the very core of the library.
But you probably know better about the library logging level so I'll remove them to not spam the log's.
They indeed lead to a huge log file..

Comment on lines 82 to 83
if callback in self.subscribers[can_id]:
self.subscribers[can_id].remove(callback)
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right. I removed this change.

@@ -219,6 +219,17 @@ def add_member(self, variable):
self.subindices[variable.subindex] = variable
self.names[variable.name] = variable

class Unsigned24(struct.Struct):
def __init__(self, *args, **kwargs):
super(Unsigned24, self).__init__("<I", *args, **kwargs)
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed the super thing.
This isn't an attempt to convert the uint24 to a regular 32-bit. The only reason I'm converting it to an I is because that's the closest data types Struct supports.
Then I'm using Struct.unpack("<I) to make that 32-bit int to python int.

@@ -232,6 +243,7 @@ class Variable(object):
UNSIGNED8: struct.Struct("B"),
UNSIGNED16: struct.Struct("<H"),
UNSIGNED32: struct.Struct("<L"),
UNSIGNED24: Unsigned24(),
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

@@ -562,7 +562,7 @@ def get_data(self):
data = data | (~((1 << self.length) - 1))
data = od_struct.pack(data)
else:
data = self.pdo_parent.data[byte_offset:byte_offset + len(self.od) // 8]
data = self.pdo_parent.data[byte_offset:byte_offset + self.length // 8]
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I couldn't think of any reason len (self. OD) is used here.
Somehow len (self.od) changed during the program execution and wasn't equal to self.length when get_data was called. It made the returned data to be in the wrong size. Changing it to self.length fixed the problem.

t
removed from on_response and send_request the debug log's to not spam the logs
Copy link
Collaborator

@acolomb acolomb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please still consider breaking up the changes into several commits / PRs with only related changes? I think pinpointing any possible issues will be easier when each commit holds only changes related to the same topic.

data += b'\x00'
else:
logger.error(f"Unsigned24.unpack received wrong type - {type(data)}")
return super(Unsigned24, self).unpack(data, *args, **kwargs)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another super() thing.

@@ -219,6 +219,17 @@ def add_member(self, variable):
self.subindices[variable.subindex] = variable
self.names[variable.name] = variable

class Unsigned24(struct.Struct):
def __init__(self, *args, **kwargs):
super().__init__("<I", *args, **kwargs)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As previously noted, use "<L" for consistency with the UNSIGNED32 type.

@@ -89,6 +92,7 @@ def request_response(self, sdo_request):
self.abort(0x5040000)
raise
logger.warning(str(e))
time.sleep(self.RETRY_DELAY)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
time.sleep(self.RETRY_DELAY)
if self.RETRY_DELAY:
time.sleep(self.RETRY_DELAY)

@@ -172,6 +172,7 @@ def segmented_download(self, command, request):
self.send_response(response)

def send_response(self, response):
logger.debug(f"Sending to {self.tx_cobid} data {response}.")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another overly verbose log message.

while len(data) < 4:
data += b'\x00'
else:
logger.error(f"Unsigned24.unpack received wrong type - {type(data)}")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use %-style format string.

@raffi-g raffi-g mentioned this pull request Nov 28, 2023
@sveinse
Copy link
Collaborator

sveinse commented May 19, 2024

Isn't this PR outdated? PR #429, PR #435 implements delay on SDO send and retries. PR #406 implements 24-bit support. PR #440 will implement the remaining datatypes. Perhaps the PR can be closed?

@acolomb
Copy link
Collaborator

acolomb commented May 23, 2024

I've skimmed over the changes and converstion again. Not quite sure whether all problems addressed here have been dealt with. One of the main issues was that the PR contains many unrelated changes all-in-one. That now makes it harder to dissect what has already been fixed elsewhere, and where issues remain.

@sveinse if you're motivated to check closely, extracting the relevant problem descriptions into separate issues would help. Pointing back at this PR for tentative fixes. I might get to it sometime, but don't hold your breath...

Copy link
Collaborator

@sveinse sveinse left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@acolomb I reviewed the patch and I think everything is covered already with one exception which I have commented.

@@ -562,7 +562,7 @@ def get_data(self):
data = data | (~((1 << self.length) - 1))
data = od_struct.pack(data)
else:
data = self.pdo_parent.data[byte_offset:byte_offset + len(self.od) // 8]
data = self.pdo_parent.data[byte_offset:byte_offset + self.length // 8]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@acolomb This one

@acolomb
Copy link
Collaborator

acolomb commented May 26, 2024

@sveinse That remaining change should definitely go in. I even found it in one of my older branches, just never got around to making a PR.

@sveinse
Copy link
Collaborator

sveinse commented May 26, 2024

@acolomb Should I embed it into PR #440 ?

@acolomb
Copy link
Collaborator

acolomb commented May 26, 2024

No, that one is unrelated. I'm currently checking what other fixes I had started to work on about 1.5 years ago and will make a separate PR.

@acolomb
Copy link
Collaborator

acolomb commented Jun 4, 2024

The rework of PDO variable decoding (especially for mapped lengths differing from the OD size) is done in #454. Merging that will make the remaining bit of this PR obsolete I suppose.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants