Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

API cleanup and bugfixes to get minimal functionality #13

Draft
wants to merge 22 commits into
base: master
Choose a base branch
from

Conversation

puddly
Copy link
Contributor

@puddly puddly commented Apr 10, 2024

This is a fairly large PR but until the radio library is actually functional I don't see a need to split it up.

I've made a few changes:

  1. Adjusted the command serialization/parsing to work with the device's actual serial protocol instead of modifying the deCONZ protocol.
  2. Renamed commands to more closely match what is in the SDK.
  3. Renamed all getter API functions to use a get_ prefix and setters to use set_.

At the moment, a few things do not work:

  1. Radio startup does not work properly without adding fake endpoint objects because endpoint registration doesn't appear to be functional. The endpoints are registered but the coordinator does not respond to the loopback Active_EP_req with anything other than [242, 242] (also a bug?).
  2. Restoring a backup crashes if set_nwk_extended_panid is called. If this call is removed, backup restoration "works" but no new settings are actually written.
  3. permit similarly doesn't work and crashes with the same error as above.
  4. There are no firmware commands to report the current firmware version, board name/model, or manufacturer name.
  5. There is no way to reset the firmware into a known state. I'm intentionally crashing it to do so, but this isn't a very good way.

I believe that many of these issues stem from the network startup/shutdown/settings writing sequence being unclear. I wasn't able to find anything in the documentation about forming a new network with specific settings, especially with the radio already running.

For the library to be functional with ZHA, the following need to work and they are what I am currently trying to fix:

# Leave the current network and erase all settings
zigpy -vvv radio --baudrate 115200 znsp /dev/cu.usbserial-A94TYUSB reset

# Form a temporary network, perform an energy scan to find a good channel, then use the settings provided by zigpy
zigpy -vvv radio --baudrate 115200 znsp /dev/cu.usbserial-A94TYUSB form

# Dump the current network settings
zigpy -vvv radio --baudrate 115200 znsp /dev/cu.usbserial-A94TYUSB backup -z > backup.json

# Form a new network with random settings
zigpy -vvv radio --baudrate 115200 znsp /dev/cu.usbserial-A94TYUSB form

# Restore the old network settings
zigpy -vvv radio --baudrate 115200 znsp /dev/cu.usbserial-A94TYUSB restore backup.json

# Dump the settings again
zigpy -vvv radio --baudrate 115200 znsp /dev/cu.usbserial-A94TYUSB backup -z > backup2.json

# They should be identical except for the frame counter increasing by 5000 and the backup timestamp changing
diff backup.json backup2.json

# Energy scanning should work
zigpy -vvv radio --baudrate 115200 znsp /dev/cu.usbserial-A94TYUSB energy-scan

# And finally, new devices can join the network and be interviewed by zigpy
zigpy -vvv radio --baudrate 115200 znsp /dev/cu.usbserial-A94TYUSB permit

Could you explain a little more the process to read/write network settings like I have above? The write_network_info function should be pretty simple at this point so let me know what should be changed to get it working. I've added # TODO comments throughout the library in places where I ran into trouble.

Let me know if you have any questions!

@puddly
Copy link
Contributor Author

puddly commented Apr 10, 2024

CC @Maxwelltoo, as you mentioned having interest in helping out.

@Maxwelltoo
Copy link

Understood, I’ll take a look at these issues.
I'm completely new to Zigbee, and I'm more comfortable with C than Python. So, it will take me some time to go through the documentation and source code. Once I make some progress, I'll be sure to let you know.

@lhespress
Copy link
Owner

@puddly Please see the explain as follows:

# TODO: implement a proper software-driven watchdog

A: OK, Let's do it

# TODO: Most commands fail if the network is not formed. Why?

A: The form_network command will initialization Zigbee stack and select the device role, so some commands must run after it.

# TODO: add our registered endpoints manually so things don't crash. These should be discovered automatically.

A: OK, Let's do it

# TODO: Why does setting the extended PAN ID trigger a crash?

A: From the LOG, it seems doesn't include the extended PAN ID in the send Frame.

zigpy_espzb.api DEBUG Sending CommandId.extpanid_set{'panid': d6:75:3d:9a:a8:8a:8c:01} (seq=36)
zigpy_espzb.uart DEBUG Send: 00000e00240000
zigpy_espzb.uart DEBUG Frame received: 0x1000ffffc5010002


# TODO: Network settings do not persist. How do you write them?

A: autostart - true  autostart mod: It initializes, load some parameters from NVRAM and proceed with startup.
             - false no-autostart mode: It initializes scheduler and buffers pool, but not MAC and upper layers.

# TODO: implement firmware commands to read the board name, manufacturer

A: OK, Let's do it

# TODO: implement firmware command to read out the firmware version and build ID

A: OK, Let's do it

# TODO: this does not work, the NCP responds again with:

A: OK, command ID 0x0005 means allow other nodes to join the network with this node as their parent function, 
   it doesn't support on the current NCP firmware, Let's do it.

# TODO: Is this correct? It seems to be used only for loopback

A: Why not use t.NWK(address_short), for APS indication it only support short address mode on NCP firmware.

# TODO: implement a firmware version command

A: OK, Let's do it

# TODO: There is no reset command but we can trigger a crash if we form the network twice

A: OK, Let's do it.

@lhespress
Copy link
Owner

@puddly

# TODO: why does the coordinator respond to the loopback ZDO Active_EP_req with [242, 242]? It should include endpoints 1 and 2, we registered them.

A: I sniff the packet, It seems that 242 endpoint which be green power proxy must include from ZigBee PRO 3.0 as Coordinator/Router device. Could you also check again?
Request
Response

@puddly
Copy link
Contributor Author

puddly commented Apr 16, 2024

From the LOG, it seems doesn't include the extended PAN ID in the send Frame.

Fixed, thanks.

# TODO: Network settings do not persist. How do you write them?

I've tried adding await self._api.start(autostart=True) in various places at the end of form_network but I still cannot get the new network settings to take effect.

We expect that write_network_info configures all of the network settings of a device. This is how the code currently looks:

  1. Reset (via crash, sorry!) to get into a known state
  2. network_init()
  3. form_network(role=DeviceType.COORDINATOR) (without this, I can't send other commands)
  4. start(autostart=False) (without this, I can't send other commands)
  5. Write all of the new settings.
  6. start(autostart=True)
    • I immediately get Received indication CommandId.form_network{'extended_panid': 74:4d:bd:ff:fe:60:3c:ae, 'panid': 0x6EAA, 'channel': 22}, which is not correct. It seems at this point the settings are not written.

After a reset, the old settings are still there. I think there is some issue with the order in which these commands are called. I've tried removing some and changing autostart but it doesn't seem to matter, the settings don't persist.

I sniff the packet, It seems that 242 endpoint which be green power proxy must include from ZigBee PRO 3.0 as Coordinator/Router device. Could you also check again?

We expect that the add_endpoint command registers an endpoint on the device that can then be queried by sending a loopback (0x0000 -> 0x0000) ZDO Active_EP_req. These are the default endpoints](https://github.com/zigpy/zigpy/blob/18a7add71a546ba009e811c153a61c60d7f0987a/zigpy/application.py#L714-L750) zigpy registers so my expectation is that the loopback Active_EP_req would respond with [1, 2] or [1, 2, 242], not [242, 242]. Having the same endpoint twice seems like a bug.

Alternatively, is there a specific time when add_endpoint can be called? Before start(autostart=True), network_init(), or form_network(role=DeviceType.COORDINATOR)?

@puddly
Copy link
Contributor Author

puddly commented Apr 16, 2024

I've cleaned up the command serialization and deserialization. All commands are now in a commands.py module and should have clear structures. This doesn't change the way the code works but it should be much simpler to understand.

@lhespress
Copy link
Owner

lhespress commented Apr 17, 2024

@puddly

  1. Reset (via crash, sorry!) to get into a known state
  2. network_init()
  3. form_network(role=DeviceType.COORDINATOR) (without this, I can't send other commands)
  4. start(autostart=False) (without this, I can't send other commands)
  5. Write all of the new settings.
  6. start(autostart=True)

Start command mean start ZigBee stack, once call, it'll start top level commissioning and indication information when formed network successfully, like this:

Received indication CommandId.form_network{'extended_panid': 74:4d:bd:ff:fe:60:3c:ae, 'panid': 0x6EAA, 'channel': 22},

It'll only return command response but don't do anything if repeat call start command when has called start command. the follow steps is suggestion:

  1. Resetto get into a known state
  2. network_init()
  3. form_network(role=DeviceType.COORDINATOR)
  4. Write all of the new settings.
  5. start(autostart=True) or start(autostart=False)

True: It initializes, load some parameters from NVRAM and proceed with startup.
False: It initializes scheduler and buffers pool, but not MAC and upper layers.

Alternatively, is there a specific time when add_endpoint can be called? Before start(autostart=True), network_init(), or form_network(role=DeviceType.COORDINATOR)?

The add_endpoint must be be called after form_network(role=DeviceType.COORDINATOR), there isn't a specific time for start(autostart=True).

@puddly
Copy link
Contributor Author

puddly commented Apr 17, 2024

Thank you. I've tried the modified startup sequence but there doesn't appear to be a change, settings still aren't persisting.

# 1. Reset to get into a known state
2024-04-16 23:47:23.247 macbook.local zigpy_espzb.api DEBUG Reset complete

# 2. network_init()
2024-04-16 23:47:23.247 macbook.local zigpy_espzb.api DEBUG Sending NetworkInitReq() (seq=27)
2024-04-16 23:47:23.247 macbook.local zigpy_espzb.uart DEBUG Send: 000000001b0000
2024-04-16 23:47:23.260 macbook.local zigpy_espzb.uart DEBUG Frame received: 0x100000001b010000
2024-04-16 23:47:23.261 macbook.local zigpy_espzb.api DEBUG Received response NetworkInitRsp(status=<Status.SUCCESS: 0>) (seq 27)

# 3. form_network(role=DeviceType.COORDINATOR)
2024-04-16 23:47:23.261 macbook.local zigpy_espzb.api DEBUG Sending FormNetworkReq(role=<DeviceType.COORDINATOR: 0>, install_code_policy=<Bool.false: 0>, max_children=20) (seq=28)
2024-04-16 23:47:23.261 macbook.local zigpy_espzb.uart DEBUG Send: 000004001c0300000014
2024-04-16 23:47:23.292 macbook.local zigpy_espzb.uart DEBUG Frame received: 0x100004001c010000
2024-04-16 23:47:23.293 macbook.local zigpy_espzb.api DEBUG Received response FormNetworkRsp(status=<Status.SUCCESS: 0>) (seq 28)

# 4. Write all of the new settings.
2024-04-16 23:47:25.295 macbook.local zigpy_espzb.api DEBUG Sending NetworkRoleSetReq(role=<DeviceType.COORDINATOR: 0>) (seq=29)
2024-04-16 23:47:25.295 macbook.local zigpy_espzb.uart DEBUG Send: 00001c001d010000
2024-04-16 23:47:25.308 macbook.local zigpy_espzb.uart DEBUG Frame received: 0x10001c001d010000
2024-04-16 23:47:25.309 macbook.local zigpy_espzb.api DEBUG Received response NetworkRoleSetRsp(status=<Status.SUCCESS: 0>) (seq 29)
2024-04-16 23:47:25.309 macbook.local zigpy_espzb.api DEBUG Sending ShortAddrSetReq(short_addr=0x0000) (seq=30)
2024-04-16 23:47:25.310 macbook.local zigpy_espzb.uart DEBUG Send: 00001e001e02000000
2024-04-16 23:47:25.324 macbook.local zigpy_espzb.uart DEBUG Frame received: 0x10001e001e010000
2024-04-16 23:47:25.325 macbook.local zigpy_espzb.api DEBUG Received response ShortAddrSetRsp(status=<Status.SUCCESS: 0>) (seq 30)
2024-04-16 23:47:25.325 macbook.local zigpy_espzb.api DEBUG Sending LongAddrGetReq() (seq=31)
2024-04-16 23:47:25.325 macbook.local zigpy_espzb.uart DEBUG Send: 00001f001f0000
2024-04-16 23:47:25.340 macbook.local zigpy_espzb.uart DEBUG Frame received: 0x10001f001f08000000000000000000
2024-04-16 23:47:25.341 macbook.local zigpy_espzb.api DEBUG Received response LongAddrGetRsp(ieee=00:00:00:00:00:00:00:00) (seq 31)
2024-04-16 23:47:25.341 macbook.local zigpy_espzb.api DEBUG Sending UsePredefinedNwkPanidSetReq(predefined=<Bool.true: 1>) (seq=32)
2024-04-16 23:47:25.341 macbook.local zigpy_espzb.uart DEBUG Send: 00002b0020010001
2024-04-16 23:47:25.356 macbook.local zigpy_espzb.uart DEBUG Frame received: 0x10002b0020010000
2024-04-16 23:47:25.357 macbook.local zigpy_espzb.api DEBUG Received response UsePredefinedNwkPanidSetRsp(status=<Status.SUCCESS: 0>) (seq 32)
2024-04-16 23:47:25.357 macbook.local zigpy_espzb.api DEBUG Sending PanidSetReq(panid=0xB1B5) (seq=33)
2024-04-16 23:47:25.357 macbook.local zigpy_espzb.uart DEBUG Send: 00000c00210200b5b1
2024-04-16 23:47:25.372 macbook.local zigpy_espzb.uart DEBUG Frame received: 0x10000c0021010000
2024-04-16 23:47:25.372 macbook.local zigpy_espzb.api DEBUG Received response PanidSetRsp(status=<Status.SUCCESS: 0>) (seq 33)
2024-04-16 23:47:25.373 macbook.local zigpy_espzb.api DEBUG Sending ExtpanidSetReq(ieee=2e:80:5a:e0:9c:83:8c:b3) (seq=34)
2024-04-16 23:47:25.373 macbook.local zigpy_espzb.uart DEBUG Send: 00000e00220800b38c839ce05a802e
2024-04-16 23:47:25.388 macbook.local zigpy_espzb.uart DEBUG Frame received: 0x10000e0022010000
2024-04-16 23:47:25.389 macbook.local zigpy_espzb.api DEBUG Received response ExtpanidSetRsp(status=<Status.SUCCESS: 0>) (seq 34)
2024-04-16 23:47:25.389 macbook.local zigpy_espzb.api DEBUG Sending NwkUpdateIdSetReq(nwk_update_id=0) (seq=35)
2024-04-16 23:47:25.389 macbook.local zigpy_espzb.uart DEBUG Send: 0000240023010000
2024-04-16 23:47:25.404 macbook.local zigpy_espzb.uart DEBUG Frame received: 0x1000240023010000
2024-04-16 23:47:25.404 macbook.local zigpy_espzb.api DEBUG Received response NwkUpdateIdSetRsp(status=<Status.SUCCESS: 0>) (seq 35)
2024-04-16 23:47:25.405 macbook.local zigpy_espzb.api DEBUG Sending NetworkKeySetReq(nwk_key=89:e9:fb:14:90:67:a6:ab:15:30:59:e7:1b:fa:66:f0) (seq=36)
2024-04-16 23:47:25.405 macbook.local zigpy_espzb.uart DEBUG Send: 0000180024100089e9fb149067a6ab153059e71bfa66f0
2024-04-16 23:47:25.420 macbook.local zigpy_espzb.uart DEBUG Frame received: 0x1000180024010000
2024-04-16 23:47:25.420 macbook.local zigpy_espzb.api DEBUG Received response NetworkKeySetRsp(status=<Status.SUCCESS: 0>) (seq 36)
2024-04-16 23:47:25.421 macbook.local zigpy_espzb.api DEBUG Sending NwkFrameCounterSetReq(nwk_frame_counter=0) (seq=37)
2024-04-16 23:47:25.421 macbook.local zigpy_espzb.uart DEBUG Send: 00001a0025040000000000
2024-04-16 23:47:25.436 macbook.local zigpy_espzb.uart DEBUG Frame received: 0x10001a0025010000
2024-04-16 23:47:25.437 macbook.local zigpy_espzb.api DEBUG Received response NwkFrameCounterSetRsp(status=<Status.SUCCESS: 0>) (seq 37)
2024-04-16 23:47:25.437 macbook.local zigpy_espzb.api DEBUG Sending TrustCenterAddressSetReq(trust_center_addr=00:00:00:00:00:00:00:00) (seq=38)
2024-04-16 23:47:25.437 macbook.local zigpy_espzb.uart DEBUG Send: 000026002608000000000000000000
2024-04-16 23:47:25.452 macbook.local zigpy_espzb.uart DEBUG Frame received: 0x1000260026010000
2024-04-16 23:47:25.452 macbook.local zigpy_espzb.api DEBUG Received response TrustCenterAddressSetRsp(status=<Status.SUCCESS: 0>) (seq 38)
2024-04-16 23:47:25.453 macbook.local zigpy_espzb.api DEBUG Sending LinkKeySetReq(key=5a:69:67:42:65:65:41:6c:6c:69:61:6e:63:65:30:39) (seq=39)
2024-04-16 23:47:25.453 macbook.local zigpy_espzb.uart DEBUG Send: 000028002710005a6967426565416c6c69616e63653039
2024-04-16 23:47:25.468 macbook.local zigpy_espzb.uart DEBUG Frame received: 0x1000280027010000
2024-04-16 23:47:25.468 macbook.local zigpy_espzb.api DEBUG Received response LinkKeySetRsp(status=<Status.SUCCESS: 0>) (seq 39)
2024-04-16 23:47:25.469 macbook.local zigpy_espzb.api DEBUG Sending SecurityModeSetReq(security_mode=<SecurityMode.PRECONFIGURED_NETWORK_KEY: 1>) (seq=40)
2024-04-16 23:47:25.469 macbook.local zigpy_espzb.uart DEBUG Send: 00002a0028010001
2024-04-16 23:47:25.484 macbook.local zigpy_espzb.uart DEBUG Frame received: 0x10002a0028010000
2024-04-16 23:47:25.484 macbook.local zigpy_espzb.api DEBUG Received response SecurityModeSetRsp(status=<Status.SUCCESS: 0>) (seq 40)
2024-04-16 23:47:25.485 macbook.local zigpy_espzb.api DEBUG Sending CurrentChannelSetReq(channel=15) (seq=41)
2024-04-16 23:47:25.485 macbook.local zigpy_espzb.uart DEBUG Send: 000014002901000f
2024-04-16 23:47:25.500 macbook.local zigpy_espzb.uart DEBUG Frame received: 0x1000140029010000
2024-04-16 23:47:25.500 macbook.local zigpy_espzb.api DEBUG Received response CurrentChannelSetRsp(status=<Status.SUCCESS: 0>) (seq 41)

# 5. start(autostart=True) or start(autostart=False)
2024-04-16 23:47:25.501 macbook.local zigpy_espzb.api DEBUG Sending StartReq(autostart=<Bool.true: 1>) (seq=42)
2024-04-16 23:47:25.501 macbook.local zigpy_espzb.uart DEBUG Send: 000001002a010001
2024-04-16 23:47:25.533 macbook.local zigpy_espzb.uart DEBUG Frame received: 0x100001002a010000
2024-04-16 23:47:25.533 macbook.local zigpy_espzb.api DEBUG Received response StartRsp(status=<Status.SUCCESS: 0>) (seq 42)
2024-04-16 23:47:25.533 macbook.local zigpy_espzb.uart DEBUG Frame received: 0x20000500e5010000
2024-04-16 23:47:25.533 macbook.local zigpy_espzb.api DEBUG Received indication PermitJoiningInd(duration=0) (seq 229)

# ?? `FormNetworkInd` is received but with the wrong PAN ID, EPID, and channel
2024-04-16 23:47:25.548 macbook.local zigpy_espzb.uart DEBUG Frame received: 0x20000400b80b00ae3c60feffbd4d74044d19
2024-04-16 23:47:25.548 macbook.local zigpy_espzb.api DEBUG Received indication FormNetworkInd(extended_panid=74:4d:bd:ff:fe:60:3c:ae, panid=0x4D04, channel=25) (seq 184)
2024-04-16 23:47:26.141 macbook.local zigpy_espzb.uart DEBUG Frame received: 0x20000500220100b4
2024-04-16 23:47:26.143 macbook.local zigpy_espzb.api DEBUG Received indication PermitJoiningInd(duration=180) (seq 34)

# Resetting again
2024-04-16 23:47:27.536 macbook.local zigpy_espzb.api DEBUG Resetting via crash...
2024-04-16 23:47:27.537 macbook.local zigpy_espzb.api DEBUG Sending FormNetworkReq(role=<DeviceType.COORDINATOR: 0>, install_code_policy=<Bool.false: 0>, max_children=20) (seq=43)
2024-04-16 23:47:27.538 macbook.local zigpy_espzb.uart DEBUG Send: 000004002b0300000014
2024-04-16 23:47:29.340 macbook.local zigpy_espzb.api DEBUG No response to 'FormNetworkReq(role=<DeviceType.COORDINATOR: 0>, install_code_policy=<Bool.false: 0>, max_children=20)' command with seq 43
2024-04-16 23:47:31.344 macbook.local zigpy_espzb.api DEBUG Reset complete

# Performing a backup to read current settings
2024-04-16 23:47:31.345 macbook.local zigpy_espzb.api DEBUG Sending NetworkInitReq() (seq=44)
2024-04-16 23:47:31.346 macbook.local zigpy_espzb.uart DEBUG Send: 000000002c0000
2024-04-16 23:47:31.356 macbook.local zigpy_espzb.uart DEBUG Frame received: 0x100000002c010000
2024-04-16 23:47:31.357 macbook.local zigpy_espzb.api DEBUG Received response NetworkInitRsp(status=<Status.SUCCESS: 0>) (seq 44)
2024-04-16 23:47:31.357 macbook.local zigpy_espzb.api DEBUG Sending FormNetworkReq(role=<DeviceType.COORDINATOR: 0>, install_code_policy=<Bool.false: 0>, max_children=20) (seq=45)
2024-04-16 23:47:31.357 macbook.local zigpy_espzb.uart DEBUG Send: 000004002d0300000014
2024-04-16 23:47:31.388 macbook.local zigpy_espzb.uart DEBUG Frame received: 0x100004002d010000
2024-04-16 23:47:31.389 macbook.local zigpy_espzb.api DEBUG Received response FormNetworkRsp(status=<Status.SUCCESS: 0>) (seq 45)
2024-04-16 23:47:33.393 macbook.local zigpy_espzb.api DEBUG Sending StartReq(autostart=<Bool.true: 1>) (seq=46)
2024-04-16 23:47:33.394 macbook.local zigpy_espzb.uart DEBUG Send: 000001002e010001
2024-04-16 23:47:33.420 macbook.local zigpy_espzb.uart DEBUG Frame received: 0x100001002e010000
2024-04-16 23:47:33.421 macbook.local zigpy_espzb.api DEBUG Received response StartRsp(status=<Status.SUCCESS: 0>) (seq 46)
2024-04-16 23:47:33.436 macbook.local zigpy_espzb.uart DEBUG Frame received: 0x200005007f010000
2024-04-16 23:47:33.437 macbook.local zigpy_espzb.api DEBUG Received indication PermitJoiningInd(duration=0) (seq 127)
2024-04-16 23:47:33.452 macbook.local zigpy_espzb.uart DEBUG Frame received: 0x20000400a80b00ae3c60feffbd4d74044d19
2024-04-16 23:47:33.453 macbook.local zigpy_espzb.api DEBUG Received indication FormNetworkInd(extended_panid=74:4d:bd:ff:fe:60:3c:ae, panid=0x4D04, channel=25) (seq 168)
2024-04-16 23:47:34.046 macbook.local zigpy_espzb.uart DEBUG Frame received: 0x200005008a0100b4
2024-04-16 23:47:34.047 macbook.local zigpy_espzb.api DEBUG Received indication PermitJoiningInd(duration=180) (seq 138)
2024-04-16 23:47:35.425 macbook.local zigpy_espzb.api DEBUG Sending NetworkRoleGetReq() (seq=47)
2024-04-16 23:47:35.426 macbook.local zigpy_espzb.uart DEBUG Send: 00001b002f0000
2024-04-16 23:47:35.437 macbook.local zigpy_espzb.uart DEBUG Frame received: 0x10001b002f010000
2024-04-16 23:47:35.437 macbook.local zigpy_espzb.api DEBUG Received response NetworkRoleGetRsp(role=<DeviceType.COORDINATOR: 0>) (seq 47)
2024-04-16 23:47:35.437 macbook.local zigpy_espzb.api DEBUG Sending ShortAddrGetReq() (seq=48)
2024-04-16 23:47:35.438 macbook.local zigpy_espzb.uart DEBUG Send: 00001d00300000
2024-04-16 23:47:35.452 macbook.local zigpy_espzb.uart DEBUG Frame received: 0x10001d003002000000
2024-04-16 23:47:35.453 macbook.local zigpy_espzb.api DEBUG Received response ShortAddrGetRsp(short_addr=0x0000) (seq 48)
2024-04-16 23:47:35.454 macbook.local zigpy_espzb.api DEBUG Sending LongAddrGetReq() (seq=49)
2024-04-16 23:47:35.454 macbook.local zigpy_espzb.uart DEBUG Send: 00001f00310000
2024-04-16 23:47:35.468 macbook.local zigpy_espzb.uart DEBUG Frame received: 0x10001f00310800ae3c60feffbd4d74
2024-04-16 23:47:35.469 macbook.local zigpy_espzb.api DEBUG Received response LongAddrGetRsp(ieee=74:4d:bd:ff:fe:60:3c:ae) (seq 49)
2024-04-16 23:47:35.470 macbook.local zigpy_espzb.api DEBUG Sending PanidGetReq() (seq=50)
2024-04-16 23:47:35.470 macbook.local zigpy_espzb.uart DEBUG Send: 00000b00320000
2024-04-16 23:47:35.484 macbook.local zigpy_espzb.uart DEBUG Frame received: 0x10000b00320200044d
2024-04-16 23:47:35.485 macbook.local zigpy_espzb.api DEBUG Received response PanidGetRsp(panid=0x4D04) (seq 50)
2024-04-16 23:47:35.485 macbook.local zigpy_espzb.api DEBUG Sending ExtpanidGetReq() (seq=51)
2024-04-16 23:47:35.485 macbook.local zigpy_espzb.uart DEBUG Send: 00000d00330000
2024-04-16 23:47:35.500 macbook.local zigpy_espzb.uart DEBUG Frame received: 0x10000d00330800ae3c60feffbd4d74
2024-04-16 23:47:35.500 macbook.local zigpy_espzb.api DEBUG Received response ExtpanidGetRsp(ieee=74:4d:bd:ff:fe:60:3c:ae) (seq 51)
2024-04-16 23:47:35.501 macbook.local zigpy_espzb.api DEBUG Sending CurrentChannelGetReq() (seq=52)
2024-04-16 23:47:35.501 macbook.local zigpy_espzb.uart DEBUG Send: 00001300340000
2024-04-16 23:47:35.516 macbook.local zigpy_espzb.uart DEBUG Frame received: 0x1000130034010019
2024-04-16 23:47:35.517 macbook.local zigpy_espzb.api DEBUG Received response CurrentChannelGetRsp(channel=25) (seq 52)
2024-04-16 23:47:35.517 macbook.local zigpy_espzb.api DEBUG Sending PrimaryChannelMaskGetReq() (seq=53)
2024-04-16 23:47:35.517 macbook.local zigpy_espzb.uart DEBUG Send: 00000f00350000
2024-04-16 23:47:35.532 macbook.local zigpy_espzb.uart DEBUG Frame received: 0x10000f0035040000000000
2024-04-16 23:47:35.533 macbook.local zigpy_espzb.api DEBUG Received response PrimaryChannelMaskGetRsp(channel_mask=<ShiftedChannels.NO_CHANNELS: 0>) (seq 53)
2024-04-16 23:47:35.533 macbook.local zigpy_espzb.api DEBUG Sending NwkUpdateIdGetReq() (seq=54)
2024-04-16 23:47:35.533 macbook.local zigpy_espzb.uart DEBUG Send: 00002300360000
2024-04-16 23:47:35.548 macbook.local zigpy_espzb.uart DEBUG Frame received: 0x1000230036010001
2024-04-16 23:47:35.548 macbook.local zigpy_espzb.api DEBUG Received response NwkUpdateIdGetRsp(nwk_update_id=1) (seq 54)
2024-04-16 23:47:35.549 macbook.local zigpy_espzb.api DEBUG Sending NetworkKeyGetReq() (seq=55)
2024-04-16 23:47:35.549 macbook.local zigpy_espzb.uart DEBUG Send: 00001700370000
2024-04-16 23:47:35.565 macbook.local zigpy_espzb.uart DEBUG Frame received: 0x10001700371000332555ec57be4b36ec7d087c2dfa1316
2024-04-16 23:47:35.565 macbook.local zigpy_espzb.api DEBUG Received response NetworkKeyGetRsp(nwk_key=33:25:55:ec:57:be:4b:36:ec:7d:08:7c:2d:fa:13:16) (seq 55)
2024-04-16 23:47:35.565 macbook.local zigpy_espzb.api DEBUG Sending NwkFrameCounterGetReq() (seq=56)
2024-04-16 23:47:35.565 macbook.local zigpy_espzb.uart DEBUG Send: 00001900380000
2024-04-16 23:47:35.580 macbook.local zigpy_espzb.uart DEBUG Frame received: 0x1000190038040088130000
2024-04-16 23:47:35.580 macbook.local zigpy_espzb.api DEBUG Received response NwkFrameCounterGetRsp(nwk_frame_counter=5000) (seq 56)
2024-04-16 23:47:35.581 macbook.local zigpy_espzb.api DEBUG Sending LinkKeyGetReq() (seq=57)
2024-04-16 23:47:35.581 macbook.local zigpy_espzb.uart DEBUG Send: 00002700390000
2024-04-16 23:47:35.596 macbook.local zigpy_espzb.uart DEBUG Frame received: 0x10002700391800ae3c60feffbd4d745a6967426565416c6c69616e63653039
2024-04-16 23:47:35.596 macbook.local zigpy_espzb.api DEBUG Received response LinkKeyGetRsp(ieee=74:4d:bd:ff:fe:60:3c:ae, key=5a:69:67:42:65:65:41:6c:6c:69:61:6e:63:65:30:39) (seq 57)
2024-04-16 23:47:35.597 macbook.local zigpy_espzb.api DEBUG Sending TrustCenterAddressGetReq() (seq=58)
2024-04-16 23:47:35.597 macbook.local zigpy_espzb.uart DEBUG Send: 000025003a0000
2024-04-16 23:47:35.612 macbook.local zigpy_espzb.uart DEBUG Frame received: 0x100025003a0800ab9809ffff2e2100
2024-04-16 23:47:35.612 macbook.local zigpy_espzb.api DEBUG Received response TrustCenterAddressGetRsp(trust_center_addr=00:21:2e:ff:ff:09:98:ab) (seq 58)
2024-04-16 23:47:35.613 macbook.local zigpy_espzb.api DEBUG Sending SecurityModeGetReq() (seq=59)
2024-04-16 23:47:35.613 macbook.local zigpy_espzb.uart DEBUG Send: 000029003b0000
2024-04-16 23:47:35.628 macbook.local zigpy_espzb.uart DEBUG Frame received: 0x100029003b010000
2024-04-16 23:47:35.628 macbook.local zigpy_espzb.api DEBUG Received response SecurityModeGetRsp(security_mode=<SecurityMode.NO_SECURITY: 0>) (seq 59)

# Same problem, wrong settings are read
2024-04-16 23:47:35.629 macbook.local zigpy.backups DEBUG Adding a new backup NetworkBackup(version=1, backup_time=datetime.datetime(2024, 4, 17, 3, 47, 35, 629149, tzinfo=datetime.timezone.utc), network_info=NetworkInfo(extended_pan_id=74:4d:bd:ff:fe:60:3c:ae, pan_id=0x4D04, nwk_update_id=1, nwk_manager_id=0x0000, channel=25, channel_mask=<Channels.NO_CHANNELS: 0>, security_level=0, network_key=Key(key=33:25:55:ec:57:be:4b:36:ec:7d:08:7c:2d:fa:13:16, tx_counter=5000, rx_counter=0, seq=0, partner_ieee=ff:ff:ff:ff:ff:ff:ff:ff), tc_link_key=Key(key=5a:69:67:42:65:65:41:6c:6c:69:61:6e:63:65:30:39, tx_counter=0, rx_counter=0, seq=0, partner_ieee=00:21:2e:ff:ff:09:98:ab), key_table=[], children=[], nwk_addresses={}, stack_specific={}, metadata={}, source='zigpy-espzb@0.0.1'), node_info=NodeInfo(nwk=0x0000, ieee=74:4d:bd:ff:fe:60:3c:ae, logical_type=<LogicalType.Coordinator: 0>, model='ESP32H2', manufacturer='Espressif Systems', version='0x00000000'))
2024-04-16 23:47:35.629 macbook.local zigpy_espzb.api DEBUG Serial '/dev/cu.usbserial-A94TYUSB' connection lost unexpectedly: None

@lhespress
Copy link
Owner

@puddly

Write all of the new settings.

2024-04-16 23:47:25.309 macbook.local zigpy_espzb.api DEBUG Sending ShortAddrSetReq(short_addr=0x0000) (seq=30)

only return command response because of do not allow set short address, will add it.

2024-04-16 23:47:25.357 macbook.local zigpy_espzb.api DEBUG Sending PanidSetReq(panid=0xB1B5) (seq=33)

only return command response beacuse of do not allow set PAN ID, will add it.

2024-04-16 23:47:25.373 macbook.local zigpy_espzb.api DEBUG Sending ExtpanidSetReq(ieee=2e:80:5a:e0:9c:83:8c:b3) (seq=34)

only return command response beacuse of do not allow set extended PAN ID, could use long_addr_set command?

2024-04-16 23:47:25.389 macbook.local zigpy_espzb.api DEBUG Sending NwkUpdateIdSetReq(nwk_update_id=0) (seq=35)
2024-04-16 23:47:25.405 macbook.local zigpy_espzb.api DEBUG Sending NetworkKeySetReq(nwk_key=89:e9:fb:14:90:67:a6:ab:15:30:59:e7:1b:fa:66:f0) (seq=36)
2024-04-16 23:47:25.421 macbook.local zigpy_espzb.api DEBUG Sending NwkFrameCounterSetReq(nwk_frame_counter=0) (seq=37)
2024-04-16 23:47:25.437 macbook.local zigpy_espzb.api DEBUG Sending TrustCenterAddressSetReq(trust_center_addr=00:00:00:00:00:00:00:00) (seq=38)
2024-04-16 23:47:25.469 macbook.local zigpy_espzb.api DEBUG Sending SecurityModeSetReq(security_mode=<SecurityMode.PRECONFIGURED_NETWORK_KEY: 1>) (seq=40)

only return command response, will add it.

2024-04-16 23:47:25.485 macbook.local zigpy_espzb.api DEBUG Sending CurrentChannelSetReq(channel=15) (seq=41)

it means channel mask, please use primary_channel_mask_set command set it as 1 < 15 if you want to work on 15 channel

@lhespress
Copy link
Owner

@puddly

Performing a backup to read current settings

2024-04-16 23:47:35.437 macbook.local zigpy_espzb.api DEBUG Sending ShortAddrGetReq() (seq=48)

return the generate address

2024-04-16 23:47:35.454 macbook.local zigpy_espzb.api DEBUG Sending LongAddrGetReq() (seq=49)

return the long address by long_addr_set command

2024-04-16 23:47:35.470 macbook.local zigpy_espzb.api DEBUG Sending PanidGetReq() (seq=50)

return the generate PAN ID

2024-04-16 23:47:35.485 macbook.local zigpy_espzb.api DEBUG Sending ExtpanidGetReq() (seq=51)

return the generate EXTENDED PAN ID

2024-04-16 23:47:35.501 macbook.local zigpy_espzb.api DEBUG Sending CurrentChannelGetReq() (seq=52)

return channel which match primary_channel_mask_get

2024-04-16 23:47:35.517 macbook.local zigpy_espzb.api DEBUG Sending PrimaryChannelMaskGetReq() (seq=53)

return default channel mask when reset device

2024-04-16 23:47:35.533 macbook.local zigpy_espzb.api DEBUG Sending NwkUpdateIdGetReq() (seq=54)

return default update ID when reset device

2024-04-16 23:47:35.549 macbook.local zigpy_espzb.api DEBUG Sending NetworkKeyGetReq() (seq=55)

return the generate KEY

2024-04-16 23:47:35.565 macbook.local zigpy_espzb.api DEBUG Sending NwkFrameCounterGetReq() (seq=56)

return default count

2024-04-16 23:47:35.597 macbook.local zigpy_espzb.api DEBUG Sending TrustCenterAddressGetReq() (seq=58)

return default trust center

2024-04-16 23:47:35.613 macbook.local zigpy_espzb.api DEBUG Sending SecurityModeGetReq() (seq=59)

return default secure mode

@lhespress
Copy link
Owner

@puddly Could we don't write these network information but can read it continue because of there are not handle action for some command?

BTW, i checked the function by end device, it can join the network.

@lhespress
Copy link
Owner

lhespress commented Apr 19, 2024

@puddly

There is no reset command but we can trigger a crash if we form the network twice

What means about the reset command? restart or factory the device?

@lhespress
Copy link
Owner

lhespress commented May 14, 2024

@puddly

Please download the firmware as follows which support system frame ID and match with feature/serialization-fixes branch.

build0516.zip

@lhespress
Copy link
Owner

@puddly Any update about this?

@KlimovVladimir
Copy link

@lhespress Are you Espressif developer?)
If yes, could you tell please how soon will be update of esp-zigbee-ncp (with features which you discussed with puddly here)?

@lhespress
Copy link
Owner

@lhespress Are you Espressif developer?) If yes, could you tell please how soon will be update of esp-zigbee-ncp (with features which you discussed with puddly here)?

@KlimovVladimir Yes,It'll be update once the firmware as the attachment has been checked and match the function.

@puddly
Copy link
Contributor Author

puddly commented Jun 4, 2024

@lhespress Sorry for the delay. I've tested the firmware and made some adjustments to this branch. It looks like things are getting better.

Things that now work:

  • Model string, manufacturer string, firmware version, reset, and factory reset commands
  • Network backup and restore!

Things that do not:

  • Factory reset initially works but a new network is formed if you take a backup immediately after. When the stick is factory reset, it should not have any network settings.

I think I am confused about the stack startup sequence. Could you explain how to do the following?

  1. Connect to the stack, reboot, and then load the network settings without starting or forming a network.
  2. Connect to the stack, reset, and then check if there is a formed network without forming a new one.

Other than that, it would be nice if the stack sent some sort of indication upon boot or reset, as this would allow the reset() command to work better (without polling) and also would allow the application to immediately detect when the stack has crashed and reset.

@Hedda
Copy link
Contributor

Hedda commented Jul 12, 2024

@lhespress Any updates on this?

@lhespress
Copy link
Owner

@Hedda There is some changes for startup sequence for the issues @puddly mentioned, i'll update it this week.

@lhespress
Copy link
Owner

@puddly

Factory reset initially works but a new network is formed if you take a backup immediately after. When the stick is factory reset, it should not have any network settings.

That's because of it'll form a network automatic after start command, i changed the startup sequence which can read and write the network information without form command. BTW, it must call factory command when write network information with forming a network.

  1. Connect to the stack, reboot, and then load the network settings without starting or forming a network.

reboot means restart the device, right?

It can't load network setting without forming a network in the build0516.zip firmware. I have changed in the follow firmware.

  1. Connect to the stack, reset, and then check if there is a formed network without forming a new one.

reset means factory reset the device, right?

That's because of it'll form a network automatic after start command, so it'll forming a new network in the build0516.zip firmware.

It'll sent an factory indication without forming a new one in the follow firmware.

Other than that, it would be nice if the stack sent some sort of indication upon boot or reset, as this would allow the reset() command to work better (without polling) and also would allow the application to immediately detect when the stack has crashed and reset.

It'll sent an indication when factory or reset in the follow firmware.

Please download the firmware build0717.zip which change startup sequence and match with fixes-serialization branch.

@lhespress
Copy link
Owner

@puddly Any update about the newest firmware?

@puddly
Copy link
Contributor Author

puddly commented Sep 9, 2024

I will try to work on testing the new firmware some time this week.

@robbawebba
Copy link

Hey everyone, I'm interested in using an ESP32-C6 as a zigbee coordinator radio as part of the zigpy stack. I see this PR is linked in various upstream issues as a blocker and wanted to see if there's some way I can help with testing or bug fixes.

Any specific areas you think are good to jump in and help?

@lhespress
Copy link
Owner

Hey everyone, I'm interested in using an ESP32-C6 as a zigbee coordinator radio as part of the zigpy stack. I see this PR is linked in various upstream issues as a blocker and wanted to see if there's some way I can help with testing or bug fixes.

Any specific areas you think are good to jump in and help?

@robbawebba Thanks for your help firstly. Could you help confirm the function use the firmware build_1016.zip on ESP32-C6 which UART use GPIO4 (RX) and GPIO5 (TX)?

@puddly
Copy link
Contributor Author

puddly commented Oct 16, 2024

@lhespress I have a request. Would it be possible for you to build a version of the ESP32-H2 firmware that uses the USB-C UART port of the device? Testing it with a USB-UART adapter requires extra wires and USB ports compared to other coordinators that can function just over USB.


Unfortunately, the new firmware doesn't appear to work very well. Reset indications are not sent on startup and network formation no longer respects the written settings.

Just to make sure we're on the same page, here is exactly how I am testing:

# In the zigpy-espzb virtualenv
pip install git+https://github.com/lhespress/zigpy-cli@feature/zigpy_espzb
pip install -e .  # To install the development version of zigpy-espzb

# Leave the current network and erase all settings
zigpy -vvv radio --baudrate 115200 znsp /dev/cu.usbserial-A94TYUSB reset

# If you try to read settings, you should get a `NetworkNotFormed` error
zigpy -vvv radio --baudrate 115200 znsp /dev/cu.usbserial-A94TYUSB backup -z

# Form a temporary network, perform an energy scan to find a good channel, then use the settings provided by zigpy
zigpy -vvv radio --baudrate 115200 znsp /dev/cu.usbserial-A94TYUSB form

# Dump the current network settings
zigpy -vvv radio --baudrate 115200 znsp /dev/cu.usbserial-A94TYUSB backup -z > backup.json

### At this point, read `backup.json` and make sure the settings match what zigpy wrote

# Form a new network with random settings
zigpy -vvv radio --baudrate 115200 znsp /dev/cu.usbserial-A94TYUSB form

# Restore the old network settings
zigpy -vvv radio --baudrate 115200 znsp /dev/cu.usbserial-A94TYUSB restore backup.json

# Dump the settings again
zigpy -vvv radio --baudrate 115200 znsp /dev/cu.usbserial-A94TYUSB backup -z > backup2.json

# They should be identical except for the frame counter increasing by 5000 and the backup timestamp changing
diff backup.json backup2.json

# Energy scanning should work
zigpy -vvv radio --baudrate 115200 znsp /dev/cu.usbserial-A94TYUSB energy-scan

# And finally, new devices can join the network and be interviewed by zigpy
zigpy -vvv radio --baudrate 115200 znsp /dev/cu.usbserial-A94TYUSB permit

You will see in the log output what settings the form command is trying to write. Can you explain what exact commands I need to run, from the moment I connect to the serial port, to:

  1. Check if a network exists.
  2. Form a network with the exact, provided settings.
  3. Start the current network, if one exists.
  4. Load the currently-written network settings without starting the network.

I feel there is some mistake with the connect() code or the write_network_info() code.

@lhespress
Copy link
Owner

@puddly Please check build_1022.zip which use the USB-C UART port of the device for command. BTW, if you want to show the log which use GPIO4 (RX) and GPIO5 (TX) as UART.

@robbawebba
Copy link

Hi @lhespress, I was able to test the same functionality outlined above on ESP32-C6. I observed a few results that don't match the expected outcome, but I'll let you be the judge of that since I'm new to this effort.

Below are the test steps and observations. Attached are the zigby cli debug logs.
2024-10-22-build_1016.testlog.txt


  1. Leave the current network and erase all settings
    $ zigpy -vvv radio --baudrate 115200 znsp /dev/ttyUSB1 reset
    ✔️ Looks like the device rebooted and reset

  2. If you try to read settings, you should get a NetworkNotFormed error
    $ zigpy -vvv radio --baudrate 115200 znsp /dev/ttyUSB1 backup -z
    ❌ Zigpy dumps some network settings, no NetworkNotFormed error observed

  3. Form a temporary network, perform an energy scan to find a good channel, then use the settings provided by zigpy
    $ zigpy -vvv radio --baudrate 115200 znsp /dev/ttyUSB1 form
    ✔️ A network appears to be formed

  4. Dump the current network settings
    $ zigpy -vvv radio --baudrate 115200 znsp /dev/ttyUSB1 backup -z > backup.json
    ⚠️ Settings dumped to JSON, with a warning: "Backup is incomplete, ignoring"

  5. At this point, read backup.json and make sure the settings match what zigpy wrote
    ✔️ settings.json appears to be complete, I'm not sure what's missing or incomplete

  6. Form a new network with random settings
    $ zigpy -vvv radio --baudrate 115200 znsp /dev/ttyUSB1 form
    ✔️ A network appears to be formed

  7. Restore the old network settings
    $ zigpy -vvv radio --baudrate 115200 znsp /dev/ttyUSB1 restore backup.json
    ⚠️ This command seemed to fail due to an exception "backup is incomplete". but the next step may indicate that the restore succeeded?

  8. Dump the settings again
    $ zigpy -vvv radio --baudrate 115200 znsp /dev/ttyUSB1 backup -z > backup2.json
    ✔️ Only difference between the backups is the "backup_time" field.

  9. They should be identical except for the frame counter increasing by 5000 and the backup timestamp changing
    diff backup.json backup2.json
    ⚠️ Only difference between the backups is the "backup_time" field. Frame counter is not increased. Both show a frame counter of 5000.

  10. Energy scanning should work
    $ zigpy -vvv radio --baudrate 115200 znsp /dev/ttyUSB1 energy-scan
    ❌ Warning observed: WARNING Coordinator does not support energy scanning
    ❌ After mean 5 / 5 finished, the energy scan attempts seem to repeat forever. SIGINT is required to end the command.

  11. And finally, new devices can join the network and be interviewed by zigpy
    $ zigpy -vvv radio --baudrate 115200 znsp /dev/ttyUSB1 permit
    ✔️ I see devices joining the network.


Please let me know if you need additional testing help or more logs! Maybe I'll try setting this up with ZHA too.

@puddly
Copy link
Contributor Author

puddly commented Oct 31, 2024

@lhespress You can see from @robbawebba running the tests that there are still some problems.

Can you explain what exact commands I need to run, from the moment I connect to the serial port, to:

  • Check if a network exists.
  • Form a network with the exact, provided settings.
  • Start the current network, if one exists.
  • Load the currently-written network settings without starting the network.

I feel there is some mistake with the connect() code or the write_network_info() code. From the tests, it's clear that the coordinator is not starting up properly and network settings are not persisting.

@lhespress
Copy link
Owner

@puddly

  • Check if a network exists.
  • What do you mean how to check whether have formed a network or not?
  • Form a network with the exact, provided settings.
  • You can setting any information before call the form command.
  • Start the current network, if one exists.
  • It'll start the network if has formed when call start command.
  • Load the currently-written network settings without starting the network.
  • The newest firmware has supported it.

@lhespress
Copy link
Owner

@robbawebba Thanks for you share the test result.

  1. If you try to read settings, you should get a NetworkNotFormed error
  • Reset command only reboot the device, it doesn't erase all setting, factory command will do it.
  1. They should be identical except for the frame counter increasing by 5000 and the backup timestamp changing
  • The frame counter use a same value now, i can change it on next update.
  1. Energy scanning should work
  • Can you explain what the mean of Energy scan? Active scan available network or Energy detect?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants