-
Notifications
You must be signed in to change notification settings - Fork 78
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
uselect Poll.ipoll() returns unexpected events with 'userdata' feature #46
Comments
@AndyFug, I would like to thank you for the excellent bug report. It indeed covers all the major points - Unix port tested, detailed description, minimal reproduction testcase provided, and even the problem spot is pinpointed. I understand that it took some time to prepare, and even just write down, this report, and it's fully appreciated. That's the kind of report any open-source maintainer would like to receive (well, I for sure). I have to admit that lately I'm concentrating on Unix port, simply because IMHO no MicroPython microcontroller port is without issues, and my approach (unlike MicroPython) is that the Unix port is the main one, and other ports should follow it, not the other way around. Actually, I'm not even sure if I have an esp8266 hardware here (== where world's lockdown caught me) for testing. So well, I apologize if processing this will take more time. I hope to look into this in more detail on weekend. |
Hi @pfalcon, and many thanks for your response. Glad that my first report on GitHub has covered the main points... I now need to start learning a bit more about git/github etc and attempt to apply it to some of my projects. Totally understand re. Unix port being your main priority. I thought you may say that. I'm keen to use Picoweb (and, in turn, Pycopy etc) so I'll just keep my reverted version of uasyncio running on my current project for now until you get an opportunity to look into this. Cheers, |
@AndyFug, So, I had a quick look so far. The commit which introduces userdata support is 3ae5c19. As you can see, it's marked "WIP", but that's mostly, IIRC, because it didn't have an associated unittest (which is kinda hard to write so it covered every possibility). The changes in that commit are however pretty simple and obvious. It doesn't change any existing algorithms, just add storage for this "userdata", then marshals it back and forth. So, while your testcase demonstrates that the problem is triggered by "userdata" param, I suspect that's a false lead, and there's a different underlying problem. One issue can be that baremetal (unlike Unix) moduselect.c uses mp_map_t to store info about the polled sockets, which is not safe to change (e.g., add, possible, delete), during iteration. And your testcase demonstrates that, in the
etc. So, to verify if that would be the problem, you would need to replace:
with
I.e. wrap ipoll() call in a list(), to make it eager (produce all results in once) instead of lazy (incremental). If you can test such a change, I'd appreciate it, because I still have a bunch of WIP issues in my queue, before I can start to look into this one. |
Hi, while True:
gc.collect()
log.debug("Memory used: {}".format(start_mem - gc.mem_free()))
# One-shot behaviour + eager instead of lazy iteration by wrapping result in a list
res = list(poll.ipoll(-1, 1))
log.debug("ipoll result: {}".format(res))
for s, e, obj in res:
i += 1 Sadly the output is still the same. As soon as I start the server loop with 'userdata' parameter, I get the following output (with no client connections):
For what it's worth (given my lack of experience/understanding at this low level), I'll take a look at that commit you mentioned ( 3ae5c19 ) and see if I can spot anything else or think of another test that would highlight the underlying issue. I'll report back if I discover anything.. (unlikely!) :) Many thanks again for taking the time to have a think about this. Let me know if you want me to conduct any further tests with the test case or experiment with tweaks to other modules. Cheers, |
Hi,
Novice to this world, but hoping that this is a valid contribution. Apologies if I've missed something obvious or if this report isn't clear enough/in the correct place. Sorry that it's a little lengthy but I'm led to believe that more (useful) information is often better than less. Feedback accepted. :)
I was trying to use Picoweb/Uasyncio with Pycopy on an ESP8266, when I stumbled across the EAGAIN issues discussed here:
#30
micropython/micropython#4824
I applied the work-around suggested by catching the exception on s.accept() in uasyncio start_server() loop, but that then unveiled what appeared to be going on (once you enabled verbose logging). It appeared to me that uselect Poll.ipoll() method was returning unexpected events, so when uasyncio's start_server() loop was expecting a POLLIN event after yielding IORead(s) it was, in fact, receiving other events - mostly POLLOUT(4), hence why it was expecting to accept new client connections when there weren't any and raising the EAGAIN exception.
To look into this further, I came up with a minimal test-case for uselect using the same methods as uasyncio, based on others' code from previous posts:
After playing around with this code on the ESP8266, I discovered that the issue described only appeared to come about when adding the 'userdata' argument to the Poll.register() method. The code above can be run with/without 'userdata' and, for me, it only works as expected when the 'userdata' feature isn't used. I used a basic packet sender application as a client to send short ASCII strings. Here are the outputs:
With Poll.register() 'userdata' argument:
(No client connections during this output... just the server running on its own)
Without Poll.register() 'userdata' argument:
(+ Client connection with 3 ASCII strings sent client > server)
Extra info:
For my Picoweb app (which was leaking memory after consecutive client connections), I've reverted uasyncio's application of userdata for callbacks by rolling back commit: pfalcon/pycopy-lib@69fecaf. This seems to have fixed my memory leak issue for now.
Sadly I'm not experienced enough to find the root cause of the issue myself. I suspect it could lie in a C module (Such as this one?) or lwIP stack, but I don't have the experience to follow what's going on at this low level.
Many thanks in advance to anyone who looks into this with me, and for all the fantastic work you guys have done on these projects.
Kind regards,
Andy
The text was updated successfully, but these errors were encountered: