Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove node-calls-python #55

Closed
josephjclark opened this issue May 8, 2024 · 1 comment · Fixed by #61
Closed

Remove node-calls-python #55

josephjclark opened this issue May 8, 2024 · 1 comment · Fixed by #61

Comments

@josephjclark
Copy link
Collaborator

Sadly, node-calls-python isn't going to be sustainable fit for this project. It's a real shame.

The two issue are:

  1. Logging - how to we get python stdout back into node?
  2. Concurrency - how do we run multiple modules at once?

Admittedly questions are open about concurrency, but at the very least it gives us problems with logging.

The problem is this:

  • node-calls-python creates a single thread (so far as I understand it) in which all python executes
  • I think that means that if one module starts blocking, all modules will die
  • There is absolutely no threading/module control for npc. I get one context and I can't create more. Even if I could, I'm not sure
  • I can't at the moment get the stdout for node-calls-python at all
  • But if I had multiple jobs running, I could not tell the logs apart - I wouldn't know which was which

Possible solutions:

  • I could use workerpool and create a new thread with node-calls-python for each job. But what's the point? We may as well just do child_process.exec(0

Basically I think that, for now at least, it's going to be just as easy to do child_process.exec and create a new python context each time. At this point I don;'t even care if that's a little slow in production - but I don't even have a reading on HOW slow.

TODO:

  • Try and investigate blocking and queuing. Can I run two modules concurrently? Most of the time it's just gonna wait for HTTP. I don't know enough about python to know if that wait is blocking. that should be a quick test. Well that wasn't easy but yes I think they run sequentially.
@josephjclark
Copy link
Collaborator Author

Another issue coming up is that the python modules have to be DESIGNED to be long lived. You have to think very carefully about what gets instantiated where. Right now the logger has to be created inside main and given a unique id. Module-scoped loggers will forever log to the wrong file. And the same with any clients, you need to think about when it gets instantiated, and be sure you're not accidentally using the wrong client.

This is fairly hard engineering stuff. I'd actually PREFER that each python gets loaded, run and destroyed on every call - it's so much safer.

I suppose we could keep node-calls-python in always-reload mode, but in local dev that feels a little bit unstable (the server keeps dying and I haven't yet diagnosed why), but I also hope that some of the other libraries will make getting the stdout easier.

@github-project-automation github-project-automation bot moved this from Backlog to Done in v2 May 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

1 participant