-
-
Notifications
You must be signed in to change notification settings - Fork 369
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WIP: Feature: fork kernel #410
base: main
Are you sure you want to change the base?
Conversation
550763d
to
13137a7
Compare
Hey @maartenbreddels ! I'm interested in helping you with this. I don't quite get what is the strategy here... Decreasing run time can indeed be done with more than one process, but you did mention this:
If you are waiting for it to be done, then we are waiting for the whole run time, right? Could you please be more specific about this idea? It does sound interesting, though! :) |
Hello @maartenbreddels, I'm very interested in this feature. Is there a way I can help on how we can implement this together? |
I currently don't have much time to work on this, but I'm happy for you to take it over a bit, or test it out. |
Hey @maartenbreddels, I believe I found out why the child kernel does not receive the messages. That's due to the class Session:
def send(...):
...
if self.check_pid and not os.getpid() == self.pid:
get_logger().warning("WARNING: attempted to send message from fork\n%s",
msg
)
return If I set import itertools
import sys
import time
from jupyter_client import BlockingKernelClient
import logging
logging.basicConfig()
connection_file = sys.argv[1]
client = BlockingKernelClient()
client.log.setLevel('DEBUG')
client.load_connection_file(sys.argv[1])
client.start_channels()
client.wait_for_ready(100)
obj = client.fork()
msg = client.shell_channel.get_msg(timeout=100)
print("=========================================")
print(" Client after fork")
print("=========================================")
client_after_fork = BlockingKernelClient()
client_after_fork.log.setLevel('DEBUG')
client_after_fork.load_connection_info(msg["content"]["conn"])
client_after_fork.start_channels()
client_after_fork.wait_for_ready(100)
client_after_fork.execute(code="foo = 5+5", user_expressions={"my-user-expresssion": "foo"})
msg = client_after_fork.get_shell_msg(timeout=100)
print("Value of last execution is {}".format(msg["content"]["user_expressions"]["my-user-expresssion"]["data"]["text/plain"]))
client_after_fork.execute(code="bar = 1+2", user_expressions={"my-user-expresssion": "foo,bar"})
msg = client_after_fork.get_shell_msg(timeout=100)
print("Value of last execution is {}".format(msg["content"]["user_expressions"]["my-user-expresssion"]["data"]["text/plain"])) Will print to the console:
|
I've also tried connecting to a "forked" kernel and it does work! |
Awesome work, I'm really glad I put out this half working PR now. Hooray to
open source :)
(from mobile phone)
…On Sun, 25 Aug 2019, 20:28 Edison Gustavo Muenz, ***@***.***> wrote:
I've also tried connecting in another process to a "forked" kernel and it
does work!
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#410?email_source=notifications&email_token=AANPEPOU6W3N3WEA4DPQ4ALQGLFOBA5CNFSM4HQHMM5KYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD5CZEFI#issuecomment-524653077>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AANPEPMDFFDSJMYX7DWKPCLQGLFOBANCNFSM4HQHMM5A>
.
|
Testing
|
I'm working now on getting to reuse the Kernel, I've forked the repo here: https://github.com/edisongustavo/ipykernel/, I believe it will take some time now, since I can only work on this on my free time. But I'm getting there! |
By spawning subprocesses with `stdout = open('/dev/null')` then PyCharm is not able to attach to them. This is specially painful when debugging the tests in test_kernel(), where the methods `kernel()` do spawn kernels in subprocesses if required.
This is the directory created by PyCharm
a28afe4
to
083458e
Compare
Issue
For dashboarding, it is useful to have a kernel + the final state ready instantly. If for instance, it takes ~5 minutes to execute a notebook, it does not lead to good user experience.
Solution
Execute a notebook, and after it is done, consider it a template, and give each user a
fork()
of this template process.Implementation
The kernel gets a new message, which is simply called
fork
. Asyncio/tornado-ioloop with fork is a no-go, it is basically not supported. To avoid this, we stop the current ioloop, and use the ioloop object to communicate the fork request up the callstack. The kernelapp now checks if a fork was requested and it will fork:Usage (although it does not work yet)
It also needs this PR: jupyter/jupyter_client#441
Shell1:
Shell2:
The last command never does anything, so I ctrl-c'ed it, so the stacktrace is included.
Problem
It doesn't work 😄. The forked child process doesn't seem to be reachable by zmq, and no debug info is printed, so I'm giving up here, and hope that someone else has good ideas or wants to pick this up.
Side uses
Forking the kernel could be used for 'undoing' the excution of a cell/line, by forking after each
execute(..)
. Note that since fork uses copy on write, this will be fairly cheap in resource usage.