Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SshSession::connect_mux hangs on MacOS 14.5 #150

Open
justin-elementlabs opened this issue Aug 7, 2024 · 21 comments
Open

SshSession::connect_mux hangs on MacOS 14.5 #150

justin-elementlabs opened this issue Aug 7, 2024 · 21 comments

Comments

@justin-elementlabs
Copy link

When running the example code, Rust will print "SSH before" but fail to print "SSH after". After a while, I will see a warning about my test "has been running for over 60 seconds". It will continue hang with no error or further output.

Am I missing libraries on my machine? Are there other suggestions on how to find out what is causing this to hang? The russh crate has the same issue.

println!("SSH before"); let result = SshSession::connect_mux( format!( "ssh://{}@{}:{}", &credentials.username, &credentials.host, &credentials.port ), KnownHosts::Add, ) .await; println!("SSH after");

@NobodyXu
Copy link
Member

NobodyXu commented Aug 8, 2024

sshsession::connect_mux is usually used when you already creates a ssh multiplex master.

Usually you want to use SessionBuilder

@justin-elementlabs
Copy link
Author

justin-elementlabs commented Aug 12, 2024

@NobodyXu , thanks for the tip. Maybe you could help with a full example? This is what I have right now.

It's hanging after printing "SFTP 201".

If I run ps -ax | grep ssh I do see

ssh -E .../.local/state/.ssh-connectionbb4jRB/log -S .../.local/state/.ssh-connectionbb4jRB/master -M -f -N -o ControlPersist=yes -o BatchMode=yes -o StrictHostKeyChecking=accept-new -p 22 -l ... -o IdentitiesOnly=yes -I ... ssh://...

 let mut b = SshSessionBuilder::default();
b.keyfile("...");
let dest = format!(
    "ssh://{}@{}:{}",
    &credentials.username, &credentials.host, &credentials.port
);
let (b, d) = b.resolve(&dest);
println!("SFTP 201");

let temp_dir = b.launch_master(&dest).await; // What do I do with temp_dir?
println!("SFTP 202");

let result = SshSession::connect(&dest,
    KnownHosts::Add,
)
.await;
println!("SFTP 203");
if result.is_err() {
    return Err(format!(
        "Unable to connect to SSH server at {}:{} (error: {})",
        &credentials.host,
        &credentials.port,
        result.err().unwrap()
    ));
}
println!("SFTP 204");

let ssh_session = result.unwrap();

let result = _Sftp::from_session(ssh_session, Default::default()).await;
if result.is_err() {
    return Err(format!(
        "Unable to establish to SFTP session from SSH session at {}:{} (error: {})",
        &credentials.host,
        &credentials.port,
        result.err().unwrap()
    ));
}
println!("SFTP 220");

@NobodyXu
Copy link
Member

NobodyXu commented Aug 12, 2024

Connecting is quite simple:

let session = SshSessionBuilder::default()
    .keyfile("...")
    .connect_mux(format!(
        "ssh://{}@{}:{}",
    &credentials.username, &credentials.host, &credentials.port
)).await?;

@justin-elementlabs
Copy link
Author

justin-elementlabs commented Aug 12, 2024

It's still hanging following the example provided after SFTP 201:

println!("SFTP 200");

let mut b = SshSessionBuilder::default();
let b = b.keyfile("...");
let dest = format!(
    "ssh://{}@{}:{}",
    &credentials.username, &credentials.host, &credentials.port
);

println!("SFTP 201");
let result = b.connect_mux(&dest).await;
println!("SFTP 202");

if result.is_err() {
    return Err(format!(
        "Unable to connect to SSH server at {}:{} (error: {})",
        &credentials.host,
        &credentials.port,
        result.err().unwrap()
    ));
}
println!("SFTP 209");

let ssh_session = result.unwrap();

let result = _Sftp::from_session(ssh_session, Default::default()).await;
if result.is_err() {
    return Err(format!(
        "Unable to establish to SFTP session from SSH session at {}:{} (error: {})",
        &credentials.host,
        &credentials.port,
        result.err().unwrap()
    ));
}
println!("SFTP 220");

@justin-elementlabs
Copy link
Author

If I run ps -ax | grep ssh I do see

ssh -E .../.local/state/.ssh-connectionbb4jRB/log -S .../.local/state/.ssh-connectionbb4jRB/master -M -f -N -o ControlPersist=yes -o BatchMode=yes -o StrictHostKeyChecking=accept-new -p 22 -l ... -o IdentitiesOnly=yes -I ... ssh://...

But Rust just hangs

@NobodyXu
Copy link
Member

NobodyXu commented Aug 12, 2024

I think it might be the ssh multiplex master that is hanging, maybe that the remote host is hanging somehow?

Can you login to the remote host using ssh on cmdline?

@justin-elementlabs
Copy link
Author

@NobodyXu , yes I can ssh fine using command line.

@NobodyXu
Copy link
Member

So multiplex master is not working for some reason...

Can you try connecting to the multiplex master directly, using ssh?

@justin-elementlabs
Copy link
Author

@NobodyXu , yes this works:

ssh -S /.../.local/state/.ssh-connectionyNAAbu/master ip

@NobodyXu
Copy link
Member

Can you try:

let session = SshSessionBuilder::default()
    .keyfile("...")
    .connect(format!(
        "ssh://{}@{}:{}",
    &credentials.username, &credentials.host, &credentials.port
)).await?;

Maybe it's the openssh-mux-client not working?

@justin-elementlabs
Copy link
Author

justin-elementlabs commented Aug 12, 2024

@NobodyXu
It's the same unfortunately. It will hang when connect is called and only print SFTP 201

println!("SFTP 201");
let result = SshSessionBuilder::default().keyfile("...").connect(&dest).await;
println!("SFTP 202");

@NobodyXu
Copy link
Member

Ok I think I misunderstood the issue.

It actually stucks in launch_master, it's likely the ssh command never exits.

We expect the ssh to fork and create a server process in background, and then returns.

It's likely not the case here.

@NobodyXu
Copy link
Member

ssh -E .../.local/state/.ssh-connectionbb4jRB/log -S .../.local/state/.ssh-connectionbb4jRB/master -M -f -N -o ControlPersist=yes -o BatchMode=yes -o StrictHostKeyChecking=accept-new -p 22 -l ... -o IdentitiesOnly=yes -I ... ssh://...

@justin-elementlabs if you execute this command manually, does it exit immediately or stuck?

@justin-elementlabs
Copy link
Author

justin-elementlabs commented Aug 12, 2024

@NobodyXu it will run for 1-2 seconds and then return to the terminal (exit immediately)

% ssh -E .../.local/state/.ssh-connectionbb4jRB/log -S .../.local/state/.ssh-connectionbb4jRB/master -M -f -N -o ControlPersist=yes -o BatchMode=yes -o StrictHostKeyChecking=accept-new -p 22 -l ... -o IdentitiesOnly=yes -I ... ...

(1-2 seconds later with no more user input...)

%

@NobodyXu
Copy link
Member

In openssh, we just wait for the procees to exit and check its status https://docs.rs/openssh/latest/src/openssh/builder.rs.html#487

Given that we are using tokio::process, I think something might be wrong with it? @justin-elementlabs

@NobodyXu
Copy link
Member

Which tokio version are you using?

And what is the kernel (linux or macOS)?

I strongly suspect it's a bug in tokio

@justin-elementlabs
Copy link
Author

[[package]]
name = "tokio"
version = "1.38.0"

macOS 14.5

@NobodyXu
Copy link
Member

I recommend to update to latest tokio (1.39.2), if it still doesn't work, it could be a tokio bug, try launching that ssh command using tokio::process directly, if that stucks with tokio but not your cmdline, then it could be a tokio bug.

@justin-elementlabs
Copy link
Author

@NobodyXu , upgrading tokio didn't help. I created an issue just FYI: tokio-rs/tokio#6770

@sander2
Copy link

sander2 commented Oct 29, 2024

@justin-elementlabs assuming that this is still an issue, I think it might be caused by you using executor::block_on: https://github.com/elementlabs42/BitVM-playground/blob/e102f23d88fee5c46f8e6f86442f371e857016ba/src/bridge/client/data_store/sftp.rs#L208 .If you call the function from a regular async function in a tokio context it works for me

@NobodyXu
Copy link
Member

If block_on causes it to fail, then maybe you can try tokio::spawn and then block_on the handler?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

3 participants