Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Redirecting STDOUT and STDERR #68

Open
eric-hemasystems opened this issue Oct 19, 2022 · 9 comments
Open

Redirecting STDOUT and STDERR #68

eric-hemasystems opened this issue Oct 19, 2022 · 9 comments
Labels
❓ question Further information is requested

Comments

@eric-hemasystems
Copy link

Summary

When I run a wasi binary with this library the stdout and stderr are directed towards the application's stdout and stderr. I would rather capture that information to present it in my own way. Is there an option for that which I am missing.

Additional details

Right now I'm working around this issue by futzing with the file descriptors for the program but this seems less than ideal.

def capturing_output
  old_stdout = $stdout.dup
  old_stderr = $stderr.dup

  Tempfile.create '' do |stdout|
    $stdout.reopen stdout.path, 'w+'

    Tempfile.create '' do |stderr|
      $stderr.reopen stderr.path, 'w+'

      yield

      stdout.read
    rescue RuntimeError
      raise ScriptError.new stdout.read, stderr.read, $!.message
    end
  ensure
    $stdout.reopen old_stdout
    $stderr.reopen old_stderr
  end
end

capturing_output do
  Wasmer::Instance.new(module_, import).exports._start.()
end

It's both overly complicated and requires the use of files which is IO I would prefer to avoid. I expect my output to be very small so it would be ideal if I could provide a StringIO object to avoid that disk access.

@eric-hemasystems eric-hemasystems added the ❓ question Further information is requested label Oct 19, 2022
@kingdonb
Copy link

I went the other way – I assumed I wasn't meant to redirect the input/output, since Wasi "has the conn"

But I see you can do it, at least Ruby doesn't stop you!

I went ahead and converted my whole HTML parsing strawman example Wasi program into a Wasm library rather than simply try to redirect the output and capture it, now come to find out that there is no String type in Wasm and I have to learn how to use a pointer and pass some memory around. Now I wish I had done the I/O redirection instead. :table_flip:

All this in order to have my strawman example do the Http Client part in Ruby, but have the actual HTML Parsing part happen in my Rust module. (This is actually my first Rust program! Wasm is so cool! I just wish I knew more about Wasm itself before I got this far along building and tilting at windmills... I feel like I have no idea what I'm doing)

I think the example for exported memory says it's going to show how to read from and write into an exported memory:

https://github.com/wasmerio/wasmer-ruby/tree/bcc8a79fa32cc656db063daa785edeba901c7772/examples#exports

but the actual reader only ever calls memory.uint8_view pointer, calling take(13) since we know how many bytes is in "Hello World"

I would suggest, rather than redirecting STDOUT and STDERR, perhaps it would be great to have an example that does show how to go about sizing a memory for a string that we know the length of, and writing it into the memory for your Wasm to read on demand?

Or maybe that's not the right pattern. Anyway I think this goes here, the sentiment is there aren't enough examples that show string handling. Can we pass strings into Wasm too? I'm learning this for the first time as a Rubyist and it is hard! :)

@kingdonb
Copy link

kingdonb commented Apr 25, 2023

I've now read a bit further in the docs, and I found that:

https://docs.wasmer.io/integrations/examples/memory-pointers

All the examples that center around writing have been omitted for Ruby. Can there be a safety reason for this? I think it's probably on purpose, right?

Edit: I read a bit further, I think I get it now. You don't write exported memory in Wasm modules from Ruby. If you were writing exported memory, you'd be writing something like a native (C) extension. The examples for interacting with memory include Rust, Go, and C... which are all languages that either are C, or already support extern C natively one way or another here.

Ruby evidently isn't letting us do all that. I'm not sure why, but I'll keep reading!

@eric-hemasystems
Copy link
Author

Another problem I've found with my IO redirection strategy is it doesn't really support multi-threading. If another thread is writing to $stdout or $stderr (say for logging) you end up with content from that other thread.

Since the STDOUT and STDERR file descriptor are process-wide this really means I may have to fork the process to get unique STDOUT and STDERR file descriptors and then communicate the info back to the parent fork when done.

In the JS side of things there is a wasmfs that can be used to provide a STDOUT and STDERR to the wasmer process independent of the calling processes STDOUT and STDERR. Seems we need something like that.

@kingdonb
Copy link

kingdonb commented Jun 1, 2023

I'm not using threads but fibers instead, so that's not really a problem for me, but it is a problem when there is an error

(I wound up doing instead):

raise ScriptError, "#{stdout.read}, #{stderr.read}, #{$!.message}"

because in the version above passing 3 arguments to ScriptError.new looks to be causing some kind of a syntax error.

But if you drop in a binding.pry statement here, you get the pry in the context where the stdout and stderr have been redirected, so not very useful and the input/output seems frozen unless you know that is what has happened.

I would love to try this again but fork and waiting, or with the wasmfs approach, but I figured out the problem for now by rethrowing that error to outside of the I/O redirection context, and then it seems to be recoverable. (Or in the case of my error, since no recovery was possible, I just fixed it... I needed a 64 bit integer where I had only given a 32 bit one)

I'm developing a wasm module that runs in Ruby, "in anger" here now: https://github.com/kingdonb/stats-tracker-ghcr

It's nice that it caught the type error, but the debugging experience leads me to believe I'm still not doing things the best possible way.

@eric-hemasystems
Copy link
Author

Ah, I forgot to include the custom error in my example:

# Exception raised with either a syntax or execution error
class ScriptError < StandardError
  attr_reader :stdout, :stderr

  def initialize stdout, stderr, msg=nil # :nodoc:
    super msg

    @stdout = stdout
    @stderr = stderr
  end

  def to_s # :nodoc:
    <<~OUTPUT
      Standard Out: #{@stdout}
      Standard Error: #{@stderr}
    OUTPUT
  end
end

If your curious my entire module can be found here. It's goal is to execute JS safely server-side. It uses Spidermonkey (Firefox's JS engine) compiled to WASM. Feel free to use whatever is useful.

@eric-hemasystems
Copy link
Author

If you are using my code (from above or the gist) note the GC.start note I have in the gist. I didn't find out about that until later but you will eat memory if you don't do it.

@kingdonb
Copy link

I will definitely investigate using your forking solution today. I'll copy your version with ScriptError, and definitely take some measurements of before and after I call that GC. Since I've been writing this process as a cronjob, I don't really care to call the GC if I'm just getting immediately shut off when I'm over and done with anyway. But the final design of the operator calls for something other than a cronjob, so it will become important (thanks for the pointers!)

There are now several operations that might be running at the same time, and wasmer is one. I think I finally got the fibers working (⚡ 🚅 ) and I noticed the problems that you mentioned even with fibers, I think it's not only a problem for threads, as I occasionally saw the program exiting with a strange error, like runtime errors or methods missing, finding strings and numbers in places where fibers should be. I'm really honestly not sure how many of the things I'm using are really fiber safe, or if I'm using them in properly fiber-safe ways. Still learning my way around this part of Ruby.

Anyway, I don't see fork and exec in your code, so I'm assuming you just use fork and wait (or detach) somewhere else in your codebase before you call this code via call method? (Or is there something about the call method that does fork and I'm just not seeing it?)

@eric-hemasystems
Copy link
Author

eric-hemasystems commented Jun 20, 2023

Yea, as of my last post I hadn't yet resolved the IO capturing issue. As a temp solution I just reduced to one thread in the process that runs the code. It was a background process so that is ok.

But a week or so later I did update to do the fork. Code got a bit more ugly but it seems to work. Just updated that gist to show the latest version. This hasn't seen production usage yet unlike the last version.

Updating this library to allow IO to be captured directly would be far superior but that seems more involved so the fork hack will have to work for now. The good news is the manual garbage collection is no longer needed since the forked process exits completely after running the wasmer binary reclaiming all memory. I did preload the wasmer binary in the parent process so copy-on-write memory semantics mean less memory churn.

I'm not sure how forking will work with fibers. I know when you fork it only runs the current thread in the sub-process (which is what I want). I don't know enough about fibers to know if it will work there.

@kingdonb
Copy link

kingdonb commented Jun 21, 2023

I think if I build my fiber to fork and wait, then the ruby program mostly stays the same, and the things that fiber was doing with wasmer or with HTTP client, or with ActiveRecord models, that were unsafe (eg. STDIO redirection trick), making them safe for fibers, (since now the fiber is just forking and waiting, whatever it was doing that was unsafe before will have to get its own subprocess context, and it should thus gain automatic safety since it is no longer sharing the same process space)

I think it's possible I didn't design this model carefully or thoughtfully enough, maybe better add some tests... thanks for the good examples 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
❓ question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants