-
Notifications
You must be signed in to change notification settings - Fork 31
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
first pass at server-side sockets (+ client comm changes) #5
Conversation
src/server/model_db.ts
Outdated
setupUserModel().then((dict) => { | ||
this.writeNewVars(dict.vars as tf.Tensor[]); | ||
}); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When we start the server for the first time, the effect of these lines will be to:
- create
dist/data/
- load the mnist transfer CNN from the internet
- save its fully connected variable values to
dist/data/<some-timestamp>.json
However, (3) will actually happen asynchronously (I can't await
its completion in the constructor), and the server might start before it finishes. This will only be an issue the very first time we start the server, so it might not be worth addressing, but it would probably be better to wait on starting the server until the database is ready to send weights to the client.
src/server/model_db.ts
Outdated
const DEFAULT_MIN_UPDATES = 10; | ||
const NO_MODELS_IN_FOLDER = '0'; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a little bit hacky -- I'm having getLatestId
return '0'
if there are no models in the data directory, and then using that as a special indicator that we need to initialize the database. Let me know if you think of a nicer way to structure this. I could have getLatestId
return undefined
instead, I guess.
This is great for a first pass! I left a few comments, but nothing blocking since things are still in flux, and not worth overthinking it in the beginning (thus LGTM) Reviewed 11 of 11 files at r1. src/model.ts, line 27 at r1 (raw file):
you can constrain the return type to be a src/model.ts, line 28 at r1 (raw file):
Move these types interface FederatedModel {
async setup(): ModelDict;
} and implement it here in src/model.ts, line 29 at r1 (raw file):
Maybe don't allow src/serialization.ts, line 68 at r1 (raw file):
Have you tried sending a src/server/fetch_polyfill.ts, line 5 at r1 (raw file):
Curious why you had to do the eval stuff. Anyhow, not blocking - we can chat in person. src/server/model_db.ts, line 27 at r1 (raw file): Previously, asross (Andrew Ross) wrote…
That's ok. Nit: usually returning src/server/model_db.ts, line 70 at r1 (raw file): Previously, asross (Andrew Ross) wrote…
Since c-tors are limited, we usually avoid doing real work in them, and have a setup() method on the class (which can be async) and you can call it from another place. In this case that setup() method will do all of the 3 steps. Comments from Reviewable |
Review status: all files reviewed at latest revision, 7 unresolved discussions, all commit checks successful. src/model.ts, line 20 at r1 (raw file):
can you import this from the public API? src/model.ts, line 25 at r1 (raw file):
I know we're just getting started, but maybe we could not hard code some examples into src/serialization.ts, line 59 at r1 (raw file):
it's probably a good idea to make this async (return src/serialization.ts, line 71 at r1 (raw file):
it would be good at add test coverage for this file src/server/fetch_polyfill.ts, line 1 at r1 (raw file):
put a license at the top of this file. src/server/fetch_polyfill.ts, line 8 at r1 (raw file):
use spaces not tabs src/server/model_db.ts, line 27 at r1 (raw file): Previously, dsmilkov (Daniel Smilkov) wrote…
+1 to using null here src/server/model_db.ts, line 67 at r1 (raw file):
you dont need parens around dict since it's the only argument src/server/model_db.ts, line 89 at r1 (raw file):
return type on this method Comments from Reviewable |
src/model.ts, line 20 at r1 (raw file): Previously, nsthorat (Nikhil Thorat) wrote…
It doesn't seem to be available from Comments from Reviewable |
src/server/model_db.ts, line 27 at r1 (raw file): Previously, nsthorat (Nikhil Thorat) wrote…
Great, done (and thanks for the heads up about Comments from Reviewable |
src/server/model_db.ts, line 70 at r1 (raw file): Previously, dsmilkov (Daniel Smilkov) wrote…
Got it, good to know about that pattern. Comments from Reviewable |
src/serialization.ts, line 68 at r1 (raw file): Previously, dsmilkov (Daniel Smilkov) wrote…
This is for reading/writing from files, but it's possible I could save/load them in a binary format using the Comments from Reviewable |
src/model.ts, line 25 at r1 (raw file): Previously, nsthorat (Nikhil Thorat) wrote…
Done, although I'm not sure if the way I did it is what you had in mind -- I moved Comments from Reviewable |
src/model.ts, line 20 at r1 (raw file): Previously, asross (Andrew Ross) wrote…
(where it := Comments from Reviewable |
src/server/fetch_polyfill.ts, line 5 at r1 (raw file): Previously, dsmilkov (Daniel Smilkov) wrote…
There's almost certainly a better way, but the reason I did it this way was:
Let me know if you have a simpler solution for any part of that chain of problems... Comments from Reviewable |
src/comm_test.ts, line 116 at r4 (raw file):
Question / possible issue -- Let's imagine a user labels one example, and the client code, fearing the user may disconnect imminently, decides to compute a weight update (following our fit parameters) and send it up to the server with However, the user doesn't disconnect, and instead goes on to label three more examples. The client has at least two options:
(I think in either case, it has to revert to the initial weights before retraining, to ensure that different clients send weight updates generated under consistent conditions.) I've assumed that we're working under option 1, since it's a bit simpler from the server-side and also doesn't require that the client remember any training data after it's been used in SGD. However, what it will probably amount to in practice is that all of the updates will be generated using individual examples. This might not be the end of the world -- the experiment from before shows that, on MNIST, learning from ~30 independent updates with 1 example is only slightly worse than learning from ~10 independent updates with 3 examples. But maybe it's better to have client updates still supersede each other. Let me know what you all think. Comments from Reviewable |
Review status: complete! 1 of 1 LGTMs obtained a discussion (no related file): Comments from Reviewable |
Really nice work! Left some tiny comments. Feel free to submit after! Reviewed 6 of 9 files at r3, 7 of 7 files at r4. demo/mnist_transfer_learning_model.ts, line 20 at r4 (raw file):
To emulate better the usage of the API, introduce a src/comm_test.ts, line 20 at r4 (raw file):
remove the 2 commented out lines src/comm_test.ts, line 116 at r4 (raw file): Previously, asross (Andrew Ross) wrote…
Great observation. Can you file an issue on the repo with this discussion and we can revisit it later? For now #1 sounds good. (we can come back to this if need arises based on the experiments) src/comm_test.ts, line 122 at r4 (raw file):
check out jasmine's spyOn to verify that a callback was called. https://jasmine.github.io/2.0/introduction.html src/server/fetch_polyfill.ts, line 5 at r1 (raw file): Previously, asross (Andrew Ross) wrote…
This is ok for now. Shanqing is working on making node work with https, at which you this hack will go away. Comments from Reviewable |
src/comm_test.ts, line 122 at r4 (raw file): Previously, dsmilkov (Daniel Smilkov) wrote…
Unfortunately, I don't think Comments from Reviewable |
src/comm_test.ts, line 116 at r4 (raw file): Previously, dsmilkov (Daniel Smilkov) wrote…
Yep, done: #7 Comments from Reviewable |
This implements a server websocket API that supports two basic message types: sending a
DownloadMsg
and receiving anUploadMsg
.The serialization/deserialization logic is in flux and will definitely be streamlined, but this should work as a first pass. The main TODO is to get this working with the client at the API level.
This change is