Replies: 1 comment
-
100 trillion parameters? |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
With recent discussions of real time frankenmerges by running passes over layers multiple times, I was wondering what the limit would be in terms of size. Could it theoretically scale to 100T and beyond? If speed scales linearly with the number of layers, I think it would only take around 50s per token on a 3090. Is this how we achieve ASI?
Beta Was this translation helpful? Give feedback.
All reactions