Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

why are task2vec embeddings using diagonal of FIM better than using raw activations? #18

Open
brando90 opened this issue May 31, 2023 · 2 comments

Comments

@brando90
Copy link

I was curious if there were experiments in favour justifying diagonal of the FIM vs activations.

@brando90
Copy link
Author

brando90 commented Jun 3, 2023

thoughts?

Another benefit is that Task2Vec method does not rely on activations from an arbitrarily selected layer in a network.
Lastly, note that activations may be unreliable for embedding dataset/tasks because large distances between datasets/tasks may be due to well-separated decision boundaries instead of intrinsic semantic properties of the dataset/task.
In contrast, the diversity coefficient is well-justified, extensively tested in our work and previous work, e.g. the diversity coefficient correlates with ground truth diversities, cluster according to semantics, taxonomy etc. (see section \ref{appendix:ground_truth_div} and \cite{task2vec, curse_low_div}). 
In short, FIM based representations are motivated by information theory (e.g. FIMs are metrics in distributions) and have been extensively tested by independent sources \citep{curse_low_div, task2vec, nlp_task2vec}.
% main argument against activations is that the distance can be very large very randomly due to decision boundaries rather than intrinsic data properties e.g. models might try to maximize distance (SVMs)

@brando90
Copy link
Author

@alexachille

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant