You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Another benefit is that Task2Vec method does not rely on activations from an arbitrarily selected layer in a network.
Lastly, note that activations may be unreliable for embedding dataset/tasks because large distances between datasets/tasks may be due to well-separated decision boundaries instead of intrinsic semantic properties of the dataset/task.
In contrast, the diversity coefficient is well-justified, extensively tested in our work and previous work, e.g. the diversity coefficient correlates with ground truth diversities, cluster according to semantics, taxonomy etc. (see section \ref{appendix:ground_truth_div} and \cite{task2vec, curse_low_div}).
In short, FIM based representations are motivated by information theory (e.g. FIMs are metrics in distributions) and have been extensively tested by independent sources \citep{curse_low_div, task2vec, nlp_task2vec}.
% main argument against activations is that the distance can be very large very randomly due to decision boundaries rather than intrinsic data properties e.g. models might try to maximize distance (SVMs)
I was curious if there were experiments in favour justifying diagonal of the FIM vs activations.
The text was updated successfully, but these errors were encountered: