You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
(I know do not include the plots as these are on the main page.)
First, the result values that Stata produces using the user community contributed package fre with and without analytical weights:
Next, the result values that the code for the alluvial plot that uses value(wage) produces for the categorical variable race
White (75.85%)
Black (22.87%)
and, the result values that the code that uses [w = wage] produces for the categorical variable race
White (72.89%)
Black (25.96%)
You note in the help file (without indicating which type of weight is used): Weights are allowed but use them cautiously.
Now, I am a bit puzzled because using [w = wage] appears not to produce the weighted percentages, whereas using value(wage) does.
Maybe this is intentional, but, I fail to grasp why using [w = numvar] does not seem to make a difference for alluvial (yet).
Also, your description of the functional use of value(numvar): Define a numerical variable that will be aggregated over the categories for the flows. The default is the count of rows.
makes me wonder what 'that will be aggregated over the categories' implies: is it weighting - the weighted count of categorical cases?
The text was updated successfully, but these errors were encountered:
Dear Eric, I would highly recommend just using value() if that is the key variable over which the items need to be summed.
Weighted sums are very likely to give different estimates since the formulas for weights are doing more that simple summations. The weights should only be used if the data genuinely has a specific weight type (aw, pw, fw, iw in Stata) . I am planning on writing a note on this.
Hmm, I am most interested in an example that shows where things go wrong!
But, what you write does not explain why the alluvial code that uses [w = wage] does not produce weighted result values while fre race [aw = wage] does. That is confusing (to me). I mean, Stata's tab produces the same weighted result:
. tab race [aw = wage]
Race | Freq. Percent Cum.
------------+-----------------------------------
White | 1,703.6123 75.85 75.85
Black | 513.763792 22.87 98.73
Other | 28.6238923 1.27 100.00
------------+-----------------------------------
Total | 2,246 100.00
Dear Asjad,
Thank you for your new version of
alluvial
. I just replicate your previous example that uses[w = wage]
:and (want to) compare that with your last example that uses
value(wage)
(I know do not include the plots as these are on the main page.)
First, the result values that Stata produces using the user community contributed package
fre
with and without analytical weights:Next, the result values that the code for the alluvial plot that uses
value(wage)
produces for the categorical variablerace
and, the result values that the code that uses
[w = wage]
produces for the categorical variablerace
You note in the help file (without indicating which type of weight is used): Weights are allowed but use them cautiously.
Now, I am a bit puzzled because using
[w = wage]
appears not to produce the weighted percentages, whereas usingvalue(wage)
does.Maybe this is intentional, but, I fail to grasp why using [w = numvar] does not seem to make a difference for alluvial (yet).
Also, your description of the functional use of value(numvar): Define a numerical variable that will be aggregated over the categories for the flows. The default is the count of rows.
makes me wonder what 'that will be aggregated over the categories' implies: is it weighting - the weighted count of categorical cases?
The text was updated successfully, but these errors were encountered: