Using [w = wage] or value(wage) #28

ericmelse · 2024-11-12T14:04:53Z

Dear Asjad,

Thank you for your new version of alluvial. I just replicate your previous example that uses [w = wage]:

alluvial race married collgrad smsa union [w = wage], smooth(8) alpha(60) palette(CET C6) valsize(2)  ///
	laba(0) labs(1.6) boxw(11) gap(2) novalues ///
	showtotal wrapcat(20) wraplab(15) catgap(8) plotregion(margin(b+5 l+10 r+10)) ///
	xsize(2) ysize(1) showmiss labprop percent

and (want to) compare that with your last example that uses value(wage)

alluvial race married collgrad smsa union, value(wage) ///
	smooth(8) alpha(60) palette(CET C6) valsize(2)  ///
	laba(0) labs(1.6) boxw(11) gap(2) novalues ///
	showtotal wrapcat(20) wraplab(15) catgap(8) plotregion(margin(b+5 l+10 r+10)) ///
	xsize(2) ysize(1) showmiss labprop percent

(I know do not include the plots as these are on the main page.)
First, the result values that Stata produces using the user community contributed package fre with and without analytical weights:

. fre race [aw = wage]
race -- Race
-------------------------------------------------------------
                |      Freq.    Percent      Valid       Cum.
----------------+--------------------------------------------
Valid   1 White |   1703.612      75.85      75.85      75.85
        2 Black |   513.7638      22.87      22.87      98.73
        3 Other |   28.62389       1.27       1.27     100.00
        Total   |       2246     100.00     100.00           
-------------------------------------------------------------

. fre race 
race -- Race
-------------------------------------------------------------
                |      Freq.    Percent      Valid       Cum.
----------------+--------------------------------------------
Valid   1 White |       1637      72.89      72.89      72.89
        2 Black |        583      25.96      25.96      98.84
        3 Other |         26       1.16       1.16     100.00
        Total   |       2246     100.00     100.00           
-------------------------------------------------------------

Next, the result values that the code for the alluvial plot that uses value(wage) produces for the categorical variable race

White (75.85%)
Black (22.87%)

and, the result values that the code that uses [w = wage] produces for the categorical variable race

White (72.89%)
Black (25.96%)

You note in the help file (without indicating which type of weight is used): Weights are allowed but use them cautiously.

Now, I am a bit puzzled because using [w = wage] appears not to produce the weighted percentages, whereas using value(wage) does.
Maybe this is intentional, but, I fail to grasp why using [w = numvar] does not seem to make a difference for alluvial (yet).
Also, your description of the functional use of value(numvar): Define a numerical variable that will be aggregated over the categories for the flows. The default is the count of rows.
makes me wonder what 'that will be aggregated over the categories' implies: is it weighting - the weighted count of categorical cases?

The text was updated successfully, but these errors were encountered:

asjadnaqvi · 2024-11-12T14:10:55Z

Dear Eric, I would highly recommend just using value() if that is the key variable over which the items need to be summed.

Weighted sums are very likely to give different estimates since the formulas for weights are doing more that simple summations. The weights should only be used if the data genuinely has a specific weight type (aw, pw, fw, iw in Stata) . I am planning on writing a note on this.

ericmelse · 2024-11-12T14:19:08Z

Hmm, I am most interested in an example that shows where things go wrong!

But, what you write does not explain why the alluvial code that uses [w = wage] does not produce weighted result values while fre race [aw = wage] does. That is confusing (to me). I mean, Stata's tab produces the same weighted result:

. tab race [aw = wage]

       Race |      Freq.     Percent        Cum.
------------+-----------------------------------
      White | 1,703.6123       75.85       75.85
      Black | 513.763792       22.87       98.73
      Other | 28.6238923        1.27      100.00
------------+-----------------------------------
      Total |      2,246      100.00

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Using [w = wage] or value(wage) #28

Using [w = wage] or value(wage) #28

ericmelse commented Nov 12, 2024

asjadnaqvi commented Nov 12, 2024

ericmelse commented Nov 12, 2024

Using [w = wage] or value(wage) #28

Using [w = wage] or value(wage) #28

Comments

ericmelse commented Nov 12, 2024

asjadnaqvi commented Nov 12, 2024

ericmelse commented Nov 12, 2024