Replies: 11 comments
-
I don't think this is a great fit for seaborn. It's already in pandas (as you note) and also |
Beta Was this translation helpful? Give feedback.
-
@mwaskom Coincidentally, I might have an interesting use case for this where it would be beneficial to have an easy way to add additional axes (or at least a second one similar to I want to visualize the result of a grid search on a regression model while tracking two metrics/scores. The catch is that one metric (max_error) is absolute, and the other (MAPE) is a percentage. For me, both metrics are useful because they give me an estimate of both overall performance and worst-case performance. One way I can currently do this is by using a facet over metrics: (
so.Plot(grid_result, x="max_depth", y="score")
.facet(col="metric")
.add(so.Line(), so.Agg())
.add(so.Band())
.share(y=False)
) This is nice, but a bit hard to read, because I need to go back and forth between figures. With base matplotlib, I can use fig, ax1 = plt.subplots()
ax2 = ax1.twinx()
sns.lineplot(grid_result.query("metric == 'mape'"), x="max_depth", y="score", color="tab:blue", ax=ax1)
sns.lineplot(grid_result.query("metric == 'max_error'"), x="max_depth", y="score", color="tab:red", ax=ax2)
ax1.set_ylabel("mape (blue)")
ax2.set_ylabel("max_error (red)") It would be nice if we could get this done in seaborn without having to drop down to matplotlib; especially so because this would free up the dimensions used by a facet to be used with by other variables, e.g., grid search parameters. |
Beta Was this translation helpful? Give feedback.
-
I think a |
Beta Was this translation helpful? Give feedback.
-
Isn't |
Beta Was this translation helpful? Give feedback.
-
I'm having trouble seeing it that way. In a parallel coordinates plot there isn't a separate (
sns.load_dataset("iris")
.rename_axis("example")
.reset_index()
.melt(["example", "species"])
.pipe(so.Plot, x="variable", y="value", color="species")
.add(so.Lines(alpha=.5), group="example")
) BTW
This seems to work for me? (Of course it has the same limitations of not playing nicely with faceting, etc., as the function interface) f, ax1 = plt.subplots()
ax2 = ax1.twinx()
p = so.Plot(healthexp, x="Year", group="Country")
p.add(so.Line(), so.Agg(), y="Spending_USD").on(ax1).plot()
p.add(so.Line(color="r"), so.Agg(), y="Life_Expectancy").on(ax2).plot() |
Beta Was this translation helpful? Give feedback.
-
In the first plot above, would it be possible to (minmax) normalise the data on the Y-axis? |
Beta Was this translation helpful? Give feedback.
-
Right! I have indeed misunderstood the parallel coordinates plot and they are separate things; sorry about that. @mwaskom Should I create a new issue/feature request to track
Cool! Then this was user-error on my side. I didn't call healthexp = sns.load_dataset("healthexp")
fig, ax1 = plt.subplots()
ax2 = ax1.twinx()
(
so.Plot(healthexp, x="Year", group="Country", y="Spending_USD")
.add(so.Line(color="tab:blue"), so.Agg())
.on(ax1)
)
(
so.Plot(healthexp, x="Year", group="Country", y="Life_Expectancy")
.add(so.Line(color="tab:red"), so.Agg())
.on(ax2)
)
@EwoutH Absolutely. Just transform your data before handing it over to the plot :) import numpy as np
import pandas as pd
import seaborn.objects as so
iris: pd.DataFrame = sns.load_dataset("iris")
def normalize(df, columns):
normalized = df.loc[:, columns].apply(
# min/max normalization of a column
lambda data: (data - np.min(data)) / np.ptp(data)
)
return df.assign(**{col: normalized[col] for col in normalized})
(
iris.rename_axis("example")
.reset_index()
.transform(
normalize,
columns=["sepal_length", "sepal_width", "petal_length", "petal_width"],
)
.melt(["example", "species"])
.pipe(so.Plot, x="variable", y="value", color="species")
.add(so.Lines(alpha=0.5), group="example")
) |
Beta Was this translation helpful? Give feedback.
-
This isn't good enough tracking for you? :) Line 602 in 021a20f
You don't need to invoke The key thing is explicitly calling |
Beta Was this translation helpful? Give feedback.
-
You could also do this with a move transform: class NormByOrient(so.Move):
def __call__(self, df, groupby, orient, scales):
other = {"x": "y", "y": "x"}[orient]
return df.assign(**{
other: df.groupby(orient)[other]
.transform(lambda x: (x - x.min()) / (x.max() - x.min()))
})
(
iris
.rename_axis("example")
.reset_index()
.melt(["example", "species"])
.pipe(so.Plot, x="variable", y="value", color="species", group="example")
.add(so.Lines(alpha=.5), NormByOrient())
) I'm 👎 on adding a move transform that does this specifically but open to having it work within a more general operation. The existing But also I suspect that in most cases where you're doing a parallel coordinates plot your data are going to be in "wide form" as that's how you'd hand them to an ML library so the |
Beta Was this translation helpful? Give feedback.
-
Indeed that's the crux. I actually think the documentation is fine as is; it's just a bit imperceptible because it is part of the detailed explanation of If you are willing to accept a PR for this I can look into that. |
Beta Was this translation helpful? Give feedback.
-
Duplication of the information doesn't sound like a great idea but maybe "notes" would be a better section, then again, the numpydoc standard says:
Of course, the docs don't really adhere to that standard religiously... |
Beta Was this translation helpful? Give feedback.
-
When visualizing high-dimensional datasets, parallel coordinates plots are sometimes very useful. I would love for Seaborn to have a build in function to do this!
Resources
Beta Was this translation helpful? Give feedback.
All reactions