Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

trouble with big dataframe #7

Open
auderson opened this issue Nov 8, 2019 · 1 comment
Open

trouble with big dataframe #7

auderson opened this issue Nov 8, 2019 · 1 comment

Comments

@auderson
Copy link

auderson commented Nov 8, 2019

I'm trying to visualize a relatively big dataset (634785 x 282) with your tool, but the plot just won't show up.
I can see the python console is running from the task manager, however even after a long long while (when python is not using CPU any more), the plot won't show (even the tool bar as well).
Using matplotlib is pretty fast for my dataset, but it's not convenient to switch over columns. So I guess there's efficiency issue here.

@bluenote10
Copy link
Owner

Hm, rendering almost a million points in JS is tough, but it works for me:

import pandas as pd
import numpy as np

import tabloo

N = 1000000
df = pd.DataFrame({
    "id": np.arange(N),
    "xs": np.random.uniform(-1, +1, N),
    "ys": np.random.uniform(-1, +1, N),
})
df.loc[::10, "xs"] = np.nan
df.loc[::20, "ys"] = np.nan
df.loc[::47, "xs"] = +np.inf
df.loc[::83, "xs"] = -np.inf
#df["Column with much too long name"] = 0

tabloo.show(df, open_browser=False, debug=True, server_logging=True)

Peek 2019-11-08 11-59

The request in the backend takes a few seconds, then the frontend starts drawing the markers left-to-right. Is there anything unusual about your data?

But like I said, browser based plotting libraries are not made for such large number of points (and actually matplotlib also becomes super slow for me when plotting more then 100k points). You could enter a filtering expression in the table view and only switch to the plot with some strong filter applied. From there you could modify the filter to explore your data further, avoiding to draw the entire dataset.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants