Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance Degradation #49

Open
piacenti opened this issue Jun 23, 2020 · 5 comments
Open

Performance Degradation #49

piacenti opened this issue Jun 23, 2020 · 5 comments

Comments

@piacenti
Copy link

piacenti commented Jun 23, 2020

I just tried using a couple different version of this library to test things out. I started with version 0.0.4 and then tried one of the latest ones form jitpack cd716fb1a9. A test I had parsing a 13MB JSON file that used to run in about 300-400ms on average (running 10 times) jumped to over a second in the latest version. The grammar is just the JSON grammar in the antlr repository. I also ran just the same file with just the regular java antlr target and its performance lines up with what I get from version 0.0.4. The JSON file is just a long JSON file generated by JSON file generator. Degradation was visible in both JVM and JS and it is generally 2x slower. Running a profiler on it seems to show we are spending a lot more CPU time in org.antlr.v4.kotlinruntime.StringCharStreamKt.codePointIndices, 40% of all CPU time in fact.

image

Profiler run for version 0.4 doesn't show that

image

@ftomassetti
Copy link
Member

Thank you for reporting this! We should analyze and fix this

lppedd added a commit to lppedd/antlr-kotlin that referenced this issue Dec 6, 2023
@lppedd
Copy link
Contributor

lppedd commented Dec 6, 2023

Root cause was mapNotNull creating an ArrayList under the hood to hold elements.

The ArrayList (obviously) keeps resizing, especially with big input strings.
So we had multiple primitive int array copies going on, plus the final copy with toIntArray.

This should now be solved by simply creating an int array of the max possible size (string.length), and then slicing it a single time at the end to cut it to the right size.

Let's see how it goes on JS with big inputs. I know the browser (or Node) will limit the size of arrays.

@lppedd
Copy link
Contributor

lppedd commented Dec 6, 2023

As an idea, we could swap it with a proper bit set.

@lppedd
Copy link
Contributor

lppedd commented Jan 16, 2024

@piacenti do you think this is still valid?

@lppedd
Copy link
Contributor

lppedd commented Feb 13, 2024

Last version contains another small performance boost.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants