String.replace
JVM implementations comparison based on benchmarks
4 implementations of String.replace
were compared:
- 'manual': a manual implementation using
indexOf
andStringBuilder.append
- 'platform': an implementation delegating to
java.lang.String.replace
- 'regex': an implementation based on a literal regex pattern and an escaped literal replacement
- 'regex1': same as above, but using a manual replacement loop
- 'stdlib': the current implementation in the kotlin-stdlib 1.4.10, basically
split(oldValue) + join(newValue)
Varying parameters:
- JVM: 1.8.0_191, 11.0.1
ignoreCase
: false for case-sensitive search, true for case-insensitiveneedle
: the string being searched in a test string- "unique", a string that can be easily found just by its first letter
- ">>back", a string that shares its first two chars with a lot of other substrings in the test string
occurrences
: the number of searched substrings actually found in a test stringtotalLength
: the length of a test string
-
fixed
totalLength=5000
,occurrences=10
; varyingignoreCase
andneedle
-
fixed
ignoreCase=false
,needle=>>back
, varyingtotalLength
(100, 100K) andoccurrences
(0, 1, 10)log scale recommended
-
fixed
ignoreCase=true
,needle=>>back
, varyingtotalLength
(100, 100K) andoccurrences
(0, 1, 10)log scale recommended
With JDK 8:
- 'platform' performs the same as 'regex'
- 'regex' outperforms 'manual' (~x3) on long strings when the string being searched requires a lot of backtracking
- 'regex' takes time to compile the regular expression, thus is slower than 'manual' on short strings
With JDK 11:
- String operations are generally faster than with JDK 8
- 'platform' is roughly the same as 'manual'
- 'regex' is generally slower than 'manual'
- 'stdlib' (split+join) is faster! (~x1.14) than 'platform' on long strings
- no platform method available for case-insensitive replacement
With JDK 8:
- 'regex' outperforms 'manual' (~x1.5-2) even on short strings (except one case: many occurrences in a short string, x2.5 slower)
With JDK 11:
- 'regex' outperforms 'manual' (~x1.1-2) except in the case of many occurrences in a long string (x1.2 slower) and in a short string (x2.5 slower)