Skip to content

cjinhuo/text-search-engine

Repository files navigation

Text Search Engine

A text search engine that supports mixed Chinese and English fuzzy search

Overview

中文 README

A dynamic programming-based text search engine that supports mixed Chinese and English fuzzy search, returning the highest-weight matching results.

Who use it?

Online Demo

Check out this online demo if you are interested.

online-demo

Installation

npm i text-search-engine

Supported Environments

Supports both Node.js and Web environments.

Usage

search

Pure English Search

import { search } from 'text-search-engine'

const source = 'nonode'

search(source, 'no') //[[0, 1]]
// Matches 'no', continuous characters have higher weight
search(source, 'nod') // [[2, 4]]
search(source, 'noe') // [[0, 1], [5, 5]]
search(source, 'oo') // [[1, 1],[3, 3]]

search('nonode', 'noe') Match result: nonode

Pure Chinese Search

import { search } from 'text-search-engine'

const source = '地表最强前端监控平台'

search(source, 'jk') // [[6, 7]]
search(source, 'qianduapt') // [[4, 5],[8, 9]]

search('地表最强前端监控平台', 'qianduapt') Match result: 地表最强前端监控平台

Mixed Chinese and English Search

import { search } from 'text-search-engine'

search('Node.js 最强监控平台 V9', 'nodejk') //[[0, 3],[10, 11]]

const source_2 = 'a_nd你你的就是我的'
search(source_2, 'nd') //[[2, 3]]
// Matches '你你的'
search(source_2, 'nnd') //[[4, 6]]
// Matches 'a_'n'd你你的就'是我的'
search(source_2, 'nshwode') //[[2, 2],[8, 10]]

search('Node.js 最强监控平台 V9', 'nodejk') Match result: Node.js 最强监控平台 V9

Space-separated Search

Adding spaces makes each term independent. Each term starts matching from the beginning, and matched terms will be removed, so the next term starts matching from the beginning and ignores previously matched terms.

const source_1 = 'Node.js 最强监控平台 V9'

search(source_1, 'jknode') // undefined
search(source_1, 'jk node') // [[10, 11],[0, 3]]

search('Node.js 最强监控平台 V9', 'jk node') Match result: Node.js 最强监控平台 V9

Sort of Backtracking

const source_1 = 'zxhxo zhx'
search(source_1, 'zh') //[[6, 7]])
// Even though the weight of 'zh' is higher, but the next term 'o' is not matched, so hit the previous one
search(source_1, 'zho') //[[0, 0],[2, 2],[4, 4]])

highlightMatches

This API is used for quickly validating text match highlights. It returns ANSI escape codes that can be output using console.log in both Web and Node.js environments to see the highlighted text.

import { highlightMatches } from 'text-search-engine'
console.log(highlightMatches('Node.js 最强监控平台 V9', 'nodev9'))

The console will output: Node.js 最强监控平台 V9

options

mergeSpaces

Default: false

const source = 'chrome 应用商店'
search(source, 'meyinyon') //[[4, 5], [7, 8]])
// would merge blank spaces between each index of the matched term
search(source, 'meyinyon', { mergeSpaces: true }) //[[4, 8]])

strictnessCoefficient

Default: undefined

const source = 'Node.js 最强监控平台 V8'
search(source, 'nozjk') //[[0, 1], [8, 8], [10, 11]]
// When the strictnessCoefficient is 0.5 and nozjk is five characters long, Math.ceil(5 * 0.5) equals 3. If the match is less than or equal to 3 characters, it will return normally.
search(source, 'nozjk', { strictnessCoefficient: 0.5 }) //[[0, 1], [8, 8], [10, 11]]
search(source, 'nozjk', { strictnessCoefficient: 0.4 }) //undefined

React Component

Take a look at CodeSandbox Online Demo

HighlightWithTarget

import { HighlightWithTarget } from 'text-search-engine/react'

function Test() {
    return <HighlightWithTarget source='Node.js 最强监控平台 V9' target='nodejk' />
}

HighlightWithRange

import { HighlightWithRanges } from 'text-search-engine/react'
import { search } from 'text-search-engine'

export default function DemoForHighlightWithTarget() {
	const ranges = search('Node.js 最强监控平台 V9', 'nodejk')
	return <HighlightWithRanges source='Node.js 最强监控平台 V9' hitRanges={ranges}  />
}

Performance

Time Complexity Space Complexity
Best O(M(source)) O(M(source))
Worst O(M(source) * N(target)) O(M(source) * N(target))

Contributing

Please see the contributing guidelines to learn more.

A big thanks to all of our amazing contributors ❤️

Feel free to join the fun and send a PR!

Contributors

Star History Chart

📞 contact

welcome to raise issue, you can contact me on wx or email if you have some good suggestion(notes: text-search-engine)