Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use dictionary language for Frequency sorting mode detection #1611

Open
Kuuuube opened this issue Nov 26, 2024 · 0 comments
Open

Use dictionary language for Frequency sorting mode detection #1611

Kuuuube opened this issue Nov 26, 2024 · 0 comments
Labels
area/linguistics The issue or PR is related to linguistics area/ui-ux The issue or PR is related to UI/UX/Design kind/enhancement The issue or PR is a new feature or request

Comments

@Kuuuube
Copy link
Member

Kuuuube commented Nov 26, 2024

Currently, regardless of the language, Japanese words are chosen to try to detect whether a freq dict is occurrence or rank based when pressing the Auto button on Frequency sorting mode under the Frequency sorting dictionary setting.

async _getFrequencyOrder(dictionary) {
const moreCommonTerms = ['来る', '言う', '出る', '入る', '方', '男', '女', '今', '何', '時'];
const lessCommonTerms = ['行なう', '論じる', '過す', '行方', '人口', '猫', '犬', '滝', '理', '暁'];
const terms = [...moreCommonTerms, ...lessCommonTerms];
const frequencies = await this._settingsController.application.api.getTermFrequencies(
terms.map((term) => ({term, reading: null})),
[dictionary],
);
/** @type {Map<string, {hasValue: boolean, minValue: number, maxValue: number}>} */
const termDetails = new Map();
const moreCommonTermDetails = [];
const lessCommonTermDetails = [];
for (const term of moreCommonTerms) {
const details = {hasValue: false, minValue: Number.MAX_SAFE_INTEGER, maxValue: Number.MIN_SAFE_INTEGER};
termDetails.set(term, details);
moreCommonTermDetails.push(details);
}
for (const term of lessCommonTerms) {
const details = {hasValue: false, minValue: Number.MAX_SAFE_INTEGER, maxValue: Number.MIN_SAFE_INTEGER};
termDetails.set(term, details);
lessCommonTermDetails.push(details);
}
for (const {term, frequency} of frequencies) {
const details = termDetails.get(term);
if (typeof details === 'undefined') { continue; }
details.minValue = Math.min(details.minValue, frequency);
details.maxValue = Math.max(details.maxValue, frequency);
details.hasValue = true;
}
let result = 0;
for (const details1 of moreCommonTermDetails) {
if (!details1.hasValue) { continue; }
for (const details2 of lessCommonTermDetails) {
if (!details2.hasValue) { continue; }
result += Math.sign(details1.maxValue - details2.minValue) + Math.sign(details1.minValue - details2.maxValue);
}
}
return Math.sign(result);
}

@Kuuuube Kuuuube added kind/enhancement The issue or PR is a new feature or request area/linguistics The issue or PR is related to linguistics area/ui-ux The issue or PR is related to UI/UX/Design labels Nov 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/linguistics The issue or PR is related to linguistics area/ui-ux The issue or PR is related to UI/UX/Design kind/enhancement The issue or PR is a new feature or request
Projects
None yet
Development

No branches or pull requests

1 participant