-
Notifications
You must be signed in to change notification settings - Fork 116
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How to add new language #40
Comments
when i download dataset from guesslangtool, many repo is not exist, and github server reject my request. |
Hello @yjmm10
I think that the current model doesn't suit transfer learning very well. The list of supported languages is embedded in the model graph itself. Mean that you'll have to hack the graph somehow to add new languages info. Today the only recommended way to add new languages is to build a dataset including the new languages with guesslangtools. |
Yes that's expected, the Github public repository list that I use was last updated on January 2020 https://zenodo.org/record/3626071/ |
Strange... Guesslangtools main workflow only rely on Can you share the errors that you're getting? |
or
|
Okay @yjmm10, it looks like you are using an older version of guesslangtools (version < 1.0). You can install guesslangtools latest version with the following commands # Clone the latest version of the code
git clone https://github.com/yoeo/guesslangtools.git
cd guesslangtools
# Edit the language description file to add the new languages information
vi data/languages.yaml
# Install the new Guesslangtools on your system
pip install -Ue . |
After installing guesslangtools you can run it to generate the dataset: # You can change the --nb-xxx parameters to have more or less examples in your dataset
gltool /path/to/new/dataset It will take hours, and when it is done, you can train Guesslang: # Clone Guesslang
git clone https://github.com/yoeo/guesslang.git
cd guesslang
# Install Guesslang in "developper mode"
pip install -Ue .
# Copy the language mapping generated in the dataset (`languages.json`) into Guesslang repository
cp /path/to/new/dataset/languages.json ./data/languages.json
# Run the training
guesslang --train /path/to/new/dataset/files --steps 10000 --model /path/to/new/model I'm using Linux command line syntax here, and I hope that it won't be hard to convert them into Window shell commands. |
Hello, if I want to do migration training based on yours, can I use the trained model?
I tried to load the trained model but no effect, I hope to get your reply
The text was updated successfully, but these errors were encountered: