-
Notifications
You must be signed in to change notification settings - Fork 117
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Solidity (ethereum smart contract language) #50
Comments
Hello @vbersier, Indeed, it would be a good idea to add Solidity as Etherium/smart contract/NFT are everywere lately. I propose that we wait for the number of Solidity projects to grow on Github before adding this language. |
Hi @yoeo There are thousands of source code examples available from etherscan.io and bscscan.com. I wonder if it would be possible to somehow scrape them with their API? https://docs.etherscan.io/api-endpoints/contracts#get-contract-source-code-for-verified-contract-source-codes In total already 13k files with open source license. Github search seems glitchy as the number of code results changes with every refresh, from 700 to 63k results. Finally I think this particular language will require a smaller training set than most other languages, as I explained the reserved keywords are very unique. |
You're right, now I see that Github search result is not stable for Solidity.
Currently, the dataset is generated by this script https://github.com/yoeo/guesslangtools/ |
I would love to see Solidity added as a language, as it usually gets detected as JavaScript, Dart, Lua or other languages.
That language has a set of reserved keywords that are very different from other languages and should enable the training to perform extremely well on it.
The text was updated successfully, but these errors were encountered: