该项目代码为本人首次参加 Kaggle 比赛的模型代码。参加的 Kaggle 比赛内容为音乐推荐系统。比赛的任务要求如下:
The main objective of the project is to predict the chances of a user listening to a song repetitively after the first observable listening event within a time window was triggered. If there are recurring listening event(s) triggered within a month after the user’s very first observable listening event, its target is marked 1, and 0 otherwise in the training set. The same rule applies to the testing set.
在比赛结束时,我所在的队伍在 Public Leaderboard 排名为 62 名,在 Private Leaderboard 排名为 59 名(参赛队伍为 1172 支)。
- Python 3.x
- Numpy
- CatBoost
- XGBoost
- LightBGM
- GBDT
- Libffm
- g++ (with C++11 and OpenMP support)
数据来源为 Kaggle 上的比赛数据集,内容为音乐推荐。
Exploratory Data Analysis 以及 Feature Engineering 的工作部分在此不进行展开,如果对如何系统合理地处理数据感兴趣,可以参考我个人博客中的该篇文章「Music Recommendation Challenge」。
This model is called Field-aware Factorization Machines. If you want to use this model, please download LIBFFM first.
黄威,Randolph
SCU SE Bachelor; USTC CS Ph.D.
Email: chinawolfman@hotmail.com
My Blog: randolph.pro
LinkedIn: randolph's linkedin