A tool that makes use of sigpy to scrape the ERASMUS page at FEUP into JSON files, for time analysis and other fun ideas people usually have.
python extract.py [USERNAME|201403027] [YEAR|2018]
scrapes all the current hardcoded allocations (ids should be updated annually as they are unpredictable) intoarchive/COURSE/YEAR/yyyy_mm_dd_hh_mm.json
, this should be executed on a daily basis (or at the rate of the system updates). This folder (archive
) is gitignored, so it will only persist on your local clone.python anonymize.py
since the identity of students and their GPA is not public information it needs to be anonymized, this script takes care of that and creates a duplicate database inanonymous/COURSE/YEAR/yyyy_mm_dd_hh_mm.json
using funny, yet consistent, anonymous animals for students.- Jupyter notebook discover previous years can be used to bruteforce url IDS and find valid ones, so that past allocations can be found (I already did this for MIEIC up to 2019)
- Additionally, @antonioalmeida has created a google sheets that is reusable for further years that allows for real-time updates if all students specify their preferences. The sheet can be copied from here.
- 12/13 - 1000861
- 13/14 - 1002664
- 14/15 - 1004742
- 15/16 - 1006885
- 16/17 - 1008349
- 17/18 - 1010349
- 18/19 - 1012045
This is probably a stationary repo, as far as my dedication goes, but...
Here are some ideas for people that might want to improve it:
- Extend to other faculties (maybe even works by changing the URL)
- Perform the scrapping using a cron job on future years and PR