- Change all
os.Getenv
toos.LookupEnv
- Implement graph
- Improve env variables loading
- Use logger object
- Log each service to a different file
- Implement service
- Periodically check proxy health
- Add configuration structs for services
- Fix
msgChan
anderrChan
sizes in order to prevent deadlock - Improve goroutines tracking with waitgroups because right now it's a mess
- Make
PageStorage
an interface - Refactor
ElasticPageStorage
- Save all data on SIGINT
- Make
ElasticPageStorage
concurrent - Make
MongoJobsStorage
concurrent - Store responses headers
- Save pages in case of error
- Save timed out links and the number of times it timed out, use it to revisit pages
- Open connections only when needed
- Organize data by domain
- Make another collector for URLs added from the webserver, in order to be able to crawl clearnet and subreddits
- Merge getCollector function and use a flag to get an onion one or a normal one
- Make some periodic collectors for places where links gets published