Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Integrate Kiwix with zsync #12

Open
yeehi opened this issue Dec 10, 2018 · 11 comments
Open

Integrate Kiwix with zsync #12

yeehi opened this issue Dec 10, 2018 · 11 comments
Assignees
Labels

Comments

@yeehi
Copy link

yeehi commented Dec 10, 2018

PROBLEM
Many .zim files, eg for Wikipedia, are huge. If you wish to update your wikipedia zim, you must re-download everything. This takes time, uses lots of bandwidth, means that new torrents need to be started frequently, and fragements people seeding torrents across versions.

PROPOSAL
Wikipedia releases an annual zim, eg 2019. That is torrented. Then, there are say four updates per year, which arrive Spring Summmer, Autumn Winter. These updates would include the new edits and added material. Something like zsync would be used to update the downloaded torrent to the more recent version.

This would allow an accumulation throughtout the year of people seeding the same torrent, the 2019 one, and at the same time allow people to be up to date without having to re-download the entire wikipedia zim.

@kelson42
Copy link
Contributor

@yeehi Interesting, I have made a test and it save ~ 60% of the bandwidth!

$ zsyncmake -u 'http://mirror.download.kiwix.org/zim/wikipedia/wikipedia_en_medicine_novid_2018-10.zim' wikipedia_en_medicine_novid_2018-10.zim
$ mv wikipedia_en_medicine_novid_2018-10.zim wikipedia_en_medicine_novid_2018-10.zim.old
$ zsync -i wikipedia_en_medicine_novid_2018-09.zim wikipedia_en_medicine_novid_2018-10.zim.zsync
reading seed file wikipedia_en_medicine_novid_2018-09.zimead wikipedia_en_medicine_novid_2018-09.zim. Target 57.6% complete.      
downloading from http://mirror.download.kiwix.org/zim/wikipedia/wikipedia_en_medicine_novid_2018-10.zim:
#################### 100.0% 10741.9 kBps DONE     

verifying download...checksum matches OK
used 706150400 local, fetched 519752649
$ ls -la *
-rw-r--r-- 1 kelson kelson 1208191347 Dez 15 16:05 wikipedia_en_medicine_novid_2018-09.zim
-rw------- 1 kelson kelson 1225768642 Dez 15 12:05 wikipedia_en_medicine_novid_2018-10.zim
-rw------- 1 kelson kelson 1225768642 Dez 15 14:05 wikipedia_en_medicine_novid_2018-10.zim.old
-rw-r--r-- 1 kelson kelson    2394378 Dez 15 16:13 wikipedia_en_medicine_novid_2018-10.zim.zsync

@kelson42 kelson42 self-assigned this Dec 15, 2018
@kelson42
Copy link
Contributor

@mgautierfr @rgaudin If you are curious, please give a try too, might be an idea to deal with the problem of incremental update. That said if we want to use the advantages of aria2c, some kind of work to integrate both would be necessary.

@kelson42
Copy link
Contributor

kelson42 commented Dec 15, 2018

I have opened a feature request on aria2c side aria2/aria2#1320.

@kelson42 kelson42 transferred this issue from kiwix/kiwix-build Dec 25, 2018
@kelson42
Copy link
Contributor

@yeehi I have detected that Mirrorbrain supports zsync, so we will try to activate it on the download.kiwix.org end to see if it works fine. See kiwix/container-images#37

@yeehi
Copy link
Author

yeehi commented Jan 19, 2019

@kelson42 Thank you very much for using your skills to assist! It is great that you were able to check zsync with the wiki already.

@stale
Copy link

stale bot commented Jan 20, 2020

This issue has been automatically marked as stale because it has not had recent activity. It will be now be reviewed manually. Thank you for your contributions.

@stale stale bot added the stale label Jan 20, 2020
@kelson42
Copy link
Contributor

kelson42 commented Apr 16, 2020

Maybe we should consider IPFS as well which proposes a similar functionnality like zsync. See ipfs/distributed-wikipedia-mirror#71

@stale
Copy link

stale bot commented Jun 16, 2020

This issue has been automatically marked as stale because it has not had recent activity. It will be now be reviewed manually. Thank you for your contributions.

@stale stale bot added the stale label Jun 16, 2020
@natema
Copy link

natema commented Aug 21, 2021

@kelson42 any update on this?
Btw, I see that there's now a rewrite of zsync: https://github.com/AppImage/zsync2

@stale stale bot removed the stale label Aug 21, 2021
@kelson42
Copy link
Contributor

No update, we look more. In the direction of IPFS for the moment.

@stale
Copy link

stale bot commented Apr 16, 2022

This issue has been automatically marked as stale because it has not had recent activity. It will be now be reviewed manually. Thank you for your contributions.

@stale stale bot added the stale label Apr 16, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants