I maintain this project to convert my Kindle highlights to a webpage, and a few intermediate dataformats(that are easier to parse in code).
The raw format(Kindle HTML export format) is converted to a JSON file and a markdown file, which is further exported to a Jekyll website which can be found here http://tarangshah.com/books.
I also had a few old books I read on Google Play Books that exported highlights in MS Word format. There is a parser for that as well.
- Conversion scripts for Google Play Books(Exported to Drive) and Kindle Email export(HTML)
index.js
has the main code- The main classes for parsing are in
KindleConvert.js
andGooglePlayConverter.js
renderer.js
andpostProcessJson.js
are used for the final document generation tasks
-
Raw and intermediate highlight files
{these were essentially text files and quite small in size, github seemed like the perfect place to store these files. You can find them in the "Raw" and "Results" folders}
- Sync all your highlights to Google Drive, google ensures that your highlights are synced in the docx format
- Use Pandoc to convert the docx to html
pandoc -f docx -t html -o file.html file.docx
- Then use the GooglePlayConverter class to convert to json
- Using the Kindle Android or iOS apps, for each book, export your notes to an email. The kindle app attaches an html file of your highlights
- Use this html file and the KindleConverter module to convert the highlights to json
Special thanks to @sawyerh for the kindle-email-to-json package, from which the Kindle Converter was derived. I added location, page number and other small parsing updates, mainly for notes etc.
- Copy the Raw files in the
raw
directory - Run
./run.sh
- Once you have the jekyll data(bunch of
.md
files inresults/jekyllCollection
) copy them to the jekyll website(https://github.com/t27/books/ in the_highlights
folder) - the updated highlights should now be available at tarangshah.com/books