Skip to content

Commit

Permalink
doc updates
Browse files Browse the repository at this point in the history
  • Loading branch information
purarue committed May 8, 2024
1 parent a8aefac commit 9aea89f
Show file tree
Hide file tree
Showing 2 changed files with 6 additions and 4 deletions.
6 changes: 3 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -253,17 +253,17 @@ On certain machines, the giant HTML files may even take so much memory that the

Just to give a brief overview, to add new functionality (parsing some new folder that this doesn't currently support), you'd need to:

- Add a `model` for it in [`models.py`](google_takeout_parser/models.py) subclassing `BaseEvent` and adding it to the Union at the bottom of the file. That should have a `key` property function which describes each event uniquely (used to merge takeout events)
- Add a `model` for it in [`models.py`](google_takeout_parser/models.py) subclassing `BaseEvent` and adding it to the Union at the bottom of the file. That should have a [`key` property function](https://github.com/seanbreckenridge/google_takeout_parser/blob/a8aefac76d8e1474ca2275b4a7c78bbb962c7a04/google_takeout_parser/models.py#L185-L187) which describes each event uniquely (this is used to remove duplicate items when merging takeouts)
- Write a function which takes the `Path` to the file you're trying to parse and converts it to the model you created (See examples in [`parse_json.py`](google_takeout_parser/parse_json.py)). Ideally extract a single raw item from the takeout file add a test for it so its obvious when/if the format changes.
- Add a regex match for the file path to the handler map in [`google_takeout_parser/locales/en.py`](google_takeout_parser/locales/en.py).

Dont feel required to add support for all locales, its somewhat annoying to swap languages on google, request a takeout, wait for it to process and then swap back.

Though, if your takeout is in some language this doesn't support, you can [create an issue](https://github.com/seanbreckenridge/google_takeout_parser/issues/new?title=support+new+locale) with the file structure (run `find Takeout` and/or `tree Takeout`), or contribute a locale file by creating a `path -> function mapping`, and adding it to the global `LOCALES` variables in `locales/all.py` and `locales/main.py`
Though, if your takeout is in some language this doesn't support, you can [create an issue](https://github.com/seanbreckenridge/google_takeout_parser/issues/new?title=support+new+locale) with the file structure (run `find Takeout` and/or `tree Takeout`), or contribute a locale file by creating a `path -> function mapping` ([see locales](https://github.com/seanbreckenridge/google_takeout_parser/tree/master/google_takeout_parser/locales)), and adding it to the global `LOCALES` variables in `locales/all.py` and `locales/main.py`

This is a pretty difficult to maintain, as it requires a lot of manual testing from people who have access to these takeouts, and who actively use the language that the takeout is in. My google accounts main language is English, so I upkeep that locale whenever I notice changes, but its not trivial to port those changes to other locales without swapping my language, making an export, waiting, and then switching back. I keep track of mismatched changes [in this board](https://github.com/users/seanbreckenridge/projects/1/views/1)

Ideally, you would select everything when doing a takeout (not just the `My Activity`/`Chrome`/`Location History` like I suggested above), so [paths that are not parsed can be ignored properly](https://github.com/seanbreckenridge/google_takeout_parser/blob/4981c241c04b5b37265710dcc6ca00f19d1eafb4/google_takeout_parser/locales/en.py#L105C1-L113).
Ideally, when first creating a locale file, you would select everything when doing a takeout (not just the `My Activity`/`Chrome`/`Location History` like I suggested above), so [paths that are not parsed can be ignored properly](https://github.com/seanbreckenridge/google_takeout_parser/blob/4981c241c04b5b37265710dcc6ca00f19d1eafb4/google_takeout_parser/locales/en.py#L105C1-L113).

### Testing

Expand Down
4 changes: 3 additions & 1 deletion google_takeout_parser/parse_csv.py
Original file line number Diff line number Diff line change
Expand Up @@ -156,7 +156,9 @@ def reconstruct_comment_content(
elif "text" in segment:
buf.write(segment["text"])
else:
return ValueError(f"Expected 'text' or 'link' in segment, got {segment}")
return ValueError(
f"Expected 'text' or 'link' in segment, got {segment}"
)
return buf.getvalue()
else:
# this is not a user-facing error, its misconfiguration, so we raise it
Expand Down

0 comments on commit 9aea89f

Please sign in to comment.