Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create and use a standard utility library for handling zip files in the frontend #11539

Merged
merged 9 commits into from
Jan 2, 2024

Conversation

rtibbles
Copy link
Member

Summary

  • Generalizes usage of the fflate library in the H5P runner to create a kolibri-zip library for reading zip files in the browser
  • Allows users of the ZipFile utility to define Mapper classes that have getPaths and replacePaths methods, that then modify the file when it is fetched from the ZipFile
  • Mostly this is done to allow automatic recursing of the file to produce, for example, a CSS file with local file references (to fonts, images, etc) already replaced with a blob URL for a file that has been extracted from the zip file
  • This initial CSS use case used in the H5P Runner is then extended for src and href attributes in HTML and XML files
  • Lastly, the similar replacement done in JSON files in the Perseus Renderer is then updated to use the kolibri-zip library, with it defining its own custom JSONMapper for handle json files

References

Fixes #9157

The HTML and XML utilities were also created in conjunction with work on the draft PR #8070 (making the third use case for the abstraction)

Reviewer guidance

Is kolibri-zip a good enough name?

These changes affect, but in my manual testing do not alter the behaviour of both H5P file rendering and Perseus rendering. Some confirmation of this would be helpful.

It is worth noting that this work is likely to cause merge conflicts with #9759 - but it seemed simpler to do this change now, and resolve any conflicts in that PR.

The last issue that might need to be cleared up is the implicit dependency of a Hashi script on JSZip - this can be resolved within this PR, but is probably best left to a follow up issue, where we also update our bundled H5P static files.


Testing checklist

  • Contributor has fully tested the PR manually
  • If there are any front-end changes, before/after screenshots are included
  • Critical user journeys are covered by Gherkin stories
  • Critical and brittle code paths are covered by unit tests

PR process

  • PR has the correct target branch and milestone
  • PR has 'needs review' or 'work-in-progress' label
  • If PR is ready for review, a reviewer has been added. (Don't use 'Assignees')
  • If this is an important user-facing change, PR or related issue has a 'changelog' label
  • If this includes an internal dependency change, a link to the diff is provided

Reviewer checklist

  • Automated test coverage is satisfactory
  • PR is fully functional
  • PR has been tested for accessibility regressions
  • External dependency files were updated if necessary (yarn and pip)
  • Documentation is updated
  • Contributor is in AUTHORS.md

Copy link
Member

@nucleogenesis nucleogenesis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code and changes make sense throughout. I'm a little fuzzy on exactly how the get/replaceDomPaths functions are working w/ the regex. Maybe an example of what the mapping function is doing w/ the regex there could make it a bit more clear what is going on there?

attributes
.map(a => element.getAttribute(a))
.filter(Boolean)
.map(url => url.replace(queryParamRegex, '$1'))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does $1 here refer to the first captured match?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@marcellamaki
Copy link
Member

@nucleogenesis is there a blocker here? I'm not sure if I should review or if you are waiting for something specific. Can you clarify your recommendation a bit, just so we can keep this moving forward as it's gotten rather stale. Thank you!

Copy link
Member

@marcellamaki marcellamaki left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have added a few in-line comments/questions for my own understanding, and will re-review after.

Manual QA of perseus and h5p checks out

@@ -547,7 +523,7 @@ export default class H5PRunner {
if (!file) {
return;
}
const json = JSON.parse(strFromU8(file.obj));
const json = JSON.parse(file.toString());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All of these file.toString() changes in this file are converting what was an object, to a string, which is the same behavior, but just in a different way than before, right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the the toString method just uses the fflate strFromU8 method under the hood, this just allows the calling code to be agnostic about the underlying library.

@@ -20,7 +20,7 @@ export default function(path) {
if (xhr.readyState === 4) {
if (xhr.status === 200 || xhr.status === 0) {
try {
resolve(new Uint8Array(xhr.response));
resolve(xhr.response);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this because the changes you've added mean that xhr.response is in a different format?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No - just because the loadBinary is only used for loading resources into the zip file, and I preferred to delegate the coercion to Uint8Array to the ZipFile class instead, and leave this utility to simply load the binary data.

export function getAbsoluteFilePath(baseFilePath, relativeFilePath) {
// Construct a URL with a dummy base so that we can concatenate the
// dependency URL with the URL relative to the dependency
// and then read the pathname to get the new path.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I realize this was pre-existing and it's just moving it, so non-blocking.

This comment feels rather hard to wrap my brain around, mostly the
that we can concatenate the dependency URL with the URL relative to the dependency part
The second "URL" (i.e. "the URL relative to the dependency") is where I get lost.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Basically, we are using the JS URL utilities to resolve a relative file path (by turning it into a resolution of a relative URL against a dummy base URL).

So the file that we are currently looking at in the zip file: e.g. style/images.css references another file relatively ../images/cool.png, this will resolve the reference to an absolute file path from the root of the zip file - so images/cool.png.

Copy link
Member

@nucleogenesis nucleogenesis Dec 18, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a bit confusing for me as well but I played w/ it a little bit and I think that the thing that isn't clear is that the URL constructor auto-magically resolves the relative path against the second argument.

I played w/ it in the terminal a bit and this kind of made it make sense to me as I kept adding ..s to the relative path:

> new URL("../best/friend.css", "https://a.com/hi/there/my/mortal/enemy").pathname
'/hi/there/my/best/friend.css'

> new URL("../../best/friend.css", "https://a.com/hi/there/my/mortal/enemy").pathname
'/hi/there/best/friend.css'

> new URL("../../../best/friend.css", "https://a.com/hi/there/my/mortal/enemy").pathname
'/hi/best/friend.css'

I think that maybe the comment could use some clarification or an example like the one you gave, Richard, re: styles/header/logo.css has a reference to ../../assets/logo.svg so we need to get a pathname assets/logo.svg out of that. Maybe that's more complicated though 😄

packages/kolibri-zip/src/index.js Show resolved Hide resolved
// Additional file types for Kolibri
'html',
'htm',
'xhtml',
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure why these are added. Not that they shouldn't be, but more in the benefit/purpose way

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I note in the PR description:

The HTML and XML utilities were also created in conjunction with work on the draft PR #8070 (making the third use case for the abstraction)

I preferred to include these edits here, as it helps to triangulate the use case for the abstraction.

Copy link
Member

@nucleogenesis nucleogenesis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left one comment but looking through the code again everything looks good. I don't need anything in particular for this to be merged although it may be worth updating the comment on getAbsolutePath to a JSDoc.

That's not blocking for me so as @marcellamaki plans to re-review I'll leave final approval to her

Copy link
Member

@marcellamaki marcellamaki left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @rtibbles - no blockers from me, but one remaining curiosity-only question

Khan.imageUrls = {};

function getImagePaths(itemResponse) {
const graphieMatches = {};
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's graphie? Is this a perseus specific image term?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes - it allows loading of either a JSON blob or an SVG file to display graph style images.

@rtibbles rtibbles merged commit a855679 into learningequality:develop Jan 2, 2024
35 checks passed
@rtibbles rtibbles deleted the zippidy-doodah branch January 2, 2024 20:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants