Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Grab images and text from HathiTrust for the literary collection at the same time #664

Open
3 tasks
mnaydan opened this issue May 8, 2024 · 0 comments
Open
3 tasks
Assignees

Comments

@mnaydan
Copy link
Contributor

mnaydan commented May 8, 2024

  • Image data and xml data need to have the same version date
  • Include timestamp when downloading images
  • Get a sense of frequency of change
@mnaydan mnaydan transferred this issue from Princeton-CDH/ppa-nlp Jul 26, 2024
@mnaydan mnaydan changed the title Grab images and text from HathiTrust for the entire corpus at the same time Grab images and text from HathiTrust for the literary collection at the same time Nov 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: IceBox
Development

No branches or pull requests

3 participants