-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
New reasons prevent pages in a sitemap from being indexed by Google on site https://era.library.ualberta.ca/ #3289
Comments
Not found (404) Google Search Console error
Todo in 2024:
2024-01-10: Above suspicion seem to be wrong as the log analysis shows the 404 problem predates the Oct 2023 Sidekiq problem. For example:
Summary:
From the
No other Items appear invalid (run on staging and prod 2024-01-12)
1b17b01c-4eda-4453-95c9-27c764c5b69d 2024-01-30: fixed irb(main):006:0> i = Item.find('1b17b01c-4eda-4453-95c9-27c764c5b69d') |
Google Search Console "Duplicate without user-selected canonical" category analysis. This category doesn't seem to impact end users however below is the cause and an improvement. One cause is the "view" and "download" links lead to the same canonical file but no |
When the sitemap filter is applied to Google Search Console "Duplicate without user-selected canonical", three items appeared where Google thinks the content is similar to another item in the sitemap. Upon investigating the Google Search Console URL inspection, the "User-declared canonical" and "Google-selected canonical" appear very similar. E-mail sent to the erahelp team for advice.
Todo: is there a better way to find duplicates from the Jupiter/ERA backend? |
Discovered - currently not indexed: "The page was found by Google, but not crawled yet. Typically, Google wanted to crawl the URL but this was expected to overload the site; therefore Google rescheduled the crawl. This is why the last crawl date is empty on the report." Perhaps related to: |
Soft 404: https://support.google.com/webmasters/answer/7440203#soft_404
Possible fix:
|
Todo list 2024-03-06 (first round of changes to address the bulk of the errors; further scheduled rounds required)
Error when no filter selected
|
https://search.google.com/search-console/index?resource_id=https://era.library.ualberta.ca/&utm_source=wnc_20237597&utm_medium=gamma&utm_campaign=wnc_20237597&utm_content=msg_110624660&hl=en-CA
Let @pgwillia know if you don't have access.
There was a major incident Sept 2nd that may be related TicketID=67376.
[Jeff] 2023-10-19
The page was crawled by Google but not indexed. It may or may not be indexed in the future; no need to resubmit this URL for crawling. https://support.google.com/webmasters/answer/7440203#crawled
The text was updated successfully, but these errors were encountered: