Opened 4 years ago

Last modified 2 years ago

#32723 assigned Cleanup/optimization

Add a GitHub action to run the Sphinx linkcheck builder.

Reported by: Nick Pope Owned by: Sarah Abderemane
Component: Documentation Version: dev
Severity: Normal Keywords: linkcheck, github, action
Cc: Triage Stage: Someday/Maybe
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: yes UI/UX: no

Description

This is a follow on step from #32720. (Migrated to this new ticket as suggested by Mariusz.)

Add a scheduled GitHub action to check for broken links, or redirects that could be simplified, on a weekly/monthly basis.

This would need to wait for sphinx-doc/sphinx#6525 to be addressed so that we can treat desired redirections as "working" links instead of "redirected", e.g. https://docs.djangoproject.com/en/stable/https://docs.djangoproject.com/en/3.2/.

The linkcheck builder generates docs/_build/linkcheck/output.{json,txt} which could be filtered and attached as an artifact from the GitHub action to make it easier to provide a report on what needs fixing.

Change History (12)

comment:1 by Nick Pope, 4 years ago

Triage Stage: UnreviewedSomeday/Maybe

comment:2 by Sarah Abderemane, 4 years ago

Owner: changed from nobody to Sarah Abderemane
Status: newassigned

comment:3 by Sarah Abderemane, 3 years ago

sphinx-doc/sphinx#6525 has been closed and merged in the last release of sphinx. Does the triage stage will be updated to accepted according to this ?

in reply to:  3 ; comment:4 by Nick Pope, 3 years ago

Replying to Sarah Abderemane:

sphinx-doc/sphinx#6525 has been closed and merged in the last release of sphinx. Does the triage stage will be updated to accepted according to this ?

I'll need to have a look as the entire proposal wasn't implemented with some work being left for a follow-up.

We'll also need to wait for any changes to be released.

in reply to:  4 comment:5 by Sarah Abderemane, 3 years ago

Nick Pope any news on this subject ?

comment:6 by Jacob Walls, 3 years ago

Hey Sarah. Took a look myself. I believe Nick was hoping for changes in Sphinx to allow regexes like these like these with backreferences, but testing against the latest version of Sphinx, they aren't supported:

WARNING: Failed to compile regex in linkcheck_allowed_redirects: '^https://\\1\\.readthedocs\\.io/[-a-z]+/(?:master|latest|stable)/$' invalid group reference 1
WARNING: Failed to compile regex in linkcheck_allowed_redirects: '^https://docs\\.djangoproject\\.com/\\1/\\d+\\.\\d+/' invalid group reference 1

However the released Sphinx changes would allow regexes without the backreferences. So if we were to get started now with implementing this in Django, we would, say, in the case of the readthedocs.io links, either need to use looser regexes (possibly allowing one readthedocs site to redirect to another readthedocs site, etc) or need to write regexes for each readthedocs site (ouch?)

I don't have a strong opinion, but I would say the former option (just remove the backreferences and allow looser matches) seems tolerable. We're perfectionists who don't mind using imperfect tools, right? :wink: We could then tighten up the regexes once someone submits a follow-up PR to Sphinx. A second opinion would be good to get before setting to work, though.

I think the triage stage is set to Maybe because there might be objections to running this on CI vs. manually, but that doesn't bear directly on whether we should get the regexes written to use the new Sphinx feature.

comment:7 by Nick Pope, 3 years ago

Thanks for looking into this Jacob.

So, yes, I was waiting for the feature to be added to Sphinx, but there were some complications over the implementation, so it is only part of what would be needed. It's on my very long list of things to look into. The main problem is that substituting the captures from the source pattern into the target pattern using back-references precludes the use of back-references for repeating captured groups within the target pattern.

At this stage I think we still need to wait until something further is implemented in Sphinx. That is why I flagged this "Someday/Maybe". I think we'd require a great many more entries because we can't make it generic enough. On top of that, one of the reasons for wanting to be able to substitute from the source to the target is to ensure that we're actually redirected to the right page. This whole thing is intended to make linkcheck useful for documentation sites that redirect for language and version, but not mask other potential issues.

comment:8 by Sarah Abderemane, 3 years ago

Thanks Nick and Jacob for your replies. I understand better what it implies to do before this ticket, I'll follow up to see how it will involves.
I agree, the triage stage is confusing, even more for me when I didn't know what it implies really. I think it will be good to have many more entries for better understanding for everyone.

comment:9 by Ralph Reid, 2 years ago

Hi Sarah, I have some spare cycles to help out. Has the closed reference ticket https://github.com/sphinx-doc/sphinx/issues/6525 moved this ticket further?
Thanks.

comment:10 by Sarah Abderemane, 2 years ago

Hi Ralph, I was waiting this issue to be resolved https://github.com/sphinx-doc/sphinx/issues/9234 and the triage stage to be changed.
Nick or Jacob, or someone else who have the knowledge, do you think it's ok to change the triage stage or there are some stuff to be checked before to change the status of this ticket ?

comment:11 by Nick Pope, 2 years ago

My previous comment above still stands - it was posted after that change to Sphinx was merged.

The change made was only part of the solution, so we cannot really progress here yet.

comment:12 by Sarah Abderemane, 2 years ago

Sorry for the inconvenience Nick, thanks for the quick response. I will check if the triage status only changes.

Note: See TracTickets for help on using tickets.
Back to Top