Skip to content

Internationalizes section handouts with Sphinx, reST, and gettext. #1

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 6 commits into
base: master
Choose a base branch
from

Conversation

sredmond
Copy link

Overview

This PR inserts the Sphinx documentation tool, and its friends, to autogenerate HTML and PDF handouts in multiple languages.

Changes Proposed

At a more detailed level, what changes does this PR make to the codebase?

  • Section handouts can (and should?) be written in reST so that they can produce both HTML and PDF output.
    • Seriously, reST with Sphinx is fantastic for technical writing about Python. It's like Markdown with loads of extra features.
  • Sphinx is used to autogenerate documentation targeting HTML and PDF output in all languages.
  • Sphinx is configured for internationalization.
  • Handouts are localized into Spanish, French, and German.
  • There's an index page (for HTML) which points to all of the section handouts.
  • Translations are stored in gettext-friendly .po files, which could eventually be integrated with a public-facing crowd-sourcing translation service like Transifex or CrowdIn.

Caveats

  • I don't speak French or German.
  • The LaTeX header looks a little janky, and doesn't have the usual "Quarter / Date / Handout #" format.
  • The HTML build output contains loads of garbage that could be pruned depending on the eventual landing spot for the HTML bodies.

Mentions

@brahmcapoor

For technical writing involving Python, RST is a neat choice, because it can
be used by Sphinx to autogenerate files in a variety of output formats [1].

Moreover, it's the "fancy-Markdown" equivalent that has enough expressiveness
to meaningfully work for technical writing, shoring up areas where Markdown
itself fails to provide useful primitives (such as embedding images with set
attributes).

Section headings adhere to the same convention as Python's official docs [2].

[1]: https://www.sphinx-doc.org/en/master/usage/restructuredtext/index.html
[2]: https://devguide.python.org/documenting/#sections
Sphinx [1] is "a tool that makes it easy to create intelligent and beautiful
documentation." It also has great support for internationalization [2].

Sphinx is a common tool for writing about and documenting Python code, even
used by the Python language itself.

The TL;DR: of using Sphinx is: to make HTML documents, run `make html`. To
make PDF documents, run `make latexpdf`. It's as easy as that!

[1]: https://www.sphinx-doc.org/en/master/
[2]: https://www.sphinx-doc.org/en/master/usage/advanced/intl.html
The index reST file current just contains a TOC containing all documents that
look like "section*/section*".

For more information on the `toctree` directive, see [1].

[1]: https://www.sphinx-doc.org/en/1.8/usage/restructuredtext/directives.html#directive-toctree
With this commit, the power of Sphinx's gettext integration begins to shine.

Translators (or in this case, just me) populate `.po` files autogenerated by
`sphinx-intl`, and then Sphinx can build documentation in any of those
languages.

To extract translatable messages into `pot` files, use:

    $ make gettext

These `pot` files are just generated by Sphinx and are kept out of version
control.

To generate `po` files from the `pot` files created by the previous step, use:

    $ sphinx-intl update -p _build/gettext -l es -l de -l fr

Any valid language codes can be used in that command.

Later, when the content of the translatable strings change, you'll have to run:

    $ sphinx-intl update -p _build/gettext

This will update the `po` files with the new strings for translators.

Finally, to make the documentation in a specific language (say, Spanish), use:

    $ make -e SPHINXOPTS="-D language='es'" html
    $ make -e SPHINXOPTS="-D language='es'" latexpdf

That's somewhat hard to remember, so on its way is a helpful script.
The `makedoc.sh` script will make HTML and PDF versions of the handouts in all
languages it knows about and store the output in `handouts-html` and
`handouts-pdf`. It's a pretty janky script though, so manually check the
output before committing.

The README contains instructions for handling translations and building to
multiple languages.
The PDFs can be found at:

* `handouts-pdf/de/Section1.pdf`
* `handouts-pdf/en/Section1.pdf`
* `handouts-pdf/es/Section1.pdf`
* `handouts-pdf/fr/Section1.pdf`

The HTML files can be found at:

* handouts-html/de/html/section1/section1.html
* handouts-html/en/html/section1/section1.html
* handouts-html/es/html/section1/section1.html
* handouts-html/fr/html/section1/section1.html

Admittedly, there's a lot of HTML garbage floating around and it would be nice
to find a way to clean that all up.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant