Re: [PATCH v2 1/2] Documentation: Start translations to Spanish

From: Matthew Wilcox
Date: Mon Oct 24 2022 - 15:21:56 EST


On Mon, Oct 24, 2022 at 08:40:42AM -0500, Carlos Bilbao wrote:
> > > I don't know what standard we're actually following. RFC5646 suggests
> > > simply using "es", with "es-419" for Latin America specialisation or
> > > "es-ES" for Spain. I don't know how much variation there is between
> > > different Spanish dialects for technical documents; as I understand it,
> > > it's worth supporting two dialects of Chinese, but we merrily mix &
> > > match en_US and en_GB spellings. Similarly, I wouldn't suggest that we
> > > have separate translations for fr_CA, fr_CH, fr_FR, just a single 'fr'
> > > would be fine.
> > >
> > > We do need to be careful here; people are rightfully sensitive about
> > > being incorrectly grouped together. If possible we should find a
> > > standard to follow that's been defined by experts in these matters.
> > > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fen.wikipedia.org%2Fwiki%2FIETF_language_tag&data=05%7C01%7Ccarlos.bilbao%40amd.com%7C44c226d534f44b4afc1f08dab0b1893b%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C638016573808784843%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=3T9bPQzcj9hEuZiPkjIU%2BPCEaxAivgaNKZ2gL5m3OQA%3D&reserved=0 may be a good place to
> > > start looking.
> > I think generic "es" is OK, especially if "es_ES" can have such a
> > negative connotation to some. I just wanted to point out "sp_SP"
> > looks wrong.
> >
> > Carlos, if you go the "es" way, it would be better to mention the
> > reason of the choice in the Changelog for future reference.
> >
> > Subdirectories "ja_JP", "ko_KR", and "zh_CN" were added under
> > Documentation/ way back in 2007 (v2.6.23).
> >
> > As you might see, two of the three language codes needed region
> > distinction and they were reasonable choices at the time.
> >
> > Thanks, Akira
>
> Answering to Akira and Matthew below. Thanks to both for valuable feedback.
>
> I made the conscious choice of not using es_ES, because as mentioned, it
> references a standard that I don’t intend to follow myself or enforce on
> Spanish translations. es_ES is a standard that comes from “Esp”-aña (Spain,
> the country) whereas “sp_SP” is as in "Sp"-anish, the language, not the
> country. Regarding this, I took the liberty of adding an extra paragraph to
> index.rs. I would translate it to English like:
>
> "Many countries speak Spanish, each one with its own culture, expressions,
> and sometimes significant grammatical differences. The translators are free
> to use the version of Spanish which they are most comfortable with. In
> principle, these small differences should not pose a great barrier for
> speakers of different versions of Spanish, albeit in case of doubt, you can
> ask the maintainers."
>
> I also opted for not using es_ES due to its geographical connotations. If
> someone from Peru, Mexico, Argentina, … submits a translation tomorrow, I
> would review it and we would understand each other just fine. Even within
> “Spain” there are many dialects and things change within regions. I
> reiterate that all dialects should be allowed in this directory.
>
> Fortunately for us, versions of Spanish differ much more in spoken form
> than they do when written. This does not happen between traditional and
> simplified Chinese.
>
> On top of everything else, using locale es_ES may imply that spell checks
> on that directory using the locale es_ES would be clean, but this is very
> far from reality, among other things, because all the English terms we
> inherit regarding computers. As Miguel Ojeda pointed out somewhere in this
> thread, there are terms that is better if we do not translate, to favor
> understanding of code/other documents.
>
> I will update the corresponding commit message to clarify why we are using
> es_ES format in this particular case.

I think we're better off following BCP 47:
https://www.rfc-editor.org/info/bcp47 rather than the libc locale format.
That will imply renaming it_IT to simply "it", ja_JP to "ja" and
ko_KR to "ko". The two Chinese translations we have might be called
"zh-Hant" and "zh-Hans", if the distinction is purely Traditional vs
Simplified script. If they really are region based, then they'd be
zh-CN and zh-TW.

I think you're right to conflate all dialects of Spanish together, just
as we do all dialects of English.

Jon, this feels like policy you should be setting. Are you on board
with this, or do you want to retain the mandatory geography tag that
we've been using up to now?