Re: [RFC 0/2] Add a new translation tool scripts/trslt.py

From: Wu X.C.
Date: Fri Apr 16 2021 - 04:45:23 EST


Hi Federico,

On Wed, Apr 14, 2021 at 01:27:23AM +0200, Federico Vaga wrote:
> Hi,
>
> Yes, you are touching a good point where things can be improved. I admit that I
> did not have a look at the code yet, if not very quickly. Perhaps I'm missing
> something. However, let me give you my two cents based on what I usually do.
>
> I do not like the idea of adding tags to the file and having tools to modify it.
> I would prefer to keep the text as clean as possible.

Yeah, I also consider about that, so let this tag be one line and a comment
at design time, hope make text clean.

>
> Instead, what can be done without touching manipulating the text file is to do
> something like this:
>
> # Take the commit ID of the last time a document has translated
> LAST_TRANS=$(git log -n 1 --oneline Documentation/translations/<lang>/<path-to-file> | cut -d " " -f 1)
>
> # Take the history of the same file in the main Documentation tree
> git log --oneline $LAST_TRANS..doc/docs-next Documentation/<path-to-file>
>
> This will give you the list of commits that changed <path-to-file>, and that
> probably need to be translated. The problem of this approach is that by the time
> you submit a translation, other people may change the very same files. The
> correctness of this approach depends on patch order in docs-next, and this can't
> be guaranteed.

Thanks for sharing your experiences!

Yes, the order is why I think about this translation version control.
It's really messy especially when file be updated frequently.
And some old files are also hard to maintain.

>
> So, instead of relying on LAST_DIR, I rely on a special git branch that acts as
> marker. But this works only for me and not for other translator of the same
> languages, so you can get troubles also in this case.
>
> What we can actually do is to exploit the git commit message to store the tag
> you mentioned. Hence, we can get the last Id with something like this:
>
> LAST_ID=$(git log -n 1 Documentation/translations/<lang>/<path-to-file> | grep -E "Translated-on-top-of: commit [0-9a-f]{12}")
>
> The ID we store in the tag does not need to be the commit ID of the last change
> to <path-to-file>, but just the commit on which you were when you did the
> translation. This because it will simplify the management of this tag when
> translating multiple files/patches in a single patch (to avoid to spam the
> mailing list with dozens of small patches).

Yes, I also think about store the relative commit-id in commit message.
Being a git-hook is easy for now, but if we'd like to add something in
the future, it would may need add another script. Or just a tool which
show the relative information and let translator add it by themselves?

But to be honest, I'd like to make the tool could have more functions in
the future. Like auto start worlflow etc. More and more people will
join the translation work and some new developers also start their way
from here. There is a clear need to make the work more standardized and
easier.

Thanks!

Wu X.C.

>
> On Mon, Apr 12, 2021 at 03:04:03PM +0800, Wu XiangCheng wrote:
> > Hi all,
> >
> > This set of patches aim to add a new translation tool - trslt.py, which
> > can control the transltions version corresponding to source files.
> >
> > For a long time, kernel documentation translations lacks a way to control the
> > version corresponding to the source files. If you translate a file and then
> > someone updates the source file, there will be a problem. It's hard to know
> > which version the existing translation corresponds to, and even harder to sync
> > them.
> >
> > The common way now is to check the date, but this is not exactly accurate,
> > especially for documents that are often updated. And some translators write
> > corresponding commit ID in the commit log for reference, it is a good way,
> > but still a little troublesome.
> >
> > Thus, the purpose of ``trslt.py`` is to add a new annotating tag to the file
> > to indicate corresponding version of the source file::
> >
> > .. translation_origin_commit: xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
> >
> > The script will automatically copy file and generate tag when creating new
> > translation, and give update suggestions based on those tags when updating
> > translations.
> >
> > More details please read doc in [Patch 2/2].
> >
> > Still need working:
> > - improve verbose mode
> > - test on more python 3.x version
> > - only support linux now, need test on Mac OS, nonsupport Windows
> > due to '\'
> >
> > Any suggestion is welcome!
> >
> > Thanks!
> >
> > Wu XiangCheng (2):
> > scripts: Add new translation tool trslt.py
> > docs: doc-guide: Add document for scripts/trslt.py
> >
> > Documentation/doc-guide/index.rst | 1 +
> > Documentation/doc-guide/trslt.rst | 233 ++++++++++++++++++++++++++
> > scripts/trslt.py | 267 ++++++++++++++++++++++++++++++
> > 3 files changed, 501 insertions(+)
> > create mode 100644 Documentation/doc-guide/trslt.rst
> > create mode 100755 scripts/trslt.py
> >
> > --
> > 2.20.1
> >
>