RE: [PATCH v2 7/7] udf: Merge linux specific translation into CS0 conversion function

From: Andrew Gabbasov
Date: Mon Jan 11 2016 - 08:33:11 EST


Hi Jan,

> -----Original Message-----
> From: Jan Kara [mailto:jack@xxxxxxx]
> Sent: Monday, January 04, 2016 4:25 PM
> To: Gabbasov, Andrew
> Cc: Jan Kara; linux-kernel@xxxxxxxxxxxxxxx
> Subject: Re: [PATCH v2 7/7] udf: Merge linux specific translation into CS0
conversion function

[skipped]

> > + u_ch = cmp_id >> 3;
> > +
> > + ocu++;
> > + ocu_len--;
> > +
>
> Can we just check whether ocu_len is a multiple of u_ch here
> and bail out if not? That would save us some rounding below
> and also fix possible access beyond the end of the string
> in case fs is corrupted.

Yes, I think it can be done. I just wasn't sure if having
unaligned length is allowed or not, and tried to support
this case too. We can definitely return -EINVAL if ocu_len
is not a multiple of u_ch.

[skipped]

> > - for (i = 1; (i < ocu_len) && (str_o_len < str_max_len);) {
> > /* Expand OSTA compressed Unicode to Unicode */
> > - uint32_t c = ocu[i++];
> > - if (cmp_id == 16)
> > + c = ocu[i++];
> > + if (u_ch > 1)
> > c = (c << 8) | ocu[i++];
> >
> > + if (translate) {
> > + if ((c == '.') && (firstDots >= 0))
> > + firstDots++;
> > + else
> > + firstDots = -1;
> > +
> > + if (c == '/' || c == 0) {
> > + if (illChar)
> > + continue;
> > + illChar = 1;
> > + needsCRC = 1;
> > + c = ILLEGAL_CHAR_MARK;
> > + } else {
> > + illChar = 0;
> > + }
> > + }
> > +
> > len = conv_f(c, &str_o[str_o_len], str_max_len - str_o_len);
> > /* Valid character? */
> > - if (len >= 0)
> > + if (len >= 0) {
> > str_o_len += len;
> > - else
> > + } else {
> > str_o[str_o_len++] = '?';
> > + needsCRC = 1;
> > + }
>
> Umm, can you try factoring out body of this loop into a helper function
> and using it both during extension parsing and full name parsing?

I'll try to think about it. Although I'm not sure if it could become
a little overcomplicated since that helper could require too much internals
of this calling function.

> Also I think that checking whether the name is '.' or '..' could be moved
> just into the if (translate) below where you could just directly check it
like:
> if (str_o_len <= 2 && str_o[0] == '.' &&
> (str_o_len == 1 || str_o[1] == '.'))

Yes, you are right, it will be simpler.
I'll do it (and get rid of firstDots) in version 3 of the patch.

I'll also make all your suggested changes with empty lines and += u_ch.

Thanks.

Best regards,
Andrew