Re: [PATCH v7 1/8] unicode: Add utf8_casefold_iter
From: Gabriel Krisman Bertazi
Date: Mon Feb 17 2020 - 14:02:23 EST
Daniel Rosenberg <drosen@xxxxxxxxxx> writes:
> On Tue, Feb 11, 2020 at 7:38 PM Eric Biggers <ebiggers@xxxxxxxxxx> wrote:
>>
>> Indirect function calls are expensive these days for various reasons, including
>> Spectre mitigations and CFI. Are you sure it's okay from a performance
>> perspective to make an indirect call for every byte of the pathname?
>>
>> > +typedef int (*utf8_itr_actor_t)(struct utf8_itr_context *, int byte, int pos);
>>
>> The byte argument probably should be 'u8', to avoid confusion about whether it's
>> a byte or a Unicode codepoint.
>>
just for the record, we use int utf8byte because it can fail
error codes, but that is not the case here. It should be u8.
>
> Gabriel, what do you think here? I could change it to either exposing
> the things necessary to do the hashing in libfs, or instead of the
> general purpose iterator, just have a hash function inside of unicode
> that will compute the hash given a seed value.
Sorry for the delay, I'm away on a long vacation and intentionally
staying away from my laptop :)
Eric has a very good point, if not prohibitively, it is unnecessarily
expensive for a hot path. Why not expose utf8ncursor and utf8byte to
libfs and implement the hash in libfs?
--
Gabriel Krisman Bertazi