Re: [PATCH 2/2] exfat: unify name extraction

From: Tetsuhiro Kohada
Date: Tue Aug 25 2020 - 06:15:50 EST



+ exfat_free_dentry_set(es, false);
+
+ if (!exfat_uniname_ncmp(sb,
+ p_uniname->name,
+ uni_name.name,
+ name_len)) {
+ /* set the last used position as hint */
+ hint_stat->clu = clu.dir;
+ hint_stat->eidx = dentry;

eidx and clu of hint_stat should have one for the next entry we'll
start looking for.
Did you intentionally change the concept?

Yes, this is intentional.
Essentially, the "Hint" concept is to reduce the next seek cost with
minimal cost.
There is a difference in the position of the hint, but the concept is the
same.
As you can see, the patched code strategy doesn't move from current
position.
Basically, the original code strategy is advancing only one dentry.(It's
the "minimum cost") However, when it reaches the cluster boundary, it gets
the next cluster and error handling.

I didn't get exactly what "original code" is.
Do you mean whole code lines for exfat_find_dir_entry()?
Or just only for handling the hint in it?

My intention is the latter.


The strategy of original code for hint is advancing not one dentry but one dentry_set.

That's the strategy as a whole code.
But all it does to get a hint after "found" is to advance one entry.
In the original code, the 'dentry' variable points to the end of the EntrySet when "found",
so it can point to the next EntrySet by simply advancing one entry.
(However, it may need to scan the cluster chain)


If a hint position is not moved to next like the patched code,
caller have to start at old dentry_set that could be already loaded on dentry cache.

Let's think the case of searching through all files sequentially.
The patched code should check twice per a file.

This is the case when all requests find the specified file, right?

Sure, the request will evaluate the same EntrySet as before found.
However, the cost to spend is different from the last time.
The current request looks for a different name than the last request.
In most cases, length and hash are different from the last EntrySet.
Therefore, the last EntrySet just skips dir-entries by num_ext.
There is no string comparison with ignores cases. <- This cost is high
The cost of skipping dir-entries is much less than the string comparison.

No better than the original policy.

In this patch, when "found", the 'dentry' variable still points to the beginning of the EntrySet.
In this case, I thought "stay here" was a very efficient hint at a minimal cost.
As a whole, I think that the cost has been reduced...

Getting the next cluster The error handling already exists at the end of
the while loop, so the code is duplicated.
These costs should be paid next time and are no longer the "minimum cost".

I agree with your words, "These costs should be paid next time".
If so, how about moving the cluster handling for a hint dentry to
the beginning of the function while keeping the original policy?

My first idea was
hint_stat->eidx = dentry + 1 + num_ext;

However, in the current hint, offset ((hint_stat->eidx) and cluster number (hint_stat->clu) in the directory are paired.
It was difficult to change only one of values.
So I'm trying to make a 'new hint' where the offset and cluster number aren't linked.


BTW, this patch is not related to the hint code.
I think it would be better to keep the original code in this patch and improve it with a separate patch.

I think so, too.
I'll try another patch.


BR
---
Tetsuhiro Kohada <kohada.t2@xxxxxxxxx>