DragonFly kernel List (threaded) for 2008-11
DragonFly BSD
DragonFly kernel List (threaded) for 2008-11
[Date Prev][Date Next]  [Thread Prev][Thread Next]  [Date Index][Thread Index]


From: "Dennis Melentyev" <dennis.melentyev@xxxxxxxxx>
Date: Sat, 15 Nov 2008 21:22:28 +0200

Hi Matt,

2008/11/15 Matthew Dillon <dillon@apollo.backplane.com>:
> :It might be a good idea to make a small survey, i.e. find
> :people who actually _do_ have directories with a huge
> :number of files in them (and I mean more than just a few
> :thousands), and ask them what the filenames typically look
> :like.
>    That is a very good idea.
> :An obvious improvement would be to store name[d-2] and
> :name[d-1] in y[] and z[], respectively, where d is the
> :location of the last dot in the filename, if any, or the
> :location of the terminating zero if there is no dot.
> :In other words:  Ignore the extension when identifying
> :y[] and z[].  Finding the last dot shouldn't be more
> :computationally expensive than strlen(name), so this
> :shouldn't be a problem.
> :
> :Best regards
> :   Oliver
>    Another thing I was thinking about was dividing the filename
>    into four zones, and CRCing each zone.
>    The zones could be based on dashes and dots, and secondarily on
>    alpha-numeric transitions. If there are fewer then four zones
>    we would simply cut the pieces we do have down the middle, or into
>    quarters.  If there are more then four zones we would combine two
>    or more zones together to fit.
>    Here is an off-the-cuff structure:  Four zones, each zone CRC'd,
>    laid out using 16 bit CRC's for each zone ('d' is 15 bits so we
>    can set the LSB bit to zero to guarantee the iteration space).
>    aaaaaaaabbbbbbbb ccccccccdddddddd aaaaaaaabbbbbbbb ccccccccddddddd0
>    The problem with the zone idea is that it might not work too well
>    if the filenames have varying lengths... though now that I think about
>    it if the filename is otherwise unstructured (no dots, dashes, etc),
>    we could restrict zone A to the first 2-3 chars and zone D to the last
>    2-3 chars, and use zone's B and C to split everything left in the middle.

Please, think of it being tunable some way. In no dobt you have a huge
experience, but I'm not sure you can guess every possible situation
and this could be left for administrator, who really knows what do he
need in every particular case.

Dennis Melentyev

[Date Prev][Date Next]  [Thread Prev][Thread Next]  [Date Index][Thread Index]