Sunday, 10 March 2013

Re: [gccsdk] unixlib directory iteration doesn't work on Fat32FS

In message <53021f2a53.Matthew@sinenomine.freeserve.co.uk>
on 10 Mar 2013 Matthew Phillips wrote:

> Still not having looked at the code, my impression is that the routine in
> question is returning a single 32-bit word to *its* caller which attempts,
> in one word, to keep the position in the directory and the subdirectory.
> See the start of the thread where Alan quoted from the code:
>
> /* After preaching the passion, now the hack : when we're going to do
> the reverse suffix swapping, we will have up to two outstanding OS_GBPB
> sessions : one in the main dir and another in the suffix dir.
> How we're going to give an 'off_t' result in telldir describing these
> two offsets in OS_GBPB ? Well, we are going to rely on unity monotonic
> increase of the offsets and this up to GBPB_MAX_ENUM. If these
> conditions are not fullfilled, we stop the enumeration.
> Another solution is to return table index numbers which, when presented
> to seekdir, will get looked up in a table giving the two internal dir
> offsets. */
>
> How exactly it relies on unity monotonic increasing, I don't know. Perhaps
> I'd better download unix/dirent.c to see.

OK, take a look at the code here:

http://www.riscos.info/websvn/filedetails.php?repname=gccsdk&path=%2Ftrunk%2Fgcc4%2Frecipe%2Ffiles%2Fgcc%2Flibunixlib%2Funix%2Fdirent.c

The problems come from the way the standard Unix functions telldir and
seekdir are implemented. These rely on the use of a long int to record the
position we are up to in traversing the directory. Unixlib is pretending
that the files contained in subdirectories named "c", "h", "o" etc. are
actually files in the main directory with suffixes instead, so if you had a
directory containing (RISC OS style)

a_file
c
c.source
c.source2
h
h.header

it would return filenames (Unix style) of

a_file
source.c
source2.c
header.h

The telldir function has to return a long int which says where we currently
are in reading the directory and these special subdirectories. The seekdir
function has to take that value and turn it back into a GBPB position in the
directory and in the corresponding subdirectory if we are in the middle of
one.

I think the difficulty mainly comes from the OS_GBPB call in readdir_r
reading multiple file information at once. This means, if it encounters a
suitable subdirectory part way through what it has read, that it is
impossible for it to know what value of R4 would take it straight back to
that entry without relying on R4 increasing by one each time. It needs to be
able to recover the subdirectory name because otherwise it doesn't know
which subdirectory it's getting the files from.

I've not scoured the source thoroughly, but the only bit of arithmetic on R4
values I can see is in telldir:

long int
telldir (DIR *stream)
{
  if (!__validdir (stream))
    return -1;

  if (stream->suffix && stream->suffix->dd_off != GBPB_END_ENUM)
    return (stream->dd_off - 1) + (stream->suffix->dd_off << 16);

  return stream->dd_off + (stream->dd_suf_off << 16);
}

If OS_GBPB 10 were used to read one file at a time, it would be possible to
alter the code to store the proper R4 values and avoid this hack. But this
might be rather less efficient, speed-wise. I am not sure whether things
would be noticeably slower.

I would need to read a bit more about how these Unix functions are supposed
to behave before being able to suggest a solution.

The code also relies on R4 return values from OS_GBPB 10 being in the range
-1 to 65535, as it packs two of them into a single word. That should not be
a problem for FileCore file systems or for FAT32FS but is not in general
true.

The implementation does check that R4 values are going up by one per
directory entry, and if it finds this not to be the case, enumeration is
stopped to be on the safe side. Removing this check, which can be found in
this block of code

    /* Check if we have a monotonic unit increase OS_GBPB
        offset, if not, we don't do anything.  */
    if (stream->gbpb_off + regs[3] != regs[4])
      {
        stream->gbpb_off = stream->dd_off = GBPB_END_ENUM;
        return 0;
      }

could well actually cure the problem as far as FAT32FS goes, because that
filing system copes with reversing R4 values by a single step, but in general
it is a useful safety check which needs to be kept while the functions are
implemented in this manner.

--
Matthew Phillips
Durham

_______________________________________________
GCCSDK mailing list gcc@gccsdk.riscos.info
Bugzilla: http://www.riscos.info/bugzilla/index.cgi
List Info: http://www.riscos.info/mailman/listinfo/gcc
Main Page: http://www.riscos.info/index.php/GCCSDK

No comments:

Post a Comment