Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Storing data in case insensitive format #232

Open
mobedoor opened this issue Aug 25, 2024 · 17 comments
Open

Storing data in case insensitive format #232

mobedoor opened this issue Aug 25, 2024 · 17 comments
Assignees
Labels
enhancement New feature or request fixready
Milestone

Comments

@mobedoor
Copy link

Hello. I use dwarfs to store Windows games and use a script to mount them via fuse and run via wine. As Windows does not care about case sensitivity, game devs occasionally use the incorrect case which causes no problem on windows but causes issues on Linux. To that end, I would like to request an option to store files in dwarfs in a case insensitive format.
My current (unsuccessful attempt) to get it working.

  1. Store game in dwarfs
  2. Mount dwarfs to folder A
  3. Mount folder A via cicpoffs to Folder B
  4. Mount the folder via fuse-overlayfs lowerdir="Folder B" -o upperdir=overlay -o workdir=work game-root
  5. Failure
@mhx
Copy link
Owner

mhx commented Aug 25, 2024

  1. Failure

This fails why exactly?

You can't really store the files case-insensitive. What you rather want is case-insensitive lookup when accessing the data. This could definitely be an option to the FUSE driver, but I wouldn't mind if there was an off-the-shelf solution for the problem.

@mhx
Copy link
Owner

mhx commented Aug 25, 2024

@diablo00001 either you provide some context on the incredibly fishy looking link, or I'm going to report/delete your comments.

Repository owner deleted a comment Aug 25, 2024
@mhx
Copy link
Owner

mhx commented Aug 25, 2024

Also, we certainly don't need that link twice.

Repository owner deleted a comment Aug 25, 2024
Repository owner deleted a comment Aug 25, 2024
@mhx
Copy link
Owner

mhx commented Aug 25, 2024

Deleted comments and reported @diablo00001.

@mobedoor
Copy link
Author

You can't really store the files case-insensitive. What you rather want is case-insensitive lookup when accessing the data. This could definitely be an option to the FUSE driver, but I wouldn't mind if there was an off-the-shelf solution for the problem.

I see. Thank you for clarifying.

This fails why exactly?

I'm unsure of why it is failing at step 5, perhaps cicpoffs does not play well with fuse-overlay-fs? It certainly works fine on its own. If I follow from steps 1 through 3 and then launch the game directly it works fine.
On further testing cicpoffs seems to be crashing even without fuse-overlayfs. I'll try testing with ciopfs and report back.

@mhx
Copy link
Owner

mhx commented Aug 25, 2024

I did a quick proof-of-concept to make sure there isn't a problem with the low-level FUSE API that DwarFS uses.

┌──[mhx@gandalf] ✔ [dwarfs/build-clang] (mhx/work|✚1)
└─% cat tmp/debug/perl-5.6.0/lib/5.6.0/pod/perlintern.pod | head
=head1 NAME

perlintern - autogenerated documentation of purely B<internal> 
		Perl functions

=head1 DESCRIPTION

This file is the autogenerated documentation of functions in the 
Perl intrepreter that are documented using Perl's internal documentation
format but are not marked as part of the Perl API. In other words, 
┌──[mhx@gandalf] ✔ [dwarfs/build-clang] (mhx/work|✚1)
└─% cat tmp/debug/perl-5.6.0/LIB/5.6.0/POD/PERLINTERN.pod | head
=head1 NAME

perlintern - autogenerated documentation of purely B<internal> 
		Perl functions

=head1 DESCRIPTION

This file is the autogenerated documentation of functions in the 
Perl intrepreter that are documented using Perl's internal documentation
format but are not marked as part of the Perl API. In other words, 

So this is entirely feasible.

@mobedoor
Copy link
Author

On testing, ciopfs works, however the following requirement

All filenames in the data directory which aren’t all lower case are ignored

Makes it's utility rather limited for me because of the added requirement of preprocessing all files to lowercase.
If possible and not introducing unnecessary burden, I would appreciate dwarfs offering the feature.

@mhx
Copy link
Owner

mhx commented Aug 25, 2024

On testing, ciopfs works, however the following requirement

All filenames in the data directory which aren’t all lower case are ignored

Makes it's utility rather limited for me because of the added requirement of preprocessing all files to lowercase. If possible and not introducing unnecessary burden, I would appreciate dwarfs offering the feature.

So this could also be a bug in cicpoffs considering it works with ciopfs?

Have you tried swapping (3) and (4)? I don't see why the order would matter (on the contrary, if e.g. the Windows code wrote to ABC.txt and then tried to read abc.txt from the overlay).

@mobedoor
Copy link
Author

So this could also be a bug in cicpoffs considering it works with ciopfs?

Yes. That seems to be the case.

Have you tried swapping (3) and (4)? I don't see why the order would matter (on the contrary, if e.g. the Windows code wrote to ABC.txt and then tried to read abc.txt from the overlay).

For cicpoffs? It crashes. Ciopfs works fine in both cases.

@mhx
Copy link
Owner

mhx commented Aug 25, 2024

Shame. Maybe worth opening an issue for cicpoffs?

I'm not opposed to adding this to DwarFS, but doing it properly will require a bit of work. It's unlikely going to happen in the next 1-2 months.

@mobedoor
Copy link
Author

Maybe worth opening an issue for cicpoffs?

Will do that.

It's unlikely going to happen in the next 1-2 months.

Cheers. I have no problem waiting.

@xsmolasses
Copy link

I'm not opposed to adding this to DwarFS, but doing it properly will require a bit of work. It's unlikely going to happen in the next 1-2 months.

While beautifully archived, result from dwarfs' case-sensitive lookup, case-insensitive queries are bound to bring failure state for a lot of assets & software under Windows.

DIR /B "\\BD\x\BDMV\PLAYLIST\00001.mpls" "\\BD\y\BDMV\PLAYLIST\00001.mpls"
00001.mpls
00001.mpls

FC /B "\\BD\x\BDMV\PLAYLIST\00001.mpls" "\\BD\y\BDMV\PLAYLIST\00001.mpls"
FC: cannot open \\BD\x\BDMV\PLAYLIST\00001.MPLS - No such file or folder

DIR /B "X:\BDMV\PLAYLIST\00001.mpls" "Y:\BDMV\PLAYLIST\00001.mpls"
00001.mpls
00001.mpls

FC /B "X:\BDMV\PLAYLIST\00001.mpls" "Y:\BDMV\PLAYLIST\00001.mpls"
FC: cannot open X:\BDMV\PLAYLIST\00001.MPLS - No such file or folder

I tried wrapping with ntptfs:
"The parameter is incorrect."

By the way, when providing a UNC VolumePrefix, please change <mountpoint> to be optional.
But may not be possible given the WinFsp-FUSE abstraction? (Maybe asking for dirty hack.)

Silly being limited to < 26 mounts - and where drive letters are needed for surprise pnp.

@mhx mhx self-assigned this Oct 18, 2024
@mhx mhx added the enhancement New feature or request label Oct 18, 2024
@mhx
Copy link
Owner

mhx commented Oct 19, 2024

By the way, when providing a UNC VolumePrefix, please change <mountpoint> to be optional. But may not be possible given the WinFsp-FUSE abstraction? (Maybe asking for dirty hack.)

Silly being limited to < 26 mounts - and where drive letters are needed for surprise pnp.

I have no idea what you're talking about. Please assume that I know nothing about Windows.

@xsmolasses
Copy link

By the way, when providing a UNC VolumePrefix, please change <mountpoint> to be optional. But may not be possible given the WinFsp-FUSE abstraction? (Maybe asking for dirty hack.)
Silly being limited to < 26 mounts - and where drive letters are needed for surprise pnp.

I have no idea what you're talking about. Please assume that I know nothing about Windows.

My lousy wording, in a nutshell: Windows has always used drive letter mappings A:\ to Z:\ for storage devices and whatnot.

(Compare usage of this old DOS style versus the naming freedom of UNC paths in my provided example where path case insensitivity is shown to be a requisite on Windows; given an instance where a system command line program, FC.exe, turns its arguments uppercase for some inane reason.)

Users lose a drive letter assigned to each mount, whether from virtual volume or surprise insertion of physical media, and we cannot mount more than there are letters in the alphabet.

WinFsp insists that this old convention is followed, and responsibility for said drive letter mappings corresponds to the mountpoint parameter.

(An aside, Windows’ file manager gui, Explorer, for some reason cannot reach the unfettered alternative \UNC\paths probably trying to hand it off to SMB networking, I don't know - may be good reason for the questionable decision to force the creation of drive letters; a question that I’d put to the WinFsp developer.)

And on account of the WinFsp-FUSE abstracting out of sight the call to FspFileSystemSetMountPoint; not even exposed to you, I’m guessing, to even be able to make conditional. Somewhere down the line a drive letter is automatically alotted whether we like it or not. Rclone also has this issue.

Sorry to bother a wizard such as yourself if being the case it appears to be out of your hands, hence maybe it falls on to myself to hack together a proxy WinFsp library to bypass its forced drive letter creation for all filesystems employing the WinFSP-FUSE layer.

@mhx
Copy link
Owner

mhx commented Oct 19, 2024

Users lose a drive letter assigned to each mount, whether from virtual volume or surprise insertion of physical media, and we cannot mount more than there are letters in the alphabet.

WinFsp insists that this old convention is followed, and responsibility for said drive letter mappings corresponds to the mountpoint parameter.

Is that really the case? If I do

C:\Users\mhx>dwarfs image.dwarfs mnt -d

image.dwarfs will be accessible at C:\Users\mhx\mnt and (at least as far as I can tell) no drive letter will be consumed for this.

I may be totally wrong (as I said, I know almost nothing about Windows). If the drive letter is somehow consumed "invisibly" by something WinFsp does under the hood, this issue must indeed be addressed by WinFsp.

@xsmolasses
Copy link

dwarfs.exe "image.dwarfs" "C:\mnt\test10"
fsptool-x64.exe lsvol
-   \Device\Volume{2fdd5cda-8e76-11ef-a7ee-00e04c682878}

- denoting no volume letter taken. I forgot mount points could be used in substitute.

I recall some 3rd-party compatibility [or preference] reason I didn't resort to them.

And it's either-or, can't seem use them in combination with --VolumePrefix=\UNC\test

With regards to native WinFsp ntptfs, commenting out FspFileSystemSetMountPoint (and related cmd line args parsing) was sufficient to avoid necessity of mountpoint altogether.

Whereas in order to use UNC prefix with a WinFsp-FUSE wrapper, my typical mount line has been & continues to be a wildcard:

dwarfs.exe "image.dwarfs" * --VolumePrefix=\UNC\test
fsptool-x64.exe lsvol
S:  \Device\Volume{5a8248ad-8e7a-11ef-a7ee-00e04c682878}\UNC\test

(I wonder where least system overhead might rest but leave that for another occasion.)

We can rest this by-the-way issue erm by the wayside, too, as yes, up to WinFsp guy.

Thanks for your input and making the most-excellent DwarFS project [available to all]!

@mhx
Copy link
Owner

mhx commented Nov 18, 2024

Implemented in d6b620b. You can check if this works for your use case using the artifacts from the CI pipeline.

$ dwarfs perl.dwarfs tmp -ocase_insensitive

$ ll tmp 
drwxr-xr-x@ - mhx users 16 Nov  2020 debug
drwxr-xr-x@ - mhx users 23 Nov  2020 debugthread
drwxr-xr-x@ - mhx users 24 Nov  2020 default
drwxr-xr-x@ - mhx users 24 Nov  2020 thread
drwxr-xr-x@ - mhx users 24 Nov  2020 thread5005

$ ll tmp/DeBuGtHrEaD/perl5.005/bin/perl 
.rwxr-xr-x@ 1,111,000 mhx users 24 Nov  2020 tmp/DeBuGtHrEaD/perl5.005/bin/perl

This works on all platforms, not just Windows. Not sure, but that might also help with running stuff under wine.

This should work with pretty much any file name, e.g. even a name like DÃTâScïÊNcË will be matched if the requested file is DãtÂScïêNcË.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request fixready
Projects
None yet
Development

No branches or pull requests

4 participants
@mhx @mobedoor @xsmolasses and others