Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

File objects for resource forks #3

Open
ajnelson-nist opened this issue Jan 19, 2016 · 12 comments
Open

File objects for resource forks #3

ajnelson-nist opened this issue Jan 19, 2016 · 12 comments

Comments

@ajnelson-nist
Copy link

Per discussions at CurateGear, it would be helpful for hfs2dfxml to separate out the forks of files. I think it makes the most sense to create a separate <fileobject> for each fork, but I'm not sure what the most correct naming scheme would be. For a file named foo, ._foo seems to be the resource fork bundled with file metadata, according to the citations for the HFS Wikipedia page's compatibility section. I don't recall from my grade school years whether macunpack's current practice of naming resource forks foo.rsrc are the widest adopted practice.

@dd388
Copy link
Collaborator

dd388 commented Jan 21, 2016

I agree that resource forks should be added. For what it's worth, the spreadsheet view on FTK Imager for an HFS formatted image looks something like this. (Here, filename1 and filename3 have resource forks but filename2 does not.)

path\filename1\
path\filename1\rsrc
path\filename2
path\filename3\
path\filename3\rsrc

If we followed that convention, the associated <fileobject> would be named rsrc and be a child of it. That matches what's happening behind the scenes, too. See: http://xahlee.info/UnixResource_dir/macosx.html, question: "How to use the command line to find out if a file has resource fork?"

That would mean something like this in DFXML:

<fileobject>
<filename>:filename1</filename>
...
  <fileobject>
  <filename>:filename1:rsrc</filename>
  </fileobject>
</fileobject>

Though that begs the question as to how to map the other fileobject properties to a resource fork fileobject. I'll have to create some test output and see how it validates.

@ajnelson-nist
Copy link
Author

On nesting:

At the moment, the DFXML language does not support nesting a fileobject within a fileobject. Per the fileobject type definition, a fileobject is not within the enumerated list of supported child elements.

The way to make a fileobject denote that it is stored within some parent object---usually done with directories and their files, or files derived from a zip archive---is to have a child c store a parent_object reference. For now, that reference is by inode number. If that's insufficient in some case, it'd be worth filing a bug against the DFXML schema to expand the parent reference definition.

On naming convention:

You've supplied two references for naming conventions, which themselves supply two conventions. First is FTK and OS X pre-10.7 (per the Xahlee page) naming the resource fork of foo as foo/rsrc; but, OS X at and after 10.7 use foo/..namedfork/rsrc per Xahlee, though I have tried and failed to verify their fork detection and creation notes in an operating system newer than 10.7.

I'm not sure what I'd personally want the resource fork name to be. However, on reviewing Matt Deatherage's notes on the ._ convention, ._ is definitely the wrong answer. I see a decent historical-context argument for foo/rsrc. I see a decent "Current practice" argument for foo/..namedfork/rsrc.

@dd388
Copy link
Collaborator

dd388 commented Jan 26, 2016

Yes, right, you can't nest a fileobject in another fileobject.

It sounds like we need the following properties for a resource fork:

  1. name/path
  2. size
  3. byte runs
  4. MD5, SHA1 hash

Is there anything I'm missing?

That's enough to support it being its own fileobject and then its parent can be inferred by filename (more on that in a second) and explicitly referenced using parent_object. For now, I don't see an issue with using the inode for that reference.

My preference now is to use filename:rsrc to refer to the resource fork for HFS-formatted disks because it matches what I remember seeing when mounting an HFS disk in OS 10.2 (with Classic Environment). I'll have to dig up my notes to verify, but that seems to align with what I remember.

Will that work?

@ajnelson-nist
Copy link
Author

On properties:

Another property for a resource fork: There is an enumeration in the DFXML schema for the "Type" of a file. This way-too-overloaded term here would mean, "What's the type of file system object that resides at this name?" (The xs:documentation element could probably do with a tweak.) The enumeration does not contain something that would properly denote an HFS-specific resource fork. There's room to add one, since Solaris and OpenBSD got their own entries, but most of the key letters (particularly r and f) are already taken. There's also probably some legacy SleuthKit reason those are just one character long.

For now, please include a name_type of "-" for resource forks, and make sure the generated DFXML denotes that it's version 1.1.1.

On names:

I would personally most prefer whatever was native. Could you post a screenshot somewhere of 10.2 Classic Environment, with a textual path of a resource fork? That'd be fine justification to settle the name.

@dd388
Copy link
Collaborator

dd388 commented Jan 27, 2016

Sounds good. I'll try to get some sort of documentation of the representation of a resource fork to include in this thread.

@dd388
Copy link
Collaborator

dd388 commented Feb 4, 2016

Some visual documentation of forks follows.

OS X 10.2, with Classic Environment, Power Mac G4
I loaded a CD with an HFS file system and listed directory contents for two files, one with a resource fork and one without. (Note: I sharpened the image, because this was a machine with a CRT monitor and no internet access, so taking a "screenshot" with my phone was tricky.)
sharper_section

This shows that the file Shock in the Ear (an executable program) has zero bytes in its data fork and 3067 bytes in the resource fork. Access to the resource fork is through the path Shock in the Ear/rsrc. (The file AppleShare PDS has no resource fork and is shown for comparison. The pathAppleShare PDS/rsrc exists but shows zero bytes.)

For verification, here is the same executable file listed through hfsutils.
hfsutils_confirmation

In contrast, here is how the paths look on a recent Intel Mac Mini:
fork_recentmac
This shows that the path Shock in the Ear/rsrc is not valid, but xattr shows that there is a resource fork attached to this file.

Look good? (Disk image used: Shock in the Ear)

@ajnelson-nist
Copy link
Author

That is great reference material. I assume if you could take a picture of an OS 9 or earlier environment, you would see Shock in the Ear:rsrc.

I still feel it's the developer's call whether to go with foo/..namedfork/rsrc to name the fork and keep with current OS X conventions, or foo/rsrc to keep with period conventions. Personally, I'd lean towards the historical convention, but it's up to you. You also have the option of adding a flag to the command line to pick between the two.

@dd388
Copy link
Collaborator

dd388 commented Feb 4, 2016

For completeness:
fork_newname
(I just upgraded OS X today, which which is why the Terminal has new styling, but it's the same Mac I did the other screencaps on.)

I'm tending towards the historical convention for HFS, but I like the idea of having a flag to pick between the two.

Alternatively, is there any way to have an alternate file name represented in DFXML? I can see the utility in this in other contexts, like a CD image with an ISO 9660 filesystem and Joliet extensions, which essentially gives you two names for the same file.

@ajnelson-nist
Copy link
Author

Alternative file name sources are a possibility. Given enough use cases, we can design a sane interface for them.

As for the CD image example, I think it is better for that to be a separate fileobject. The data pointers would be the same, yes, but other metadata is not guaranteed to line up, and the location of the metadata structures (the bytes comprising the directory entry and inode) are guaranteed to not line up.

@dd388
Copy link
Collaborator

dd388 commented Feb 5, 2016

Given that, If I go with the current OS X convention of using filename/..namedfork/rsrc, that also implies that I'm using a slash as a delimiter. In the current version of the code, the path delimiter is the colon, since that was the convention in HFS. I'd rather not conflate the two, and so I think a reasonable path is as follows.

Default output is to use the colon as a path delimiter and name resource forks using the convention filename:rsrc.

Optionally, there can be a flag that will use current OS X conventions, and so the path delimiter will be a slash and the resource fork will be named filename/..namedfork/rsrc.

@ajnelson-nist
Copy link
Author

This looks reasonable. Should the flag take a string argument naming the convention type, to allow for the intermediary convention filename/rsrc?

@dd388
Copy link
Collaborator

dd388 commented Feb 26, 2016

I went back and forth on this, and originally I wasn't in support of the intermediary, but now I can start to see the rationale.

So there would be three options for output, corresponding to the following cases:
filename:rsrc - As the path would appear in pre-OS X systems
filename/rsrc - As the path would appear in OS 10.0-10.6 systems
filename/..namedfork/rsrc - As the path would appear in OS 10.7 and above systems

I would argue that since subset of the second case (i.e., OS 10.0-10.4) includes the Classic Environment layer, that would support the use case for having the intermediary convention.

Hopefully in the next few weeks I can make this change to the code -- at the very least, I can add options for the path output formats listed above.

Related, is there something in the DFXML specification that I should use to denote non-standard path delimiters? I know you had mentioned the possibility of that a while back, but I wasn't sure where it stood now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants