Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[boot] Expand /bootopts to 1023 bytes #2100

Merged
merged 7 commits into from
Nov 14, 2024
Merged

[boot] Expand /bootopts to 1023 bytes #2100

merged 7 commits into from
Nov 14, 2024

Conversation

ghaerr
Copy link
Owner

@ghaerr ghaerr commented Nov 13, 2024

Requested by @Mellvik some time ago.

This PR adds the ability to use a larger /bootopts file for system configuration, on both MINIX and FAT filesystems. This will allow for more text, explanations or multiple scenarios to be saved. The current only caveat is that when /bootopts is larger than 512 bytes, both disk sectors must be contiguous. This should normally not be a problem since /bootopts is now distributed as a file > 512 bytes using contiguous sectors, and editing or rewriting the file will keep the same sector numbers on FAT. MINIX uses 1K blocks so this is not an issue.

The larger /bootopts file was already almost working, as MINIX already reads two 512-byte disk sectors per block. While looking deeper into the implementation, some subtle bugs were found and fixed, noted below. In addition, several related features were added at the same time, including the release of all the low memory and kernel data segment memory used for /bootopts buffers back to main memory and the kernel heap respectively.

Features added:

  • /bootopts can now work with double its previous size, up to 1023 bytes.
  • The 1024 byte DEF_OPTSEG as well as 512 byte REL_INITSEG (setup.S data segment) are released for main memory use after /bootopts processing.
  • MINIX boot now only reads a single sector for the superblock instead of two.
  • Six new bytes (total 8) are now available in the MINIX boot sector, allowing for some future expansion. Was 2 available.

Bugs fixed:

  • On PC-98 1232k floppies which use a 1K sector, the /bootopts actually overwrote REL_INITSEG during startup. This didn't seem to be a problem with the random initialization, as setup.S reinitialized most SETUP_ variables on startup. If REL_INITSEG hadn't directly followed DEF_OPTSEG, this would have been a system crash.
  • Same problem for MINIX, the boot loader was reading a 1K block the whole time into a 512-byte DEF_OPTSEG buffer.

Caveats:

  • /bootopts must be contiguous sectors on FAT filesystems. This shouldn't be a problem since its shipped contiguous.
  • SETUP_xxx defines can only be used during kernel initialization, as REL_INITSEG is released. Setup variables needing to be accessed during normal kernel operation must be copied during kernel init.

@Mellvik: Interestingly, all this work started as the result of playing with a cool macOS hex editor. I tried it out and happened to be looking at the hex dump image/fd1440.img and immediately noticed two "linux" strings in the boot_minix.o second boot sector payload. I figured the boot sector code size could be decreased, and that started the process of noticing exactly how /bootopts was being loaded, and the overwrite occurring at REL_INITSEG... The attention was still focused on the config.h DMASEG and REL_SYSSEG areas and down the rabbit hole I went :)

I highly recommend Hex Fiend, if you set the column width to 32, filesystem layouts become very apparent while scrolling through floppy disk images. Seeing filesystem layouts visually really helps see what is going on.

@Mellvik
Copy link
Contributor

Mellvik commented Nov 13, 2024

Thank you @ghaerr, this is great. You beat me to it, and I'm glad you did: The restructuring of bootopts-processing and the release of the segment after use - very nice!

And thanks for the hint about Hex Fiend! I used it way back and have since forgotten completely about it. Extremely powerful - and immediately applicable to the rabbit chase I'm in at the moment.

@ghaerr
Copy link
Owner Author

ghaerr commented Nov 13, 2024

The restructuring of bootopts-processing and the release of the segment after use

I got a little ahead of myself trying to put all the bootopts processing variables into a struct and reverted portions of the release commit. Some of them are used as argv/slen values to sys_exec when starting /bin/init and thus can't be released by itself. I've determined releasing the variables later by the idle task is unsafe. The final result works well, and only the options[] and umb[] setup buffers are released, but that's a large part of the total - a little over 1K returned to the kernel data segment.

An unforeseen nice enhancement is that after kernel startup, but before /bin/init, adding the ~1K back to the kernel heap ends up being used for a bunch of smaller allocations, which otherwise would fragment the older, larger sections of the kernel heap. This can be seen below in the allocations occurring at kernel data segment location 3E12, which used to be the options parsing buffer:
Screen Shot 2024-11-13 at 1 07 42 PM

@ghaerr ghaerr merged commit b8c9588 into master Nov 14, 2024
2 checks passed
@ghaerr ghaerr deleted the bootopts branch November 14, 2024 01:45
@Mellvik
Copy link
Contributor

Mellvik commented Nov 14, 2024

An unforeseen nice enhancement is that after kernel startup, but before /bin/init, adding the ~1K back to the kernel heap ends up being used for a bunch of smaller allocations, which otherwise would fragment the older, larger sections of the kernel heap.

Really nice, @ghaerr. It will be interesting to see how this works out on TLVC where the small heap allocations are made from the top of the heap to avoid the very same fragmentation. My first thought is that it should be quite simple to fill the bottom segment first, then continue from the top. We'll see when we get there. Thanks.

@ghaerr
Copy link
Owner Author

ghaerr commented Nov 14, 2024

it should be quite simple to fill the bottom segment first, then continue from the top.

The ELKS near heap allocator always uses best fit. Now, instead of having a single initial (very large) heap segment, the second heap_add of approximately 1050 bytes causes that segment to fill first (from low addresses). If the TLVC allocator has been modified to fill from high addresses first but still uses best fit, then it would seem to perform as you say, except likely fill from the high address of the small segment first, then the high addresses of the larger segment? Does TLVC still use best fit, or does it always place allocations less than a certain size at highest address instead?

Given this interesting behavior, its on my list to try some other arena-size experimentation, perhaps allocating the initial heap into two or three arena sizes from the start. The real test will be to look at fragmentation after quite a few processes have run.

@Mellvik
Copy link
Contributor

Mellvik commented Nov 14, 2024

The TLVC allocator https://github.com/Mellvik/TLVC/pull/36 was modified to check for SEG descriptor allocations and allocate those from the top since they are small, dynamic and tend to cause fragmentation. That has worked out really well. Best fit still rules, so if there is space, a small TTY buffer allocation will end up in the middle of the SEG descriptors, but that's fine.

Changing this to take advantage of split heap segments should really be a walk in the park - and really useful, maybe even reserve the first (low men) heap segment to small allocations.

tlvc16# meminfo
  HEAP   TYPE  SIZE    SEG   TYPE    SIZE  CNT  NAME
  68ee   TASK 13872
  9f2a   CACH 10240
  c736   BUFH   640
  c9c2   TTY   1024
  cdce   TTY     80
  ce2a   NETB    12
  ce42   NETB  1536
  d44e   NETB  1536
  da5a   TTY     80
  dab6   free  8762
  fcfc   SEG     16   5a15   free    2624    0  
  fd18   SEG     16   41d7   free   52208    0  
  fd34   free    16
  fd50   SEG     16   5459   free    1616    0  
  fd6c   SEG     16   3bae   DSEG   25232    1  -/bin/sh 
  fd88   SEG     16   6756   free    1296    0  
  fda4   SEG     16   5c26   free   28128    0  
  fdc0   SEG     16   6697   DSEG    3056    1  /bin/getty 
  fddc   SEG     16   6304   DSEG   14640    1  ftpd 
  fdf8   SEG     16   5364   CSEG    3920    1  meminfo 
  fe14   SEG     16   4ff7   CSEG   14032    1  ftpd 
  fe30   SEG     16   5ab9   CSEG    5840    1  /bin/getty 
  fe4c   SEG     16   3070   CSEG   46048    1  -/bin/sh 
  fe68   SEG     16   7605   free  171952    0  
  fe84   SEG     16   67a7   DSEG   58848    1  ktcp 
  fea0   SEG     16   5949   DSEG    3264    1  meminfo 
  febc   SEG     16   54be   CSEG   18608    1  ktcp 
  fed8   SEG     16   4e96   DSEG    5648    1  telnetd 
  fef4   TTY    100
  ff64   SEG     16   2f70   CSEG    4096    1  telnetd 
  ff80   SEG     16   155f   free    2384    0  
  ff9c   SEG     16   13a6   CSEG    7056    1  /bin/init 
  ffb8   SEG     16   2600   BUF    32768    1  
  ffd4   SEG     16   2e00   DSEG    5888    1  /bin/init 
  fff0   SEG     16   25f4   free     192    0  
  Heap/free   38686/ 8778 Total mem  509344
  Memory usage  497KB total,  243KB used,  254KB free
tlvc16# 

Given this interesting behavior, its on my list to try some other arena-size experimentation, perhaps allocating the initial heap into two or three arena sizes from the start. The real test will be to look at fragmentation after quite a few processes have run.

I agree, this is an interesting experiment. The heap fragmentation is very much activity dependent, as would be expected. Unusual activity like switching between network interfaces on a running system, i.e. releasing and reallocating network & ntty buffers may still cause some interesting fragmentation, but I suspect a more 'regular' environment would be fine with the current setup, possibly reserving the low mem segment to small allocations.

@ghaerr
Copy link
Owner Author

ghaerr commented Nov 14, 2024

maybe even reserve the first (low men) heap segment

Actually, the low memory segment is the last heap segment. Undoubtably since it was added last but contains .data address offsets from the options[] buffer. It seems that the heap allocator isn't sorting new segments when added to the heap, which could be considered a bug with regards to any later block merging (although that won't happen in this case). Nonetheless, as can be seen in my screenshot above, the lower addressed segment comes later in the meminfo listing, which is a bit strange to look at. I may fix this but I'm suspecting that the heap manager was written with the idea of only a single heap_add to be used, and perhaps its only working because the second heap_add uses lower address. I'm not aware of any code prohibiting merging between the added segments (like libc malloc) so I'm thinking its a definite possibility that more than two heap_add's won't work without allocator enhancements. FYI!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants