Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

First 5 bytes are "zeroed" out #3

Open
ryanjpearson opened this issue Dec 21, 2020 · 27 comments
Open

First 5 bytes are "zeroed" out #3

ryanjpearson opened this issue Dec 21, 2020 · 27 comments

Comments

@ryanjpearson
Copy link

ryanjpearson commented Dec 21, 2020

Hi, I set up deflate.py with a 32768 OBSIZE and COMPRESS = False. I wrote C code based on test_deflate.py. When it runs, the first 5 bytes that are output on o_byte are zeroed out, but after that, the bytes look correct. The bytes are not coming out shifted. Byte 5 is what byte 5 is supposed to be, when I compared the output with the output of a software inflate function. The code is attached, thanks for any suggestions! (I'm particular wondering if there's an issue in InitDecompress, WriteDecompress, ReadDecompress, or DeflateClockStrobe functions.)

decompress.txt

@tomtor
Copy link
Owner

tomtor commented Dec 21, 2020

@ryanjpearson I will have a look later this week...

@tomtor
Copy link
Owner

tomtor commented Dec 22, 2020

The PNG code makes it a bit harder to find the root cause.

Could you first change your example C code to test the decompression of (EDIT):

static char ctest[] = "\x18\x95u\xcf\xbb\t\x80@\x00\x04\xd1V\xd6\x0e\xdc\xf3\xdf\x81\x1d\xd8\x80\x66\x07\x07\xf6\x1f\x18\t\n;\x13N\xf6$\xedW\xadMG\xbb\xeb\xd9\xa9\xd7\xdbo;\xef\x92\xf7\x90\xf7\x98\xf7\x94\xf7\x9c\xf7\x92\xf7\x9a\xf7\x06\x1c\x62\x82\xd3\x00\x35H\rT\x83\xd5\x80\x35h\r\\\x83\xb7|\xbc\x0fjCzP";

which should result in:

"   Hello World! 0         Hello World! 1         Hello World! 2         Hello World! 3         Hello World! 4         Hello World! 5         Hello World! 6         Hello World! 7         Hello World! 8         Hello World! 9         Hello World! 10         Hello World! 11         Hello World! 12         Hello World! 13         Hello World! 14         Hello World! 15         Hello World! 16         Hello World! 17         Hello World! 18         Hello World! 19         Hello World! 20         H"

@ryanjpearson
Copy link
Author

Here's a simpler implemenation using the test case you provided. Thanks!

decompress.txt

@tomtor
Copy link
Owner

tomtor commented Dec 23, 2020

Here's a simpler implemenation using the test case you provided. Thanks!

I assume you verified that the first 5 bytes are still damaged (0) and the rest of the output is ok?

@tomtor
Copy link
Owner

tomtor commented Dec 24, 2020

Regarding the test data, why don't you just use (EDIT corrected wrong escapes):

static unsigned char Compressed[] = "\x18\x95u\xcf\xbb\t\x80@\x00\x04\xd1V\xd6\x0e\xdc\xf3\xdf\x81\x1d\xd8\x80\x66\x07\x07\xf6\x1f\x18\t\n;\x13N\xf6$\xedW\xadMG\xbb\xeb\xd9\xa9\xd7\xdbo;\xef\x92\xf7\x90\xf7\x98\xf7\x94\xf7\x9c\xf7\x92\xf7\x9a\xf7\x06\x1c\x62\x82\xd3\x00\x35H\rT\x83\xd5\x80\x35h\r\\\x83\xb7|\xbc\x0fjCzP";

In InitDecompress

DeflateWrite(REG_DEFLATE_RESET, 1);
DeflateWrite(REG_I_CLK, 0);
DeflateWrite(REG_DEFLATE_RESET, 0);
DeflateWrite(REG_I_CLK, 1);

Has the clock edge reversed, try changing this to:

DeflateWrite(REG_I_CLK, 0); 
DeflateWrite(REG_DEFLATE_RESET, 1);
DeflateWrite(REG_I_CLK, 1); // Low->High transition to reset
DeflateWrite(REG_DEFLATE_RESET, 0);

Let me know if that makes a difference. If not I'll have a look at the rest of the code.

Could you also tell me a bit more of your use case? Why are you driving/testing the decompress logic from C code?

That will be very slow, just using the processor and a software implementation would have the same or higher speed.

test_deflate_bench in test_deflate.py is an example driving the logic directly.

@ryanjpearson
Copy link
Author

Thanks for the suggestion! I'm away from the hardware for 2 weeks, but I will give it a shot then.

Isn't the string you provided an ascii representation of the data? I used binascii.b2a_hex in python to convert it to the bytes in the code. I wasn't sure of any C functions that would convert "/x95" to just a char with value of 0x95. I'm new to that notation, and I thought it was python specific.

I wasn't certain the micro-controller I'm using would have the memory needed for inflate. I may move more functionality to the FPGA, but I first wanted to verify I was able to interface to the inflate logic correctly.

@tomtor
Copy link
Owner

tomtor commented Dec 25, 2020

There is indeed an issue with 4 (3 reported by the compiler because it exceeds the 0-255 range) of the encoded bytes:

tom@swan-ubu20:~/src/HDL-deflate$ cat test.c 
#include <stdio.h>

int main()
{
  static unsigned char Compressed[] = "\x18\x95u\xcf\xbb\t\x80@\x00\x04\xd1V\xd6\x0e\xdc\xf3\xdf\x81\x1d\xd8\x80f\x07\x07\xf6\x1f\x18\t\n;\x13N\xf6$\xedW\xadMG\xbb\xeb\xd9\xa9\xd7\xdbo;\xef\x92\xf7\x90\xf7\x98\xf7\x94\xf7\x9c\xf7\x92\xf7\x9a\xf7\x06\x1cb\x82\xd3\x005H\rT\x83\xd5\x805h\r\\\x83\xb7|\xbc\x0fjCzP";
  for (int i= 0; i < sizeof(Compressed)-1; i++)
	  printf("%02x ", Compressed[i]);
  printf("\n");
}
tom@swan-ubu20:~/src/HDL-deflate$ clang test.c; 
test.c:5:109: error: hex escape sequence out of range
  static unsigned char Compressed[] = "\x18\x95u\xcf\xbb\t\x80@\x00\x04\xd1V\xd6\x0e\xdc\xf3\xdf\x81\x1d\xd8\x80f\x07\x07\xf6\x1f\x18\t\n;\x13N\xf6$\xedW\xadMG\xbb\xeb\xd9\xa9\xd7\xdbo;\xef\x92\xf7\x90\xf7\x98\xf7\x94\xf7\x9c\xf7\x92\xf7\x9a\xf7\x06\x1cb\x82\xd3\x005H\rT\x83\xd5\x805h\r\\\x83\xb7|\xbc\x0fjCzP";
                                                                                                            ^~~~~
test.c:5:250: error: hex escape sequence out of range
  static unsigned char Compressed[] = "\x18\x95u\xcf\xbb\t\x80@\x00\x04\xd1V\xd6\x0e\xdc\xf3\xdf\x81\x1d\xd8\x80f\x07\x07\xf6\x1f\x18\t\n;\x13N\xf6$\xedW\xadMG\xbb\xeb\xd9\xa9\xd7\xdbo;\xef\x92\xf7\x90\xf7\x98\xf7\x94\xf7\x9c\xf7\x92\xf7\x9a\xf7\x06\x1cb\x82\xd3\x005H\rT\x83\xd5\x805h\r\\\x83\xb7|\xbc\x0fjCzP";
                                                                                                                                                                                                                                                         ^~~~~
test.c:5:280: error: hex escape sequence out of range
  static unsigned char Compressed[] = "\x18\x95u\xcf\xbb\t\x80@\x00\x04\xd1V\xd6\x0e\xdc\xf3\xdf\x81\x1d\xd8\x80f\x07\x07\xf6\x1f\x18\t\n;\x13N\xf6$\xedW\xadMG\xbb\xeb\xd9\xa9\xd7\xdbo;\xef\x92\xf7\x90\xf7\x98\xf7\x94\xf7\x9c\xf7\x92\xf7\x9a\xf7\x06\x1cb\x82\xd3\x005H\rT\x83\xd5\x805h\r\\\x83\xb7|\xbc\x0fjCzP";
                                                                                                                                                                                                                                                                                       ^~~~~
3 errors generated.

eg \x80f should have been encoded \x80\x66 because f is a hex digit. This is different from Python encoding, where the encoding is limited to 2 hex characters \xHH.

Corrected:

tom@swan-ubu20:~/src/HDL-deflate$ cat test.c 
#include <stdio.h>

int main()
{
  static unsigned char Compressed[] = "\x18\x95u\xcf\xbb\t\x80@\x00\x04\xd1V\xd6\x0e\xdc\xf3\xdf\x81\x1d\xd8\x80\x66\x07\x07\xf6\x1f\x18\t\n;\x13N\xf6$\xedW\xadMG\xbb\xeb\xd9\xa9\xd7\xdbo;\xef\x92\xf7\x90\xf7\x98\xf7\x94\xf7\x9c\xf7\x92\xf7\x9a\xf7\x06\x1c\x62\x82\xd3\x00\x35H\rT\x83\xd5\x80\x35h\r\\\x83\xb7|\xbc\x0fjCzP";
  for (int i= 0; i < sizeof(Compressed)-1; i++)
	  printf("%02x%c", Compressed[i], (i+1)%16 ? ' ': '\n');
  printf("\n");
}
tom@swan-ubu20:~/src/HDL-deflate$ clang test.c; ./a.out 
18 95 75 cf bb 09 80 40 00 04 d1 56 d6 0e dc f3
df 81 1d d8 80 66 07 07 f6 1f 18 09 0a 3b 13 4e
f6 24 ed 57 ad 4d 47 bb eb d9 a9 d7 db 6f 3b ef
92 f7 90 f7 98 f7 94 f7 9c f7 92 f7 9a f7 06 1c
62 82 d3 00 35 48 0d 54 83 d5 80 35 68 0d 5c 83
b7 7c bc 0f 6a 43 7a 50 

@ryanjpearson
Copy link
Author

I used your suggestion and fixed a couple of mistakes in the decompress.txt. Attached is the latest.
decompress.txt

The output has bytes 0 through 18, 26, and 40 "zeroed" out. The rest of the bytes are what they should be.

Here's a screen shot of the first 46 bytes of output
output

@tomtor
Copy link
Owner

tomtor commented Jan 8, 2021

The output has bytes 0 through 18, 26, and 40 "zeroed" out. The rest of the bytes are what they should be.

What happens if you do 2 clock ticks around line 100:

if (DecompressedReadCount < GetOProgress()) {
		did_read = 1;
		DeflateWrite(REG_I_MODE, READ);
		DeflateWrite(REG_I_RADDR, DecompressedReadCount);
		DeflateClockStrobe(); DeflateClockStrobe(); // TWO TICKS
		DecompressedReadCount++;
	}

@ryanjpearson
Copy link
Author

That made the problem go away! Thank you!

@tomtor
Copy link
Owner

tomtor commented Jan 8, 2021

That made the problem go away! Thank you!

That's good news! You will probably only need the extra tick when

DecompressedReadCount + 1 == GetOProgress()

@ryanjpearson
Copy link
Author

Now, I'm trying to push through a deflate block of size 705 bytes. o_iprogress is stalling at 481, meaning I'm only able to write 513 bytes. This number is interesting (481 = 512 - 32 + 1). Any suggestions? Thanks!

@tomtor
Copy link
Owner

tomtor commented Jan 15, 2021

The standard input buffer is 512 bytes and cyclic. 32 bytes are reserved for look back when compressing (could also apply when decompressing).

So if you are feeding data to decompress and it does not fit completely in the input buffer, you should also start the decompression and read the result so that the input buffer is emptied and you can reuse it in a cyclic way (after writing slot 511, continue with slot 0).

See the test code around:

if o_iprogress > i - MAXW:

@ryanjpearson
Copy link
Author

Following the example of test_deflate.py, I believe I start decompression at the beginning of the process.
Line 59 in decompress.txt: DeflateWrite(REG_I_MODE, STARTD);

And I have an analogous check on i_progress.
Line 66 in decompress.txt: if ((int)GetIProgress() > CompressedWriteCount - MAXW)

If you're able to take a look, I attached a zip file with the small png I'm testing with. And there's a binary file with just the compressed block that's inside the png. The compress block has a size of 705 bytes. It should inflate to a size of 5656 bytes (100x56 PNG 8-bit with 1 filter byte per line. 101*56=5656).

png_deflate.zip

Thanks!

@tomtor
Copy link
Owner

tomtor commented Jan 16, 2021

Can you print GetIProgress() and GetOProgress() and CompressedWriteCount in the inner loop when writing the input data, to get some debug info?

Both Progress indicators should start progressing after a number of clock cycles

@ryanjpearson
Copy link
Author

Here you go! It prints each time I'm able to write a byte to i_data and CompressedWriteCount increments. Thanks

PNGDecompressOutput.txt

@kad-vijlbc
Copy link

Your output ends with:

IProgress: 478	OProgress: 3699	CompressedWriteCount: 510	 NumberOfClocks: 78251
IProgress: 480	OProgress: 3713	CompressedWriteCount: 511	 NumberOfClocks: 78278
IProgress: 480	OProgress: 3713	CompressedWriteCount: 512	 NumberOfClocks: 78281
IProgress: 481	OProgress: 3713	CompressedWriteCount: 513	 NumberOfClocks: 78284

That looks good!

I expect that at the end of your log you are no longer repeatedly calling DeflateClockStrobe().

That is needed in order to advance IProgress to slot 482 and beyond.

@kad-vijlbc
Copy link

Also note CompressedWriteCount: 513.

You have just written byte 513, do you store this byte at slot 0, because the buffer is sized 512 bytes by default.

@ryanjpearson
Copy link
Author

It is still repeatedly calling DeflateClockStrobe() after that. It's only printing when it's able to write a byte ( meaning ((int)GetIProgress() > CompressedWriteCount - MAXW)). And the write addresses do wrap around back to zero since the register is 9 bits. Byte 513 is written to address 1.

@ryanjpearson
Copy link
Author

Another thing: the o_done bit goes high after byte 449 is written ( I think it was 449, I'll confirm tomorrow). I was under the belief that png contains just one deflate block, but I think that is false now. So, it may be that deflate is working fine, and I just need to start a new block. I will check this tomorrow. My apologies for reporting an error if this is the case.

@kad-vijlbc
Copy link

uint8_t WriteDecompress(uint8_t CompressedByte)
{
	uint8_t ret;
	if ((int)GetIProgress() > CompressedWriteCount - MAXW)
	{
		DeflateWrite(REG_I_MODE, WRITE);
		DeflateWrite(REG_I_WADDR, CompressedWriteCount);
		DeflateWrite(REG_I_DATA, CompressedByte);

		CompressedWriteCount++;
		ret = 0; //byte written
	}
	else
	{
		DeflateWrite(REG_I_MODE, IDLEM);
		ret = 1; //byte not written, waiting for space
	}

	return ret;

}

is different from the Python code. You switch to IDLE when waiting, the Python code switches to IDLE when the last input byte is written...

i_mode.next = IDLE

@ryanjpearson
Copy link
Author

Ah thanks, I will correct this.

@tomtor
Copy link
Owner

tomtor commented Jan 22, 2021

Another thing: the o_done bit goes high after byte 449 is written ( I think it was 449, I'll confirm tomorrow). I was under the belief that png contains just one deflate block, but I think that is false now. So, it may be that deflate is working fine, and I just need to start a new block. I will check this tomorrow. My apologies for reporting an error if this is the case.

IProgress should not progress after the end of a block, so I do not think there is a block end at position 449 or before 481.

@ryanjpearson
Copy link
Author

ryanjpearson commented Jan 25, 2021

If the end of a block is reached and there are more blocks, do I need to start a new decompression and write STARTD to i_mode again? Or do I just keep writing in the compressed data?

@tomtor
Copy link
Owner

tomtor commented Jan 25, 2021

Or do I just keep writing in the compressed data?

That should work, it does handle multiple blocks in a stream.

I do not think that your issue is related to blocks, it is somehow related to writing input byte 513 (buffersize + 1).

You could increase the buffersize to 1024 and it will most likely stop at position 1025.

Ah thanks, I will correct this.

This made no difference?

@ryanjpearson
Copy link
Author

Ah thanks, I will correct this.

This made no difference?
It did not make a difference, it still stalled at the same spot.

@ryanjpearson
Copy link
Author

ryanjpearson commented Feb 5, 2021

I also noticed that my cwindow size was wrong, I had it set to 32 in my code. Since I had Compress = False and Lowlut = False, the CWINDOW size should be 256. I made this fix in the code, and it did not seem to matter. It still stalls after writing the 513th byte to address 0.
I tried a build with IBSIZE = 32768. The o_done bit still goes high after i_progress reaches 449.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants