The trial runs for our self-driving car are gaining momentum, and everything seems to be going great. But every now and then, there are some system failures and bugs that you need to report back to your technical headquarters so they can work on a fix. Communication between the car and the coding team back home is key to delivering updates.
Unfortunately, your project has taken the backseat in allocation of funds, losing out to bigger players. The supported bandwidth for your communication channel is quite low, and cannot realiably send huge system logfiles without corruption of contents. On top of that, rival companies are trying to sabotage your runs by sending forged logfiles, pretending like they came from the car.
You decide to tackle both of these issues. You decide to encrypt your logfiles when you send them, but that won't solve the issue. We usually do not encrypt huge files with something like RSA(credits: CSeC), because that tends to be slow. Instead, we establish a secure connection, verify the identity of the senders through digital signatures, and revert to some form of simple symmetric encryption. You can click on the links to learn more about these fascinating topics.
To compress the logiles and make them small enough to be sent quickly and reliably, you can use the Huffman Compression Scheme. Basically, you encode the file in a way that saves you space, while making it absolutely clear what the decoded text will be, without prefix ambiguities. You can get started with the concept of Huffman Coding here.
Structure for the code is provided in the 'Huffman Codes' folder, to help you get started.
For now, you only need to implement a simple RSA based digital signature. The car can take a short string of text, encrypt it using its private key, and send it to the team. You can then decrypt it and verify that the corresponding logfiles have indeed originated from the car, and not from someone else.
Text messages on their own cannot be encrypted since they are made of characters, a non numerical concept. So an encoding is used, most commonly Unicode or ASCII, to convert the given string to a sequence of bytes in hex-notation. Each character can be mapped to a 2 digit hex number. The resulting hex numbers can be concatenated, and the resulting number can be encrypted through modular exponentiation, like it is described in the pdf and on SE.
Alternatively, you can map each alphabet to a 2-digit number in base 10, starting with 00
for A
, 01
for B
and so on. This is less exhaustive compared to Unicode, but you can get it to work on alphanumeric signatures only.
The resulting message
If you read up on RSA, you would know the private exponent is the inverse modulo
Inverse modulus of
The coefficients
You can check out this awesome paper for lucid explanation on the fascinating mathematics behind RSA.