Roach is a minimalist, general-purpose, bit-level programming language. It's designed to easily access and modify odd or non-byte-based amounts of bits.
This document is a proof of concept for a language not yet implemented. The standard library is not yet documented too. I seek your feedback on the core of this language.
\ This is a single-line comment.
\( This is a multi-line comment.
Note that '\)' needs to be preceded by a newline to close the comment.
\)
Fields are groups of contiguous bits. Every scalar value in Roach is a field. Here are the various field types:
[n] \ n-bit type
[0] \ void or unit type
.T \ floating-point version of type T
-T \ signed version of unsigned type T
T~ \ dynamic version of type T where its size is its minimum number of bits
~T \ dynamic version of type T where its size is its maximum number of bits
m~T \ dynamic version of type T where its size is its maximum number of bits, and m is its minimum
T~m \ dynamic version of type T where its size is its minimum number of bits, and m is its maximum
The bit-size of the CPU's registers (e.g. 32-bit or 64-bit) can be accessed as ?
.
Without -
, types are unsigned.
Composites are values composed of other values, called elements.
Here's a 64-bit floating-point type as a composite type:
[
[1] \ sign
[11] \ exponent
[52] \ fraction
]
Composite types can be restricted to only having elements, or items, of one type. These composites are known as arrays.
T*n \ array type with n items of type T
T* \ equivalent to T*0~
Numerals are of type [?]~
. Here are some examples:
0b0101
0o777
0xf0f0f0
21
0xa.1
-x
x + y
x - y
x * y
x ** y
(exponentiation)x / y
~x
x | y
x ^ y
x & y
x << y
x <<< y
x <-< y
(left rotation)x >> y
x >>> y
x >-> y
(right rotation)!x
x || y
x && y
More specific operators are explained later.
Character literals are of type [7]~
. Here are some examples:
'πΈ'
'\n'
\\
β\
\'
β'
\"
β"
\u(N)
β The unicode character with the code from the numeralN
\0
β NUL\t
β TAB\n
β LF\r
β CR\e
β ESC
String literals are of type [7]~*
. Here are some examples:
"Ribbit, MFs!"
Character escape sequences apply to string literals too.
###x \ number of bits in x
##x \ number of elements in x
Lengths are of type [1]~
.
Names can refer to types or values.
Names can contain any non-whitespace character, and must either start with a non-arabic-digit character, or end in a non-arabic-digit character.
Values of non-assigned names are zeroed-out.
Here's an example:
age: [8]
traversal: [
start: [8]
end: [8]
n_times: [8]
]
Declarations return 0.
Once declared, names can be assigned values like this:
age = 10
Names can also be defined: declared and assigned in the same expression:
age: [8] = 10
Assignments return the assigned value.
Top-level names (ones that are not referring to elements) can refer to constants by using :=
instead of =
. Here's an example:
SIZE := 16
Type aliases can't be redefined. Here are some examples:
Byte [8]
RawText [32]*
Here's an example with a quick constructor:
me: [name:RawText age:Byte] = ("ghoom" 21)
Here's a non-declarative variation with a full constructor:
me = {
name: RawText = "ghoom"
age: Byte = 21
}
Assuming v
is a value of any type:
v#i \ the bit at index i
v#(i j) \ the field slice from index i to j
Negative indices and using this syntax in assignments are supported.
Assuming c
is a composite:
c.i \ the element at index i
c.(i j) \ the composite slice from index i to j
c.a \ the element named a
Negative indices and using this syntax in assignments are supported.
Bits can be inserted into a field f
like this:
f <- x \ a new field where x is appended
f#i <- x \ a new field where x is inserted before the bit at index i
(f#i) <- x \ a new field only containing the bit at index i before x
And bits can be deleted like this:
f! \ the default value of f's type (it's like a cleared version)
f#i! \ a new field where the bit at index i is deleted
f#(i j)! \ a new field where the slice from index i to j is deleted
Elements can be inserted into a composite c
like this:
c <- x \ a new composite where x is appended
c#i <- x \ a new composite where x is inserted before the element at index i
(c#i) <- x \ a new composite only containing the element at index i before x
And elements can be deleted like this:
c! \ the default value of c's type (it's like a cleared version)
c#i! \ a new composite where the element at index i is deleted
c#(i j)! \ a new composite where the slice from index i to j is deleted
Use !!
to delete bits or elements in place:
c!! \ equivalent to c = c!
Assuming x
is the value to be converted to type T
:
x: T \ x as a value of type T
Everything is convertible to everything.
When necessary, statements can be seperated with ;
.
Blocks are scoped groups of expressions or statements, surrounded in {}
, that can optionally return a value x
like this:
=> x
x -> y |> z \ If x is not 0, return y. Otherwise, return z.
|> z
is optional.
x ~> y \ While x is not 0, add y to the array that the loop returns.
Breaking out of a loop is done using:
<>
And "continuing" to the loop's next iteration is done using:
^^
This is the function type:
P=>R \ function type taking parameter of type P and returning a value of type R
Here's an example of a function:
factorialize = n: [8] => {
=> !n || n == 1
-> 1
|> n * factorialize(n - 1)
}
factorialize 5 \=> 120
All names in the parameter are accessible inside the function. ..
is the argument (useful for unnamed parameters).
Functions names are overloadable.
Function calls are straightforward:
f \ Call f with no argument (defaulting to 0).
f x \ Call f with argument x.
Use (f)
to get the function instead of calling it.
Coroutines are functions that use one of these constructs:
=>> x \ Yield x.
=<< f \ Yield to coroutine f with no arguments.
=<< f x \ Yield to coroutine f and pass argument x if its the first time yielding to it.
The following constructs are unlocked for coroutines:
^f \ Asynchronously call coroutine f.
^f x \ Asynchronously call coroutine f with argument x.
Asynchronous calls return a value of type [done:[1] value:T]
where T
is the return type, and done
is whether the function finally returned.
x: c ~> y \ For each return value of iterator call c, named x, add y to the array that the loop returns.
Here's the pointer type expression:
@T \ type of pointers to values of type T
Dereferencing a pointer p
is done like this:
@p
A generic is a composite with ambiguous types to be specified later.
Let's say we're remaking the Map
type from the standard module maps:
Map <K V>[
keys: K*
values: V*
]
This is how we would select a variation of it:
Map<RawText RawText>
Generic selection isn't required if the types can be infered. For example:
Optional <T>[
defined: [1]
value: T
]
byte: [8] = 7
optional_byte: Optional = (1 byte)
$
is an array of type [7]~**
containing the command line arguments.
Slash functions are built-in constructs that are used as functions.
Assuming regex is the name of a standard module, and "user.ro"
is a path to a module, this is how you include them:
+regex
+"user.ro"
You can always use '/' in paths, regardless of the operating system. Assuming x is the name of a disk in Windows (e.g. C), the root path '/x' is 'x:' in Windows.
Machine code can be inserted like this:
/x
...where x
is the machine code.
??
, a field of type [?]
, can be one of:
- Windows
- Apple
- Linux
- FreeBSD
- Solaris