Proto-typed (pt) is a compiled statically typed language with garbage collection and modern syntax.
As the name suggests, proto-typed was designed for writing prototypes and short simple programs in a compiled and typed manner.
This repository contains Proto-typed Compiler (ptc) and the language definition in textual form.
PT is a standalone program that can be used to compile and run proto-typed programs. Just as the language, it has a straight forward and simple design, but can be customized.
It has 3 commands:
build
- Build compiles the program.run
- Run compiles the program and runs it.see
- Run compiles the program, runs it and deletes the binary.
Options for pt
has to be specified before the command, for ptc
after the command and arguments to the program after --
.
- Create your pt program
hello.pt
:
print("Hello, World!\n")
- Run the program using
pt
:
pt run hello.pt
This will output:
Hello, World!
and create binary hello
.
Soon precompiled ptc releases should be available under release section in GitHub (https://github.com/mark-sed/proto-typed/releases).
As mentioned before, ptc relies on LLVM and expects it to be installed. The ptc can be compiled from source using CMake with the following steps.
- Clone the repository (or download it as a zip in GitHub):
git clone https://github.com/mark-sed/proto-typed.git
- Enter the repository and create a
build
directory:
cd proto-typed
mkdir build
- Run CMake for pt compiler:
cmake -S . -B build
cmake --build build --target ptc
- (Optional) Run CMake for pt utility:
cmake --build build --target pt
After running this inside of build
you should find compiled proto-typed compiler named ptc
and compiled pt
utility.
Note that pt
is currently available only for Linux. To have ptc and pt accessible from anywhere and everything have easy to use, you can use install.sh
script to place all the needed files into the expected places:
- (Optional) Setting up pt path and libs:
sudo bash install.sh
Proto-typed (pt) is a compiled statically typed language with garbage collection and modern syntax. But it offers lot more, here is a list of some of the key proto-typed features:
- Compiled.
- Statically typed.
- Support for modules and simple compilation of multiple modules (only main module path needs to be specified, rest will be found by the compiler).
- Automatic memory management (garbage collection).
- Dynamic
any
type. - Maybe types (hold value of specified type or
none
). - Function overloading.
- No need for forward declaration of functions.
- Implicit main function generation (code similar to interpreted scripts).
- C library functions and simple invocation of C functions (not yet implemented).
Before going through in depth code features, here are some simple code examples to showcase the syntax.
print("Hello, World!\n")
int fib(int n) {
if(n < 2) return n
return fib(n - 1) + fib(n - 2)
}
int index = 42
print("Fibonacci for index "++index++" is "++fib(index))
Proto-typed offers simple types and user defined structures. These types can then be constructed into arrays or maybe versions.
All variables are initialized to their default value if not initialized explicitly.
There are in fact 2 groups of types, the first - primitive - does not require runtime initialization (int
, float
, bool
and value none
) and the second - composite - which requires runtime initialization (string
, maybe type including any
, arrays and struct
).
Signed whole number represented by 64 bits. Such number can be written in decimal, binary (prefix 0b
or 0B
), octal (prefix 0q
or 0Q
) or hexadecimal (prefix 0x
or 0X
) format.
int a_dec = 42 // Decimal
int a_bin = 0b101010 // Binary
int a_oct = 0q52 // Octal
int a_hex = 0x2A // Hexadecimal
Floating point number represented by 64 bits. It can be also written using scientific notation.
float not_pi = 3.14159265
float avogardo = 6.02214076e23
float rnd = -22e-8
Note that for the purpose of correct range parsing, float cannot start or end with only .
- you cannot write 2.
, but have to write 2.0
and the same for .5
, which has to be 0.5
.
Boolean value capable of containing either true
or false
value.
bool slow // false (default value)
bool compiled = true
Dynamic unicode array of characters. String support escape sequences.
string boring_text = "sushi\n\tramen."
string cool_text = "寿司\n\t拉麺"
string cooler_text = "🍣\n\t🍜."
string pozdrav = "Ahoj, jak se máš?"
String can also be written as "raw strings" or "rstrings", where no escape sequences are parsed. Such strings have to pre prefixed with r
:
string raw_text = r"New line is written as \n and \ has to be escaped by another \ in normal strings."
Characters can be also encoded using escape sequences, these can be in octal (\q
or \Q
) or hexadecimal (\x
or \X
) after their prefix in braces {}
:
string space = "\x{20}"
string space8 = "\Q{12}"
Unicode characters can also be escaped using the \U
or \u
prefix followed by the character's hexadecimal code.
string potassium_source = "\U{0001f34c}"
Strings are represented as simple objects and therefore their size can be easily determined in constant time, but at the same time string's character buffer is zero terminated making it compatible with many C functions.
String can be also sliced. Slicing has a form of [start..end]
or [start,next..end]
(range inside of []
):
string dna_seqence = "a-c-c-g-t-a-t-g"
string amino_acids = dna_seqence[0,2..length(dna_seqence)]
Slice can also use descending range and therefore reverse a string. Keep in mind, that end is not included in the range, but start is and therefore there needs to be some adjustments:
string dna_seqence = "a-c-c-g-t-a-t-g"
int l = length(dna_seqence) - 1
string reversed_aas = dna_seqence[l, l - 2 .. -1]
Structs can hold variables of any type, but cannot contain function definitions withing their scope. When referring to a struct type only the name can be used (without the struct
keyword).
struct Pair {
int first
int second
}
Pair p
p.first = foo()
p.second = bar()
print(p.first++", "++p.second)
Structs can also have default initializers for their elements:
struct Player {
string name = "Unknown"
float strength = 10.5
int x = 350
int y = 200
}
Arrays in pt are dynamic (you can think of them as vectors in C++).
Array type is defined by putting []
after type name:
int[] values
for (int i : 0..20) {
append(values, i)
}
for (int c: values) {
print(c++" ")
}
Multidimensional arrays (matrices) work in the same way:
int[][] pos = [
[0, 1, 2, 8],
[9, 0, 8, 3],
[2, 3, 5, 6],
[0, 1, 1, 4]
]
string[][] tt = [
["x", "o", "x"],
["x", "x", "o"],
["o", "x", "o"]
]
string center = tt[1][1]
Array is also result of slicing. Slicing has a form of [start..end]
or [start,next..end]
(range inside of []
):
float x = [0.2, 1.3, 4.5, 5.0, 0.0, 9.9, 7.1, 1.0]
for (float i : x[1,2..6]) {
print(i++" ") // 1.3 5.0 9.9
}
float y = x[0..2] // [0.2, 1.3]
Maybe value can either hold value of its base type or none
.
int? x // none by default
x = -7
Every maybe value is passed by a reference to a function and can therefore be used to modify input arguments:
void pow2(int? x) {
x = x * x
}
int v = 5
pow2(v)
print(v++"\n")
But don't think of maybe values as of pointers or references since any function taking a maybe value can accept any base value, including a constant:
void pow2(int? x) {
x = x * x
}
pow2(42)
But there is still a difference when passing maybe and actual value to a function. Maybe to maybe assignment will work only for actual maybe value passed in. In the following example the assignment in function getK
assigns address to passed in argument, but when the passed in value is not a maybe type, what is modified is the address of parameter (x
), not the address of passed in variable (k
):
void getK(int? x) {
any KVAL = 7542
x = KVAL
}
int? maybe_k
int k
getK(maybe_k)
getK(k)
print(maybe_k++" "++k) // 7542 0
When assigning a maybe value to another maybe value, both of these will contain the same address and therefore the same value:
string? a = "hi"
string? b
b = a
b = "bye"
print(a) // bye
Any type (not surprisingly) can hold any value. But what it holds is only known to the user not the compiler and therefore it is easy to get a runtime error. To extract a value it has to be assigned or implicitly casted.
any x = 42
// code ...
x = "forty two"
string y = x
print(y)
Any type has to be internally represented as a maybe type (to be able to hold none
), which might cause some problems with maybe to maybe assignment and memory sharing:
any x = 32
int? y
y = x
x = "Ca vas pas"
print(y++"\n") // Incorrect value - string read as int
Any type, unlike maybe type, is checked as an address and not a value (as it is only known to the user what type is stored there).
Function type is always a maybe type (the ?
is implicit, just like with any) and therefore it can hold none
.
Function type has the following syntax:
<return type>(<optional list of argument types>) <variable name>
For example:
bool isBig(int? a, bool warn) {
if(a != none) {
return a > 100
}
if(warn) print("is none")
return false
}
bool(int?, bool) funIsBig
funIsBig = isBig
print(funIsBig(4, false) as string)
Function type can also be taken as an argument:
void err_print(string s) {
print("Error: "++s++"\n")
}
void report(void(string) rep_fun, string msg) {
rep_fun(msg)
}
report(err_print, "Oops!")
report(print, "Never mind!")
Proto-typed aims for having a simple syntax, that allows good code readability and writing, as it's main purpose is prototyping and small programs.
End of statement in pt can be marked with (1) a new line (\n
), (2) a semicolon (;
), (3) the end of the certain construct or (4) end of file.
Example (1):
int a = 42
int b = 0
which is equivalent to (2, 4):
int a = 42; int b = 0
In case of constructs ending with }
(3), there is no need for ;
or \n
:
if(a) {
c = 42;
} print("hello")
Proto-typed supports c-style one line comments (//
) and multiline comments (/**/
):
// Single line comments
/*
Multiline
comment
*/
Importing a module is done using the import
keyword followed by the module's name (without its extension) or comma separated list of module names:
import bigmath
import window, controller, handler
Every function, global variable and type (struct) can be accessed from another module.
To access external module symbols the symbol has to be always prefaced by the imported module's name, then 2 colons - scope (::
) - followed by the symbols name.
File mod2.pt:
string NAME = "mod2"
int get_key() {
return 42
}
File main.pt:
import mod2
print("Key for module " ++ mod2::NAME ++ " is: " ++ mod2::get_key())
Functions use c-style syntax of return type followed by the name, arguments and then the body:
int foo(float a, bool b) {
int r
// Code ...
return r
}
Proto-typed has a support for function overloading (multiple functions with the same name, but different arguments):
void add(int a, int b) {
print((a + b)++"\n")
}
void add(string a, string b) {
print(a ++ b ++"\n")
}
add(4, 8) // 12
add("h", "i") // hi
If statement's expression has to be boolean value (int or maybe won't be implicitly converted), the same is true for while and do-while. If follows the c-style syntax as well:
if (a == 0) {
foo(c)
} else if (a == 1) {
bar(c)
} else {
baz(c)
}
If does not require {}
if it is followed only by one statement, but keep in mind that in proto-typed new line is the terminator for a statement and therefore, unlike C, it does not allow for arbitrary amount of new lines after if. Fortunately, proto-typed will emit an error when such incorrect case happens.
if (a)
print("Hi") // Syntax error
else print("Bye") // Syntax error
if (a) print("Hi") // Correct
else print("Bye") // Correct
Also keep in mind that the statement following if has to be terminated as well, either by a new line or a semicolon and therefore when writing a one-line if-else, one must use semicolon (or {}
) after each statement:
if (a) print("Hi") else print("Bye") // Syntax error
if (a) print("Hi"); else print("Bye") // Correct
While and do-while loops have the following syntax:
while (a < 10) {
// code
a += 1
}
do {
// code
a += 1
} while (a < 10)
For works more like a for each loop, where it iterates over ranges, arrays or strings:
float[] values
// init values
for (float i : values) {
print(i++"\n")
}
string text = "Some text"
for (string letter: text) {
print(letter++" ")
}
Proto-typed also offers special type range
(see more bellow), which can be used counted for loops:
for (int i : 0..5) {
print(i++" ") // 0 1 2 3 4
}
for (int j : 0,2..5) {
print(j++" ") // 0 2 4
}
string text = "Some more text"
for (int k : 0..length(text)) {
print(k++": "++text[k]++"\n")
}
Following table contains pt operators from highest precedence to the lowest.
Operator | Description | Associativity |
---|---|---|
:: |
Module scope | none |
() , [] , [..] |
Function call, array indexing, slicing | none |
as |
Type casting | left |
. |
Structure member access | left |
not , ~ |
Logical NOT, bitwise NOT | right |
** |
Exponentiation (returns float ) |
right |
* , / , % |
Multiplication, division, reminder | left |
+ , - |
Addition/array join, subtraction | left |
<< , >> |
Bitwise left shift, bitwise right shift | left |
.. |
Range | left |
> , >= , < , <= |
Bigger than, bigger than or equal, less than, less than or equal | left |
== , != |
Equality, inequality | left |
& |
Bitwise AND | left |
^ |
Bitwise XOR | left |
| |
Bitwise OR | left |
in |
Membership | left |
and |
Logical AND | left |
or |
Logical OR | left |
++ |
Concatenation | left |
= , ++= , **= , += , -= , /= , *= , %= , &= , |= , ^= , ~= , <<= , >>= |
Assignment, compound assignments | left |
Proto-typed does not provide some of the higher level constructs such as classes and objects, but at the same time tries to provide abstractions for simple and quick coding. Example of such abstractions is the memory managements handled by the garbage collector or not present pointer type.
Main function is implicitly generated by the compiler and it's job is to initialize modules and execute entry function of the current module.
This means that pt module does not contain any main function, but it works with the global scope as this function, but beware that all the functions and variables declared here are still global and accessible from other modules.
Entry function - _entry
- contains all the global scope code (statements that are not declaration).
Entry function is called only for the main module, so if it needs to be executed for imported modules it is needed to call it explicitly.
import mod2
mod2::_entry()
Module could possibly call its own entry function.
Casting can be done only to non-maybe type, unless the casted value is of type any
. The reason for this is that maybe is dynamically allocated memory and casting (reading) this memory as different type (size) does not make sense. For any
, this cast only reads the value.
void foo(string? c) {
c = "changed"
}
int? a = 4
foo(a as string?) // Error
// This does not make sense as a would be
// modified to contain string
string? str_a
str_a = a as string
foo(str_a) // Works
If you really wish to play god and treat memory as different type, you can utilize the any
type for this:
any ivalue = 1
float? fvalue
fvalue = ivalue as float?
Calls to standard library do not require any module name prefix and any of the functions can be overridden by custom definitions.
length
- String length.int length(string s)
- Returns the length of the strings
.
to_string
- Conversion of types to string. This is used by theas
operator.string to_string(int i)
- Converts inti
to string.string to_string(float f)
- Converts floatf
to string.string to_string(bool b)
- Converts boolb
to string.
mto_string
- Conversion of maybe types to string (fornone
will return "none"). This is used by theas
operator.string to_string(int? i)
- Converts maybe inti
to string.string to_string(float? f)
- Converts maybe floatf
to string.string to_string(bool? b)
- Converts maybe boolb
to string.string to_string(string? s)
- Converts maybe strings
to string. This is useful only for printing asnone
will be indistinguishable from string "none".
- From string - Conversion from string to other types. This is used by the
as
operator, butas
expects valid value (will ignorenone
and crash).int? to_int(string str)
- Converts string number in base 10 and returns it as int or none if it is not an integer.int? to_int_base(string str, int base)
- Converts string number in basebase
and returns it as int or none if it is not an integer in given base.float? to_float(string str)
- Converts string float and returns it a float or none if it is not a float.
find
- Find substring.int find(string s1, string s2)
- Returns first index of substrings2
in strings1
or -1 if not found.
contains
- Check if string contains substring (equivalent tos2 in s1
).bool contains(string s1, string s2)
- Returns true if strings1
contains string s2, false otherwise.
reverse
- String in reverse.string reverse(string s)
- Returns copy of strings
reversed.
slice
- Slices a string (equivalent tos[start, next..end]
). Range can be also descending for reversed string.string slice(string s, int start, int end)
- Slices string from indexstart
, with step 1 or -1 until indexend
.string slice(string s, int start, int next, int end)
- Slices string from indexstart
, with stepnext - start
until indexend
.
- Case conversion - Conversion to uppercase.
string upper(string s)
- Returns copy of strings
in uppercase.string lower(string s)
- Returns copy of strings
in lowercase.
ord
- Converts letter of string to its integer value.int ord(string s)
- Converts first letter ofs
to its integer value.
chr
- Converts integer value to corresponding letter.string chr(int i)
- Converts integeri
into corresponding letter and returns it as a string.
These functions are templated for any array (matrix) type, the type T
stands for this general array type. Type TBase
stands for the base type of T
(e.g.: T
might be int[][]
and then TBase
would be int[]
). Type TElem
is the base non-array type of T
(e.g.: T
might be int[][]
and then TElem
would be int
).
append
- Append to an array.void append(T a, TBase v)
- Appendv
at the end of arraya
.void mappend(T a, TBase? v)
- Append maybe valuev
into arraya
.
insert
- Insert to an array.void insert(T a, TBase v, int index)
- Insertv
at indexindex
(from 0 tolength(a)
) into arraya
.void minsert(T a, TBase? v, int index)
- Insert maybe valuev
at indexindex
(from 0 tolength(a)
) into arraya
.
remove
- Remove value from an array.void remove(T a, int index)
- Removes value of the arraya
at indexindex
.
length
- Array length.int length(T a)
- Returns the length of the arraya
.
equals
- Array equality (equivalent toa1 == a2
)bool equals(T a1, T a2)
- Returns true if all value ina1
are equal to those ina2
.
find
- Find value in an array.int find(T a, TElem e)
- Returns index of valuee
in arraya
or -1 if it does not exist there.
contains
- Check if value is present in an array (equivalent toe in a
)bool contains(T a, TElem e)
- Returns true if valuee
is in arraya
, false otherwise.
reverse
- Array in reverse.T reverse(T a)
- Returns copy of arraya
reversed.
slice
- Slices an array (equivalent toa[start, next..end]
). Range can be also descending for reversed array.T slice(T a, int start, int end)
- Slices array from indexstart
, with step 1 or -1 until indexend
.T slice(T a, int start, int next, int end)
- Slices array from indexstart
, with stepnext - start
until indexend
.
sort
- Sorts an array.void sort(T a, bool(TElem, TElem) cmp)
- Sorts arraya
using comparison functioncmp
.
These functions are templated and type S
represent generic struct type in this case.
equals
- Struct equality (equivalent tos1 == s2
)bool equals(S s1, S s2)
- Returns true if values of all elements ins1
are equal to those ins2
.
- Math constants
M_PI
- Ludolph's number.M_E
- Euler's number.M_PHI
- Golden ratio ((1 + 5^0.5)/2
).M_EGAMMA
- Euler–Mascheroni constant.
- Trigonometric functions
float sin(float x)
- Sine ofx
.float cos(float x)
- Cosine ofx
.float tan(float x)
- Tangent ofx
.
abs
- Absolute value.int abs(int x)
float abs(float x)
sum
- Sum of all values in an arrayint sum(int[] arr)
float sum(float[] arr)
int gcd(int a, int b)
- Greatest common divisor ofa
andb
.int lcm(int a, int b)
- Least common multiple ofa
andb
.- Logarithm - Computes logarithm.
float ln(float x)
- Natural (base e) logarithm ofx
.float log10(float x)
- Common (base 10) logarithm ofx
.
int system(string cmd)
- Calls host environment command processor withcmd
. Return value is implementation-defined value.string getenv(string name)
- Return value of environemnt variable.bool setenv(string name, string value, bool overwrite)
- Sets environment variable to passed in value.- Pseudo-random number generation - Functions for pseudo random number generation. Seed is initialized implicitly.
void set_seed(int)
- Sets seed for generator.int rand_int(int min, int max)
- Random integer number betweenmin
andmax
(including).float rand_float(float min, float max)
- Random float number betweenmin
andmax
(including).float rand_float()
- Random float between 0.0 and 1.0 (including).bool rand_bool()
- Random boolean.int rand_uint()
- Returns random unsigned integer.
int timestamp()
- Current time since epoch (timestamp).
IO works with built-in structure File
.
File open(string path, string mode)
- Opens a file and returns File structure handle.path
is the relative or absolute path to the file and mode is one of:"r"
(read),"w"
(write),"a"
(append),"r+"
(read/update),"w+"
(write/update),"a+"
(append/update). For binary files suffix the mode withb
("rb"
). On failure File.handle will be 0.bool close(File f)
- Closes opened file. On succeess returns true, false otherwise.- Reading whole input
string read(File f)
- Reads the whole filef
and returns it as a string.
- Reading one character
string getc(File f)
- Reads one character from a file and returns it as a string (the string is empty ifEOF
is reached).string getc()
- Reads one character from stdin and returns it as a string (the string is empty ifEOF
is reached).
- Input
string input()
- Reads one line from stdin.string input(string prompt)
- Prints outprompt
and then reads input from stdin.
- Reading command line arguments
string[] args
- This is a global variable, not a function. It contains all the command line arguments passed to the program.
Proto-typed compiler (ptc) uses LLVM and can target any of big amount of targets LLVM can compile for. The ptc also relies on LibC.
If you encounter any issues not mentioned here, you can report it in GitHub issues section.
Because of some types requiring initialization, only non-initialized types can be assigned to a global variable in its declaration, but a workaround is to assign it after the declaration:
int a = 4
int[] arr_wrong = [a, a+1, a+2] // Won't currently work
int[] arr_corr
arr_corr = [a, a+1, a+2] // Will work