Skip to content

Differences between Frege and Haskell

Ingo60 edited this page Oct 11, 2012 · 43 revisions

This document is intended for people who have a working knowledge of Haskell and want to write Frege code. The objective is not to describe each difference in every detail. For this, please consult the Frege Language Reference Manual, that can be obtained from the download page. It is rather just an overview with brief explanations. However, if you feel some difference you stumble upon should be included here, please open an issue on the project page.

I refer to the Haskell language as described in the Haskell 2010 Language Report. This document follows in structure the Haskell 2010 Language Report. I give the section numbers for easy reference. Most of the time I will not repeat how things are in Haskell, for example I will not write "In Haskell, this is such and such, whereas in Frege, it is thus." Rather, I just write "thus". Words and phrases like "not yet", "currently", "at this time" etc. indicate the possibility, that the incompatibility they refer to will be eliminated later.

##General differences

  • The Foreign Function Interface is missing, instead of this there are language constructs to make Java types and methods usable. All primitive types are just Java types.

  • Higher rank polymorphism is supported out of the box. (Other GHC extensions like multi paramter typeclasses and functional dependencies may be implemented sooner or later.)

  • Data types act as namespaces within modules, this makes it possible to have record fields whose names do not pollute the top level namespace.

##Library/Prelude

The Frege-Prelude defines many functions, type classes and types known from Haskell in a similar way. The goal is to implement most of the Haskell 2010 standard library in a most compatible way as far as this makes sense.

Apart from that, everything else will be quite different, because Frege uses the Java APIs whenever possible. At the time being, there is not much library support yet beyond the standard libraries.

##1.4 Namespaces / 5 Modules

The compilation unit is a package or module, whose name is a dot-separated sequence of identifiers, where the last one must start with an uppercase letter. Packages names are used only in import clauses. In all other cases, they are referenced by the simple name that was assigned in the import clause or, if there was none assigned, by the last component of the package name. This is known as name space identifier.

There are no export clauses. Constructors, functions and data types that must not be accessible from other modules can be marked private. A data type declaration can be marked abstract, this is equivalent to making all constructors private.

Every module, not just Main, can have a function

main :: [String] -> IO ()

Execution of the Java class file that results from compiling of such a package will apply this function to the list of command line arguments and then run the resulting IO action.

##2.3 Comments

Comments introduced with {-- (block comment) or --- (line comment) are treated as documentation text. Documentation text looks like a comment, but is syntactically a single token that may occur at certain points in the program only. The text of the comment makes it down to the compiled module (Java class file) from where it can be retrieved later, for example by documentation tools.

##2.4 Identifiers and Operators

  • default, deriving, foreign and newtype are not keywords.
  • abstract, derive, false, native, package, private, protected, public, pure and true are keywords.
  • Currently, no identifier may start with an underscore except the identifier _. The latter may only appear in patterns, where it signals an uninteresting value, as usual.
  • Identifiers, but not operators, can be qualified with name space identifiers.
  • The only operator allowed as data constructor is :
  • ! and ? are unary operators. Unary operators bind yet stronger than function application, so that f !a is parsed as f (!a)
  • Frege has 18 precedence levels, 16 can be assigned to user defined operators. The infix directives work on the lexical level only.
  • In infix directives, it is mandatory to enclose new operators composed of non word characters in accent marks.

Here is the table of standard operators from the Prelude:

--       18 `!` `?`                -- unary operators
--       17                        -- function application
infixr   16 `<~`  `•`              -- a <~ (b <~ c) == (c ~> b) ~> a
infixl   16 `!!` `~>`
infixr   15 `**` `??` 
infix    15 `=~` `!~` `?~` `/~` `~` `~~` `~~~`
infixl   14 `*` `/` `%` `mod` `rem` `div` `quot`
infixl   13 `+`                    -- - is handled specially
infixr   13 `++`
infixl   12 `<<` `bshl` `bshr`
infixl   11 `band`
infixl   10 `bor` `bxor`
infix     9 `<=` `>=` `<` `>` `elem` `notElem`
infix     8 `<=>` `compare`
infix     7 `==` `!=` `/=` `===` `!==` 
infixr    6 `&&` `and`
infixr    5 `||` `or` `xor`
infix     4 `..`
infixr    4 `:`
infixr    3 `>>` `>>=`              -- monad bind
infixr    2 `:=` `=<<` `@` `seq`    -- so that in x@a:bs 
                                    -- x is bound to the list
infixr    1 `$` `$!`

##2.5 Numeric Literals

Frege accepts Java style numeric literals, except hexadecimal floating point literals. In addition, integer literals can have trailing groups of 3 digits separated by underscores.

A decimal integer literal followed by the letter n or N has type Integer, for other numeric literals one gets the Frege type by capitalizing the first letter of the name of the type that the literal would have in Java (i.e. int -> Int, float -> Float).

Numeric literals are not overloaded. The Num class has a function

fromInt :: Num a => Int -> a

that can be used to overload integer literals manually.

##2.6 Character and String Literals

Character and string literals are the same as in Java. A string literal denotes a value of type String. A string value is not a list of characters but an instance of the Java class java.lang.String.

There exists a type class ListSource in the Prelude that abstracts over types one can make ordinary lists from by providing a toListoperation. String is an instance of this class, as well as Maybe and arrays. List comprehension generators accept instances of ListSource right from the arrow.

Besides the more general and lazy toList, the Prelude functions packed and unpacked convert eagerly from lists of characters to strings and back.

##Regular expression literals

A sequence of characters included in grave accent marks ´ denotes a value of type Regex, which is the Frege incarnation of java.util.regex.Pattern.

##Boolean literals

Because Bool is not an algebraic data type in Frege but an abstract one that wraps the boolean type of Java there are no constructors True and False. Rather, the keywords true and false act as literals for the two boolean values.

##2.7 Layout

  • The token following a where, let, do or of keyword must be an opening brace or it must be more indented than the current level, otherwise the layout algorithm will insert a closing brace which will result in a syntax error. In other words, the token sequence { } will never be inserted.
  • The handling of layout is a pure lexical matter in Frege, hence, it is not possible to insert a closing brace “if an illegal lexeme is encountered at a point where a close brace would be legal”.
  • However, a closing brace is inserted before the keyword in regardless of the indentation unless it is already preceded by a closing brace. Therefore one can still write let expressions without braces on a single line like so:

let a = 6; b=7 in a*b

##3 Expressions, Patterns

  • The expression hierarchy is different from Haskell’s. if, case and let expressions as well as lambda abstractions bind less tightly than infix expressions, while do expressions are terms (aka aexp). In practice, this should seldom make a difference as to how expressions are parsed.
  • A lambda abstraction has exactly one pattern. However, \a -> \b -> x can be abbreviated as \a \b -> x
  • There are no arithmetic sequences yet. The .. operator from the Enum class may serve as a poor substitute in some cases. The following holds

a..b = takeWhile (<=b) (iterate succ a)

Note that a..b is an ordinary infix expression and is not written in brackets - no syntactic sugar here.

  • There is a whole range of syntactic sugar for field access, test for field existence, nondestructive update of data values with fields and nondestructive change of values through application of functions to fields. This applies to algebraic data types that have constructors with field labels.
  • If e has type T u1 ... uk, the expression e.f is transformed to (T.f e), i.e. the function f from namespace T applied to e. Otherwise, if f is a type class operation of type class C it means (C.f e). This is so because every type (constructor) name and class name is associated with a separate name space.
  • The function composition operator is . The character can be produced by holding the ALT key and typing 0149 on the numeric keypad (in Windows). In Linux, holding the COMPOSE key and typing .= should work. There are also tools like AutoHotKey that make it possible to bind arbitrary key combinations to produce arbitrary characters. Last but not least, editors often provide the possibility to bind key combinations to macros that insert arvitrary text.
  • However, to minimize the incompatibility, a dot that is not part of another operator will be treated as if it were just if: the previous token was a ( or the next token is a ) or the . is enclosed in whitespace on both sides.
  • There is no notation to make a pattern irrefutable. However, patterns of product types like tuples are irrefutable unless they contain refutable sub-patterns. In general, all products are matched as lazily as possible.
  • The precedence rules for patterns are subtly different from Haskell's, but one easy rule is enough to understand them: In Frege, patterns are syntactically just expressions (in fact, there is not even a separate grammar rule for patterns), though of course only a subset of all possible expressions make valid patterns.

A consequence is that if the left hand side of a binding is syntactically an application of some function or operator, it will be interpreted as function binding for that function except if the operator is @, which is lexically a right associative operator that binds weaker than : and treated specially so as to allow traditional pattern bindings like in

let qs@(q:_) = ....

Note that the parentheses are not needed in Frege.

Hence, f y@(x:xs) = ... is taken as pattern binding, even if f y is not a valid pattern (which causes this construct to fail, but only after it is parsed). If one meant to define a function f that takes a list y with head x and tail xs, one writes f (y@x:xs) = ....

Likewise, application of ! is syntactically an unary expression, not an application, therefore a construct like !p = ... is interpreted as (strict) binding for pattern p and

f !x !y = ...

binds f to a function that takes 2 strict arguments.

The rule of thumb is to put sub patterns and arguments of function bindings in parentheses when they are more complex than !v, because the syntax works exactly like on the right hand side.

##4.2 User-Defined Datatypes

  • No constraints can be specified in constructor declarations.
  • Infix notation can not be used in constructor declarations (but is possible in constructor application).
  • Every type declared with data owns a namespace. The data declaration can have a where clause, and value bindings in the scope of that where clause go in the type’s namespace. Qualified access of those bindings is possible, the qualifier is the type constructor name.
  • Field labels are not global, but live in the namespace associated with the type. Hence, if type T has a (data constructor that has a) field m and the type of e is an application of T, then e.m accesses the field and T.m is a function that extracts the field from any value v whose type is an application of T, provided v was constructed with a data constructor in whose field list m appears.
  • In constructors with record field syntax, the record fields can have polymorphicfunction types, even higher ranked ones. Note that type variables that did not appear as arguments to the type constructor must be protected by a forall, e.g.:

data ListFunctions = LF { rev, tl :: forall a.[a]->[a] }

  • There is no deriving clause. See below for derive declarations that serve the same purpose.
  • No strictness flags on individual fields are possible, instead the whole variant can be made strict by placing a ! before the constructor in data type declaration. This causes the constructor to be strict in all arguments and all fields are kept in weak head normal form or in unboxed native form for fields with native types.
  • Any Java class or interface can be declared as an abstract Frege types.

##4.2.3 Datatype Renamimgs

There is no newtype keyword. An algebraic data type with exactly one constructor with exactly one field serves the same purpose.

##4.3.1 and 4.3.2 Constraints in Class and Instance Declarations

The Frege syntax reflects the idea, that a constraint is part of a type. Therefore, instead of

class Eq a => Ord a
instance (Ord a, Ord b) => Ord (a,b)

we write

class Ord Eq a => a
instance Ord (Ord a, Ord b) => (a, b)

Hence, the class we are talking about comes next to the keyword, followed by the (possibly constrained) type we are talking about.

##4.3.2 Instance Declarations

  • Type synonyms in instance declarations are allowed.
  • The arguments to a type constructor need not be distinct type variables, they can also be type applications or the same type variable can be used more than once. However, since at most one instance per class and type constructor may be in place, this is not generally recommended. In addition, I am not sure if this will work with multi-parameter classes and instances (not yet implemented), so the best bet is to stick with the Haskell 2010 rules.

##4.3.3 Derived Instances

A variant of the instance declaration uses the keyword derive in place of instance and lacks a where clause. It serves the same purpose as the deriving keyword in Haskell, but because it is independent of the data declaration it can occur everywhere an instance declaration can, even in a different module. Hence, class, data and derive declarations could be in up to 3 different modules.

If the instantiated type is given without constraints, derive automatically assumes a simple one. For example, one can have

data T a b c = .....
derive Eq  T a b c
-- short for derive Eq (Eq a, Eq b, Eq c) => T a b c

Currently, instances for Eq, Ord, Show, Enum and Bounded can be derived. Derived Show instances currently do not show field labels/record syntax.

##8 Foreign Function Interface

Frege has language constructs to

  • use any existing Java primitive, class or interface type as abstract data type. Actually, all basic types like Bool, Char, Int, Long, Integer, Float, Double, String, Regex are just abstract datatypes borrowed from Java.
  • make existing java methods and even java operators available as Frege functions. These are known as native functions.

While the interoperability with other languages is not strictly restricted to Java it certainly can't cross the border of the JVM.

##Miscellaneous

Functions can have higher ranked types, provided an appropriate annotation was given. The compiler can check higher rank types, but cannot infer them.

There is no monomorphism restriction. Beware of polymorphic top level "values", they will turn into functions at runtime.

Local function bindings that do not mention any local names from outer local scopes are silently made into private top level functions. They can be polymorphic.

The types of the remaining un-annotated bindings will not be quantified. If one needs a polymorphic local function, an annotation is needed. Note that there are cases where it is not possible to write the type of such a function, the Haskell 2010 report has examples of this.

Likewise, values bound by a lambda or case alternative are doomed to be monomorphic unless there is a type annotation that allows the compiler to infer a polymorphic type.

Here are three ways to achieve polymorphism for a lambda bound value:

-- compiler knows that rev is polymorphic from data definition (see above)
foo (LF{rev}) = (rev [1,2], rev['a', 'b'])

-- compiler learns type of rev from bars type signature
bar :: (forall a.[a]->[a]) -> ([Int], [Char])
bar rev = (rev [1,2], rev['a', 'b'])

-- compiler learns type of rev from pattern signature
baz (rev :: forall a.[a]->[a]) = (rev [1,2], rev ['a', 'b'])

This works similarly in case alternatives and with higher rank types.

Clone this wiki locally