Skip to content

Thrift IDL parser and code generator for the compact protocol

License

Notifications You must be signed in to change notification settings

jhorstmann/compact-thrift

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

60 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Thrift IDL parser and code generator for the compact protocol

This is an alternative implementation of the official Apache Thrift code generator, with a focus on the compact protocol.

The initial goal of this project is to develop a more efficient rust parser for the metadata embedded in Apache Parquet files.

Higher performance is achieved by the following design decisions:

  • Fewer abstractions by focusing on the compact protocol.
    • The generated code for example inlines the reading of field headers and so avoids method calls and passing of slightly larger structure like TFieldIdentifier.
    • The field id and field delta can be tracked inside the generated code, similar for boolean fields, making the actual protocol code much simpler.
  • The rust target avoids moving structures from optional local variables into fields of the returned struct by directly filling the struct. This unfortunately requires all generated structs to implement the default trait.

Even though the initial target language for the code generator is rust, the code generator is written in Kotlin. The reasons for this choice are:

  • Using a jvm-based language gives access to one of the most developer friendly parser generators, ANTLR. (There is a rust implementation of Antlr, but it is mostly unmaintained at the moment.)
  • Kotlins sealed and data classes are very powerful for modeling domain objects (similar to rust enums).
  • Kotlin has built-in support for string templates, which are checked at compile time.

The runtime support for the generated rust code can be found in the src/main/rust folder.

How to run

To run this code generator you will need a Java Distribution like Amazon Corretto and Apache Maven as a build tool. Once these are installed and their bin folders added to the PATH, the definitions for the included parquet.thrift can be generated by running:

$ ./generate-parquet.sh

About

Thrift IDL parser and code generator for the compact protocol

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published