Skip to content
View helgeho's full-sized avatar

Block or report helgeho

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Popular repositories Loading

  1. ArchiveSpark ArchiveSpark Public

    An Apache Spark framework for easy data processing, extraction as well as derivation for web archives and archival collections, developed at Internet Archive.

    Scala 145 19

  2. Web2Warc Web2Warc Public

    An easy-to-use and highly customizable crawler that enables you to create your own little Web archives (WARC/CDX)

    Scala 24 4

  3. internetarchive-transfer-scripts internetarchive-transfer-scripts Public

    Scripts to transfer archive.org collections, using https://github.com/jjjake/internetarchive

    Python 9 1

  4. HadoopConcatGz HadoopConcatGz Public

    A Splitable Hadoop InputFormat for Concatenated GZIP Files and *.(w)arc.gz

    Java 9 3

  5. HadoopWebGraph HadoopWebGraph Public

    A Hadoop input format to use gaphs in WebGraph's BV format with Hadoop and Spark.

    Java 8 3

  6. Exspec Exspec Public

    Don't write specs anymore, just save 'em while testing your code interactively. Specs will become a byproduct.

    Ruby 5