GitHub - Spirals-Team/hadoop-benchmark: Docker containers to build an Hadoop infrastructure and experiment feedback control loops atop of it.

Overview

Hadoop-Benchmark is an open-source research acceleration platform for rapid prototyping and evaluation of self-adaptive behaviors in Hadoop clusters. The main objectives are to allow researchers to

− rapidly prototype, i.e., to experiment with self-adaptation in Hadoop clusters without the need to cope with low-level system infrastructure details,

− reproduction, i.e., to share complete experiments for others to reproduce them independently, and

− repetition, i.e., to experiment with and to compare their work, re-doing the same experiments on the same system using the same evaluation methods.

It uses docker and docker-machine to easily create a multi-node cluster (on a single laptop or in a cloud including Grid5000) and provision Hadoop. It contains a number of acknowledged benchmarks and one self-adaptive scenario.

The following is the high-level overview of the created cluster and deployed services:

Requirements

docker >= 1.12
docker-machine >= 0.8
(optional) R >= 3.3.2 with tidyverse and Hmisc for data analysis

Usage

./cluster.sh                                                                              ✓
Usage ./cluster.sh [OPTIONS] COMMAND

Options:

  -f, --force   Use '-f' in docker commands where applicable
  -n, --noop    Only shows which commands would be executed wihout actually executing them
  -q, --quiet   Do not print which commands are executed

Commands:

  Cluster:
    create-cluster
    start-cluster
    stop-cluster
    restart-cluster
    destroy-cluster
    status-cluster

  Hadoop:
    start-hadoop
    stop-hadoop
    restart-hadoop
    destroy-hadoop

  Misc:
    console                   Enter a bash console in a container connected to the cluster
    run-controller CMD        Run a command CMD in the controller container
    hdfs CMD                  Run the HDFS CMD command
    hdfs-download SRC         Download a file from HDFS SRC to current directory

  Info:
    shell-init      Shows information how to initialize current shell to connect to the cluster
                    Useful to execute like: 'eval $(./cluster.sh shell-init)'
    connect-info    Shows information how to connect to the cluster

Documentation

check the tutorial to get started.
check the screencast
check the demonstration of using hadoop-benchmark on Grid5000

Name		Name	Last commit message	Last commit date
Latest commit History 288 Commits
azure		azure
benchmarks		benchmarks
figures		figures
scenarios		scenarios
LICENSE		LICENSE
README.md		README.md
artifact.zip		artifact.zip
cluster.sh		cluster.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Overview

Requirements

Usage

Documentation

About

Releases

Packages

Contributors 2

Languages

License

Spirals-Team/hadoop-benchmark

Folders and files

Latest commit

History

Repository files navigation

Overview

Requirements

Usage

Documentation

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages