Apache Flink is an open source platform which is a streaming data flow engine that provides communication, fault-tolerance, and data-distribution for distributed computations over data streams. Flink is a top-level project of Apache.
It is a scalable data analytics framework that is fully compatible with Hadoop. Flink can execute both stream processing and batch processing easily.
Version:flink-1.2.0 vi flink-conf.yaml(change port number ) ./bin/start-local.sh
Run Flink Job(Jar file) $./bin/flink run wordcount.jar --port 9000
See: UI running jobs http://ip:8080
logs: cd flink-1.20/log/ tail -100f flink_jobmanager.com.out
Kernal: Runtime (Distributed Streaming Dataflow
DataSetAPI(Batch Processing)
DataStraem API (Stream Processing)
Flink ML
Gelly (Graph)
Table(SQL )
Table(SQL using DataStream)
Local JVM(Single JVM)
Cluster( Standalone,Yarn,Mesos,tez)
Cloud (Google GCE,Amazon EC2)
Local FS
HDFS ,S3
MonoDB,Hbase,SQL,Cassandra,Any File System
RabbitMQ,Kafka,Flume ,MQTT
``Project: flink-examples
Version:flink-1.2.0
vi flink-conf.yaml
change port number
start-cluster.bat flink run E:/FlinkWorks/wordcount.jar --input E:/FlinkWorks/wordcount.txt --output E:/FlinkWorks/
refer: github flink Project: flink-examples
# Linux: ./bin/start-local.sh $./bin/flink run wordcount.jar --port 9000
See: UI running jobs logs: cd flink-1.20/log/ tail -100f flink_jobmanager.com.out
/bin/start-local.sh http://localhost:8081/#/overview Apache Flink