This is the code for a talk @joemoe and I gave at Flink Forward Global 2021: Introduction to Flink in 30 minutes.
Import this repo into an IDE, such as IntelliJ, as a Maven project.
Before running any of these application, first use docker-compose to start up a Kafka cluster:
docker-compose up -d
Running KafkaProducerJob
in your IDE will populate an input topic with UsageRecords.
To run the jobs in this project in IntelliJ, choose the option "Include dependencies with Provided scope" in your Run Configuration.
After having run the KafkaProducerJob
you can run any of the other applications
in this project (all of which expect to read from Kafka).
To watch what's happening in Kafka (using a local installation of Kafka, or from inside the container):
./kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic input --property print.timestamp=true
While this job is running in the IDE you can use Flink's web UI at http://localhost:8081.
curl -O https://dlcdn.apache.org/flink/flink-1.14.3/flink-1.14.3-bin-scala_2.12.tgz
tar -xzf flink-*.tgz
cd flink-1.14.3
ls -la bin/
./bin/start-cluster.sh
Then browse to http://localhost:8081.
To see the cluster run the streaming WordCount example:
./bin/flink run examples/streaming/WordCount.jar
tail log/flink-*-taskexecutor-*.out
If you have already started a cluster, stop it and reconfigure Flink to have more resources
by modifing flink-1.14.3/conf/flink-conf.yaml
so that
taskmanager.memory.process.size: 3072m
taskmanager.numberOfTaskSlots: 8
Then with flink-1.14.3/bin
in your PATH:
mvn clean package
start-cluster.sh
flink run -d target/flink-mobile-data-usage-1.0.jar
flink run -d -c com.ververica.flink.example.datausage.UsageAlertingProcessFunctionJob \
target/flink-mobile-data-usage-1.0.jar --webui false
curl https://flink.apache.org/q/quickstart.sh | bash -s 1.14.3
See Project Configuration in the Flink docs for more information.