This blueprint deploys a KDA app that reads from MSK Serverless using IAM auth and writes to S3 using the Java DataStream API:
- Flink version:
1.15.2
- Java version:
11
- New (in Flink 1.13)
KafkaSource
connector (FlinkKafkaSource
is slated to be deprecated). FileSink
(StreamingFileSink
is slated to be deprecated).
- Build app and copy resulting JAR to S3 location
- Deploy associated infra (MSK and KDA) using CDK script
- If using existing resources, you can simply update app properties in KDA.
- Perform data generation
- Maven
- AWS SDK v2
- AWS CDK v2 - for deploying associated infra (MSK and KDA app)
- First, let's set up some environment variables to make the deployment easier. Replace these values with your own S3 bucket, app name, etc.
export AWS_PROFILE=<<profile-name>>
export APP_NAME=<<name-of-your-app>>
export S3_BUCKET=<<your-s3-bucket-name>>
export S3_FILE_KEY=<<your-jar-name-on-s3>>
- Build Java Flink application locally.
From root directory of this project, run:
mvn clean package
- Copy jar to S3 so it can be referenced in CDK deployment
aws s3 cp target/<<your generated jar>> ${S3_BUCKET}/{S3_FILE_KEY}
-
Follow instructions in the
cdk-infra
folder to deploy the infrastructure associated with this app - such as MSK Serverless and the Kinesis Data Analytics application. -
Follow instructions in orders-datagen to create topic and generate data into MSK.
-
Start your Kinesis Data Analytics application from the AWS console.
-
Do a Flink query or S3 Select Query against S3 to view data written to S3.