Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Improve] Replace All Instances of "StreamPark" with "Apache StreamPark" in Official Documentation #324

Merged
merged 1 commit into from
Jan 23, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 3 additions & 3 deletions blog/0-streampark-flink-on-k8s.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
---
slug: streampark-flink-on-k8s
title: StreamPark Flink on Kubernetes practice
title: Apache StreamPark Flink on Kubernetes practice
tags: [StreamPark, Production Practice, FlinkSQL, Kubernetes]
description: Wuxin Technology was founded in January 2018. The current main business includes the research and development, design, manufacturing and sales of RELX brand products. With core technologies and capabilities covering the entire industry chain, RELX is committed to providing users with products that are both high quality and safe
---
Expand Down Expand Up @@ -69,7 +69,7 @@ kubectl -n flink-cluster get svc

The above is the process of deploying a Flink task to Kubernetes using the most original script method provided by Flink. Only the most basic task submission is achieved. If it is to reach the production use level, there are still a series of problems that need to be solved, such as: the method is too Originally, it was unable to adapt to large batches of tasks, unable to record task checkpoints and real-time status tracking, difficult to operate and monitor tasks, had no alarm mechanism, and could not be managed in a centralized manner, etc.

## **Deploy Flink on Kubernetes using StreamPark**
## **Deploy Flink on Kubernetes using Apache StreamPark**

There will be higher requirements for using Flink on Kubernetes in enterprise-level production environments. Generally, you will choose to build your own platform or purchase related commercial products. No matter which solution meets the product capabilities: large-scale task development and deployment, status tracking, operation and maintenance monitoring , failure alarms, unified task management, high availability, etc. are common demands.

Expand Down Expand Up @@ -173,7 +173,7 @@ Next, let’s take a look at how StreamPark supports this capability:

From the above, we can see that StreamPark has the capabilities to support the development and deployment process of Flink on Kubernetes, including: ** job development capabilities, deployment capabilities, monitoring capabilities, operation and maintenance capabilities, exception handling capabilities, etc. StreamPark provides a relatively complete set of s solution. And it already has some CICD/DevOps capabilities, and the overall completion level continues to improve. It is a product that supports the full link of Flink on Kubernetes one-stop development, deployment, operation and maintenance work in the entire open source field. StreamPark is worthy of praise. **

## **StreamPark’s implementation in Wuxin Technology**
## **Apache StreamPark’s implementation in Wuxin Technology**

StreamPark was launched late in Wuxin Technology. It is currently mainly used for the development and deployment of real-time data integration jobs and real-time indicator calculation jobs. There are Jar tasks and Flink SQL tasks, all deployed using Native Kubernetes; data sources include CDC, Kafka, etc., and Sink end There are Maxcompute, kafka, Hive, etc. The following is a screenshot of the company's development environment StreamPark platform:

Expand Down
4 changes: 2 additions & 2 deletions blog/1-flink-framework-streampark.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
---
slug: flink-development-framework-streampark
title: StreamPark - Powerful Flink Development Framework
title: Apache StreamPark - Powerful Flink Development Framework
tags: [StreamPark, DataStream, FlinkSQL]
---

Expand Down Expand Up @@ -56,7 +56,7 @@ However, because object storage requires the entire object to be rewritten for r

<br/>

## Introducing StreamPark
## Introducing Apache StreamPark

Previously, when we wrote Flink SQL, we generally used Java to wrap SQL, packed it into a jar package, and submitted it to the S3 platform through the command line. This approach has always been unfriendly; the process is cumbersome, and the costs for development and operations are too high. We hoped to further streamline the process by abstracting the Flink TableEnvironment, letting the platform handle initialization, packaging, and running Flink tasks, and automating the building, testing, and deployment of Flink applications.

Expand Down
2 changes: 1 addition & 1 deletion blog/2-streampark-usercase-chinaunion.md
Original file line number Diff line number Diff line change
Expand Up @@ -68,7 +68,7 @@ In terms of job operation and maintenance dilemmas, firstly, the job deployment

Due to various factors in the job operation and maintenance difficulties, business support challenges arise, such as a high rate of failures during launch, impact on data quality, lengthy launch times, high data latency, and issues with missed alarm handling, leading to complaints. In addition, the impact on our business is unclear, and once a problem arises, addressing the issue becomes the top priority.

## **基于 StreamPark 一体化管理**
## **基于 Apache StreamPark 一体化管理**

![](/blog/chinaunion/job_management.png)

Expand Down
4 changes: 2 additions & 2 deletions blog/3-streampark-usercase-bondex-paimon.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
---
slug: streampark-usercase-bondex-with-paimon
title: Based on Apache Paimon + StreamPark's Streaming Data Warehouse Practice by Bondex
title: Based on Apache Paimon + Apache StreamPark's Streaming Data Warehouse Practice by Bondex
tags: [StreamPark, Production Practice, paimon, streaming-warehouse]
---

Expand Down Expand Up @@ -236,7 +236,7 @@ docker push registry-vpc.cn-zhangjiakou.aliyuncs.com/xxxxx/flink-table-store:v1.

Next, prepare the Paimon jar package. You can download the corresponding version from the Apache [Repository](https://repository.apache.org/content/groups/snapshots/org/apache/paimon). It's important to note that it should be consistent with the major version of Flink.

### **Managing Jobs with StreamPark**
### **Managing Jobs with Apache StreamPark**

**Prerequisites:**

Expand Down
8 changes: 4 additions & 4 deletions blog/4-streampark-usercase-shunwang.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
---
slug: streampark-usercase-shunwang
title: StreamPark in the Large-Scale Production Practice at Shunwang Technology
title: Apache StreamPark in the Large-Scale Production Practice at Shunwang Technology
tags: [StreamPark, Production Practice, FlinkSQL]
---

Expand Down Expand Up @@ -71,13 +71,13 @@ To view logs for a job, developers must go through multiple steps, which to some

![Image](/blog/shunwang/step.png)

## **Why Use StreamPark**
## **Why Use Apache StreamPark**

Faced with the defects of our self-developed platform Streaming-Launcher, we have been considering how to further lower the barriers to using Flink and improve work efficiency. Considering the cost of human resources and time, we decided to seek help from the open-source community and look for an appropriate open-source project to manage and maintain our Flink tasks.



### 01 **StreamPark: A Powerful Tool for Solving Flink Issues**
### 01 **Apache StreamPark: A Powerful Tool for Solving Flink Issues**

Fortunately, in early June 2022, we stumbled upon StreamPark on GitHub and embarked on a preliminary exploration full of hope. We found that StreamPark's capabilities can be broadly divided into three areas: user permission management, job operation and maintenance management, and development scaffolding.

Expand Down Expand Up @@ -109,7 +109,7 @@ Further research revealed that StreamPark is not just a platform but also includ



### 02 **How StreamPark Addresses Issues of the Self-Developed Platform**
### 02 **How Apache StreamPark Addresses Issues of the Self-Developed Platform**

We briefly introduced the core capabilities of StreamPark above. During the technology selection process at Shunwang Technology, we found that StreamPark not only includes the basic functions of our existing Streaming-Launcher but also offers a more complete set of solutions to address its many shortcomings. Here, we focus on the solutions provided by StreamPark for the deficiencies of our self-developed platform, Streaming-Launcher.

Expand Down
2 changes: 1 addition & 1 deletion blog/5-streampark-usercase-dustess.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
---
slug: streampark-usercase-dustess
title: StreamPark's Best Practices at Dustess, Simplifying Complexity for the Ultimate Experience
title: Apache StreamPark's Best Practices at Dustess, Simplifying Complexity for the Ultimate Experience
tags: [StreamPark, Production Practice, FlinkSQL]
---

Expand Down
4 changes: 2 additions & 2 deletions blog/6-streampark-usercase-joyme.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
---
slug: streampark-usercase-joyme
title: StreamPark's Production Practice in Joyme
title: Apache StreamPark's Production Practice in Joyme
tags: [StreamPark, Production Practice, FlinkSQL]
---

Expand All @@ -16,7 +16,7 @@ tags: [StreamPark, Production Practice, FlinkSQL]

<!-- truncate -->

## 1 Encountering StreamPark
## 1 Encountering Apache StreamPark

Encountering StreamPark was inevitable. Based on our existing real-time job development mode, we had to find an open-source platform to support our company's real-time business. Our current situation was as follows:

Expand Down
2 changes: 1 addition & 1 deletion blog/7-streampark-usercase-haibo.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@ Haibo Tech is an industry-leading company offering AI IoT products and solutions
<!-- truncate -->


## **01. Choosing StreamPark**
## **01. Choosing Apache StreamPark**

Haibo Tech started using Flink SQL to aggregate and process various real-time IoT data since 2020. With the accelerated pace of smart city construction in various cities, the types and volume of IoT data to be aggregated are also increasing. This has resulted in an increasing number of Flink SQL tasks being maintained online, making a dedicated platform for managing numerous Flink SQL tasks an urgent need.

Expand Down
8 changes: 4 additions & 4 deletions blog/8-streampark-usercase-ziru.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
---
slug: streampark-usercase-ziru
title: Ziroom's Real-Time Computing Platform Practice Based on Apache StreamPark
title: Ziroom's Real-Time Computing Platform Practice Based on Apache Apache StreamPark
tags: [StreamPark, Production Practice]
---

Expand Down Expand Up @@ -75,7 +75,7 @@ After analyzing the pros and cons of many open-source projects, we decided to pa

3. On the basis of StreamPark, we aim to promote integration with the existing ecosystem of the company to better meet our business needs.

## **In-depth Practice Based on StreamPark**
## **In-depth Practice Based on Apache StreamPark**

Based on the above decisions, we initiated the evolution of the real-time computing platform, oriented by "pain point needs," and built a stable, efficient, and easy-to-maintain real-time computing platform based on StreamPark. Since the beginning of 2022, we have participated in the construction of the community while officially scheduling our internal platform construction.

Expand Down Expand Up @@ -225,7 +225,7 @@ User B's actual execution SQL:
SELECT name, Encryption_function(age), price, Sensitive_field_functions(phone) FROM user;
```

### **06 Data Synchronization Platform Based on StreamPark**
### **06 Data Synchronization Platform Based on Apache StreamPark**

With the successful implementation of StreamPark's technical solutions in the company, we achieved deep support for Flink jobs, bringing a qualitative leap in data processing. This prompted us to completely revamp our past data synchronization logic, aiming to reduce operational costs through technical optimization and integration. Therefore, we gradually replaced historical Sqoop jobs, Canal jobs, and Hive JDBC Handler jobs with Flink CDC jobs, Flink stream, and batch jobs. In this process, we continued to optimize and strengthen StreamPark's interface capabilities, adding a status callback mechanism and achieving perfect integration with the DolphinScheduler [7] scheduling system, further enhancing our data processing capabilities.

Expand Down Expand Up @@ -363,7 +363,7 @@ Clicking Sync Conf will synchronize the global configuration file, and new jobs

![](/blog/ziru/sync_conf.png)

### **05 StreamPark DNS Resolution Configuration**
### **05 Apache StreamPark DNS Resolution Configuration**

A correct and reasonable DNS resolution configuration is very important when submitting FlinkSQL on the StreamPark platform. It mainly involves the following points:

Expand Down
2 changes: 1 addition & 1 deletion docs/connector/3-clickhouse.md
Original file line number Diff line number Diff line change
Expand Up @@ -65,7 +65,7 @@ public class ClickHouseUtil {

The method of splicing various parameters into the request url is cumbersome and hard-coded, which is very inflexible.

### Write with StreamPark
### Write with Apache StreamPark

To access `ClickHouse` data with `StreamPark`, you only need to define the configuration file in the specified format and then write code.
The configuration and code are as follows. The configuration of `ClickHose JDBC` in `StreamPark` is in the configuration list, and the sample running program is scala
Expand Down
2 changes: 1 addition & 1 deletion docs/connector/4-doris.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@ import TabItem from '@theme/TabItem';
which could support high-concurrent point query scenarios.
StreamPark encapsulates DoirsSink for writing data to Doris in real-time, based on [Doris' stream loads](https://doris.apache.org/administrator-guide/load-data/stream-load-manual.html)

### Write with StreamPark
### Write with Apache StreamPark

Use `StreamPark` to write data to `Doris`. DorisSink only supports JSON format (single-layer) writing currently,
such as: {"id":1,"name":"streampark"} The example of the running program is java, as follows:
Expand Down
4 changes: 2 additions & 2 deletions docs/connector/5-es.md
Original file line number Diff line number Diff line change
Expand Up @@ -204,7 +204,7 @@ The ElasticsearchSink created above is very inflexible to add parameters. `Strea
Users only need to configure es connection parameters and Flink operating parameters, and StreamPark will automatically assemble source and sink,
which greatly simplifies development logic and improves development efficiency and maintainability.

## Using StreamPark writes to Elasticsearch
## Using Apache StreamPark writes to Elasticsearch

Please ensure that operation requests are sent to the Elasticsearch cluster at least once after enabling Flink checkpointing in ESSink.

Expand Down Expand Up @@ -376,7 +376,7 @@ See [Official Documentation](https://nightlies.apache.org/flink/flink-docs-relea
The BulkProcessor inside es can further configure its behavior of how to refresh the cache operation request,
see the [official documentation](https://nightlies.apache.org/flink/flink-docs-release-1.14/zh/docs/connectors/datastream/elasticsearch/#elasticsearch-sink) for details - **Configuring the Internal** Bulk Processor

### StreamPark configuration
### Apache StreamPark configuration

All other configurations must comply with the StreamPark configuration.
For [specific configurable](/docs/development/conf) items and the role of each parameter,
Expand Down
2 changes: 1 addition & 1 deletion docs/connector/6-hbase.md
Original file line number Diff line number Diff line change
Expand Up @@ -243,7 +243,7 @@ Reading and writing HBase in this way is cumbersome and inconvenient. `StreamPar
Users only need to configure Hbase connection parameters and Flink operating parameters. StreamPark will automatically assemble source and sink,
which greatly simplifies development logic and improves development efficiency and maintainability.

## write and read Hbase with StreamPark
## write and read Hbase with Apache StreamPark

### 1. Configure policies and connection information

Expand Down
2 changes: 1 addition & 1 deletion docs/connector/7-http.md
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,7 @@ Asynchronous writing uses asynchttpclient as the client, you need to import the
</dependency>
```

## Write with StreamPark
## Write with Apache StreamPark

### http asynchronous write support type

Expand Down
2 changes: 1 addition & 1 deletion docs/connector/8-redis.md
Original file line number Diff line number Diff line change
Expand Up @@ -166,7 +166,7 @@ public class FlinkRedisSink {
The above creation of FlinkJedisPoolConfig is tedious, and each operation of redis has to build RedisMapper, which is very insensitive. `StreamPark` uses a convention over configuration and automatic configuration. This only requires configuring redis
StreamPark automatically assembles the source and sink parameters, which greatly simplifies the development logic and improves development efficiency and maintainability.

## StreamPark Writes to Redis
## Apache StreamPark Writes to Redis

RedisSink defaults to AT_LEAST_ONCE (at least once) processing semantics, two-stage segment submission supports EXACTLY_ONCE semantics with checkpoint enabled, available connection types: single-node mode, sentinel mode.

Expand Down
6 changes: 3 additions & 3 deletions docs/intro.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
---
id: 'intro'
title: 'What is StreamPark'
title: 'What is Apache StreamPark'
sidebar_position: 1
---

Expand Down Expand Up @@ -36,13 +36,13 @@ The overall architecture of Apache StreamPark is shown in the following figure.

![StreamPark Archite](/doc/image_en/streampark_archite.png)

### 1️⃣ StreamPark-core
### 1️⃣ Apache StreamPark-core

The positioning of `StreamPark-core` is a framework uesd while developing, it focuses on coding development, regulates configuration files, and develops in the convention over configuration guide.
StreamPark-core provides a development-time RunTime Content and a series of out-of-the-box Connectors. Cumbersome operations are simplified by extending `DataStream-related` methods and integrating DataStream and `Flink sql` api .
development efficiency and development experience will be highly improved because users can focus on the business.

### 2️⃣ StreamPark-console
### 2️⃣ Apache StreamPark-console

`StreamPark-console` is a comprehensive real-time `low code` data platform that can manage `Flink` tasks more convenient.
It integrates the experience of many best practices and integrates many functions such as project compilation, release,
Expand Down
10 changes: 5 additions & 5 deletions docs/user-guide/11-platformInstall.md
Original file line number Diff line number Diff line change
Expand Up @@ -75,7 +75,7 @@ flink -v
cp mysql-connector-java-8.0.28.jar /usr/local/streampark/lib
```
![4_mysql_dep](/doc/image/install/4_mysql_dep.png)
## Download StreamPark
## Download Apache StreamPark
> Download URL: [https://dlcdn.apache.org/incubator/streampark/2.0.0/apache-streampark_2.12-2.0.0-incubating-bin.tar.gz](https://dlcdn.apache.org/incubator/streampark/2.0.0/apache-streampark_2.12-2.0.0-incubating-bin.tar.gz)

> Upload [apache-streampark_2.12-2.0.0-incubating-bin.tar.gz](https://dlcdn.apache.org/incubator/streampark/2.0.0/apache-streampark_2.12-2.0.0-incubating-bin.tar.gz) to the server /usr/local path
Expand Down Expand Up @@ -131,7 +131,7 @@ show tables;
```
![13_show_streampark_db_tables](/doc/image/install/13_show_streampark_db_tables.png)

## StreamPark Configuration
## Apache StreamPark Configuration
> Purpose: Configure the data sources needed for startup.
> Configuration file location: /usr/local/streampark/conf

Expand Down Expand Up @@ -188,13 +188,13 @@ vim application.yml
> 5. **java.security.krb5.conf=/etc/krb5.conf**

![19_kerberos_yml_config](/doc/image/install/19_kerberos_yml_config.png)
## Starting StreamPark
## Enter the StreamPark Installation Path on the Server
## Starting Apache StreamPark
## Enter the Apache StreamPark Installation Path on the Server
```bash
cd /usr/local/streampark/
```
![20_enter_streampark_dir](/doc/image/install/20_enter_streampark_dir.png)
## Start the StreamPark Service
## Start the Apache StreamPark Service
```bash
./bin/startup.sh
```
Expand Down
Loading
Loading