From d0f5b6ecf7dd2c4c81a3961eb66b7355dd33c3ba Mon Sep 17 00:00:00 2001 From: Cason <1125193113@qq.com> Date: Mon, 22 Jan 2024 22:19:52 +0800 Subject: [PATCH] =?UTF-8?q?Replace=20=20Instances=20of=20"StreamPark"=20wi?= =?UTF-8?q?th=20"Apache=20StreamPark=E2=84=A2"=20in=20Official=20Documenta?= =?UTF-8?q?tion?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit --- blog/0-streampark-flink-on-k8s.md | 6 ++--- blog/1-flink-framework-streampark.md | 4 +-- blog/2-streampark-usercase-chinaunion.md | 2 +- blog/3-streampark-usercase-bondex-paimon.md | 4 +-- blog/4-streampark-usercase-shunwang.md | 8 +++--- blog/5-streampark-usercase-dustess.md | 2 +- blog/6-streampark-usercase-joyme.md | 4 +-- blog/7-streampark-usercase-haibo.md | 2 +- blog/8-streampark-usercase-ziru.md | 8 +++--- docs/connector/3-clickhouse.md | 2 +- docs/connector/4-doris.md | 2 +- docs/connector/5-es.md | 4 +-- docs/connector/6-hbase.md | 2 +- docs/connector/7-http.md | 2 +- docs/connector/8-redis.md | 2 +- docs/intro.md | 6 ++--- docs/user-guide/11-platformInstall.md | 10 +++---- docs/user-guide/12-platformBasicUsage.md | 26 +++++++++---------- docs/user-guide/4-dockerDeployment.md | 6 ++--- .../0-streampark-flink-on-k8s.md | 8 +++--- .../1-flink-framework-streampark.md | 4 +-- .../2-streampark-usercase-chinaunion.md | 2 +- .../3-streampark-usercase-bondex-paimon.md | 4 +-- .../4-streampark-usercase-shunwang.md | 8 +++--- .../5-streampark-usercase-dustess.md | 2 +- .../6-streampark-usercase-joyme.md | 4 +-- .../7-streampark-usercase-haibo.md | 4 +-- .../8-streampark-usercase-ziru.md | 8 +++--- .../contribution_guide/become_committer.md | 2 +- .../contribution_guide/become_pmc_member.md | 2 +- .../current/release/How-to-release.md | 2 +- .../current/connector/3-clickhouse.md | 4 +-- .../current/connector/4-doris.md | 2 +- .../current/connector/5-es.md | 4 +-- .../current/connector/6-hbase.md | 2 +- .../current/connector/7-http.md | 2 +- .../current/connector/8-redis.md | 2 +- .../current/user-guide/11-platformInstall.md | 10 +++---- .../user-guide/12-platformBasicUsage.md | 26 +++++++++---------- .../current/user-guide/4-dockerDeployment.md | 6 ++--- 40 files changed, 105 insertions(+), 105 deletions(-) diff --git a/blog/0-streampark-flink-on-k8s.md b/blog/0-streampark-flink-on-k8s.md index 4474edbaa..8bf6ca31d 100644 --- a/blog/0-streampark-flink-on-k8s.md +++ b/blog/0-streampark-flink-on-k8s.md @@ -1,6 +1,6 @@ --- slug: streampark-flink-on-k8s -title: StreamPark Flink on Kubernetes practice +title: Apache StreamPark™ Flink on Kubernetes practice tags: [StreamPark, Production Practice, FlinkSQL, Kubernetes] description: Wuxin Technology was founded in January 2018. The current main business includes the research and development, design, manufacturing and sales of RELX brand products. With core technologies and capabilities covering the entire industry chain, RELX is committed to providing users with products that are both high quality and safe --- @@ -69,7 +69,7 @@ kubectl -n flink-cluster get svc The above is the process of deploying a Flink task to Kubernetes using the most original script method provided by Flink. Only the most basic task submission is achieved. If it is to reach the production use level, there are still a series of problems that need to be solved, such as: the method is too Originally, it was unable to adapt to large batches of tasks, unable to record task checkpoints and real-time status tracking, difficult to operate and monitor tasks, had no alarm mechanism, and could not be managed in a centralized manner, etc. -## **Deploy Flink on Kubernetes using StreamPark** +## **Deploy Flink on Kubernetes using Apache StreamPark™** There will be higher requirements for using Flink on Kubernetes in enterprise-level production environments. Generally, you will choose to build your own platform or purchase related commercial products. No matter which solution meets the product capabilities: large-scale task development and deployment, status tracking, operation and maintenance monitoring , failure alarms, unified task management, high availability, etc. are common demands. @@ -173,7 +173,7 @@ Next, let’s take a look at how StreamPark supports this capability: From the above, we can see that StreamPark has the capabilities to support the development and deployment process of Flink on Kubernetes, including: ** job development capabilities, deployment capabilities, monitoring capabilities, operation and maintenance capabilities, exception handling capabilities, etc. StreamPark provides a relatively complete set of s solution. And it already has some CICD/DevOps capabilities, and the overall completion level continues to improve. It is a product that supports the full link of Flink on Kubernetes one-stop development, deployment, operation and maintenance work in the entire open source field. StreamPark is worthy of praise. ** -## **StreamPark’s implementation in Wuxin Technology** +## **Apache StreamPark™’s implementation in Wuxin Technology** StreamPark was launched late in Wuxin Technology. It is currently mainly used for the development and deployment of real-time data integration jobs and real-time indicator calculation jobs. There are Jar tasks and Flink SQL tasks, all deployed using Native Kubernetes; data sources include CDC, Kafka, etc., and Sink end There are Maxcompute, kafka, Hive, etc. The following is a screenshot of the company's development environment StreamPark platform: diff --git a/blog/1-flink-framework-streampark.md b/blog/1-flink-framework-streampark.md index 25070df41..4dabf5066 100644 --- a/blog/1-flink-framework-streampark.md +++ b/blog/1-flink-framework-streampark.md @@ -1,6 +1,6 @@ --- slug: flink-development-framework-streampark -title: StreamPark - Powerful Flink Development Framework +title: Apache StreamPark™ - Powerful Flink Development Framework tags: [StreamPark, DataStream, FlinkSQL] --- @@ -56,7 +56,7 @@ However, because object storage requires the entire object to be rewritten for r
-## Introducing StreamPark +## Introducing Apache StreamPark™ Previously, when we wrote Flink SQL, we generally used Java to wrap SQL, packed it into a jar package, and submitted it to the S3 platform through the command line. This approach has always been unfriendly; the process is cumbersome, and the costs for development and operations are too high. We hoped to further streamline the process by abstracting the Flink TableEnvironment, letting the platform handle initialization, packaging, and running Flink tasks, and automating the building, testing, and deployment of Flink applications. diff --git a/blog/2-streampark-usercase-chinaunion.md b/blog/2-streampark-usercase-chinaunion.md index 6f4ea6bfd..2472eb47b 100644 --- a/blog/2-streampark-usercase-chinaunion.md +++ b/blog/2-streampark-usercase-chinaunion.md @@ -68,7 +68,7 @@ In terms of job operation and maintenance dilemmas, firstly, the job deployment Due to various factors in the job operation and maintenance difficulties, business support challenges arise, such as a high rate of failures during launch, impact on data quality, lengthy launch times, high data latency, and issues with missed alarm handling, leading to complaints. In addition, the impact on our business is unclear, and once a problem arises, addressing the issue becomes the top priority. -## **基于 StreamPark 一体化管理** +## **基于 Apache StreamPark™ 一体化管理** ![](/blog/chinaunion/job_management.png) diff --git a/blog/3-streampark-usercase-bondex-paimon.md b/blog/3-streampark-usercase-bondex-paimon.md index d3b645350..f3be5ec9a 100644 --- a/blog/3-streampark-usercase-bondex-paimon.md +++ b/blog/3-streampark-usercase-bondex-paimon.md @@ -1,6 +1,6 @@ --- slug: streampark-usercase-bondex-with-paimon -title: Based on Apache Paimon + StreamPark's Streaming Data Warehouse Practice by Bondex +title: Based on Apache Paimon + Apache StreamPark™'s Streaming Data Warehouse Practice by Bondex tags: [StreamPark, Production Practice, paimon, streaming-warehouse] --- @@ -236,7 +236,7 @@ docker push registry-vpc.cn-zhangjiakou.aliyuncs.com/xxxxx/flink-table-store:v1. Next, prepare the Paimon jar package. You can download the corresponding version from the Apache [Repository](https://repository.apache.org/content/groups/snapshots/org/apache/paimon). It's important to note that it should be consistent with the major version of Flink. -### **Managing Jobs with StreamPark** +### **Managing Jobs with Apache StreamPark™** **Prerequisites:** diff --git a/blog/4-streampark-usercase-shunwang.md b/blog/4-streampark-usercase-shunwang.md index 33d621618..170806c4c 100644 --- a/blog/4-streampark-usercase-shunwang.md +++ b/blog/4-streampark-usercase-shunwang.md @@ -1,6 +1,6 @@ --- slug: streampark-usercase-shunwang -title: StreamPark in the Large-Scale Production Practice at Shunwang Technology +title: Apache StreamPark™ in the Large-Scale Production Practice at Shunwang Technology tags: [StreamPark, Production Practice, FlinkSQL] --- @@ -71,13 +71,13 @@ To view logs for a job, developers must go through multiple steps, which to some ![Image](/blog/shunwang/step.png) -## **Why Use StreamPark** +## **Why Use Apache StreamPark™** Faced with the defects of our self-developed platform Streaming-Launcher, we have been considering how to further lower the barriers to using Flink and improve work efficiency. Considering the cost of human resources and time, we decided to seek help from the open-source community and look for an appropriate open-source project to manage and maintain our Flink tasks. -### 01 **StreamPark: A Powerful Tool for Solving Flink Issues** +### 01 **Apache StreamPark™: A Powerful Tool for Solving Flink Issues** Fortunately, in early June 2022, we stumbled upon StreamPark on GitHub and embarked on a preliminary exploration full of hope. We found that StreamPark's capabilities can be broadly divided into three areas: user permission management, job operation and maintenance management, and development scaffolding. @@ -109,7 +109,7 @@ Further research revealed that StreamPark is not just a platform but also includ -### 02 **How StreamPark Addresses Issues of the Self-Developed Platform** +### 02 **How Apache StreamPark™ Addresses Issues of the Self-Developed Platform** We briefly introduced the core capabilities of StreamPark above. During the technology selection process at Shunwang Technology, we found that StreamPark not only includes the basic functions of our existing Streaming-Launcher but also offers a more complete set of solutions to address its many shortcomings. Here, we focus on the solutions provided by StreamPark for the deficiencies of our self-developed platform, Streaming-Launcher. diff --git a/blog/5-streampark-usercase-dustess.md b/blog/5-streampark-usercase-dustess.md index 3a1910e6e..3f774d2f6 100644 --- a/blog/5-streampark-usercase-dustess.md +++ b/blog/5-streampark-usercase-dustess.md @@ -1,6 +1,6 @@ --- slug: streampark-usercase-dustess -title: StreamPark's Best Practices at Dustess, Simplifying Complexity for the Ultimate Experience +title: Apache StreamPark™'s Best Practices at Dustess, Simplifying Complexity for the Ultimate Experience tags: [StreamPark, Production Practice, FlinkSQL] --- diff --git a/blog/6-streampark-usercase-joyme.md b/blog/6-streampark-usercase-joyme.md index 511527813..6d1e2df48 100644 --- a/blog/6-streampark-usercase-joyme.md +++ b/blog/6-streampark-usercase-joyme.md @@ -1,6 +1,6 @@ --- slug: streampark-usercase-joyme -title: StreamPark's Production Practice in Joyme +title: Apache StreamPark™'s Production Practice in Joyme tags: [StreamPark, Production Practice, FlinkSQL] --- @@ -16,7 +16,7 @@ tags: [StreamPark, Production Practice, FlinkSQL] -## 1 Encountering StreamPark +## 1 Encountering Apache StreamPark™ Encountering StreamPark was inevitable. Based on our existing real-time job development mode, we had to find an open-source platform to support our company's real-time business. Our current situation was as follows: diff --git a/blog/7-streampark-usercase-haibo.md b/blog/7-streampark-usercase-haibo.md index 5badbe01a..0297ca852 100644 --- a/blog/7-streampark-usercase-haibo.md +++ b/blog/7-streampark-usercase-haibo.md @@ -18,7 +18,7 @@ Haibo Tech is an industry-leading company offering AI IoT products and solutions -## **01. Choosing StreamPark** +## **01. Choosing Apache StreamPark™** Haibo Tech started using Flink SQL to aggregate and process various real-time IoT data since 2020. With the accelerated pace of smart city construction in various cities, the types and volume of IoT data to be aggregated are also increasing. This has resulted in an increasing number of Flink SQL tasks being maintained online, making a dedicated platform for managing numerous Flink SQL tasks an urgent need. diff --git a/blog/8-streampark-usercase-ziru.md b/blog/8-streampark-usercase-ziru.md index eda608f9f..323a6febf 100644 --- a/blog/8-streampark-usercase-ziru.md +++ b/blog/8-streampark-usercase-ziru.md @@ -1,6 +1,6 @@ --- slug: streampark-usercase-ziru -title: Ziroom's Real-Time Computing Platform Practice Based on Apache StreamPark +title: Ziroom's Real-Time Computing Platform Practice Based on Apache Apache StreamPark™ tags: [StreamPark, Production Practice] --- @@ -75,7 +75,7 @@ After analyzing the pros and cons of many open-source projects, we decided to pa 3. On the basis of StreamPark, we aim to promote integration with the existing ecosystem of the company to better meet our business needs. -## **In-depth Practice Based on StreamPark** +## **In-depth Practice Based on Apache StreamPark™** Based on the above decisions, we initiated the evolution of the real-time computing platform, oriented by "pain point needs," and built a stable, efficient, and easy-to-maintain real-time computing platform based on StreamPark. Since the beginning of 2022, we have participated in the construction of the community while officially scheduling our internal platform construction. @@ -225,7 +225,7 @@ User B's actual execution SQL: SELECT name, Encryption_function(age), price, Sensitive_field_functions(phone) FROM user; ``` -### **06 Data Synchronization Platform Based on StreamPark** +### **06 Data Synchronization Platform Based on Apache StreamPark™** With the successful implementation of StreamPark's technical solutions in the company, we achieved deep support for Flink jobs, bringing a qualitative leap in data processing. This prompted us to completely revamp our past data synchronization logic, aiming to reduce operational costs through technical optimization and integration. Therefore, we gradually replaced historical Sqoop jobs, Canal jobs, and Hive JDBC Handler jobs with Flink CDC jobs, Flink stream, and batch jobs. In this process, we continued to optimize and strengthen StreamPark's interface capabilities, adding a status callback mechanism and achieving perfect integration with the DolphinScheduler [7] scheduling system, further enhancing our data processing capabilities. @@ -363,7 +363,7 @@ Clicking Sync Conf will synchronize the global configuration file, and new jobs ![](/blog/ziru/sync_conf.png) -### **05 StreamPark DNS Resolution Configuration** +### **05 Apache StreamPark™ DNS Resolution Configuration** A correct and reasonable DNS resolution configuration is very important when submitting FlinkSQL on the StreamPark platform. It mainly involves the following points: diff --git a/docs/connector/3-clickhouse.md b/docs/connector/3-clickhouse.md index dfaf1718c..579081f84 100755 --- a/docs/connector/3-clickhouse.md +++ b/docs/connector/3-clickhouse.md @@ -65,7 +65,7 @@ public class ClickHouseUtil { The method of splicing various parameters into the request url is cumbersome and hard-coded, which is very inflexible. -### Write with StreamPark +### Write with Apache StreamPark™ To access `ClickHouse` data with `StreamPark`, you only need to define the configuration file in the specified format and then write code. The configuration and code are as follows. The configuration of `ClickHose JDBC` in `StreamPark` is in the configuration list, and the sample running program is scala diff --git a/docs/connector/4-doris.md b/docs/connector/4-doris.md index cdedc7fc0..2413f9ae8 100644 --- a/docs/connector/4-doris.md +++ b/docs/connector/4-doris.md @@ -14,7 +14,7 @@ import TabItem from '@theme/TabItem'; which could support high-concurrent point query scenarios. StreamPark encapsulates DoirsSink for writing data to Doris in real-time, based on [Doris' stream loads](https://doris.apache.org/administrator-guide/load-data/stream-load-manual.html) -### Write with StreamPark +### Write with Apache StreamPark™ Use `StreamPark` to write data to `Doris`. DorisSink only supports JSON format (single-layer) writing currently, such as: {"id":1,"name":"streampark"} The example of the running program is java, as follows: diff --git a/docs/connector/5-es.md b/docs/connector/5-es.md index 7f1874f9b..f91c27b7c 100755 --- a/docs/connector/5-es.md +++ b/docs/connector/5-es.md @@ -204,7 +204,7 @@ The ElasticsearchSink created above is very inflexible to add parameters. `Strea Users only need to configure es connection parameters and Flink operating parameters, and StreamPark will automatically assemble source and sink, which greatly simplifies development logic and improves development efficiency and maintainability. -## Using StreamPark writes to Elasticsearch +## Using Apache StreamPark™ writes to Elasticsearch Please ensure that operation requests are sent to the Elasticsearch cluster at least once after enabling Flink checkpointing in ESSink. @@ -376,7 +376,7 @@ See [Official Documentation](https://nightlies.apache.org/flink/flink-docs-relea The BulkProcessor inside es can further configure its behavior of how to refresh the cache operation request, see the [official documentation](https://nightlies.apache.org/flink/flink-docs-release-1.14/zh/docs/connectors/datastream/elasticsearch/#elasticsearch-sink) for details - **Configuring the Internal** Bulk Processor -### StreamPark configuration +### Apache StreamPark™ configuration All other configurations must comply with the StreamPark configuration. For [specific configurable](/docs/development/conf) items and the role of each parameter, diff --git a/docs/connector/6-hbase.md b/docs/connector/6-hbase.md index eeecb3bf3..35ccf6adf 100755 --- a/docs/connector/6-hbase.md +++ b/docs/connector/6-hbase.md @@ -243,7 +243,7 @@ Reading and writing HBase in this way is cumbersome and inconvenient. `StreamPar Users only need to configure Hbase connection parameters and Flink operating parameters. StreamPark will automatically assemble source and sink, which greatly simplifies development logic and improves development efficiency and maintainability. -## write and read Hbase with StreamPark +## write and read Hbase with Apache StreamPark™ ### 1. Configure policies and connection information diff --git a/docs/connector/7-http.md b/docs/connector/7-http.md index e6fd6ed4f..b6013bd4e 100755 --- a/docs/connector/7-http.md +++ b/docs/connector/7-http.md @@ -29,7 +29,7 @@ Asynchronous writing uses asynchttpclient as the client, you need to import the ``` -## Write with StreamPark +## Write with Apache StreamPark™ ### http asynchronous write support type diff --git a/docs/connector/8-redis.md b/docs/connector/8-redis.md index 94b2badea..07ac03fcc 100644 --- a/docs/connector/8-redis.md +++ b/docs/connector/8-redis.md @@ -166,7 +166,7 @@ public class FlinkRedisSink { The above creation of FlinkJedisPoolConfig is tedious, and each operation of redis has to build RedisMapper, which is very insensitive. `StreamPark` uses a convention over configuration and automatic configuration. This only requires configuring redis StreamPark automatically assembles the source and sink parameters, which greatly simplifies the development logic and improves development efficiency and maintainability. -## StreamPark Writes to Redis +## Apache StreamPark™ Writes to Redis RedisSink defaults to AT_LEAST_ONCE (at least once) processing semantics, two-stage segment submission supports EXACTLY_ONCE semantics with checkpoint enabled, available connection types: single-node mode, sentinel mode. diff --git a/docs/intro.md b/docs/intro.md index 6ed75b7c5..6a42ebcb2 100644 --- a/docs/intro.md +++ b/docs/intro.md @@ -1,6 +1,6 @@ --- id: 'intro' -title: 'What is StreamPark' +title: 'What is Apache StreamPark™' sidebar_position: 1 --- @@ -36,13 +36,13 @@ The overall architecture of Apache StreamPark is shown in the following figure. ![StreamPark Archite](/doc/image_en/streampark_archite.png) -### 1️⃣ StreamPark-core +### 1️⃣ Apache StreamPark™-core The positioning of `StreamPark-core` is a framework uesd while developing, it focuses on coding development, regulates configuration files, and develops in the convention over configuration guide. StreamPark-core provides a development-time RunTime Content and a series of out-of-the-box Connectors. Cumbersome operations are simplified by extending `DataStream-related` methods and integrating DataStream and `Flink sql` api . development efficiency and development experience will be highly improved because users can focus on the business. -### 2️⃣ StreamPark-console +### 2️⃣ Apache StreamPark™-console `StreamPark-console` is a comprehensive real-time `low code` data platform that can manage `Flink` tasks more convenient. It integrates the experience of many best practices and integrates many functions such as project compilation, release, diff --git a/docs/user-guide/11-platformInstall.md b/docs/user-guide/11-platformInstall.md index 5bc92a70e..d309ae600 100644 --- a/docs/user-guide/11-platformInstall.md +++ b/docs/user-guide/11-platformInstall.md @@ -75,7 +75,7 @@ flink -v cp mysql-connector-java-8.0.28.jar /usr/local/streampark/lib ``` ![4_mysql_dep](/doc/image/install/4_mysql_dep.png) -## Download StreamPark +## Download Apache StreamPark™ > Download URL: [https://dlcdn.apache.org/incubator/streampark/2.0.0/apache-streampark_2.12-2.0.0-incubating-bin.tar.gz](https://dlcdn.apache.org/incubator/streampark/2.0.0/apache-streampark_2.12-2.0.0-incubating-bin.tar.gz) > Upload [apache-streampark_2.12-2.0.0-incubating-bin.tar.gz](https://dlcdn.apache.org/incubator/streampark/2.0.0/apache-streampark_2.12-2.0.0-incubating-bin.tar.gz) to the server /usr/local path @@ -131,7 +131,7 @@ show tables; ``` ![13_show_streampark_db_tables](/doc/image/install/13_show_streampark_db_tables.png) -## StreamPark Configuration +## Apache StreamPark™ Configuration > Purpose: Configure the data sources needed for startup. > Configuration file location: /usr/local/streampark/conf @@ -188,13 +188,13 @@ vim application.yml > 5. **java.security.krb5.conf=/etc/krb5.conf** ![19_kerberos_yml_config](/doc/image/install/19_kerberos_yml_config.png) -## Starting StreamPark -## Enter the StreamPark Installation Path on the Server +## Starting Apache StreamPark™ +## Enter the Apache StreamPark™ Installation Path on the Server ```bash cd /usr/local/streampark/ ``` ![20_enter_streampark_dir](/doc/image/install/20_enter_streampark_dir.png) -## Start the StreamPark Service +## Start the Apache StreamPark™ Service ```bash ./bin/startup.sh ``` diff --git a/docs/user-guide/12-platformBasicUsage.md b/docs/user-guide/12-platformBasicUsage.md index 9c140fe52..4ad160a17 100644 --- a/docs/user-guide/12-platformBasicUsage.md +++ b/docs/user-guide/12-platformBasicUsage.md @@ -79,7 +79,7 @@ start-cluster.sh ![22_submit_flink_job_2](/doc/image_en/platform-usage/22_submit_flink_job_2.png) ## Check Job Status -### View via StreamPark Dashboard +### View via Apache StreamPark™ Dashboard > StreamPark dashboard ![23_flink_job_dashboard](/doc/image_en/platform-usage/23_flink_job_dashboard.png) @@ -99,12 +99,12 @@ _web_ui_2.png) > With this, the process of submitting a Flink job using the StreamPark platform is essentially complete. Below is a brief summary of the general process for managing Flink jobs on the StreamPark platform. -## StreamPark Platform's Process for Managing Flink Jobs +## Apache StreamPark™ Platform's Process for Managing Flink Jobs ![28_streampark_process_workflow](/doc/image_en/platform-usage/28_streampark_process_workflow.png) > Stopping, modifying, and deleting Flink jobs through the StreamPark platform is relatively simple and can be experienced by users themselves. It is worth noting that: **If a job is in a running state, it cannot be deleted and must be stopped first**. -# StreamPark System Module Introduction +# Apache StreamPark™ System Module Introduction ## System Settings > Menu location @@ -151,7 +151,7 @@ curl -X POST '/flink/app/cancel' \ ![36_streampark_menu_management](/doc/image_en/platform-usage/36_streampark_menu_management.png) -## StreamPark Menu Modules +## Apache StreamPark™ Menu Modules ### Project > StreamPark integrates with code repositories to achieve CICD @@ -213,11 +213,11 @@ curl -X POST '/flink/app/cancel' \ ![54_visit_flink_cluster_web_ui](/doc/image_en/platform-usage/54_visit_flink_cluster_web_ui.png) -# Using Native Flink with StreamPark +# Using Native Flink with Apache StreamPark™ > 【**To be improved**】In fact, a key feature of StreamPark is the optimization of the management mode for native Flink jobs at the user level, enabling users to rapidly develop, deploy, run, and monitor Flink jobs using the platform. Meaning, if users are familiar with native Flink, they will find StreamPark even more intuitive to use. ## Flink Deployment Modes -### How to Use in StreamPark +### How to Use in Apache StreamPark™ > **Session Mode** 1. Configure Flink Cluster @@ -248,7 +248,7 @@ flink run-application -t yarn-application \ -Dyarn.provided.lib.dirs="hdfs://myhdfs/my-remote-flink-dist-dir" \ hdfs://myhdfs/jars/my-application.jar ``` -### How to Use in StreamPark +### How to Use in Apache StreamPark™ > When creating or modifying a job, add in “Dynamic Properties” as per the specified format ![67_dynamic_params_usage](/doc/image_en/platform-usage/67_dynamic_params_usage.png) @@ -261,7 +261,7 @@ flink run-application -t yarn-application \ ![68_native_flink_restart_strategy](/doc/image_en/platform-usage/68_native_flink_restart_strategy.png) -### How to Use in StreamPark +### How to Use in Apache StreamPark™ > 【**To be improved**】Generally, alerts are triggered when a job fails or an anomaly occurs 1. Configure alert notifications @@ -283,7 +283,7 @@ flink run-application -t yarn-application \ ![72_native_flink_save_checkpoint_gramma](/doc/image_en/platform-usage/72_native_flink_save_checkpoint_gramma.png) -### How to Configure Savepoint in StreamPark +### How to Configure Savepoint in Apache StreamPark™ > Users can set a savepoint when stopping a job ![73_streampark_save_checkpoint](/doc/image_en/platform-usage/73_streampark_save_checkpoint.png) @@ -298,7 +298,7 @@ flink run-application -t yarn-application \ ![77_show_checkpoint_file_name_2](/doc/image_en/platform-usage/77_show_checkpoint_file_name_2.png) -### How to Restore a Job from a Specified Savepoint in StreamPark +### How to Restore a Job from a Specified Savepoint in Apache StreamPark™ > Users have the option to choose during job startup ![78_usage_checkpoint_in_streampark](/doc/image_en/platform-usage/78_usage_checkpoint_in_streampark.png) @@ -311,7 +311,7 @@ flink run-application -t yarn-application \ ![79_native_flink_job_status](/doc/image_en/platform-usage/79_native_flink_job_status.svg) -### Job Status in StreamPark +### Job Status in Apache StreamPark™ > 【**To be improved**】 @@ -321,7 +321,7 @@ flink run-application -t yarn-application \ ![80_native_flink_job_details_page](/doc/image_en/platform-usage/80_native_flink_job_details_page.png) -### Job Details in StreamPark +### Job Details in Apache StreamPark™ ![81_streampark_flink_job_details_page](/doc/image_en/platform-usage/81_streampark_flink_job_details_page.png) > In addition, for jobs in k8s mode, StreamPark also supports real-time display of startup logs, as shown below @@ -333,7 +333,7 @@ flink run-application -t yarn-application \ > Native Flink provides a REST API > Reference: [https://nightlies.apache.org/flink/flink-docs-release-1.17/docs/ops/rest_api/](https://nightlies.apache.org/flink/flink-docs-release-1.17/docs/ops/rest_api/) -### How StreamPark Integrates with Third-Party Systems +### How Apache StreamPark™ Integrates with Third-Party Systems > StreamPark also provides Restful APIs, supporting integration with other systems. > For example, it offers REST API interfaces for starting and stopping jobs. diff --git a/docs/user-guide/4-dockerDeployment.md b/docs/user-guide/4-dockerDeployment.md index b8d38c391..a53edc14f 100644 --- a/docs/user-guide/4-dockerDeployment.md +++ b/docs/user-guide/4-dockerDeployment.md @@ -18,9 +18,9 @@ To start the service with docker, you need to install [docker](https://www.docke To start the service with docker-compose, you need to install [docker-compose](https://docs.docker.com/compose/install/) first -## StreamPark Deployment +## Apache StreamPark™ Deployment -### 1. StreamPark deployment based on h2 and docker-compose +### 1. Apache StreamPark™ deployment based on h2 and docker-compose This method is suitable for beginners to learn and become familiar with the features. The configuration will reset after the container is restarted. Below, you can configure Mysql or Pgsql for persistence. @@ -92,7 +92,7 @@ SPRING_DATASOURCE_PASSWORD=streampark docker-compose up -d ``` -## Build images based on source code for StreamPark deployment +## Build images based on source code for Apache StreamPark™ deployment ``` git clone https://github.com/apache/incubator-streampark.git cd incubator-streampark/deploy/docker diff --git a/i18n/zh-CN/docusaurus-plugin-content-blog/0-streampark-flink-on-k8s.md b/i18n/zh-CN/docusaurus-plugin-content-blog/0-streampark-flink-on-k8s.md index 39e96328f..f2dc7fca4 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-blog/0-streampark-flink-on-k8s.md +++ b/i18n/zh-CN/docusaurus-plugin-content-blog/0-streampark-flink-on-k8s.md @@ -1,6 +1,6 @@ --- slug: streampark-flink-on-k8s -title: StreamPark Flink on Kubernetes 实践 +title: Apache StreamPark™ Flink on Kubernetes 实践 tags: [StreamPark, 生产实践, FlinkSQL, Kubernetes] --- @@ -20,7 +20,7 @@ Native Kubernetes 和 Standalone Kubernetes 主要区别在于 Flink 与 Kuberne ![](/blog/relx/nativekubernetes_architecture.png) -## **当 Flink On Kubernetes 遇见 StreamPark** +## **当 Flink On Kubernetes 遇见 Apache StreamPark™** Flink on Native Kubernetes 目前支持 Application 模式和 Session 模式,两者对比 Application 模式部署规避了 Session 模式的资源隔离问题、以及客户端资源消耗问题,因此**生产环境更推荐采用 Application Mode 部署 Flink 任务。**下面我们分别看看使用原始脚本的方式和使用 StreamPark 开发部署一个 Flink on Native Kubernetes 作业的流程。 @@ -69,7 +69,7 @@ kubectl -n flink-cluster get svc 以上就是使用 Flink 提供的最原始的脚本方式把一个 Flink 任务部署到 Kubernetes 上的过程,只做到了最基本的任务提交,如果要达到生产使用级别,还有一系列的问题需要解决,如:方式过于原始无法适配大批量任务、无法记录任务checkpoint 和实时状态跟踪、任务运维和监控困难、无告警机制、 无法集中化管理等等。 -## **使用 StreamPark 部署 Flink on Kubernetes** +## **使用 Apache StreamPark™ 部署 Flink on Kubernetes** ------ @@ -178,7 +178,7 @@ StreamPark 既支持 Upload Jar 也支持直接编写 Flink SQL 作业, **Flink 通过以上我们看到 StreamPark 在支持 Flink on Kubernetes 开发部署过程中具备的能力, 包括:**作业的开发能力、部署能力、监控能力、运维能力、异常处理能力等,StreamPark 提供的是一套相对完整的解决方案。 且已经具备了一些 CICD/DevOps 的能力,整体的完成度还在持续提升。是在整个开源领域中对于 Flink on Kubernetes 一站式开发部署运维工作全链路都支持的产品,StreamPark 是值得被称赞的。** -## **StreamPark 在雾芯科技的落地实践** +## **Apache StreamPark™ 在雾芯科技的落地实践** StreamPark 在雾芯科技落地较晚,目前主要用于实时数据集成作业和实时指标计算作业的开发部署,有 Jar 任务也有 Flink SQL 任务,全部使用 Native Kubernetes 部署;数据源有CDC、Kafka 等,Sink 端有 Maxcompute、kafka、Hive 等,以下是公司开发环境StreamPark 平台截图: diff --git a/i18n/zh-CN/docusaurus-plugin-content-blog/1-flink-framework-streampark.md b/i18n/zh-CN/docusaurus-plugin-content-blog/1-flink-framework-streampark.md index 3604cdf9f..756d7ffac 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-blog/1-flink-framework-streampark.md +++ b/i18n/zh-CN/docusaurus-plugin-content-blog/1-flink-framework-streampark.md @@ -1,6 +1,6 @@ --- slug: flink-development-framework-streampark -title: Flink 开发利器 StreamPark +title: Flink 开发利器 Apache StreamPark™ tags: [StreamPark, DataStream, FlinkSQL] --- @@ -58,7 +58,7 @@ Flink 从 1.13 版本开始,就支持 Pod Template,我们可以在 Pod Templ
-## 引入 StreamPark +## 引入 Apache StreamPark™ 之前我们写 Flink SQL 基本上都是使用 Java 包装 SQL,打 jar 包,提交到 S3 平台上。通过命令行方式提交代码,但这种方式始终不友好,流程繁琐,开发和运维成本太大。我们希望能够进一步简化流程,将 Flink TableEnvironment 抽象出来,有平台负责初始化、打包运行 Flink 任务,实现 Flink 应用程序的构建、测试和部署自动化。 diff --git a/i18n/zh-CN/docusaurus-plugin-content-blog/2-streampark-usercase-chinaunion.md b/i18n/zh-CN/docusaurus-plugin-content-blog/2-streampark-usercase-chinaunion.md index ca77f9980..65bba59f0 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-blog/2-streampark-usercase-chinaunion.md +++ b/i18n/zh-CN/docusaurus-plugin-content-blog/2-streampark-usercase-chinaunion.md @@ -71,7 +71,7 @@ tags: [StreamPark, 生产实践, FlinkSQL] 由于作业运维困境上的种种因素,会产生业务支撑困境,如导致上线故障率高、影响数据质量、上线时间长、数据延迟高、告警漏发处理等,引起的投诉,此外,我们的业务影响不明确,一旦出现问题,处理问题会成为第一优先级。 -## **基于 StreamPark 一体化管理** +## **基于 Apache StreamPark™ 一体化管理** ![](/blog/chinaunion/job_management.png) diff --git a/i18n/zh-CN/docusaurus-plugin-content-blog/3-streampark-usercase-bondex-paimon.md b/i18n/zh-CN/docusaurus-plugin-content-blog/3-streampark-usercase-bondex-paimon.md index d66e83b72..da6020102 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-blog/3-streampark-usercase-bondex-paimon.md +++ b/i18n/zh-CN/docusaurus-plugin-content-blog/3-streampark-usercase-bondex-paimon.md @@ -1,6 +1,6 @@ --- slug: streampark-usercase-bondex-with-paimon -title: 海程邦达基于 Apache Paimon + StreamPark 的流式数仓实践 +title: 海程邦达基于 Apache Paimon + Apache StreamPark™ 的流式数仓实践 tags: [StreamPark, 生产实践, paimon, streaming-warehouse] --- @@ -236,7 +236,7 @@ docker push registry-vpc.cn-zhangjiakou.aliyuncs.com/xxxxx/flink-table-store:v1. 接下来准备 Paimon jar 包,可以在 Apache [Repository](https://repository.apache.org/content/groups/snapshots/org/apache/paimon) 下载对应版本,需要注意的是要和 flink 大版本保持一致 -### **使用 StreamPark 管理作业** +### **使用 Apache StreamPark™ 管理作业** **前提条件:** diff --git a/i18n/zh-CN/docusaurus-plugin-content-blog/4-streampark-usercase-shunwang.md b/i18n/zh-CN/docusaurus-plugin-content-blog/4-streampark-usercase-shunwang.md index 776036fb5..7d86cddd3 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-blog/4-streampark-usercase-shunwang.md +++ b/i18n/zh-CN/docusaurus-plugin-content-blog/4-streampark-usercase-shunwang.md @@ -1,6 +1,6 @@ --- slug: streampark-usercase-shunwang -title: StreamPark 在顺网科技的大规模生产实践 +title: Apache StreamPark™ 在顺网科技的大规模生产实践 tags: [StreamPark, 生产实践, FlinkSQL] --- @@ -72,13 +72,13 @@ Streaming-Launcher 中,没有提供统一的作业管理界面。开发同学 ![图片](/blog/shunwang/step.png) -## **为什么用** **StreamPark** +## **为什么用** **Apache StreamPark™** 面对自研平台 Streaming-Launcher 存在的缺陷,我们一直在思考如何将 Flink 的使用门槛降到更低,进一步提高工作效率。考虑到人员投入成本和时间成本,我们决定向开源社区求助寻找合适的开源项目来对我们的 Flink 任务进行管理和运维。 -### 01 **StreamPark 解决 Flink 问题的利器** +### 01 **Apache StreamPark™ 解决 Flink 问题的利器** 很幸运在 2022 年 6 月初,我们在 GitHub 机缘巧合之间认识到了 StreamPark,我们满怀希望地对 StreamPark 进行了初步的探索。发现 StreamPark 具备的能力大概分为三大块:用户权限管理、作业运维管理和开发脚手架。 @@ -110,7 +110,7 @@ Streaming-Launcher 中,没有提供统一的作业管理界面。开发同学 -### 02 **StreamPark 解决自研平台的问题** +### 02 **Apache StreamPark™ 解决自研平台的问题** 上面我们简单介绍了 StreamPark 的核心能力。在顺网科技的技术选型过程中,我们发现 StreamPark 所具备强大的功能不仅包含了现有 Streaming-Launcher 的基础功能,还提供了更完整的对应方案解决了 Streaming-Launcher 的诸多不足。在这部分,着重介绍下 StreamPark 针对我们自研平台 Streaming-Launcher 的不足所提供的解决方案。 diff --git a/i18n/zh-CN/docusaurus-plugin-content-blog/5-streampark-usercase-dustess.md b/i18n/zh-CN/docusaurus-plugin-content-blog/5-streampark-usercase-dustess.md index bec4373ac..391a589e8 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-blog/5-streampark-usercase-dustess.md +++ b/i18n/zh-CN/docusaurus-plugin-content-blog/5-streampark-usercase-dustess.md @@ -1,6 +1,6 @@ --- slug: streampark-usercase-dustess -title: StreamPark 在尘锋信息的最佳实践,化繁为简极致体验 +title: Apache StreamPark™ 在尘锋信息的最佳实践,化繁为简极致体验 tags: [StreamPark, 生产实践, FlinkSQL] --- diff --git a/i18n/zh-CN/docusaurus-plugin-content-blog/6-streampark-usercase-joyme.md b/i18n/zh-CN/docusaurus-plugin-content-blog/6-streampark-usercase-joyme.md index f983b3a16..8230d9eec 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-blog/6-streampark-usercase-joyme.md +++ b/i18n/zh-CN/docusaurus-plugin-content-blog/6-streampark-usercase-joyme.md @@ -1,6 +1,6 @@ --- slug: streampark-usercase-joyme -title: StreamPark 在 Joyme 的生产实践 +title: Apache StreamPark™ 在 Joyme 的生产实践 tags: [StreamPark, 生产实践, FlinkSQL] --- @@ -16,7 +16,7 @@ tags: [StreamPark, 生产实践, FlinkSQL] -## 1 遇见 StreamPark +## 1 遇见 Apache StreamPark™ 遇见 StreamPark 是必然的,基于我们现有的实时作业开发模式,不得不寻找一个开源的平台来支撑我司的实时业务。我们的现状如下: diff --git a/i18n/zh-CN/docusaurus-plugin-content-blog/7-streampark-usercase-haibo.md b/i18n/zh-CN/docusaurus-plugin-content-blog/7-streampark-usercase-haibo.md index d8713a1f6..faebb98d4 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-blog/7-streampark-usercase-haibo.md +++ b/i18n/zh-CN/docusaurus-plugin-content-blog/7-streampark-usercase-haibo.md @@ -1,6 +1,6 @@ --- slug: streampark-usercase-haibo -title: StreamPark 一站式计算利器在海博科技的生产实践,助力智慧城市建设 +title: Apache StreamPark™ 一站式计算利器在海博科技的生产实践,助力智慧城市建设 tags: [StreamPark, 生产实践, FlinkSQL] --- @@ -16,7 +16,7 @@ tags: [StreamPark, 生产实践, FlinkSQL] -## **01. 选择 StreamPark** +## **01. 选择 Apache StreamPark™** 海博科技自 2020 年开始使用 Flink SQL 汇聚、处理各类实时物联数据。随着各地市智慧城市建设步伐的加快,需要汇聚的各类物联数据的数据种类、数据量也不断增加,导致线上维护的 Flink SQL 任务越来越多,一个专门的能够管理众多 Flink SQL 任务的计算平台成为了迫切的需求。 diff --git a/i18n/zh-CN/docusaurus-plugin-content-blog/8-streampark-usercase-ziru.md b/i18n/zh-CN/docusaurus-plugin-content-blog/8-streampark-usercase-ziru.md index 8532e5885..c1ccf9df1 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-blog/8-streampark-usercase-ziru.md +++ b/i18n/zh-CN/docusaurus-plugin-content-blog/8-streampark-usercase-ziru.md @@ -1,6 +1,6 @@ --- slug: streampark-usercase-ziru -title: 自如基于Apache StreamPark 的实时计算平台实践 +title: 自如基于Apache Apache StreamPark™ 的实时计算平台实践 tags: [StreamPark, 生产实践] --- @@ -74,7 +74,7 @@ tags: [StreamPark, 生产实践] 3.在 StreamPark 基础上,我们要推动与公司已有生态的整合,以便更好地满足我们的业务需求。 -## **基于 StreamPark 的深度实践** +## **基于 Apache StreamPark™ 的深度实践** 基于上述决策,我们启动了以 “痛点需求” 为导向的实时计算平台演进工作,基于StremaPark 打造一个稳定、高效、易维护的实时计算平台。从 2022 年初开始我们便参与社区的建设,同时我们内部平台建设也正式提上日程。 @@ -222,7 +222,7 @@ SELECT Encryption_function(name), age, price, Sensitive_field_functions(phone) F SELECT name, Encryption_function(age), price, Sensitive_field_functions(phone) FROM user; ``` -### **06 基于 StreamPark 的数据同步平台** +### **06 基于 Apache StreamPark™ 的数据同步平台** 随着 StreamPark 的技术解决方案在公司的成功落地,我们实现了对 Flink 作业的深度支持,从而为数据处理带来质的飞跃。这促使我们对过往的数据同步逻辑进行彻底的革新,目标是通过技术的优化和整合,最大限度地降低运维成本。因此,我们逐步替换了历史上的 Sqoop 作业、Canal 作业和 Hive JDBC Handler 作业,转而采用 Flink CDC 作业、Flink 流和批作业。在这个过程中,我们也不断优化和强化 StreamPark 的接口能力,新增了状态回调机制,同时实现了与 DolphinScheduler[7] 调度系统的完美集成,进一步提升了我们的数据处理能力。 @@ -360,7 +360,7 @@ vim flink-conf.yaml ![](/blog/ziru/sync_conf.png) -### **05 StreamPark 配置 DNS 解析** +### **05 Apache StreamPark™ 配置 DNS 解析** 在使用 StreamPark 平台提交 FlinkSQL 的过程中,一个正确合理的 DNS 解析配置非常重要。主要涉及到以下几点: diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs-community/current/contribution_guide/become_committer.md b/i18n/zh-CN/docusaurus-plugin-content-docs-community/current/contribution_guide/become_committer.md index e2e6ca5b2..7f8db3d5a 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs-community/current/contribution_guide/become_committer.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs-community/current/contribution_guide/become_committer.md @@ -22,7 +22,7 @@ sidebar_position: 2 --> -## 成为 Apache StreamPark 的 Committer +## 成为 Apache Apache StreamPark™ 的 Committer 任何支持社区并在 CoPDoC 领域中工作的人都可以成为 Apache StreamPark 的Committer。CoPDoC 是 ASF 的缩写,用来描述我们如何不仅仅通过代码来认识到您的贡献。 diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs-community/current/contribution_guide/become_pmc_member.md b/i18n/zh-CN/docusaurus-plugin-content-docs-community/current/contribution_guide/become_pmc_member.md index 449a03ef0..4ef3195b1 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs-community/current/contribution_guide/become_pmc_member.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs-community/current/contribution_guide/become_pmc_member.md @@ -21,7 +21,7 @@ sidebar_position: 3 limitations under the License. --> -## 成为 Apache StreamPark 的 PMC 成员 +## 成为 Apache Apache StreamPark™ 的 PMC 成员 任何支持社区并在 CoPDoC 领域中工作的人都可以成为 Apache StreamPark 的PMC 成员。CoPDoC 是 ASF 的缩写,用来描述我们如何不仅仅通过代码来认识到您的贡献。 diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs-community/current/release/How-to-release.md b/i18n/zh-CN/docusaurus-plugin-content-docs-community/current/release/How-to-release.md index a3f1d28e4..5944ef785 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs-community/current/release/How-to-release.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs-community/current/release/How-to-release.md @@ -404,7 +404,7 @@ apache-streampark_2.12-2.1.0-incubating-bin.tar.gz: OK #### 3.6 发布Apache SVN仓库中dev目录的物料包 ```shell -# 检出Apache SVN仓库中的dev目录到Apache StreamPark项目根目录下的dist/streampark_svn_dev目录下 +# 检出Apache SVN仓库中的dev目录到Apache Apache StreamPark™项目根目录下的dist/streampark_svn_dev目录下 svn co https://dist.apache.org/repos/dist/dev/incubator/streampark dist/streampark_svn_dev svn co --depth empty https://dist.apache.org/repos/dist/dev/incubator/streampark diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/current/connector/3-clickhouse.md b/i18n/zh-CN/docusaurus-plugin-content-docs/current/connector/3-clickhouse.md index 3f4e47b0f..2d590219a 100755 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/current/connector/3-clickhouse.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/current/connector/3-clickhouse.md @@ -63,7 +63,7 @@ public class ClickHouseUtil { 以上将各项参数拼接为请求 url 的方式较繁琐,并且是硬编码的方式写死的,非常的不灵敏. -### StreamPark 方式写入 +### Apache StreamPark™ 方式写入 用`StreamPark`接入 `clickhouse`的数据, 只需要按照规定的格式定义好配置文件然后编写代码即可,配置和代码如下在`StreamPark`中`clickhose jdbc` 约定的配置见配置列表,运行程序样例为scala,如下: @@ -152,7 +152,7 @@ $ echo 'INSERT INTO t VALUES (1),(2),(3)' | curl 'http://localhost:8123/' --data 上述方式操作较简陋,当然也可以使用java 代码来进行写入, StreamPark 对 http post 写入方式进行封装增强,增加缓存、异步写入、失败重试、达到重试阈值后数据备份至外部组件(kafka,mysql,hdfs,hbase) 等功能,以上功能只需要按照规定的格式定义好配置文件然后编写代码即可,配置和代码如下 -### StreamPark 方式写入 +### Apache StreamPark™ 方式写入 在`StreamPark`中`clickhose jdbc` 约定的配置见配置列表,运行程序样例为scala,如下: diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/current/connector/4-doris.md b/i18n/zh-CN/docusaurus-plugin-content-docs/current/connector/4-doris.md index ee968a7c0..ae2f161ac 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/current/connector/4-doris.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/current/connector/4-doris.md @@ -11,7 +11,7 @@ import TabItem from '@theme/TabItem'; [Apache Doris](https://doris.apache.org/)是一款基于大规模并行处理技术的分布式 SQL 数据库,主要面向 OLAP 场景。 StreamPark 基于Doris的[stream load](https://doris.apache.org/administrator-guide/load-data/stream-load-manual.html)封装了DoirsSink用于向Doris实时写入数据。 -### StreamPark 方式写入 +### Apache StreamPark™ 方式写入 用`StreamPark`写入 `doris`的数据, 目前 DorisSink 只支持 JSON 格式(单层)写入,如:{"id":1,"name":"streampark"} 运行程序样例为java,如下: diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/current/connector/5-es.md b/i18n/zh-CN/docusaurus-plugin-content-docs/current/connector/5-es.md index 358f12c81..8f0fa4952 100755 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/current/connector/5-es.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/current/connector/5-es.md @@ -180,7 +180,7 @@ input.addSink(esSinkBuilder.build) 以上创建ElasticsearchSink添加参数非常的不灵敏。`StreamPark`使用约定大于配置、自动配置的方式只需要配置es 连接参数、flink运行参数,StreamPark 会自动组装source和sink,极大的简化开发逻辑,提升开发效率和维护性。 -## StreamPark 写入 Elasticsearch +## Apache StreamPark™ 写入 Elasticsearch ESSink 在启用 Flink checkpoint 后,保证至少一次将操作请求发送到 Elasticsearch 集群。 @@ -344,5 +344,5 @@ Elasticsearch 操作请求可能由于多种原因而失败,可以通过实现 [官方文档](https://nightlies.apache.org/flink/flink-docs-release-1.14/zh/docs/connectors/datastream/elasticsearch/#elasticsearch-sink)**处理失败的 Elasticsearch 请求** 单元 ### 配置内部批量处理器 es内部`BulkProcessor`可以进一步配置其如何刷新缓存操作请求的行为详细查看[官方文档](https://nightlies.apache.org/flink/flink-docs-release-1.14/zh/docs/connectors/datastream/elasticsearch/#elasticsearch-sink)**配置内部批量处理器** 单元 -### StreamPark配置 +### Apache StreamPark™配置 其他的所有的配置都必须遵守 **StreamPark** 配置,具体可配置项和各个参数的作用请参考[项目配置](/docs/development/conf) diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/current/connector/6-hbase.md b/i18n/zh-CN/docusaurus-plugin-content-docs/current/connector/6-hbase.md index cd4c98262..32de22bae 100755 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/current/connector/6-hbase.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/current/connector/6-hbase.md @@ -236,7 +236,7 @@ class HBaseWriter extends RichSinkFunction { 以方式读写Hbase较繁琐,非常的不灵敏。`StreamPark`使用约定大于配置、自动配置的方式只需要配置Hbase连接参数、flink运行参数,StreamPark 会自动组装source和sink,极大的简化开发逻辑,提升开发效率和维护性。 -## StreamPark 读写 Hbase +## Apache StreamPark™ 读写 Hbase ### 1. 配置策略和连接信息 diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/current/connector/7-http.md b/i18n/zh-CN/docusaurus-plugin-content-docs/current/connector/7-http.md index 06973a406..1249879f9 100755 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/current/connector/7-http.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/current/connector/7-http.md @@ -26,7 +26,7 @@ import TabItem from '@theme/TabItem'; ``` -## StreamPark 方式写入 +## Apache StreamPark™ 方式写入 ### http异步写入支持类型 diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/current/connector/8-redis.md b/i18n/zh-CN/docusaurus-plugin-content-docs/current/connector/8-redis.md index 3c2f8c2b8..4d4febf0e 100755 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/current/connector/8-redis.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/current/connector/8-redis.md @@ -169,7 +169,7 @@ public class FlinkRedisSink { 以上创建FlinkJedisPoolConfig较繁琐,redis的每种操作都要构建RedisMapper,非常的不灵敏。`StreamPark`使用约定大于配置、自动配置的方式只需要配置redis 连接参数、flink运行参数,StreamPark 会自动组装source和sink,极大的简化开发逻辑,提升开发效率和维护性。 -## StreamPark 写入 Redis +## Apache StreamPark™ 写入 Redis RedisSink 默认为AT_LEAST_ONCE (至少一次)的处理语义,在开启checkpoint情况下两阶段段提交支持EXACTLY_ONCE语义,可使用的连接类型: 单节点模式、哨兵模式。 diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/current/user-guide/11-platformInstall.md b/i18n/zh-CN/docusaurus-plugin-content-docs/current/user-guide/11-platformInstall.md index 6579f95c7..977ef0036 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/current/user-guide/11-platformInstall.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/current/user-guide/11-platformInstall.md @@ -75,7 +75,7 @@ flink -v cp mysql-connector-java-8.0.28.jar /usr/local/streampark/lib ``` ![4_mysql_dep](/doc/image/install/4_mysql_dep.png) -## 下载StreamPark +## 下载Apache StreamPark™ > 下载URL:[https://dlcdn.apache.org/incubator/streampark/2.0.0/apache-streampark_2.12-2.0.0-incubating-bin.tar.gz](https://dlcdn.apache.org/incubator/streampark/2.0.0/apache-streampark_2.12-2.0.0-incubating-bin.tar.gz) > 上传 [apache-streampark_2.12-2.0.0-incubating-bin.tar.gz](https://dlcdn.apache.org/incubator/streampark/2.0.0/apache-streampark_2.12-2.0.0-incubating-bin.tar.gz) 至 服务器 /usr/local 路径 @@ -133,7 +133,7 @@ show tables; ``` ![13_show_streampark_db_tables](/doc/image/install/13_show_streampark_db_tables.png) -## StreamPark配置 +## Apache StreamPark™配置 > 目的:配置启动需要的数据源。 > 配置文件所在路径:/usr/local/streampark/conf @@ -190,13 +190,13 @@ vim application.yml > 5. **java.security.krb5.conf=/etc/krb5.conf** ![19_kerberos_yml_config](/doc/image/install/19_kerberos_yml_config.png) -## 启动StreamPark -## 进入服务器StreamPark安装路径 +## 启动Apache StreamPark™ +## 进入服务器Apache StreamPark™安装路径 ```bash cd /usr/local/streampark/ ``` ![20_enter_streampark_dir](/doc/image/install/20_enter_streampark_dir.png) -## 启动StreamPark服务 +## 启动Apache StreamPark™服务 ```bash ./bin/startup.sh ``` diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/current/user-guide/12-platformBasicUsage.md b/i18n/zh-CN/docusaurus-plugin-content-docs/current/user-guide/12-platformBasicUsage.md index 40afc17f4..b3e8aec2c 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/current/user-guide/12-platformBasicUsage.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/current/user-guide/12-platformBasicUsage.md @@ -79,7 +79,7 @@ start-cluster.sh ![22_submit_flink_job_2](/doc/image/platform-usage/22_submit_flink_job_2.png) ## 查看作业状态 -### 通过StreamPark看板查看 +### 通过Apache StreamPark™看板查看 > StreamPark dashboard ![23_flink_job_dashboard](/doc/image/platform-usage/23_flink_job_dashboard.png) @@ -97,13 +97,13 @@ start-cluster.sh > 至此,一个使用StreamPark平台提交flink job的流程基本完成。下面简单总结下StreamPark平台管理flink作业的大致流程。 -## StreamPark平台管理flink job的流程 +## Apache StreamPark™平台管理flink job的流程 ![28_streampark_process_workflow](/doc/image/platform-usage/28_streampark_process_workflow.png) > 通过 StreamPark 平台 停止、修改、删除 flink job 相对简单,大家可自行体验,需要说明的一点是:**若作业为running状态,则不可删除,需先停止**。 -# StreamPark系统模块简介 +# Apache StreamPark™系统模块简介 ## 系统设置 > 菜单位置 @@ -150,7 +150,7 @@ curl -X POST '/flink/app/cancel' \ ![36_streampark_menu_management](/doc/image/platform-usage/36_streampark_menu_management.png) -## StreamPark菜单模块 +## Apache StreamPark™菜单模块 ### Project > StreamPark结合代码仓库实现CICD @@ -212,7 +212,7 @@ curl -X POST '/flink/app/cancel' \ ![54_visit_flink_cluster_web_ui](/doc/image/platform-usage/54_visit_flink_cluster_web_ui.png) -# 原生flink 与 StreamPark关联使用 +# 原生flink 与 Apache StreamPark™关联使用 > 【**待完善**】其实,个人理解,StreamPark一大特点是对flink原生作业的管理模式在用户使用层面进行了优化,使得用户能利用该平台快速开发、部署、运行、监控flink作业。所以,想表达的意思是:如果用户对原生flink比较熟悉的话,那StreamPark使用起来就会更加得心应手。 ## flink部署模式 @@ -231,7 +231,7 @@ curl -X POST '/flink/app/cancel' \ ![60_flink_deployment_difference_6](/doc/image/platform-usage/60_flink_deployment_difference_6.png) -### 如何在StreamPark中使用 +### 如何在Apache StreamPark™中使用 > **Session 模式** 1. 配置 Flink Cluster @@ -262,7 +262,7 @@ flink run-application -t yarn-application \ -Dyarn.provided.lib.dirs="hdfs://myhdfs/my-remote-flink-dist-dir" \ hdfs://myhdfs/jars/my-application.jar ``` -### 如何在StreamPark中使用 +### 如何在Apache StreamPark™中使用 > 创建 或 修改 作业时,在“Dynamic Properties”里面按指定格式添加即可 ![67_dynamic_params_usage](/doc/image/platform-usage/67_dynamic_params_usage.png) @@ -275,7 +275,7 @@ flink run-application -t yarn-application \ ![68_native_flink_restart_strategy](/doc/image/platform-usage/68_native_flink_restart_strategy.png) -### 如何在StreamPark中使用 +### 如何在Apache StreamPark™中使用 > 【**待完善**】一般在作业失败或出现异常时,会触发告警 1. 配置告警通知 @@ -297,7 +297,7 @@ flink run-application -t yarn-application \ ![72_native_flink_save_checkpoint_gramma](/doc/image/platform-usage/72_native_flink_save_checkpoint_gramma.png) -### 如何在StreamPark中配置savepoint +### 如何在Apache StreamPark™中配置savepoint > 当停止作业时,可以让用户设置savepoint ![73_streampark_save_checkpoint](/doc/image/platform-usage/73_streampark_save_checkpoint.png) @@ -312,7 +312,7 @@ flink run-application -t yarn-application \ ![77_show_checkpoint_file_name_2](/doc/image/platform-usage/77_show_checkpoint_file_name_2.png) -### 如何在StreamPark中由指定savepoint恢复作业 +### 如何在Apache StreamPark™中由指定savepoint恢复作业 > 启动作业时,会让选择 ![78_usage_checkpoint_in_streampark](/doc/image/platform-usage/78_usage_checkpoint_in_streampark.png) @@ -325,7 +325,7 @@ flink run-application -t yarn-application \ ![79_native_flink_job_status](/doc/image/platform-usage/79_native_flink_job_status.svg) -### StreamPark中的作业状态 +### Apache StreamPark™中的作业状态 > 【**待完善**】 @@ -335,7 +335,7 @@ flink run-application -t yarn-application \ ![80_native_flink_job_details_page](/doc/image/platform-usage/80_native_flink_job_details_page.png) -### StreamPark中作业详情 +### Apache StreamPark™中作业详情 ![81_streampark_flink_job_details_page](/doc/image/platform-usage/81_streampark_flink_job_details_page.png) > 同时在k8s模式下的作业,StreamPark还支持启动日志实时展示,如下 @@ -347,7 +347,7 @@ flink run-application -t yarn-application \ > 原生flink提供了 rest api > 参考:[https://nightlies.apache.org/flink/flink-docs-release-1.14/zh/docs/ops/rest_api/](https://nightlies.apache.org/flink/flink-docs-release-1.14/zh/docs/ops/rest_api/) -### StreamPark如何与第三方系统集成 +### Apache StreamPark™如何与第三方系统集成 > 也提供了Restful Api,支持与其他系统对接, > 比如:开启作业 启动|停止 restapi 接口 diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/current/user-guide/4-dockerDeployment.md b/i18n/zh-CN/docusaurus-plugin-content-docs/current/user-guide/4-dockerDeployment.md index 193aa624e..b8c4818c2 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/current/user-guide/4-dockerDeployment.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/current/user-guide/4-dockerDeployment.md @@ -17,9 +17,9 @@ sidebar_position: 4 ### 2. 安装 docker-compose 使用 docker-compose 启动服务,需要先安装 [docker-compose](https://docs.docker.com/compose/install/) -## 部署 StreamPark +## 部署 Apache StreamPark™ -### 1. 基于 h2 和 docker-compose 部署 StreamPark +### 1. 基于 h2 和 docker-compose 部署 Apache StreamPark™ 该方式适用于入门学习、熟悉功能特性,容器重启后配置会失效,下方可以配置Mysql、Pgsql进行持久化 @@ -93,7 +93,7 @@ SPRING_DATASOURCE_PASSWORD=streampark docker-compose up -d ``` -## 基于源码构建镜像进行StreamPark部署 +## 基于源码构建镜像进行Apache StreamPark™部署 ```shell git clone https://github.com/apache/incubator-streampark.git