docs: Update references (#393)

* Initial commit * Update xref
stackabletech · Sep 21, 2023 · f55c8ff · f55c8ff
1 parent cf371d9
commit f55c8ff
Show file tree

Hide file tree

Showing 2 changed files with 42 additions and 22 deletions.
diff --git a/docs/modules/hdfs/pages/getting_started/installation.adoc b/docs/modules/hdfs/pages/getting_started/installation.adoc
@@ -1,21 +1,22 @@
 = Installation
 
-On this page you will install the Stackable HDFS operator and its dependency, the Zookeeper operator, as well as the commons and secret operators which are required by all Stackable operators.
+On this page you will install the Stackable HDFS operator and its dependency, the Zookeeper operator, as well as the
+commons and secret operators which are required by all Stackable operators.
 
 == Stackable Operators
 
 There are 2 ways to run Stackable Operators
 
-1. Using xref:stackablectl::index.adoc[]
-
-2. Using Helm
+. Using xref:management:stackablectl:index.adoc[]
+. Using Helm
 
 === stackablectl
 
-stackablectl is the command line tool to interact with Stackable operators and our recommended way to install operators.
-Follow the xref:stackablectl::installation.adoc[installation steps] for your platform.
+`stackablectl` is the command line tool to interact with Stackable operators and our recommended way to install
+operators. Follow the xref:management:stackablectl:installation.adoc[installation steps] for your platform.
 
-After you have installed stackablectl run the following command to install all operators necessary for the HDFS cluster:
+After you have installed `stackablectl`, run the following command to install all operators necessary for the HDFS
+cluster:
 
 [source,bash]
 ----
@@ -31,7 +32,8 @@ The tool will show
 [INFO ] Installing hdfs operator
 ----
 
-TIP: Consult the xref:stackablectl::quickstart.adoc[] to learn more about how to use stackablectl. For example, you can use the `-k` flag to create a Kubernetes cluster with link:https://kind.sigs.k8s.io/[kind].
+TIP: Consult the xref:management:stackablectl:quickstart.adoc[] to learn more about how to use `stackablectl`. For
+example, you can use the `--cluster kind` flag to create a Kubernetes cluster with link:https://kind.sigs.k8s.io/[kind].
 
 === Helm
 
@@ -47,8 +49,10 @@ Then install the Stackable Operators:
 include::example$getting_started/getting_started.sh[tag=helm-install-operators]
 ----
 
-Helm will deploy the operators in a Kubernetes Deployment and apply the CRDs for the HDFS cluster (as well as the CRDs for the required operators). You are now ready to deploy HDFS in Kubernetes.
+Helm will deploy the operators in a Kubernetes Deployment and apply the CRDs for the HDFS cluster (as well as the CRDs
+for the required operators). You are now ready to deploy HDFS in Kubernetes.
 
 == What's next
 
-xref:getting_started/first_steps.adoc[Set up an HDFS cluster] and its dependencies and xref:getting_started/first_steps.adoc#_verify_that_it_works[verify that it works].
+xref:getting_started/first_steps.adoc[Set up an HDFS cluster] and its dependencies and
+xref:getting_started/first_steps.adoc#_verify_that_it_works[verify that it works].
diff --git a/docs/modules/hdfs/pages/index.adoc b/docs/modules/hdfs/pages/index.adoc
@@ -2,20 +2,28 @@
 :description: The Stackable Operator for Apache HDFS is a Kubernetes operator that can manage Apache HDFS clusters. Learn about its features, resources, dependencies and demos, and see the list of supported HDFS versions.
 :keywords: Stackable Operator, Hadoop, Apache HDFS, Kubernetes, k8s, operator, engineer, big data, metadata, storage, cluster, distributed storage
 
-The Stackable Operator for https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-hdfs/HdfsUserGuide.html[Apache HDFS] (Hadoop Distributed File System) is used to set up HFDS in high-availability mode. HDFS is a distributed file system designed to store and manage massive amounts of data across multiple machines in a fault-tolerant manner. The Operator depends on the xref:zookeeper:index.adoc[] to operate a ZooKeeper cluster to coordinate the active and standby NameNodes.
+The Stackable Operator for https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-hdfs/HdfsUserGuide.html[Apache HDFS]
+(Hadoop Distributed File System) is used to set up HFDS in high-availability mode. HDFS is a distributed file system
+designed to store and manage massive amounts of data across multiple machines in a fault-tolerant manner. The Operator
+depends on the xref:zookeeper:index.adoc[] to operate a ZooKeeper cluster to coordinate the active and standby NameNodes.
 
 == Getting started
 
-Follow the xref:getting_started/index.adoc[Getting started guide] which will guide you through installing the Stackable HDFS and ZooKeeper Operators, setting up ZooKeeper and HDFS and writing a file to HDFS to verify that everything is set up correctly.
+Follow the xref:getting_started/index.adoc[Getting started guide] which will guide you through installing the Stackable
+HDFS and ZooKeeper Operators, setting up ZooKeeper and HDFS and writing a file to HDFS to verify that everything is set
+up correctly.
 
-Afterwards you can consult the xref:usage-guide/index.adoc[] to learn more about tailoring your HDFS configuration to your needs, or have a look at the <<demos, demos>> for some example setups.
+Afterwards you can consult the xref:usage-guide/index.adoc[] to learn more about tailoring your HDFS configuration to
+your needs, or have a look at the <<demos, demos>> for some example setups.
 
 == Operator model
 
-The Operator manages the _HdfsCluster_ custom resource. The cluster implements three xref:home:concepts:roles-and-role-groups.adoc[roles]:
+The Operator manages the _HdfsCluster_ custom resource. The cluster implements three
+xref:home:concepts:roles-and-role-groups.adoc[roles]:
 
 * DataNode - responsible for storing the actual data.
-* JournalNode - responsible for keeping track of HDFS blocks and used to perform failovers in case the active NameNode fails. For details see: https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-hdfs/HDFSHighAvailabilityWithQJM.html
+* JournalNode - responsible for keeping track of HDFS blocks and used to perform failovers in case the active NameNode
+  fails. For details see: https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-hdfs/HDFSHighAvailabilityWithQJM.html
 * NameNode - responsible for keeping track of HDFS blocks and providing access to the data.
 
 
@@ -24,30 +32,38 @@ image::hdfs_overview.drawio.svg[A diagram depicting the Kubernetes resources cre
 The operator creates the following K8S objects per role group defined in the custom resource.
 
 * Service - ClusterIP used for intra-cluster communication.
-* ConfigMap - HDFS configuration files like `core-site.xml`, `hdfs-site.xml` and `log4j.properties` are defined here and mounted in the pods.
+* ConfigMap - HDFS configuration files like `core-site.xml`, `hdfs-site.xml` and `log4j.properties` are defined here and
+  mounted in the pods.
 * StatefulSet - where the replica count, volume mounts and more for each role group is defined.
 
-In addition, a `NodePort` service is created for each pod labeled with `hdfs.stackable.tech/pod-service=true` that exposes all container ports to the outside world (from the perspective of K8S).
+In addition, a `NodePort` service is created for each pod labeled with `hdfs.stackable.tech/pod-service=true` that
+exposes all container ports to the outside world (from the perspective of K8S).
 
-In the custom resource you can specify the number of replicas per role group (NameNode, DataNode or JournalNode). A minimal working configuration requires:
+In the custom resource you can specify the number of replicas per role group (NameNode, DataNode or JournalNode). A
+minimal working configuration requires:
 
 * 2 NameNodes (HA)
 * 1 JournalNode
 * 1 DataNode (should match at least the `clusterConfig.dfsReplication` factor)
 
-The Operator creates a xref:concepts:service_discovery.adoc[service discovery ConfigMap] for the HDFS instance. The discovery ConfigMap contains the `core-site.xml` file and the `hdfs-site.xml` file.
+The Operator creates a xref:concepts:service_discovery.adoc[service discovery ConfigMap] for the HDFS instance. The
+discovery ConfigMap contains the `core-site.xml` file and the `hdfs-site.xml` file.
 
 == Dependencies
 
-HDFS depends on ZooKeeper for coordination between nodes. You can run a ZooKeeper cluster with the xref:zookeeper:index.adoc[]. Additionally, the xref:commons-operator:index.adoc[] and xref:secret-operator:index.adoc[] are needed.
+HDFS depends on ZooKeeper for coordination between nodes. You can run a ZooKeeper cluster with the
+xref:zookeeper:index.adoc[]. Additionally, the xref:commons-operator:index.adoc[] and
+xref:secret-operator:index.adoc[] are needed.
 
 == [[demos]]Demos
 
 Two demos that use HDFS are available.
 
-**xref:stackablectl::demos/hbase-hdfs-load-cycling-data.adoc[]** loads a dataset of cycling data from S3 into HDFS and then uses HBase to analyze the data.
+**xref:demos:hbase-hdfs-load-cycling-data.adoc[]** loads a dataset of cycling data from S3 into HDFS and then uses HBase
+to analyze the data.
 
-**xref:stackablectl::demos/jupyterhub-pyspark-hdfs-anomaly-detection-taxi-data.adoc[]** showcases the integration between HDFS and Jupyter. New York Taxi data is stored in HDFS and analyzed in a Jupyter notebook.
+**xref:demos:jupyterhub-pyspark-hdfs-anomaly-detection-taxi-data.adoc[]** showcases the integration between HDFS and
+Jupyter. New York Taxi data is stored in HDFS and analyzed in a Jupyter notebook.
 
 == Supported Versions