change json format

marvin-j97 · Mar 11, 2024 · d84ba14 · d84ba14
1 parent 9cad823
commit d84ba14
Show file tree

Hide file tree

Showing 25 changed files with 578 additions and 512 deletions.
diff --git a/README.md b/README.md
@@ -25,13 +25,13 @@ Each row can have a different set of columns (schema-less). The table is sparse,
 
 In Bigtable, stored values are byte blobs; Smoltable supports multiple data types out of the box:
 
-- String (UTF-8 encoded string)
-- Boolean (like Byte, but is unmarshalled as boolean)
-- Byte (unsigned integer, 1 byte)
-- I32 (signed integer, 4 bytes)
-- I64 (signed integer, 8 bytes)
-- F32 (floating point, 4 bytes)
-- F64 (floating point, 8 bytes)
+- string (UTF-8 encoded string)
+- boolean (like Byte, but is unmarshalled as boolean)
+- byte (unsigned integer, 1 byte)
+- i32 (signed integer, 4 bytes)
+- i64 (signed integer, 8 bytes)
+- f32 (floating point, 4 bytes)
+- f64 (floating point, 8 bytes)
 
 Column families can be grouped into locality groups, which partition groups of column families into separate LSM-trees, increasing scan performance over those column families (e.g. OLAP-style queries over a specific column).
 

diff --git a/docs/src/content/docs/guides/locality-groups.md b/docs/src/content/docs/guides/locality-groups.md
@@ -7,15 +7,15 @@ If we need to read columns of a specific column family for many rows (using a co
 
 Consider the [`webtable` example](/smoltable/guides/wide-column-intro/#real-life-example-webtable):
 
-If we wanted to get the language of all com.* pages, we would need to scan following column families:
+If we wanted to get the language of all com.\* pages, we would need to scan following column families:
 
 - `anchor`, which can be a very wide column family
 - `language`
 - `contents`, which is always huge because it stores raw HTML
 
 `language` is just 2 bytes (alpha2 country code, e.g. **DE**, **EN**, ...), but every row may require multiple kilobytes of data to be retrieved to get just the language. This heavily decreases read throughput of OLAP-style scans of large ranges.
 
-To combat this, we can define a *locality group*, which can house multiple column families. Each locality group is stored in its own LSM-tree (a single partition inside the storage engine), but row mutations across column families stay atomic.
+To combat this, we can define a _locality group_, which can house multiple column families. Each locality group is stored in its own LSM-tree (a single partition inside the storage engine), but row mutations across column families stay atomic.
 
 ![Webtable locality groups](/smoltable/webtable-locality.png)
 
@@ -125,15 +125,13 @@ curl --request POST \
       "cells": [
         {
           "column_key": "title:",
-          "value": {
-            "String": "Apache Spark™ - Unified Engine for large-scale data analytics"
-          }
+          "type": "string",
+          "value": "Apache Spark™ - Unified Engine for large-scale data analytics"
         },
         {
           "column_key": "language:",
-          "value": {
-            "String": "EN"
-          }
+          "type": "string",
+          "value": "EN"
         }
       ]
     },
@@ -142,15 +140,13 @@ curl --request POST \
       "cells": [
         {
           "column_key": "title:",
-          "value": {
-            "String": "Welcome to Apache Solr - Apache Solr"
-          }
+          "type": "string",
+          "value": "Welcome to Apache Solr - Apache Solr"
         },
         {
           "column_key": "language:",
-          "value": {
-            "String": "EN"
-          }
+          "type": "string",
+          "value": "EN"
         }
       ]
     }
@@ -195,10 +191,9 @@ Smoltable returns (again, body truncated for brevity):
           "title": {
             "": [
               {
-                "timestamp": 1706197595375136143,
-                "value": {
-                  "String": "Apache Cassandra | Apache Cassandra Documentation"
-                }
+                "time": 1706197595375136143,
+                "type": "string",
+                "value": "Apache Cassandra | Apache Cassandra Documentation"
               }
             ]
           }
@@ -284,9 +279,7 @@ By listing our table, we can see the column families have been created, and `tit
           "disk_space_in_bytes": 0,
           "locality_groups": [
             {
-              "column_families": [
-                "title"
-              ],
+              "column_families": ["title"],
               "id": "ur_pSQZ2QAYR6XsF9Xz0o"
             }
           ],
@@ -354,10 +347,9 @@ which returns (truncated):
           "title": {
             "": [
               {
-                "timestamp": 1706198298766257607,
-                "value": {
-                  "String": "Apache Cassandra | Apache Cassandra Documentation"
-                }
+                "time": 1706198298766257607,
+                "type": "string",
+                "value": "Apache Cassandra | Apache Cassandra Documentation"
               }
             ]
           }

diff --git a/docs/src/content/docs/guides/wide-column-intro.md b/docs/src/content/docs/guides/wide-column-intro.md
@@ -15,13 +15,13 @@ Each row’s cells are sorted by the column key (family + qualifier), and a time
 
 which maps to some value, the `cell value`. The cell value, unlike in Bigtable, can be a certain type:
 
-- String (UTF-8 encoded string)
-- Boolean (like Byte, but is unmarshalled as boolean)
-- Byte (unsigned integer, 1 byte)
-- I32 (signed integer, 4 bytes)
-- I64 (signed integer, 8 bytes)
-- F32 (floating point, 4 bytes)
-- F64 (floating point, 8 bytes)
+- string (UTF-8 encoded string)
+- boolean (like Byte, but is unmarshalled as boolean)
+- byte (unsigned integer, 1 byte)
+- i32 (signed integer, 4 bytes)
+- i64 (signed integer, 8 bytes)
+- f32 (floating point, 4 bytes)
+- f64 (floating point, 8 bytes)
 
 The timestamp allows storing multiple versions of the same cell.
 

diff --git a/docs/src/content/docs/reference/json-api/ingest-data.md b/docs/src/content/docs/reference/json-api/ingest-data.md
@@ -11,47 +11,44 @@ POST http://smoltable:9876/v1/table/[name]/write
 
 ```json
 {
-	"items": [
-		{
-			"row_key": "org.apache.spark",
-			"cells": [
-				{
-					"column_key": "title:",
-					"value": {
-						"String": "Apache Spark™ - Unified Engine for large-scale data analytics"
-					}
-				},
-				{
-					"column_key": "anchor:org.apache.hbase",
-					"value": {
-						"String": "Visit Apache Spark"
-					}
-				},
+  "items": [
+    {
+      "row_key": "org.apache.spark",
+      "cells": [
         {
-					"column_key": "meta:size",
-					"value": {
-						"I64": 152014
-					}
-				},
-			]
-		}
-	]
+          "column_key": "title:",
+          "type": "string",
+          "value": "Apache Spark™ - Unified Engine for large-scale data analytics"
+        },
+        {
+          "column_key": "anchor:org.apache.hbase",
+          "type": "string",
+          "value": "Visit Apache Spark"
+        },
+        {
+          "column_key": "meta:size",
+          "type": "i64",
+          "value": 152014
+        }
+      ]
+    }
+  ]
 }
 ```
 
 ### Example response
 
 ```json
 {
-	"message": "Data ingestion successful",
-	"result": {
-		"items": {
-			"cell_count": 3,
-			"row_count": 1
-		},
-		"micros_per_item": 5
-	},
-	"status": 200,
-	"time_ms": 0
+  "message": "Data ingestion successful",
+  "result": {
+    "items": {
+      "cell_count": 3,
+      "row_count": 1
+    },
+    "micros_per_item": 5
+  },
+  "status": 200,
+  "time_ms": 0
 }
-```
+```