-
Notifications
You must be signed in to change notification settings - Fork 447
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
refactor(interactive): Support timezone info for timestamp type. (#3975)
In this PR, we expand our `timestamp` format parsing to include timezone information such as `2010-02-14T15:32:10.447+00:00`. `out_zone_offset_present` are not set as we don't expect timezone info without giving a timezone in type definition. Otherwise will cause parsing failure. https://github.com/apache/arrow/blob/3e7ae5340a123c1040f98f1c36687b81362fab52/cpp/src/arrow/csv/converter.cc#L373 https://github.com/apache/arrow/blob/3e7ae5340a123c1040f98f1c36687b81362fab52/cpp/src/arrow/type.h#L1635
- Loading branch information
1 parent
ff04409
commit 7c1dfe7
Showing
5 changed files
with
310 additions
and
5 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,9 @@ | ||
start|end|timestamp | ||
169|71|2010-02-14T15:32:10.447+00:00 | ||
171|71|2011-12-27T20:11:20.921+00:00 | ||
172|71|2011-12-27T20:11:20.922+00:00 | ||
173|71|2011-12-27T20:11:20.923+00:00 | ||
174|71|2011-12-27T20:11:20.924+08:00 | ||
169|16|2011-12-27T20:11:20.925+08:00 | ||
172|16|2011-12-27T20:11:20.926+08:00 | ||
174|16|2011-12-27T20:11:20.927+08:00 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,122 @@ | ||
graph: movies | ||
loading_config: | ||
data_source: | ||
scheme: file # file, oss, s3, hdfs; only file is supported now | ||
import_option: init # append, overwrite, only init is supported now | ||
format: | ||
type: csv | ||
metadata: | ||
delimiter: "|" # other loading configuration places here | ||
header_row: true # whether to use the first row as the header | ||
quoting: false | ||
quote_char: '"' | ||
double_quote: true | ||
escape_char: '\' | ||
escaping: false | ||
block_size: 4MB | ||
batch_reader: false | ||
vertex_mappings: | ||
- type_name: Person # must align with the schema | ||
inputs: | ||
- Person.csv | ||
- type_name: User | ||
inputs: | ||
- User.csv | ||
- type_name: Movie | ||
inputs: | ||
- Movie.csv | ||
edge_mappings: | ||
- type_triplet: | ||
edge: ACTED_IN | ||
source_vertex: Person | ||
destination_vertex: Movie | ||
source_vertex_mappings: | ||
- column: | ||
index: 0 | ||
name: id | ||
destination_vertex_mappings: | ||
- column: | ||
index: 1 | ||
name: id | ||
inputs: | ||
- ACTED_IN.csv | ||
- type_triplet: | ||
edge: DIRECTED | ||
source_vertex: Person | ||
destination_vertex: Movie | ||
source_vertex_mappings: | ||
- column: | ||
index: 0 | ||
name: id | ||
destination_vertex_mappings: | ||
- column: | ||
index: 1 | ||
name: id | ||
inputs: | ||
- DIRECTED.csv | ||
- type_triplet: | ||
edge: FOLLOWS | ||
source_vertex: User | ||
destination_vertex: Person | ||
source_vertex_mappings: | ||
- column: | ||
index: 0 | ||
name: id | ||
destination_vertex_mappings: | ||
- column: | ||
index: 1 | ||
name: id | ||
column_mappings: | ||
- column: | ||
index: 2 | ||
name: timestamp | ||
property: timestamp | ||
inputs: | ||
- FOLLOWS_TIMESTAMP.csv | ||
- type_triplet: | ||
edge: PRODUCED | ||
source_vertex: Person | ||
destination_vertex: Movie | ||
source_vertex_mappings: | ||
- column: | ||
index: 0 | ||
name: id | ||
destination_vertex_mappings: | ||
- column: | ||
index: 1 | ||
name: id | ||
inputs: | ||
- PRODUCED.csv | ||
- type_triplet: | ||
edge: REVIEW | ||
source_vertex: User | ||
destination_vertex: Movie | ||
source_vertex_mappings: | ||
- column: | ||
index: 0 | ||
name: id | ||
destination_vertex_mappings: | ||
- column: | ||
index: 1 | ||
name: id | ||
column_mappings: | ||
- column: | ||
index: 2 | ||
name: rating | ||
property: rating | ||
inputs: | ||
- REVIEWED.csv | ||
- type_triplet: | ||
edge: WROTE | ||
source_vertex: Person | ||
destination_vertex: Movie | ||
source_vertex_mappings: | ||
- column: | ||
index: 0 | ||
name: id | ||
destination_vertex_mappings: | ||
- column: | ||
index: 1 | ||
name: id | ||
inputs: | ||
- WROTE.csv |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,110 @@ | ||
name: movies | ||
schema: | ||
vertex_types: | ||
- type_id: 0 | ||
type_name: Movie | ||
properties: | ||
- property_id: 0 | ||
property_name: id | ||
property_type: | ||
primitive_type: DT_SIGNED_INT64 | ||
- property_id: 1 | ||
property_name: released | ||
property_type: | ||
primitive_type: DT_SIGNED_INT32 | ||
- property_id: 2 | ||
property_name: tagline | ||
property_type: | ||
string: | ||
long_text: | ||
- property_id: 3 | ||
property_name: title | ||
property_type: | ||
string: | ||
long_text: | ||
primary_keys: | ||
- id | ||
- type_id: 1 | ||
type_name: Person | ||
properties: | ||
- property_id: 0 | ||
property_name: id | ||
property_type: | ||
primitive_type: DT_SIGNED_INT64 | ||
- property_id: 1 | ||
property_name: born | ||
property_type: | ||
primitive_type: DT_SIGNED_INT32 | ||
- property_id: 2 | ||
property_name: name | ||
property_type: | ||
string: | ||
long_text: | ||
primary_keys: | ||
- id | ||
- type_id: 2 | ||
type_name: User | ||
properties: | ||
- property_id: 0 | ||
property_name: id | ||
property_type: | ||
primitive_type: DT_SIGNED_INT64 | ||
- property_id: 1 | ||
property_name: born | ||
property_type: | ||
primitive_type: DT_SIGNED_INT32 | ||
- property_id: 2 | ||
property_name: name | ||
property_type: | ||
string: | ||
long_text: | ||
primary_keys: | ||
- id | ||
edge_types: | ||
- type_id: 0 | ||
type_name: ACTED_IN | ||
vertex_type_pair_relations: | ||
- source_vertex: Person | ||
destination_vertex: Movie | ||
relation: MANY_TO_MANY | ||
- type_id: 1 | ||
type_name: DIRECTED | ||
vertex_type_pair_relations: | ||
- source_vertex: Person | ||
destination_vertex: Movie | ||
relation: MANY_TO_MANY | ||
- type_id: 2 | ||
type_name: REVIEW | ||
vertex_type_pair_relations: | ||
- source_vertex: User | ||
destination_vertex: Movie | ||
relation: MANY_TO_MANY | ||
properties: | ||
- property_id: 0 | ||
property_name: rating | ||
property_type: | ||
primitive_type: DT_SIGNED_INT32 | ||
- type_id: 3 | ||
type_name: FOLLOWS | ||
vertex_type_pair_relations: | ||
- source_vertex: User | ||
destination_vertex: Person | ||
relation: MANY_TO_MANY | ||
properties: | ||
- property_id: 0 | ||
property_name: timestamp | ||
property_type: | ||
temporal: | ||
timestamp: | ||
- type_id: 4 | ||
type_name: WROTE | ||
vertex_type_pair_relations: | ||
- source_vertex: Person | ||
destination_vertex: Movie | ||
relation: MANY_TO_MANY | ||
- type_id: 5 | ||
type_name: PRODUCED | ||
vertex_type_pair_relations: | ||
- source_vertex: Person | ||
destination_vertex: Movie | ||
relation: MANY_TO_MANY |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters