Skip to content

Commit

Permalink
[c++] Use valid ASCII for wide-as-possible string current domain (#3367)
Browse files Browse the repository at this point in the history
* [c++] Use valid ASCII for wide-as-possible string current domain

* lint

* comment-neaten
  • Loading branch information
johnkerl authored Nov 22, 2024
1 parent 78741e3 commit 6c26800
Show file tree
Hide file tree
Showing 3 changed files with 14 additions and 12 deletions.
10 changes: 5 additions & 5 deletions libtiledbsoma/src/soma/soma_array.cc
Original file line number Diff line number Diff line change
Expand Up @@ -1046,9 +1046,8 @@ void SOMAArray::_set_soma_joinid_shape_helper(
case TILEDB_CHAR:
case TILEDB_GEOM_WKB:
case TILEDB_GEOM_WKT:
// TODO: make these named constants b/c they're shared
// with arrow_adapter.
ndrect.set_range(dim_name, "", "\xff");
// See comments in soma_array.h.
ndrect.set_range(dim_name, "", "\x7f");
break;

case TILEDB_INT8:
Expand Down Expand Up @@ -1427,8 +1426,9 @@ void SOMAArray::_set_domain_helper(
auto lo_hi = ArrowAdapter::get_table_string_column_by_index(
newdomain, i);
if (lo_hi[0] == "" && lo_hi[1] == "") {
// Don't care -> as big as possible
ndrect.set_range(dim_name, "", "\xff");
// Don't care -> as big as possible.
// See comments in soma_array.h.
ndrect.set_range(dim_name, "", "\x7f");
} else {
throw TileDBSOMAError(std::format(
"domain (\"{}\", \"{}\") cannot be set for "
Expand Down
11 changes: 6 additions & 5 deletions libtiledbsoma/src/soma/soma_array.h
Original file line number Diff line number Diff line change
Expand Up @@ -906,13 +906,14 @@ class SOMAArray : public SOMAObject {
// imitate.
// * Core current domain for string dims must _not_ be a nullptr pair.
// * In TileDB-SOMA, unless the user specifies otherwise, we use "" for
// min and "\xff" for max.
// * However, "\xff" causes display problems in Python. It's also
// flat-out confusing to show to users.
// min and "\x7f" for max. (We could use "\x7f" but that causes
// display problems in Python.)
//
// To work with all these factors, if the current domain is the default
// "" to "\xff", return an empty-string pair just as we do for domain.
if (arr[0] == "" && arr[1] == "\xff") {
// "" to "\7f", return an empty-string pair just as we do for domain.
// (There was some pre-1.15 software using "\xff" and it's super-cheap
// to check for that as well.)
if (arr[0] == "" && (arr[1] == "\x7f" || arr[1] == "\xff")) {
return std::pair<std::string, std::string>("", "");
} else {
return std::pair<std::string, std::string>(arr[0], arr[1]);
Expand Down
5 changes: 3 additions & 2 deletions libtiledbsoma/src/utils/arrow_adapter.cc
Original file line number Diff line number Diff line change
Expand Up @@ -1125,8 +1125,9 @@ ArraySchema ArrowAdapter::tiledb_schema_from_arrow_schema(
std::string hi = strings[4];
if (lo == "" && hi == "") {
// These mean "I the caller don't care, you
// libtiledbsoma make it as big as possible"
ndrect.set_range(col_name, "", "\xff");
// libtiledbsoma make it as big as possible".
// See also comments in soma_array.h.
ndrect.set_range(col_name, "", "\x7f");
} else {
ndrect.set_range(col_name, lo, hi);
LOG_DEBUG(std::format(
Expand Down

0 comments on commit 6c26800

Please sign in to comment.