-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(low-code): added keys replace transformation #183
base: main
Are you sure you want to change the base?
feat(low-code): added keys replace transformation #183
Conversation
/autofix
|
📝 Walkthrough📝 WalkthroughWalkthroughThe pull request introduces a new Changes
Sequence DiagramsequenceDiagram
participant Stream as DeclarativeStream
participant Factory as ModelToComponentFactory
participant Transformation as KeysReplaceTransformation
Stream->>Factory: Request transformation
Factory-->>Transformation: Create KeysReplaceTransformation
Stream->>Transformation: Apply transform
Transformation->>Transformation: Replace keys in record
Possibly Related PRs
Suggested Reviewers
Hey team! 👋 Quick question - would you be interested in seeing some example use cases for the new Tip CodeRabbit's docstrings feature is now available as part of our Early Access Program! Simply use the command Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (5)
airbyte_cdk/sources/declarative/models/declarative_component_schema.py (1)
718-733
: LGTM! Consider adding value validation, wdyt?The
KeysReplace
schema is well-structured and properly integrated. The fields have clear descriptions and examples.One suggestion: Consider adding validation to ensure
old
andnew
values are not empty strings, as empty replacements could lead to unexpected behavior. Something like:old: str = Field( ..., description="Old value to replace.", examples=[" ", "_"], title="Old value", min_length=1 )unit_tests/sources/declarative/transformations/test_keys_replace_transformation.py (2)
9-10
: Consider using a more descriptive constant nameThe constant
_ANY_VALUE = -1
could be more descriptive. What about something like_DUMMY_VALUE
or_PLACEHOLDER_VALUE
, wdyt?-_ANY_VALUE = -1 +_DUMMY_VALUE = -1 # Used as a placeholder in test records
12-15
: Consider adding more test cases for better coverageThe current test only covers a basic case. Would you consider adding tests for:
- Empty strings in keys
- Multiple occurrences of the pattern
- Nested dictionaries
- Special characters in keys
- Edge cases where no replacement is needed
Here's a suggested expansion of the test cases:
def test_transform(): test_cases = [ # Basic case ({"date time": _DUMMY_VALUE}, {"date_time": _DUMMY_VALUE}), # Multiple spaces ({"date time": _DUMMY_VALUE}, {"date__time": _DUMMY_VALUE}), # Nested dictionary ({"outer space": {"inner space": _DUMMY_VALUE}}, {"outer_space": {"inner_space": _DUMMY_VALUE}}), # No replacement needed ({"date_time": _DUMMY_VALUE}, {"date_time": _DUMMY_VALUE}), # Empty string ({"": _DUMMY_VALUE}, {"": _DUMMY_VALUE}), ] for input_record, expected_output in test_cases: record = input_record.copy() KeysReplaceTransformation(old=" ", new="_").transform(record) assert record == expected_outputairbyte_cdk/sources/declarative/transformations/keys_replace_transformation.py (1)
12-16
: Consider adding docstring and input validationThe class could benefit from documentation and parameter validation. What do you think about:
@dataclass class KeysReplaceTransformation(RecordTransformation): + """Transforms record keys by replacing occurrences of a substring with another. + + Args: + old: The substring to replace in record keys + new: The substring to replace with + + Example: + >>> transform = KeysReplaceTransformation(old=" ", new="_") + >>> record = {"date time": 123} + >>> transform.transform(record) + >>> assert record == {"date_time": 123} + """ old: str new: str + + def __post_init__(self): + if not isinstance(self.old, str) or not isinstance(self.new, str): + raise ValueError("Both 'old' and 'new' must be strings")airbyte_cdk/sources/declarative/declarative_component_schema.yaml (1)
1883-1911
: LGTM! The KeysReplace transformation component looks well-structured.The component definition is clear and includes all necessary fields. I particularly like the descriptive examples provided for both
old
andnew
values. Just a minor suggestion - would you consider adding a more complex example showing how this could be used with config parameters? Something like:old: "{{ config['old_key_pattern'] }}" new: "{{ config['new_key_pattern'] }}"wdyt?
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (5)
airbyte_cdk/sources/declarative/declarative_component_schema.yaml
(3 hunks)airbyte_cdk/sources/declarative/models/declarative_component_schema.py
(3 hunks)airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py
(4 hunks)airbyte_cdk/sources/declarative/transformations/keys_replace_transformation.py
(1 hunks)unit_tests/sources/declarative/transformations/test_keys_replace_transformation.py
(1 hunks)
🔇 Additional comments (3)
airbyte_cdk/sources/declarative/models/declarative_component_schema.py (1)
1692-1692
: LGTM! KeysReplace properly integrated
The KeysReplace
transformation is correctly added to both transformation lists in DeclarativeStream
and DynamicSchemaLoader
.
Also applies to: 1867-1867
airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py (1)
239-241
: LGTM! The factory implementation looks good
The changes to ModelToComponentFactory follow the established patterns and are consistent with other transformation implementations.
Also applies to: 497-497, 619-623
airbyte_cdk/sources/declarative/declarative_component_schema.yaml (1)
1239-1239
: LGTM! The KeysReplace transformation is properly integrated.
The component is correctly added to both transformation lists in DeclarativeStream
and DynamicSchemaLoader
. The integration maintains schema consistency and allows the transformation to be used in both contexts.
Also applies to: 1784-1784
def transform( | ||
self, | ||
record: Dict[str, Any], | ||
config: Optional[Config] = None, | ||
stream_state: Optional[StreamState] = None, | ||
stream_slice: Optional[StreamSlice] = None, | ||
) -> None: | ||
for key in set(record.keys()): | ||
record[key.replace(self.old, self.new)] = record.pop(key) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Consider handling nested dictionaries recursively
The current implementation only handles top-level keys. Would you consider adding support for nested dictionaries? Here's a suggestion:
def transform(
self,
record: Dict[str, Any],
config: Optional[Config] = None,
stream_state: Optional[StreamState] = None,
stream_slice: Optional[StreamSlice] = None,
) -> None:
+ def transform_dict(d: Dict[str, Any]) -> Dict[str, Any]:
+ result = {}
+ for key, value in d.items():
+ new_key = key.replace(self.old, self.new)
+ if isinstance(value, dict):
+ result[new_key] = transform_dict(value)
+ else:
+ result[new_key] = value
+ return result
+
+ transformed = transform_dict(record)
+ record.clear()
+ record.update(transformed)
- for key in set(record.keys()):
- record[key.replace(self.old, self.new)] = record.pop(key)
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
def transform( | |
self, | |
record: Dict[str, Any], | |
config: Optional[Config] = None, | |
stream_state: Optional[StreamState] = None, | |
stream_slice: Optional[StreamSlice] = None, | |
) -> None: | |
for key in set(record.keys()): | |
record[key.replace(self.old, self.new)] = record.pop(key) | |
def transform( | |
self, | |
record: Dict[str, Any], | |
config: Optional[Config] = None, | |
stream_state: Optional[StreamState] = None, | |
stream_slice: Optional[StreamSlice] = None, | |
) -> None: | |
def transform_dict(d: Dict[str, Any]) -> Dict[str, Any]: | |
result = {} | |
for key, value in d.items(): | |
new_key = key.replace(self.old, self.new) | |
if isinstance(value, dict): | |
result[new_key] = transform_dict(value) | |
else: | |
result[new_key] = value | |
return result | |
transformed = transform_dict(record) | |
record.clear() | |
record.update(transformed) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py (1)
619-623
: Consider adding validation for the old/new parameters.The implementation looks good, but what do you think about adding validation to ensure that
old
andnew
parameters are not empty strings or None? This could help catch configuration issues early. wdyt?def create_keys_replace_transformation( self, model: KeysReplaceModel, config: Config, **kwargs: Any ) -> KeysReplaceTransformation: + if not model.old or not model.new: + raise ValueError("Both 'old' and 'new' parameters must be non-empty strings") return KeysReplaceTransformation(old=model.old, new=model.new)
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
airbyte_cdk/sources/declarative/declarative_component_schema.yaml
(3 hunks)airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py
(4 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- airbyte_cdk/sources/declarative/declarative_component_schema.yaml
🔇 Additional comments (2)
airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py (2)
239-241
: LGTM! Clean import additions.
The imports are well-organized and follow the existing pattern.
Also applies to: 405-407
497-497
: LGTM! Clean mapping addition.
The mapping is correctly placed in alphabetical order.
Added
KeysReplaceTransformations
to be able to replace symbols in record keys.Summary by CodeRabbit
New Features
KeysReplace
, for replacing symbols in record keys.DeclarativeStream
to include theKeysReplace
transformation.Bug Fixes
Tests
KeysReplaceTransformation
to validate its functionality.