Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(low-code cdk): add transformation to dynamic schema loader #176

Merged
merged 10 commits into from
Dec 18, 2024

Conversation

lazebnyi
Copy link
Contributor

@lazebnyi lazebnyi commented Dec 16, 2024

What

During dynamic schema generation, we need to transform keys or add fields to enhance the schema's compatibility and functionality.

How

A schema transformations component has been integrated into the dynamic schema loader. This component is responsible for managing a list of transformations, enabling us to easily modify schema keys or insert additional fields as needed.

Summary by CodeRabbit

  • New Features

    • Introduced schema_transformations to enhance schema manipulation capabilities, allowing for various transformations such as adding/removing fields and key formatting.
    • Added support for the FlattenFields transformation type in stream configuration.
  • Bug Fixes

    • Updated test cases to reflect schema changes, ensuring accuracy in expected outputs.
  • Tests

    • Added new test case for validating the FlattenFields transformation and adjusted existing tests for updated schema definitions.

@github-actions github-actions bot added the enhancement New feature or request label Dec 16, 2024
Copy link
Contributor

coderabbitai bot commented Dec 16, 2024

📝 Walkthrough
📝 Walkthrough

Walkthrough

This pull request introduces a new property named schema_transformations to the DynamicSchemaLoader within the declarative_component_schema.yaml file. This property allows for a list of transformations to be applied to the schema, including adding fields, removing fields, and converting keys. The changes enhance the flexibility of the DynamicSchemaLoader class by enabling users to define how the schema should be modified after extraction, integrating seamlessly into the existing schema structure.

Changes

File Change Summary
airbyte_cdk/sources/declarative/declarative_component_schema.yaml Added schema_transformations property to DynamicSchemaLoader definition
airbyte_cdk/sources/declarative/models/declarative_component_schema.py Added schema_transformations field to DynamicSchemaLoader class
airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py Added create_keys_to_snake_transformation method; updated create_dynamic_schema_loader to include schema_transformations
airbyte_cdk/sources/declarative/schema/dynamic_schema_loader.py Added schema_transformations field; updated get_json_schema method to apply transformations
unit_tests/sources/declarative/schema/test_dynamic_schema_loader.py Updated tests to include schema_transformations and modified key names in expected schema
unit_tests/sources/declarative/test_manifest_declarative_source.py Added test case for FlattenFields transformation

Possibly related PRs

  • feat(low-code cdk): add dynamic schema loader #104: The changes in this PR introduce the DynamicSchemaLoader, which is directly related to the main PR's addition of the schema_transformations property in the DynamicSchemaLoader class, enhancing schema handling capabilities.
  • feat(low-code cdk): add KeyToSnakeCase transformation #178: This PR adds the KeysToSnakeCase transformation, which is referenced in the main PR's schema_transformations property, indicating a direct relationship in terms of schema transformation functionalities.
  • feat(low-code cdk): add flatten fields #181: The addition of the FlattenFields transformation in this PR complements the schema_transformations property introduced in the main PR, as both enhance the schema manipulation capabilities within the Airbyte framework.

Suggested reviewers

  • maxi297
  • aldogonzalez8
  • darynaishchenko

Tip

CodeRabbit's docstrings feature is now available as part of our Early Access Program! Simply use the command @coderabbitai generate docstrings to have CodeRabbit automatically generate docstrings for your pull request. We would love to hear your feedback on Discord.


Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

‼️ IMPORTANT
Auto-reply has been disabled for this repository in the CodeRabbit settings. The CodeRabbit bot will not respond to your replies unless it is explicitly tagged.

  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR. (Beta)
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
airbyte_cdk/sources/declarative/declarative_component_schema.yaml (1)

2517-2526: LGTM! Consider adding clarifying documentation?

The implementation looks good and aligns well with the PR objectives. The new transformations property follows the same pattern as stream-level transformations.

What do you think about adding a description that clarifies when to use record selector transformations vs stream transformations? This could help users choose the appropriate level for their transformations. Something like:

      transformations:
        title: Transformations
-       description: A list of transformations to be applied to each output record.
+       description: A list of transformations to be applied to each output record during the record selection phase. Use these transformations when you need to modify records before they are processed by stream-level transformations.
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 216cd43 and 9260b53.

📒 Files selected for processing (4)
  • airbyte_cdk/sources/declarative/declarative_component_schema.yaml (1 hunks)
  • airbyte_cdk/sources/declarative/models/declarative_component_schema.py (9 hunks)
  • airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py (1 hunks)
  • unit_tests/sources/declarative/parsers/test_model_to_component_factory.py (4 hunks)
🔇 Additional comments (3)
airbyte_cdk/sources/declarative/models/declarative_component_schema.py (1)

1510-1516: LGTM! Nice addition of the transformations field to RecordSelector.

The optional transformations field allows for flexible record manipulation through various transformation types. The implementation maintains backward compatibility since it's optional.

airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py (1)

1926-1931: LGTM! Clean implementation of transformation handling.

The implementation correctly creates transformation components from the models and appends them to the transformations list. The code is straightforward and follows the factory pattern consistently.

unit_tests/sources/declarative/parsers/test_model_to_component_factory.py (1)

182-185: LGTM! Great test coverage for the transformation functionality.

The tests thoroughly verify the transformation handling in both the schema parsing and component creation. The coverage includes both AddFields and RemoveFields transformations, ensuring the feature works as expected.

Also applies to: 302-309, 1315-1318, 1340-1340

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
airbyte_cdk/sources/declarative/models/declarative_component_schema.py (1)

1502-1508: LGTM! Consider adding an example to the field description?

The implementation looks good and aligns well with the PR objectives to allow transformations in the record selector. The type definition and documentation are clear.

Would you consider adding an example to the field description to make it more user-friendly? Something like:

examples=[{
    "transformations": [
        {"type": "AddFields", "fields": [{"path": ["extra_field"], "value": "{{ record['some_field'] }}"}]},
        {"type": "RemoveFields", "field_pointers": [["sensitive_data"]]}
    ]
}]

wdyt?

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 9260b53 and 6976fc7.

📒 Files selected for processing (1)
  • airbyte_cdk/sources/declarative/models/declarative_component_schema.py (1 hunks)

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (2)
unit_tests/sources/declarative/parsers/test_model_to_component_factory.py (2)

182-185: LGTM! Consider enhancing test coverage?

The addition of transformations to the record selector looks good. Would you consider adding a test case with multiple field pointers to ensure the RemoveFields transformation handles multiple fields correctly? wdyt?


2250-2268: LGTM! Consider adding edge cases?

The test case for AddFields transformation with string value type is well-structured. Would you consider adding test cases for edge cases like empty strings or special characters? wdyt?

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 6976fc7 and 62dc10c.

📒 Files selected for processing (1)
  • unit_tests/sources/declarative/parsers/test_model_to_component_factory.py (10 hunks)
🔇 Additional comments (2)
unit_tests/sources/declarative/parsers/test_model_to_component_factory.py (2)

2198-2201: LGTM! Clear and helpful comment

The comment effectively explains why transformations appear twice in the test results. This helps future maintainers understand the expected behavior.


2216-2234: LGTM! Well-structured test case

The test case for AddFields transformation without value type is thorough and clearly structured. The expected results are well-defined and the comment explains the duplication clearly.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
airbyte_cdk/sources/declarative/models/declarative_component_schema.py (2)

1650-1656: Consider adding validation for the name field

The name field is marked as optional with a default empty string. Should we add some validation to ensure the name follows a specific pattern when provided? For example, ensuring it's a valid Python identifier? wdyt?

    name: Optional[str] = Field(
-        "", description="The stream name.", example=["Users"], title="Name"
+        "", 
+        description="The stream name.", 
+        example=["Users"], 
+        title="Name",
+        regex="^[a-zA-Z_][a-zA-Z0-9_]*$"
    )

Also applies to: 1657-1658


1925-1929: Consider adding type validation for partition router lists

The partition router can now be a single router or a list of routers. Should we add validation to ensure the list isn't empty when provided? wdyt?

    partition_router: Optional[
        Union[
            CustomPartitionRouter,
            ListPartitionRouter,
            SubstreamPartitionRouter,
            List[
                Union[
                    CustomPartitionRouter, ListPartitionRouter, SubstreamPartitionRouter
                ]
            ],
        ]
    ] = Field(
        [],
        description="PartitionRouter component that describes how to partition the stream, enabling incremental syncs and checkpointing.",
        title="Partition Router",
+       min_items=1
    )

Also applies to: 2003-2007

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 62dc10c and 6aadd76.

📒 Files selected for processing (3)
  • airbyte_cdk/sources/declarative/declarative_component_schema.yaml (1 hunks)
  • airbyte_cdk/sources/declarative/models/declarative_component_schema.py (10 hunks)
  • airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py
🔇 Additional comments (5)
airbyte_cdk/sources/declarative/declarative_component_schema.yaml (1)

1770-1780: LGTM! Schema definition matches implementation.

The transformations field schema definition in the YAML file correctly matches the Python implementation, allowing the same transformation types and maintaining consistency.

airbyte_cdk/sources/declarative/models/declarative_component_schema.py (4)

531-533: OAuth configuration enhancements look good!

The additions to OAuth configuration are well-documented with clear examples. The new fields enable better OAuth flow customization and user input handling.

Also applies to: 827-829, 898-912, 914-919


1848-1862: LGTM: Transformations field added to DynamicSchemaLoader

The transformations field matches the structure used in DeclarativeStream, maintaining consistency across the codebase.


1971-1973: LGTM: Optional download_extractor field

The download_extractor field is properly marked as optional with clear typing.


2071-2076: LGTM: Components resolver field in DynamicDeclarativeStream

The components_resolver field is well-typed with a clear description of its purpose.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (5)
airbyte_cdk/sources/declarative/schema/dynamic_schema_loader.py (2)

107-107: Could you add a docstring?
This adds the list of transformations with a factory default. Perhaps consider adding a short docstring for clarity, wdyt?


133-134: Check if None is more explicit.
You're calling _transform(properties, {}). If it's not used, passing None might be clearer. Would you consider that, wdyt?

unit_tests/sources/declarative/schema/test_dynamic_schema_loader.py (1)

67-71: Additional test coverage for schema transformations.
You introduced a new "KeysToSnakeCase" transformation. Maybe add another transformation variety to ensure robust coverage, wdyt?

airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py (1)

597-601: create_keys_to_snake_transformation method.
This direct return is simple. Would it help if we passed parameters to the transformation, or is this minimal signature enough, wdyt?

airbyte_cdk/sources/declarative/declarative_component_schema.yaml (1)

1770-1780: LGTM! The schema_transformations implementation looks clean and well-structured.

The implementation follows the existing patterns in the codebase and aligns perfectly with the PR objectives. One small suggestion: would you consider adding a default empty array value for consistency with other similar fields in the schema? wdyt?

      schema_transformations:
        title: Schema Transformations
        description: A list of transformations to be applied to the schema.
        type: array
+       default: []
        items:
          anyOf:
            - "$ref": "#/definitions/AddFields"
            - "$ref": "#/definitions/CustomTransformation"
            - "$ref": "#/definitions/RemoveFields"
            - "$ref": "#/definitions/KeysToLower"
            - "$ref": "#/definitions/KeysToSnakeCase"
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 6aadd76 and a241356.

📒 Files selected for processing (5)
  • airbyte_cdk/sources/declarative/declarative_component_schema.yaml (1 hunks)
  • airbyte_cdk/sources/declarative/models/declarative_component_schema.py (1 hunks)
  • airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py (6 hunks)
  • airbyte_cdk/sources/declarative/schema/dynamic_schema_loader.py (4 hunks)
  • unit_tests/sources/declarative/schema/test_dynamic_schema_loader.py (4 hunks)
🔇 Additional comments (13)
airbyte_cdk/sources/declarative/schema/dynamic_schema_loader.py (5)

7-7: Imports look good.
No issues found here, wdyt?


16-16: Direct import of RecordTransformation recognized.
Looks straightforward and helps ensure typed transformations. wdyt?


18-18: Added dependencies for Config, StreamSlice, and StreamState.
Including these type hints keeps the code more expressive. wdyt?


138-138: Schema snippet alignment.
The “properties” key is updated to the transformed schema. This is consistent with standard JSON Schema usage, wdyt?


141-153: Order of transformations is enforced implicitly.
All transformations are applied in a loop, ensuring the final state is cumulative. If the order matters, we may want more explicit documentation. Would that add clarity, wdyt?

unit_tests/sources/declarative/schema/test_dynamic_schema_loader.py (3)

238-238: Renamed property to 'first_name'.
This aligns with the transformation logic. Changes look correct, wdyt?


253-254: Sample data matches new property names.
This looks consistent with code changes. wdyt?


265-267: Confirm naming logic for "FirstName" → "first_name".
The test properly verifies the snake-case transformation. Looks good, wdyt?

airbyte_cdk/sources/declarative/models/declarative_component_schema.py (1)

1834-1848: New field for schema transformations.
This field makes transformations more flexible. Seems well-defined. wdyt?

airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py (4)

239-241: New import for KeysToSnakeCaseModel.
Clear and consistent with the rest of the transformation imports. wdyt?


481-481: Mapping KeysToSnakeCase in the factory.
This integration ensures your transformation can be constructed correctly. wdyt?


1653-1659: schema_transformations list building.
You gather transformations into a list before injecting them. This is clean and readable, wdyt?


1674-1674: Passing the transformations into DynamicSchemaLoader.
This final assignment is straightforward and consistent. wdyt?

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
unit_tests/sources/declarative/schema/test_dynamic_schema_loader.py (1)

248-250: How about making the transformation flow more explicit in the test?

The test correctly validates the transformation pipeline, but we could make it more self-documenting. What do you think about:

  1. Adding comments to explain the transformation flow:

    • PascalCase in schema response (FirstName)
    • Gets transformed to snake_case (first_name)
    • Static field gets added (static_field)
  2. Maybe split the test into smaller, focused test cases for each transformation? This would make it easier to maintain and debug, wdyt? 🤔

Also applies to: 264-265, 276-278

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between a241356 and 193d59f.

📒 Files selected for processing (1)
  • unit_tests/sources/declarative/schema/test_dynamic_schema_loader.py (4 hunks)
🔇 Additional comments (1)
unit_tests/sources/declarative/schema/test_dynamic_schema_loader.py (1)

67-81: Consider adding more test cases for schema transformations?

The new schema transformations look good! However, we might want to strengthen our test coverage. What do you think about adding test cases for:

  • Failed transformations
  • Empty transformation lists
  • Multiple AddFields transformations
  • Invalid field definitions

This would help ensure robustness, wdyt? 🤔

@lazebnyi lazebnyi changed the title feat(low-code cdk): add transformation to record selector feat(low-code cdk): add transformation to schema loader Dec 18, 2024
@lazebnyi lazebnyi changed the title feat(low-code cdk): add transformation to schema loader feat(low-code cdk): add transformation to dynamic schema loader Dec 18, 2024
@lazebnyi lazebnyi requested a review from maxi297 December 18, 2024 04:01
@aldogonzalez8
Copy link
Contributor

I liked the idea of having it in the record selector as we would resolve this in other structures. However, if we want transformations at the dynamic schema_loader level, should we also file an issue to have transformations for components_resolver?

cc @maxi297

@lazebnyi
Copy link
Contributor Author

@aldogonzalez8 Do you have cases when we need transform records in components_resolver?

@aldogonzalez8
Copy link
Contributor

@aldogonzalez8 Do you have cases when we need transform records in components_resolver?

Yes, I'm already using it here. I was considering moving away from my custom transformation to use the flatten transformation you created, but still, need transformations to be available for components_resolver.

@lazebnyi
Copy link
Contributor Author

Could you help me undsersted why do you need this transformation at the components_resolver level? I'm asking because the resolver is typically used to fetch records, then iterates through them, and dynamically create the stream and populates the stream template with values from these records.

@lazebnyi
Copy link
Contributor Author

If I understand correctly, with this transformation, you extract certain fields to the properties level and then reference them from the components_mapping. But what is the benefit of doing this? Is it possible to collect values directly in the components_mapping and skip the transformation stage?

cc @aldogonzalez8

Copy link
Contributor

@maxi297 maxi297 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! :shipit:

@maxi297
Copy link
Contributor

maxi297 commented Dec 18, 2024

Regarding @aldogonzalez8 's comment on the transformation for component resolver, it will be needed for source-airtable as well in order to put parent stream information on the records so that we can get those values available when we resolve. @darynaishchenko

@aldogonzalez8
Copy link
Contributor

If I understand correctly, with this transformation, you extract certain fields to the properties level and then reference them from the components_mapping. But what is the benefit of doing this? Is it possible to collect values directly in the components_mapping and skip the transformation stage?

cc @aldogonzalez8

Yes, I could transform data on the extractor that receives such data, but I liked how this organically worked with transformations. This can happen in many ways, I guess. Still I feel that in the future in any place where we retrieve records/data it would be nice the chance to have transformations for such records/data.

@lazebnyi
Copy link
Contributor Author

@maxi297 For airtable, I don't think we need it at all, as we will have access to parent fields in the components_mapping once we add stream_slicing to the http resolver. -

kwargs["stream_slice"] = stream_slice # type: ignore[assignment] # stream_slice will always be of type Mapping[str, Any]

Copy link
Contributor

@aldogonzalez8 aldogonzalez8 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

APPROVED

@lazebnyi
Copy link
Contributor Author

@maxi297 @aldogonzalez8 @darynaishchenko For now, the migration to 'manifest only' (migration without any custom components) for source google sheets or source airtable is not hindered by the absence of transformations in the components resolver. I prefer to leave the components resolver without transformation, as once the transformation is moved to the record selector, we will achieve this for free.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (4)
unit_tests/sources/declarative/test_manifest_declarative_source.py (1)

1370-1453: Consider adding more test cases for edge cases?

The test case effectively verifies basic flattening functionality. Would you consider adding tests for edge cases like:

  • Empty nested objects
  • Deeply nested fields (>2 levels)
  • Handling of array fields
  • Name collisions after flattening

This would help ensure robust handling of various data structures, wdyt?

airbyte_cdk/sources/declarative/declarative_component_schema.yaml (2)

1771-1782: Consider enhancing the schema transformation documentation?

The schema_transformations definition looks good, but would you consider adding:

  1. More detailed description of each transformation's behavior
  2. Examples of common use cases
  3. Notes on transformation order/precedence
  4. Potential impacts on schema validation

This would help users better understand how to effectively use these transformations, wdyt?


1238-1238: Consider adding configuration options to FlattenFields?

The current definition is simple and functional. Would you consider adding optional configuration parameters like:

  1. max_depth to control flattening depth
  2. separator to customize the field name separator
  3. include_paths/exclude_paths for selective flattening
  4. preserve_arrays flag to control array handling

This would provide more flexibility in how fields are flattened, wdyt?

airbyte_cdk/sources/declarative/models/declarative_component_schema.py (1)

1840-1855: LGTM! Schema transformations field looks good

The schema_transformations field is well-structured and properly typed. The transformations list matches the record-level transformations which provides consistency.

Hey, what do you think about adding a brief example in the field description to show how these transformations can be chained together? Something like:

schema_transformations:
  - type: KeysToSnakeCase
  - type: FlattenFields

wdyt? 🤔

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 010543c and 5405a52.

📒 Files selected for processing (3)
  • airbyte_cdk/sources/declarative/declarative_component_schema.yaml (2 hunks)
  • airbyte_cdk/sources/declarative/models/declarative_component_schema.py (2 hunks)
  • unit_tests/sources/declarative/test_manifest_declarative_source.py (1 hunks)
🔇 Additional comments (1)
airbyte_cdk/sources/declarative/models/declarative_component_schema.py (1)

1674-1674: LGTM! Addition of FlattenFields transformation

The addition of FlattenFields to the transformations list is consistent with the existing transformation types and enhances the schema capabilities.

@lazebnyi lazebnyi merged commit 9563c33 into main Dec 18, 2024
20 checks passed
@lazebnyi lazebnyi deleted the lazebnyi/add-transformation-to-record-selector branch December 18, 2024 16:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants