-
Notifications
You must be signed in to change notification settings - Fork 463
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Unit tests and monitoring capability #722
base: feat/e2e-fabric-dataops-sample
Are you sure you want to change the base?
Add Unit tests and monitoring capability #722
Conversation
- *.md files data testing, monitoring and observability using open telemetry and Things to consider for data projects. - Code for Opentelemetry implementation - Notebooks for city-safety ETL updated with unit testcases and opentelemetry implementation.
Feedback from Elena and Lace as of Aug 27:
|
Check out this pull request on See visual diffs & provide feedback on Jupyter Notebooks. Powered by ReviewNB |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Leaving these comments for now.
More comments here: https://app.reviewnb.com/Azure-Samples/modern-data-warehouse-dataops/blob/feat/sguda-e2e-fabric-dataops-sample/e2e_samples/fabric_dataops_sample/src/notebooks/nb-city-safety.ipynb/
- The telemetry process flow shows the implementation using *Open Telemetry Collector* Option which has multiple targets. | ||
- If we are using *OpenTelemetry SDK for Azure monitoring* option then AppInsights/LogAlanlytics is the only target for telemetry data. | ||
|
||
### Process flow diagram code |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I appreciate wanting to have everything as code (including diagrams). I think this is adding unnecessary clutter in the page.. Do we want to swap over to drawio format?
@@ -0,0 +1,98 @@ | |||
# Microsoft Fabirc Notebook Samples | |||
|
|||
The notebooks in this section represent sample ETL flows using Microsoft Fabric. We used [Azure Open Datasets](https://learn.microsoft.com/azure/open-datasets/dataset-catalog#population-and-safety) as source data in these examples. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should highlight first thing at the top the key thing the reader is going to learn from this page before going into details.
For example: "These sample notebooks demonstrate how to do automated tests and observability in Microsoft Fabric."
|
||
![opentelemetry_trace_output.png](../../images/opentelemetry_trace_output.png) | ||
|
||
## Sample - Covid data |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this second sample notebook needed if it doesn't highlight anything around devops ? (Automated tests, observability, etc).
|
||
- Reads from ADLS and writes that data to OneLake after a minor transformation. | ||
- Microsoft Open datasets - [City safety data](https://learn.microsoft.com/azure/open-datasets/dataset-catalog#population-and-safety) is the data source used. | ||
- The notebook **doesn't require** a default lakehouse attached and instead uses absolute paths to create/load managed tables. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want to move some of these details to the notebook itself as comments (or markdown cells?). I think we just need the simple description of the pipeline and the key concepts demonstrated in the notebook in this page. Essentially, focus on what the user is going learn by going through the notebook. I see this page as just a index of all the notebooks.
All the details about the notebook itself should be in the notebook.
Monitoring write-up is now part of #867 |
Type of PR
Purpose
Implementation samples which include:
Author pre-publish checklist
Issues Closed or Referenced