Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Unit tests and monitoring capability #722

Draft
wants to merge 17 commits into
base: feat/e2e-fabric-dataops-sample
Choose a base branch
from

Conversation

sreedhar-guda
Copy link
Contributor

Type of PR

  • Documentation changes
  • Code changes
  • Test changes

Purpose

Implementation samples which include:

  • Unit testcases which can be made part of CI process
  • Monitoring and telemetry using OpenTelemetry

Author pre-publish checklist

  • Added test to prove my fix is effective or new feature works
  • No PII in logs
  • Made corresponding changes to the documentation

Issues Closed or Referenced

  • Closes #issue_number
  • References #issue_number

@sreedhar-guda
Copy link
Contributor Author

Feedback from Elena and Lace as of Aug 27:

  • Make the main notebook the good one (move the unit tcs else where). --> TO DO
  • It seems a lot, could make it harder for the users to follow.
  • Good to separate Fabric specific concepts from others (like unit testing and OTEL) --> TO DO, we already have these files as separate ones. Need to find a good place to move them.

Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

Copy link
Contributor

@devlace devlace left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

- The telemetry process flow shows the implementation using *Open Telemetry Collector* Option which has multiple targets.
- If we are using *OpenTelemetry SDK for Azure monitoring* option then AppInsights/LogAlanlytics is the only target for telemetry data.

### Process flow diagram code
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I appreciate wanting to have everything as code (including diagrams). I think this is adding unnecessary clutter in the page.. Do we want to swap over to drawio format?

@@ -0,0 +1,98 @@
# Microsoft Fabirc Notebook Samples

The notebooks in this section represent sample ETL flows using Microsoft Fabric. We used [Azure Open Datasets](https://learn.microsoft.com/azure/open-datasets/dataset-catalog#population-and-safety) as source data in these examples.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should highlight first thing at the top the key thing the reader is going to learn from this page before going into details.

For example: "These sample notebooks demonstrate how to do automated tests and observability in Microsoft Fabric."


![opentelemetry_trace_output.png](../../images/opentelemetry_trace_output.png)

## Sample - Covid data
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this second sample notebook needed if it doesn't highlight anything around devops ? (Automated tests, observability, etc).


- Reads from ADLS and writes that data to OneLake after a minor transformation.
- Microsoft Open datasets - [City safety data](https://learn.microsoft.com/azure/open-datasets/dataset-catalog#population-and-safety) is the data source used.
- The notebook **doesn't require** a default lakehouse attached and instead uses absolute paths to create/load managed tables.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to move some of these details to the notebook itself as comments (or markdown cells?). I think we just need the simple description of the pipeline and the key concepts demonstrated in the notebook in this page. Essentially, focus on what the user is going learn by going through the notebook. I see this page as just a index of all the notebooks.

All the details about the notebook itself should be in the notebook.

@devlace devlace linked an issue Oct 29, 2024 that may be closed by this pull request
@devlace devlace marked this pull request as draft November 11, 2024 22:36
@sreedhar-guda sreedhar-guda added the e2e: fabric Related with E2E Fabric Sample label Nov 19, 2024
@sreedhar-guda
Copy link
Contributor Author

Monitoring write-up is now part of #867

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
e2e: fabric Related with E2E Fabric Sample
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add "Application Monitoring" Framework
2 participants