Skip to content

Commit

Permalink
small fixes to TDM 30100 and 40100 Project 12
Browse files Browse the repository at this point in the history
  • Loading branch information
mdw333 committed Nov 11, 2024
1 parent edc36b4 commit e83f3de
Show file tree
Hide file tree
Showing 2 changed files with 8 additions and 6 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -77,11 +77,12 @@ model.fit(X_train, y_train)
y_pred = model.predict(X_test)
----

Please calculate and print the mean squared error of the Bayesian Ridge Regression model on the test data.
Please calculate and print the mean squared error of the Bayesian Ridge Regression model on the test data, and also the output of the RMSE of the model.

.Deliverables
====
- Mean Squared Error of the Bayesian Ridge Regression model on the test data
- Output of the RMSE of the model
====

=== Question 2 (2 points)
Expand Down Expand Up @@ -111,7 +112,7 @@ y_pred_sorted = y_pred[np.argsort(y_test)]
----

You may notice that the graph is a bit messy as the predictions are not perfect. To get a better visualization, we can overlay our confidence intervals on the graph. A confidence interval is a range of values that is some percentage likely to contain the true value. For example, a 95% confidence interval around a predicted value means that we are 95% confident that the true value lies within that range. The number of standard deviations away from the mean (or predicted value) determines the confidence level. Below is a table of the number of standard deviations and their corresponding confidence levels. Additionally, you can use the following formula to calculate the number of standard deviations away from the mean for a given confidence level:
[cols="2,2,2,2", options="header"]
[cols="2,2", options="header"]
|====
|Number of Standard Deviations | Confidence Level
|1 | 68.27%
Expand All @@ -123,9 +124,9 @@ You may notice that the graph is a bit messy as the predictions are not perfect.
How do we get these confidence levels from the model? scikit_learn makes it very easy, by providing an optional argument to the `predict`` method. By setting the `return_std` argument to True, the predict method will return a tuple of the list of predictions and a list of the standard deviations for each prediction. Then, we can use the standard deviations to calculate the confidence intervals.

In order to graph the confidence intervals, you will need to calculate the upper and lower bounds of the confidence interval for each prediction. Then, you can use the matplotlib `fill_between` function to fill in the area between the upper and lower bounds. Please graph the y_test values and the 68.27% confidence intervals of the y_pred values on the same graph.

.Deliverables
====
- Output of the RMSE of the model
- Graph of the y_test values against the y_pred values
- Graph displaying the y_test values and the 68.27% confidence intervals of the y_pred values
====
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -77,11 +77,12 @@ model.fit(X_train, y_train)
y_pred = model.predict(X_test)
----

Please calculate and print the mean squared error of the Bayesian Ridge Regression model on the test data.
Please calculate and print the mean squared error of the Bayesian Ridge Regression model on the test data, and also the output of the RMSE of the model.

.Deliverables
====
- Mean Squared Error of the Bayesian Ridge Regression model on the test data
- Output of the RMSE of the model
====

=== Question 2 (2 points)
Expand Down Expand Up @@ -111,7 +112,7 @@ y_pred_sorted = y_pred[np.argsort(y_test)]
----

You may notice that the graph is a bit messy as the predictions are not perfect. To get a better visualization, we can overlay our confidence intervals on the graph. A confidence interval is a range of values that is some percentage likely to contain the true value. For example, a 95% confidence interval around a predicted value means that we are 95% confident that the true value lies within that range. The number of standard deviations away from the mean (or predicted value) determines the confidence level. Below is a table of the number of standard deviations and their corresponding confidence levels. Additionally, you can use the following formula to calculate the number of standard deviations away from the mean for a given confidence level:
[cols="2,2,2,2", options="header"]
[cols="2,2", options="header"]
|====
|Number of Standard Deviations | Confidence Level
|1 | 68.27%
Expand All @@ -123,9 +124,9 @@ You may notice that the graph is a bit messy as the predictions are not perfect.
How do we get these confidence levels from the model? scikit_learn makes it very easy, by providing an optional argument to the `predict`` method. By setting the `return_std` argument to True, the predict method will return a tuple of the list of predictions and a list of the standard deviations for each prediction. Then, we can use the standard deviations to calculate the confidence intervals.

In order to graph the confidence intervals, you will need to calculate the upper and lower bounds of the confidence interval for each prediction. Then, you can use the matplotlib `fill_between` function to fill in the area between the upper and lower bounds. Please graph the y_test values and the 68.27% confidence intervals of the y_pred values on the same graph.

.Deliverables
====
- Output of the RMSE of the model
- Graph of the y_test values against the y_pred values
- Graph displaying the y_test values and the 68.27% confidence intervals of the y_pred values
====
Expand Down

0 comments on commit e83f3de

Please sign in to comment.