04_exercise_A3

04 | Exercise - Reproducibility

17.05.22

In this exercise session, we will review and discuss the second assignment. After that, we delve deeper into the reproducible workflow, which you will need for the third assignment.

Agenda

10:15 - 10:20: Welcome and arrival
10:20 - 10:25: Peer-review recap and group forming
10:25 - 11:00: Assignment 2 peer-review
11:00 - 11:30: Assignment 2 discussion
11:30 - 11:40: Assignment 3 introduction
11:40 - 11:45: Goodbye and outlook

Session notes

Slides for the fourth exercise.

Assignment 3 - Reproducibility [1 point]

🕐 Due: Tue, 2022-06-07 10:00 AM

After having analyzed the data in the last assignment, we will continue our work with the German Credit Data (GCD)¹ dataset.

❗ Update: Please use the South German Credit Data Set² instead! We discussed this in the exercise, but if you weren't present, we recommend you reading this note. ~~Make use of your preprocessing script from the last assignment and modify it if necessary~~.

❗ Update 2: You may now use the already preprocessed South German dataset that we provide in the assignment 3 folder.

Create your group subfolder in hcds-summer-2022/assignments/A3_Reproducibility/ with your group name. Create a jupyter notebook file called A3_Reproducibility.ipynb inside your group folder. You will use this notebook to work on the assignment's tasks and to document your steps. In the end, your group's final solution should be contained within the notebook.

❗ Don't forget to submit your group's final commit hash to the Exercise Programming Assignment 3 Whiteboard assignment!

Reproducible Model Training

Now you will have the opportunity to use the knowledge of the dataset you gained in the previous exercise. You will build a model on the German Credit Data¹ and evaluate it.

In this task, you shall also demonstrate that you can follow best practices for open scientific research in designing and implementing your project and make your project fully reproducible by others. Review the reproducibility slides from the lecture. While working on the task, keep in mind the key practices of a reproducible workflow.

Build a model that is predicting the creditworthiness of applicants. Make use of the pre-processed South German Credit dataset.
Optimize your model, until you are satisfied.
At what point did you stop? ✏️ Reflect on what made you satisfied with the model's performance. Document your insights into a separate markdown cell inside your notebook!

Make sure that ..

.. you have a clear directory structure.
.. you have a file (e.g. README.txt) documenting all your data, files and folders.
.. you document your code and the steps you did
.. your results are completely reproducible by using the run all cells command in your jupyter notebook.

This task is intended to be open-ended so that you are able to explore and try out different ideas. You may use whatever models and packages methods you know. Draw inspirations from your teams' prior knowledge and your backgrounds.

Reproducing Results

❗ Note: Despite the problems with the German Credit dataset, we will still be using it for this exercise. Reproducing the results is still possible, while drawing different conclusions from it. Maybe you are able to find something that should have raised some eyebrows in the original work.

For the next task, you will try to reproduce the results from someone else; a model from the paper called Fairness definitions explained³. Here is how the authors describe their model:

For our discussion, we trained an off-the-shelf logistic regression classifier in Python. We applied the ten-fold cross-validation technique, using 90% of the data for training and the remaining 10% of the data for testing and illustrating each of the definitions. We used numerical and binary attributes directly as features in the classification and converted each categorical attribute to a set of binary features, arriving at 48 features in total.

Read at least the first two pages of the paper (everything before the Statistical Metrics subsection)
Think about what you know about the paper's approach to building a credit worth predictor, i.e. what processing have they done, what model did they build and how did they evaluate the performance?
Try to reproduce the model. You will need to preprocess the data, like the authors have done. Are you able to derive the same coefficients as depicted in table 2 of the paper?
Based on your knowledge from the course, how would you describe the approach the authors took, with regards to reproducibility? Is there something you would do differently ✏️ Write down your feedback into a separate markdown cell inside your notebook!

Feedback

We are interested in your feedback in order to improve this course. We will read all of your feedback and evaluate it. What you share may have a direct impact on the rest of the course, or future iterations of it.

Add a file called feedback.txt to your group folder.

Write down your feedback on the lecture, the exercises, or the assignments in the text file. Please also write down roughly how much time you needed for the assignment. You may also write about your insights, what you found interesting, or questions that you have.

Checklist for Deliverables

A3_Reproducibility.ipynb
README.txt or file another file describing your data
feedback.txt
commit hash

References

¹ Professor Dr. Hans Hofmann (1994). Statlog (German Credit Data) Data Set, UCI Machine Learning Repository.

² Ulrike Grömping (2019). South German Credit Data Set, UCI Machine Learning Repository.

³ S. Verma and J. Rubin, “Fairness definitions explained,” in Proceedings of the International Workshop on Software Fairness, Gothenburg Sweden, May 2018, pp. 1–7. doi: 10.1145/3194770.3194776.

Credits & Licenses

Content is available under Attribution-Share Alike 3.0 Unported unless otherwise noted.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly