Skip to content

06_exercise

siposl edited this page May 31, 2022 · 3 revisions

🏠   ◀️ 05 | Exercise07 | Exercise ▶️

06 | Exercise - Creating Datasheets

31.05.22

In this exercise session, we will create are own dataset combined with an accompanying datasheet.

Agenda

  • 10:15 - 10:20: Welcome and arrival
  • 10:20 - 10:25: In-exercise tasks introduction
  • 10:25 - 10:40: Creating a dataset
  • 10:40 - 11:20: Creating the datasheet
  • 11:20 - 11:40: Presenting the datasheet & discussion
  • 11:40 - 11:45: Goodbye and outlook

Session notes

Slides for the sixth exercise.

In-exercise Task: Creating a dataset

Together we will decide on a topic for our dataset. After that, we will add data entries to it. These entries shouldn't contain any personal data. Only made-up information and potential purposefully missing data points should be added.

✏️ Please use this link to participate in the creation of the dataset.

In-exercise Task: Creating the datasheet

We will be creating a datasheet for the dataset we just created,based on the proposal by Gebru et al.1 A datasheet is a set of questions "grouped into sections that roughly match the key stages of the dataset lifecycle"1. These seven sections are: motivation, composition, collection process, preprocessing/cleaning/labeling, uses, distribution, and maintenance.

During the exercise, each group will be assigned some sections to work on.

✏️ Please use this link to participate in the creation of the datasheet.

Steps

  • Look at your section's questions in the datasheet document.
  • Try to answer the questions as best as you can.
  • Reflect on the process of the datasheet's creation.

References

1T. Gebru et al., “Datasheets for Datasets.” arXiv, Dec. 01, 2021. Accessed: May 19, 2022. [Online]. Available: http://arxiv.org/abs/1803.09010

Clone this wiki locally