Skip to content

Commit

Permalink
added solution explanation to fa24 midterm q1
Browse files Browse the repository at this point in the history
  • Loading branch information
Jystine committed Nov 18, 2024
1 parent 7892b9c commit b873eaa
Show file tree
Hide file tree
Showing 2 changed files with 49 additions and 3 deletions.
50 changes: 47 additions & 3 deletions docs/fa24-midterm/index.html
Original file line number Diff line number Diff line change
Expand Up @@ -33,7 +33,7 @@
}
@media print {
pre > code.sourceCode { white-space: pre-wrap; }
pre > code.sourceCode > span { display: inline-block; text-indent: -5em; padding-left: 5em; }
pre > code.sourceCode > span { text-indent: -5em; padding-left: 5em; }
}
pre.numberSource code
{ counter-reset: source-line 0; }
Expand Down Expand Up @@ -87,7 +87,7 @@
</style>
<link rel="stylesheet" href="..\assets\theme.css" />
<script defer=""
src="https://cdn.jsdelivr.net/npm/katex@latest/dist/katex.min.js"></script>
src="https://cdn.jsdelivr.net/npm/katex@0.15.1/dist/katex.min.js"></script>
<script>document.addEventListener("DOMContentLoaded", function () {
var mathElements = document.getElementsByClassName("math");
var macros = [];
Expand All @@ -103,7 +103,7 @@
}}});
</script>
<link rel="stylesheet"
href="https://cdn.jsdelivr.net/npm/katex@latest/dist/katex.min.css" />
href="https://cdn.jsdelivr.net/npm/katex@0.15.1/dist/katex.min.css" />
</head>
<body>
<header id="title-block-header">
Expand Down Expand Up @@ -181,6 +181,17 @@ <h2 class="accordion-header" id="heading1">
<h1 class="title"> </h1>
</header>
<p><strong>Answer</strong>: None of these.</p>
<p>The index uniquely identifies each row of a DataFrame. As a result,
for a column to be a candidate for the index, it must not contain repeat
items. Since it is possible for an address to give out different types
of candy, values in <code>"address"</code> can show up multiple times.
Similarly, values in <code>"candy"</code> can also show up multiple
times as it will appear anytime a house gives it out. Finally, a
neighborhood has multiple houses, so if more than one of those houses
show up, that value in <code>"neighborhood"</code> will appear multiple
times. Since <code>"address"</code>, <code>"candy"</code>, and
<code>"neighborhood"</code> can potentially have repeat values, none of
them can be the index for <code>treat</code>.</p>
<hr/>
<h5>Difficulty: ⭐️⭐️⭐️</h5>
<p>
Expand Down Expand Up @@ -216,6 +227,39 @@ <h1 class="title"> </h1>
</header>
<p><strong>Answer</strong>: <code>treat.get("candy").iloc[1]</code> and
<code>treat.sort_values(by="candy", ascending = False).get("candy").loc[1]</code></p>
<ul>
<li><p><strong>Option 1</strong>:
<code>treat.get("candy").iloc[1]</code> gets the <code>candy</code>
column and then retrieves the value at index location <code>1</code>,
which would be <code>"M&amp;M"</code>.</p></li>
<li><p><strong>Option 2</strong>:
<code>treat.sort_values(by="candy", ascending=False).get("candy").iloc[1]</code>
sorts the <code>candy</code> column in descending order (alphabetically,
the last candy is at the top) and then retrieves the value at index
location <code>1</code> in the <code>candy</code> column. The entire
dataset is not shown, but in the given rows, the second-to-last candy
alphabetically is <code>"Skittles"</code>, so we know that
<code>"M&amp;M"</code> will not be the second-to-last alphabetical candy
in the full dataset.</p></li>
<li><p><strong>Option 3</strong>:
<code>treat.sort_values(by="candy", ascending=False).get("candy").loc[1]</code>
is very similar to the last option; however, this time,
<code>.loc[1]</code> is used instead of <code>.iloc[1]</code>. This
means that instead of looking at the row in position <code>1</code>
(second row) of the sorted DataFrame, we are finding the row with an
index label of <code>1</code>. When the rows are sorted by
<code>candy</code> in descending order, the index labels remain with
their original rows, so the <code>"M&amp;M"</code> row is retrieved when
we search for the index label <code>1</code>.</p></li>
<li><p><strong>Option 4</strong>:
<code>treat.set_index("candy").index[-1]</code> sets the index to the
<code>candy</code> column and then retrieves the last element in the
index (<code>candy</code>). The entire dataset is not shown, but in the
given rows, the last value would be <code>"Skittles"</code> and not
<code>"M&amp;M"</code>. The last value of the full dataset could be
<code>"M&amp;M"</code>, but since we are not sure, this option is not
selected.</p></li>
</ul>
<hr/>
<h5>Difficulty: ⭐️⭐️⭐️</h5>
<p>
Expand Down
2 changes: 2 additions & 0 deletions problems/fa24-midterm/q01.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,8 @@ Which of the following columns would be an appropriate index for the

**Answer**: None of these.

The index uniquely identifies each row of a DataFrame. As a result, for a column to be a candidate for the index, it must not contain repeat items. Since it is possible for an address to give out different types of candy, values in `"address"` can show up multiple times. Similarly, values in `"candy"` can also show up multiple times as it will appear anytime a house gives it out. Finally, a neighborhood has multiple houses, so if more than one of those houses show up, that value in `"neighborhood"` will appear multiple times. Since `"address"`, `"candy"`, and `"neighborhood"` can potentially have repeat values, none of them can be the index for `treat`.

<average>54</average>
# END SOLUTION

Expand Down

0 comments on commit b873eaa

Please sign in to comment.