Skip to content

Latest commit

 

History

History
9 lines (5 loc) · 4.74 KB

2013-11-16.md

File metadata and controls

9 lines (5 loc) · 4.74 KB

2013-11-16

2013-11-14

This week has been one of those slower weeks. I was diappointed that we didn't have speakers this week, since I was really looking forward to the guests present. In both class periods, the class ended up being more of a working session for the groups. Although there wasn't much structure, ultimately I think it was good that we had some time to discuss with our larger horizontal group, as opposed to simply meeting up in our smaller analyzer subgroup. On Wednesday, my subgroup met in the evening for a number of hours to try and "regroup." In the past few classes, we've been trying to better understand how the optimization problem worked, or figuring out how to improve upon the alarm strategy. It felt like we were stagnating and re-tackling the same issue class after class. The first hour we met, we worked to reproduce some of the visualizers' code on error diagrams. As we ran through the code, we found it very difficult to reproduce, running into issues of confusing variable names, not knowing what the data source was, etc. Working through this reproducibility exercise really reinforced the importance of writing reproducible code. After posting on their issue tracker, we then realized that they had modified some of Luen's code, which in retrospect we should have explored first. Nonetheless, it was a helpful exercise and reminder for when we continue working. Afterwards we tried to tackle how to evaluate the function that we would improve, but that continued our same problem of not fully knowing how to improve the model. Ultimately, we spent our last hour hashing out a SMART goal, and come up with a plan of something that we could attain given the time and knowledge constraints. Overall, the three hours that we met were really a sum of a lot of the concepts that we've learned over the semester. Having hashed out a new plan, we're ready to start tackling our problem again this weekend.

2013-11-16

This afternoon, we talked about writing skeleton code to plot aftershock arrival times. There were many considerations this afternoon when starting to hash out the problem. Before we even could start coding, we ran into dependency issues with ggplot. We only found the list of dependencies after having gone through and installing via sudo apt-get install and sudo pip install. After this, we had to consider which catalogs to use, since most earthquakes are of smaller magnitudes. For much of the early afternoon, we emphasized and were too engrossed in using ggplot, that we tried to plot histograms of the magnitude data for us, but didn't have a deep understanding of how to use the package. After getting back on track, we decided to use the 2012 earthquake catalog. We also had lengthy discussions on how we wanted to store the data, as lists of tuples, lists of lists, in a combined data frame with magnitudes and time deltas, as separate variables storing each, etc. We chose an arbitrary five aftershocks to plot for each earthquake. We also had to dig into binning, and how we could do that using ggplot. In order to do some analysis on the bins, we forewent ggplot for the time being and used a function in numpy. A significant amount of our time was spent reading through documentation, exploring ggplot, going through stackoverflow, searching online for examples of plotting, and using the right types in IPython Notebook. In the evening, I worked to plot our data using ggplot. Working as a group during the afternoon was super helpful to get us started with the skeleton code, and immensely helpful to start solving the problem as a team, and overcoming all our roadblocks. In the evening, I wanted to get something plotted using ggplot. Working independently with Python on IPython Notebook, I really had to understand type() of the data. Since ggplot requires a data frame, from which you can use variables for the x and y axes, I knew I needed a data frame with the magnitudes and time deltas. At first I tried putting in the list of pandas timedeltas into a data frame, but realized the data frame wouldn't get created. As an alternative, I converted the timedeltas into seconds using total_seconds(). This was then able to go into a data frame with the magnitudes. After rereading through examples of ggplot, I successfully plotted magnitudes against aftershock arrival times. I'm really glad our group is pushing ourselves to use IPython Notebook to tackle the project, as opposed to R, since that way we can challenge ourselves and work through the challenges as a team. Even though our progress appears slower, we're going through a lot of really great discussions, and working together well to learn together.