Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Last blockers before 1.0 #161

Closed
3 of 5 tasks
bkatiemills opened this issue Apr 22, 2016 · 3 comments
Closed
3 of 5 tasks

Last blockers before 1.0 #161

bkatiemills opened this issue Apr 22, 2016 · 3 comments

Comments

@bkatiemills
Copy link
Member

bkatiemills commented Apr 22, 2016

So - our May deadline is almost upon us! Before we can make a final AutoQC decision, a few questions that have arisen in #146 and elsewhere need to be addressed:

  • Shall we calculate depth from pressure data, for profiles where depth is not reported?
  • Shall we calculate pressure from depth data for the ARGO suite of tests? Yes, done
  • Shall we immediately flag any classes of profiles (such as profiles with only one level) without further consideration? No
  • What shall our final flag definition be - temperature only, temperature and depth, or otherwise?
  • What will our final test dataset(s) be for measuring and validating the performance of this iteration of the AutoQC procedure? QuOTA (Jan, Feb, Mar and Jun only) and Argo delayed mode data

Once we make decisions for all of these points (and make the datasets from the 5th point available), I think we'll be able to produce a credible first iteration. Let me know what we decide and how we want to go about closing out 1.0.

@s-good
Copy link
Contributor

s-good commented Apr 28, 2016

Here's my attempt to answer your points, but I'm open to any alternative viewpoints.

  1. I'm hoping that when we get the final test datasets there won't be any profiles where depth is not reported, but we may have to revisit this question when we get the data.
  2. I just opened a pull request for the second point (Change Argo tests to work on pressure instead of depth. #162) as I think that we should convert to pressure for Argo tests.
  3. My feeling is to leave them in for now. The worry is that they could skew the results we get out for the QuOTA data; I'm hoping that the other test datasets won't have many or any of these.
  4. For QuOTA the reference flags are attached to the temperature data only but we will have to get advice on whether the same applies for the other test datasets.
  5. At the moment I'm anticipating that there will be four test datasets including QuOTA (Jan, Feb, Mar and Jun only) and Argo delayed mode data.

I will chase up progress on obtaining the test datasets. Apart from the CoTeDe upgrade, is there anything pending before running AutoQC on the test data?

@bkatiemills
Copy link
Member Author

Thanks for your comments! Keep me posted as other points conclude. re: anything pending before running on test data: not besides this list. I think we're in great shape on the AutoQC side - all the points above regard understanding our data properly. The biggest piece of work left to do is actually post AutoQC, in how we combine tests via a machine learning technique (or otherwise). I want to set aside a few days to really focus on that as soon as we get the above points 100% sorted.

(related: I haven't forgotten about that paper we discussed; I was just thinking last night about how what we've done over the past couple years can be generalized in an interesting way. I want to chat about it with my data science people to make sure I'm not crazy, but will keep you posted when things come into focus a bit more).

@bkatiemills
Copy link
Member Author

Closing as this is stale vs. more recent conversations.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants