Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Missing data: Irrigation level #13

Open
dlebauer opened this issue Sep 18, 2018 · 0 comments
Open

Missing data: Irrigation level #13

dlebauer opened this issue Sep 18, 2018 · 0 comments

Comments

@dlebauer
Copy link
Member

@samsrabin commented on Fri Mar 10 2017

In the management table, rows with mgmttype "irrigation" and units "True = 1, False = 0" only have levels 1 or NA. Some rows with NA appear to have been irrigated, some do not.

Apologies if this is not the right venue for this!


@samsrabin commented on Fri Mar 10 2017

Relatedly (?), Christian et al. (2001: "Agronomy of Miscanthus") appears to have way too many records, and the issue appears to derive from false distinctions in irrigation.


@samsrabin commented on Sat Mar 11 2017

Here's code to reproduce the Christian et al. (2001) issue, which also has to do with apparently spurious planting dates.

# Extracting management data for Miscanthus
### Based on https://pecan.gitbooks.io/betydb-data-access/content/r_dplyr_package.html

# Connect to database
library(dplyr)
library(data.table)
## connection to database
d <- list(host = 'localhost',
          dbname = 'bety',
          user = 'bety',
          password = 'bety')

bety <- src_postgres(host = d$host, user = d$user, password = d$password, dbname = d$dbname)

# Get data
## query and join tables
species <- tbl(bety, 'species') %>% 
  select(id, scientificname, genus) %>% 
  rename(specie_id = id)

sites <- tbl(bety, sql(
  paste("select id as site_id, st_y(st_centroid(sites.geometry)) AS lat,",
        "st_x(st_centroid(sites.geometry)) AS lon,",
        " sitename, city, country from sites"))
)

citations <- tbl(bety, 'citations') %>%
  select(citation_id = id, author, year, title)

yields <- tbl(bety, 'yields') %>%
  select(yield_id=id, dateharv = date, mean, n, statname, stat, site_id, specie_id, treatment_id, citation_id, cultivar_id, notes, checked) %>% 
  filter(checked >= 0) %>%
  left_join(species, by = 'specie_id') %>%
  left_join(sites, by = 'site_id') %>% 
  left_join(citations, by = 'citation_id')

managements_treatments <- tbl(bety, 'managements_treatments') %>%
  select(treatment_id, management_id)

treatments <- tbl(bety, 'treatments') %>% 
  dplyr::mutate(treatment_id = id) %>% 
  dplyr::select(treatment_id, name, definition, control)

managements <- tbl(bety, 'managements') %>%
  filter(mgmttype %in% c('fertilizer_N', 'fertilizer_N_rate', 'planting', 'irrigation')) %>%
  dplyr::mutate(management_id = id) %>%
  dplyr::select(management_id, date, mgmttype, level, units) %>%
  left_join(managements_treatments, by = 'management_id') %>%
  left_join(treatments, by = 'treatment_id') 

nitrogen <- managements %>% 
  filter(mgmttype == "fertilizer_N_rate") %>%
  select(management_idN=management_id, treatment_id, nrate = level, nunits = units)

planting <- managements %>% filter(mgmttype == "planting") %>%
  select(treatment_id, planting_date = date)

planting_rate <- managements %>% filter(mgmttype == "planting") %>%
  select(management_idP=management_id,treatment_id, planting_date = date, planting_density = level) 
    
irrigation <- managements %>% 
  filter(mgmttype == 'irrigation')

irrigation_rate <- irrigation %>% 
  mutate(year_irr=sql("extract(year from date)")) %>%
  filter(units == 'mm', !is.na(treatment_id)) %>% 
  group_by(treatment_id, year_irr, units) %>%
  summarise(irrig.mm = sum(level)) %>% 
  group_by(treatment_id) %>% 
  summarise(irrig.mm.y = mean(irrig.mm))

irrigation_boolean <- irrigation %>%
  collect %>%   
  group_by(treatment_id) %>% 
  mutate(irrig = as.logical(mean(level))) %>% 
  select(management_idI = management_id, treatment_id, irrig = irrig)

irrigation_all <- irrigation_boolean %>%
  full_join(irrigation_rate, copy = TRUE, by = 'treatment_id')

grass_yields <- yields %>% 
  filter(genus == 'Miscanthus') %>%
  left_join(nitrogen, by = 'treatment_id') %>% 
  left_join(planting_rate, by = 'treatment_id') %>% 
  left_join(irrigation_all, by = 'treatment_id', copy = TRUE) %>% 
  collect %>% 
  mutate(age = year(dateharv)- year(planting_date),
         nrate = ifelse(is.na(nrate), 0, nrate),
         SE = ifelse(statname == "SE", stat, ifelse(statname == 'SD', stat / sqrt(n), NA)),
         continent = ifelse(lon < -30, 'united_states', ifelse(lon < 75, 'europe', 'asia')))

test <- grass_yields %>%
  filter(treatment_id == 2319, site_id == 215, dateharv == "1995-01-01")

And to see all instances of the "True or False" NA's:

irrigation_issue <- managements %>% 
  filter(mgmttype == 'irrigation', units == 'True = 1, False = 0', is.na(level)) %>%
  left_join(tbl(bety, 'citations_treatments'), by = 'treatment_id') %>%
  left_join(yields, by = 'treatment_id') %>%
  collect

@dlebauer commented on Thu Mar 16 2017

@samsrabin I am not sure who encoded irrigation as a boolean with these units, though I suspect that their intentions were pure.

The managements table is for events that do happen. Often, we record that an irrigation event happened but we don't know how much. I've updated the table in the master database (betydb.org) so you could either do the same queries against that database using the traits package or else run the following in your database:

update managements set level = NULL, units = NULL where mgmttype = 'irrigation' and units = 'True = 1, False = 0';

The downside is that you can't tell if a field was not irrigated or if the information is missing. However, within a site you can assume that if irrigation is recorded for three of four treatments, the fourth was unirrigated. If you are interested in the irrigation rates, then it would be great if you wanted to review and updating the database.


@samsrabin commented on Fri Mar 17 2017

I've actually already been adding irrigation info on my own a bit, so I'd be happy to contribute what I have. I'll apply for access as a collaborator. Is there a way to upgrade the access level for my existing account (srabin) or should I re-apply completely?

Also, and I realize this probably implies a ton of work, but it'd be nice to have a definitive way to know (a) if fields weren't irrigated, or (b) if they were irrigated but the amount was not provided. The boolean sort of made sense for that reason, but maybe it belongs in a different place.


@dlebauer commented on Sat Mar 18 2017

I've given you permission to edit records at betydb.org

Any edits to the VM will be difficult to sync back but if you've already done it we can do it manually. Another option is to create a spreadsheet with the old and new values as well as the table ids (like management_id).


@samsrabin commented on Tue Mar 21 2017

@dlebauer When I log in using my username (srabin), it says "Logged in as dlebauer." Is that a mistake? Is it going to be an issue if I start adding/editing things?


@gsrohde commented on Wed Mar 22 2017

@dlebauer @samsrabin Accounts have both a "login" and a "name". The login for Sam (what he and perhaps most people call "username") is "srabin", but the "name"—which is what gets displayed after "Logged in as " was somehow to set to "dlebauer". I reset it to "Sam Rabin", but each individual user has the power to set it to anything they like, preferably not something that will cause confusion. In any case, this having been set to "dlebauer" should have no untoward consequences.


@dlebauer commented on Wed Mar 22 2017

Sounds like the browser may have auto-updated the name field when I updated
the access level. I'll keep an eye on this.
On Tue, Mar 21, 2017 at 10:00 PM Scott Rohde [email protected]
wrote:

@dlebauer https://github.com/dlebauer @samsrabin
https://github.com/samsrabin Accounts have both a "login" and a "name".
The login for Sam (what he and perhaps most people call "username") is
"srabin", but the "name"—which is what gets displayed after "Logged in as "
was somehow to set to "dlebauer". I reset it to "Sam Rabin", but each
individual user has the power to set it to anything they like, preferably
not something that will cause confusion. In any case, this having been set
to "dlebauer" should have no untoward consequences.


You are receiving this because you were mentioned.

Reply to this email directly, view it on GitHub
PecanProject/bety#486 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AAcX50eGHxaCYDshsxYv3hWwUnKYue-1ks5roI7fgaJpZM4MZRoQ
.


@samsrabin commented on Wed Mar 22 2017

I'm finding myself unable to edit Yield records. I get the following error:

2 errors prohibited this yield from being saved
There were problems with the following fields:

  • Treatment can't be blank
  • User can't be blank

I don't see anywhere I can set "User," and when I try to select a Treatment the drop-down menu does not come up.


@gsrohde commented on Wed Mar 22 2017

@samsrabin The treatments drop-down is populated by treatments associated with the citation that the yield is connected with. In this case there are none.

New yields should automatically be associated with the user who is creating the yield. But updates don't re-assign the user to the updating user.

Both of these problems are the result of my being perhaps a little overly-zealous about ensuring that every yield is assigned to a user and a treatment (as well as a citation, species, access level and date) without having made any provision for grandfathering in existing data.

If you want to be able to update these yields as soon as possible, I think the best course of action would be for me to comment out the two lines of code that impose these restrictions. @dlebauer Is this OK? In the case of assigning a User, there really seems to be no value in imposing the requirement that user_id be non-null since this will automatically be the case when new yields are created via the Rails app. In the case of treatments, we could either (1) require assigning a treatment only when new yields are being created (2) also require assigning a treatment when updating a yield that wasn't previously assigned one (3) impose no requirement that a treatment be assigned.


@gsrohde commented on Wed Mar 22 2017

@dlebauer I should also mention that there seems to be no way to associate an existing treatment with a citation via the Rails interface. I think that there is supposed to be a way to do this but that this is broken. I'm not sure about this, though, and if it should work, I'm not sure how it is supposed to work.

To elaborate a little: There is a section of the treatments page with the heading "Other Treatments And Managements". And next to each treatment listed here, there is a checkmark button that presumably is used to link the treatment to a citation. The problem is that this section always seems to be empty when a citation is selected. And if there is no citation selected, the link button doesn't know what citation to link to and takes one to the "404 - Sorry no Data here" page.

Also, under the "Other Treatments And Managements" heading there is a subheading or comment that says "Treatments and Managements from sites linked to this citation (if any) listed at top". It's not clear to me what this is supposed to mean. As far as I can tell, treatments are only connected with sites rather indirectly, by virtue of both being associated with the same citation. But this comment makes it sounds as though it's the other way around—that treatments are connected with citations via sites.


@samsrabin commented on Thu Mar 23 2017

@gsrohde I'm not in any particular rush to update the records, so feel free to take your time to implement a complete fix to the User issue as you suggest in #495.

In addition to the fixes you suggested to the "Treatment can't be blank" while editing Yields error, it would also be good to have a fix to associating Treatments with Citations. As it is now, there is not the option to associate a Citation during creation of a Treatment, either. So without a Citations-Treatments fix, it will remain impossible to update Yield records for citations without associated treatments.

It should also be possible to associate Treatments with a citation after having selected (with the check mark) the citation in the Citations table. As it is now, it only shows Treatments already associated with the citation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant