Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Need to revise commercial RO assigned error level #148

Open
rtodling opened this issue Sep 20, 2023 · 14 comments
Open

Need to revise commercial RO assigned error level #148

rtodling opened this issue Sep 20, 2023 · 14 comments
Assignees

Comments

@rtodling
Copy link
Contributor

rtodling commented Sep 20, 2023

This is put the discussion on a possible revision of the weighting factor used for commercial RO into a single place.

I will start by presenting how the thinking for the factor used in GSI for handling SPIRE first come about.

The first actual x-experiments to incorporate commercial RO was x0045a (x45 was deemed invalid). Evaluation of DFS for that experiment indicated to me (RTodling) that things were off in comparison to how we typically draw to COSMIC-2. The figure below shows DFS (by obs count) in this case (data for 2020/12/15 to 2021/01/14).

Recall that a problem w/ 45a was that SPIRE was available from the latter half of Dec to mid Jan - the fig (replotted) now includes the period when SPIRE is fully available as well as C2.

x45a_com_ro_dfs_tro

After some offline testing, the next run introduced a weighting factor to make the assimilation of SPIRE and COSMIC-2 more like each other (with more even weighting of data between the two instruments). The results of setting the weighting factor to two is shown below for x0046c (data for 2021/12).

x46c_com_ro_dfs_tro

Notice:

  1. The figures about include data only in the Tropics (20S,20N) to make a fair comparison.
  2. The version of SPIRE used in these cases was the so-called NASA version (high-resolution and density - nearly full observing system)
  3. This "NASA-version" is apparently known to be a smoothed our version of the data - which some say not to be a full quality because of the smoothing.
  4. The "NASA-version" is not suitable for real-time applications - it's made available in delayed mode - but it is suitable for reanalysis.

Given (4) above, when SPIRE was set to run in an FPP experiment, the input dataset that would be plausible for a real time application was a set coming from NOAA (NCEP). Unlike the NASA-version, this NOAA set had no smoothing applied to it and it was densely distributed more like COSMIC-2 than the NASA-set.

More recently (2023/07), in GEOS-FP, the NOAA-version of SPIRE has suffered yet another thinning of sorts and DFS results (2023/08) for SPIRE continue to show (as in FPP) difference for how COSMIC-2 is used. Perhaps the FPP and FP results are indication that we could consider retuning of the error assignment for SPIRE.

f5295_fp_com_ro_dfs_tro

Caveat

True DFS (calculated directly from the DA operators) much be a positive quantity. When calculated on the basis of residuals - which the results above are - there is no guarantee that DFS is actually positive. Negative values of DFS are typically interpreted as something not being quite adequately tuned (like obs-errors; or trouble quality control).

The way I look at residual-based DFS is that it is a diagnostic to check the relative contribution of the data to the analysis.

@mjmurph
Copy link

mjmurph commented Sep 20, 2023

Hi Ricardo can you see this message? It's Michael Murphy

@rtodling
Copy link
Contributor Author

rtodling commented Sep 21, 2023

Mohar reports the following ongoing experiments in the GEOS-IT context:

I am not sure which ones you looked at but let me update the archive list again:
(1) /archive/users/mchattop/geosit_gps1 (geosit_gps1/etc/Y2021/M01 and M02) : No SPIRE but has COSMIC2
(2) /archive/users/mchattop/geosit_gps2_SPIRE1 (geosit_gps2_SPIRE1/etc/Y2021/M01 and M02) : SPIRE (spiregpserrinf=2.) and COSMIC2 [I made a typo in the last list)
(3) /archive/users/mchattop/geosit_gps2_SPIRE5 (geosit_gps2_SPIRE5/etc/Y2021/M01 and M02): SPIRE (spiregpserrinf=1.) and COSMIC2
I am currently doing another test :
(4) /archive/users/mchattop/geosit_gps2_noCSMC/ (geosit_gps2_noCSMC/etc/Y2021) : SPIRE (spiregpserrinf=1.) and no COSMIC2 : this has only run a few cycles.

I went ahead and derived DFS for (2) and (3) for 2021/01. The figs below show the results.

geosit_spire1_dfs_tro

and

geosit_spire5_dfs_tro

From these it would seem that not scaling the data (parameter set to 1) would seem to make SPIRE slightly more consistent with C2 than doubling the coefficient.

@rtodling
Copy link
Contributor Author

rtodling commented Sep 21, 2023

I was checking on things I was doing in the past wrt the issue here and I remembered that back then I was looking the ratio of Jo(A)/Jo(B). I notice that in x0045a the analysis was driving quite differently between SPIRE and C2 - this is not inconsistent with what DFS indicates. A picture of the discrepancy of Jo(A)/Jo(B) in x0045a is shown below (as above, Tropics only, for fairness of comparison).

x45a_com_ro_jo_tro

Then in x0046c - when the factor 2 was introduced for SPRIRE - the picture above changes to the following:

x46c_com_ro_jo_tro

Not only the curves are closer, but the jaggedness in the two curves is more alike.

I ran the same Jo(A)/Jo(B) diagnostic for the runs that Mohar has done, more recently, in the context of GEOS-IT. Here is the default for GEOS-IT (w/ the parameter set to 2):

geosit_spire1_jo_tro

and this is what happens when a run is made with the parameter set to 1 - equal weighing for SPIRE and C2:

geosit_spire5_jo_tro

It seems to me that in the default case, the curves are more like each other from 50 mb down; whereas in the second case, the curves become more alike above 50 mb - at the cost of the lower stratosphere and upper troposphere. To me, this is a statement that the error characteristics of SPRE and C2 are not quite the same; a single parameter is not enough to adjust the errors and make them fully consistent w/ each other.

@elakkraoui
Copy link
Contributor

Thank you Ricardo for digging up your old results and making new plots for Mohar's experiment. To me, it looks like a coefficient of 1 would be best for Spire even though, as you said, the inflation in some areas makes more sense. Since this is a global number, we also have to mind how it affects the extra-tropics. We were only able to see the slightest sign of impact on temperature in those regions when we removed the inflation. Now the question is: Did your experiment use a different sample of spire, or is this dependent on the period of the tests Jan 2021 vs Dec 2021?

image image

@elakkraoui
Copy link
Contributor

The plots above from Mohar are for global counts, they are not apple-to-apple comparison. Waiting to see the equivalent plots for the tropics only.

@mohar123
Copy link

mohar123 commented Sep 21, 2023 via email

@rtodling
Copy link
Contributor Author

Thank you Ricardo for digging up your old results and making new plots for Mohar's experiment. To me, it looks like a coefficient of 1 would be best for Spire even though, as you said, the inflation in some areas makes more sense. Since this is a global number, we also have to mind how it affects the extra-tropics. We were only able to see the slightest sign of impact on temperature in those regions when we removed the inflation. Now the question is: Did your experiment use a different sample of spire, or is this dependent on the period of the tests Jan 2021 vs Dec 2021?

image image

Amal,
There is no doubt the version of Spire in all the exps that I had been involved w/ somehow (x-exp) are different; include FPP and FP: that's what makes this a huge challenge. We test w/ one version at some point; "tune it"; then apply it to another situation when the version of the data has changed ... well this is bound not work.

At least for R21C you can stick with a single version of the data, and if indications are that the parameter being used in FP is not correct for the version you have, then by all means have it adjusted.

From my end, I plan to use x0048/49 and to look again the behavior of C2 compared to the version of Spire and now Planet IQ (treated as Spire) and see if and how the parameter should be revised.

@rtodling
Copy link
Contributor Author

Also, I will say this: I had experimented w/ the 5000 profile version of Spire before and I think the results in the extra tropics were unclear. The benefits can be questionable. I think introducing these many profiles has consequences to how the satellite biases change - if are going to use these many profiles, I suggest you look carefully at how these impact the bias correction.

@gmao-yzhu
Copy link
Contributor

"There is no doubt the version of Spire in all the exps that I had been involved w/ somehow (x-exp) are different; include FPP and FP: that's what makes this a huge challenge. We test w/ one version at some point; "tune it"; then apply it to another situation when the version of the data has changed ... well this is bound not work." -- I concur.

@elakkraoui
Copy link
Contributor

Also, I will say this: I had experimented w/ the 5000 profile version of Spire before and I think the results in the extra tropics were unclear. The benefits can be questionable. I think introducing these many profiles has consequences to how the satellite biases change - if are going to use these many profiles, I suggest you look carefully at how these impact the bias correction.

Noted. We'll look at the impact on satellite biases in the extra tropics.

@rtodling
Copy link
Contributor Author

rtodling commented Oct 2, 2024

Has this been given any continuation? I am sure it has ... can someone please add this to?

@mjmurph
Copy link

mjmurph commented Nov 5, 2024

Let me give everyone an update. After analyses of the CSDA Spire experiments that Mohar ran, already mentioned above, it was decided that MERRA 21C would use the current configuration of obs error for RO without modification (i.e. retaining 2X inflation of Spire).

Much discussion was had on the topic of RO obs error and QC methods between our group and Rick Anthes. While Rick has a new Hybrid Error Model that is an attractive option, it is relatively untested and the code for the GFS implementation was not able to be shared with us. It was decided that implementing and testing the ECMWF method of obs error/QC for RO would be good first step for GMAO. Below is an overview of the ECMWF method
Figure1 (Attached)

I implemented the ECWMF method into GEOSadas_5.30.2 and Rick thought the results looked good. For example the histograms of obs error look more realistic (normally distributed):
Figure2 (Attached)

An experiment was run for comparison of the EC method to the already mentioned Spire experiments. This experiment uses all the operational and CSDA Spire RO observations available in late 2022 (approximately 20K profiles/day from Spire alone). Validation with radiosondes show a relatively neutral to positive impact but validation with microwave instruments AMSU-A & ATMS show negative impact, see below:
Figure3 (Attached)

Figure4 (Attached)

Monthly Means also show large differences between the experiments in the upper levels:

Figure5 (Attached)

Next steps will be to narrow down where the EC method is leading to negative impacts. One option could be to modify the EC QC method in these regions and mitigate negative impacts. I also have ported the ECMWF obs error implementation to the version Nikki uses with OSSEs. She plans to run a test with large numbers of RO profiles and report back.
figures_response_github-ricardo_2024-11-05.docx

@gmao-yzhu
Copy link
Contributor

I would suggest to examine the ECWMF method for obs error and QC carefully. Usually, the development of these methods are closely associated with the performance of the corresponding forward observation operator. To be consistent, the ECMWF 2D observation operator would be preferred to be used together.

@mjmurph
Copy link

mjmurph commented Nov 5, 2024

That is an interesting point, Yanqiu. I know that Nikki has used the ROPP2D operator with GEOS in the past as it an option to use it in a test. My understanding is that ROPP2D is very similar to the operator the ECMWF uses operationally, though not exactly the same.

Best,
Michael

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants