Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OLD: Improve JUMP objects #1072

Open
wants to merge 17 commits into
base: master
Choose a base branch
from

Conversation

aarchiba
Copy link
Contributor

@aarchiba aarchiba commented Jul 14, 2021

This PR is meant to clarify the behaviour of JUMPs in PINT, and to ensure that they behave as expected in cases like overlapping JUMPs. This clarification has forced some changes affecting pintk. There are now some tests of pintk's Pulsar wrapper.

This has some minor API changes that would require updates from NANOGrav's timing_analysis to match, which we don't want to do before about 2021 Sept 10 because v1.0 is in active development. Testing is still very welcome, but please don't merge this until v1.0 is ready.

Closes #1078

2022 July 6: This may need very substantial updating to match the current state of the code. It is probably better to pull code and documentation snippets from this into a new PR. I do think a PR to do these same things would be a good idea.

@aarchiba
Copy link
Contributor Author

@aarchiba
Copy link
Contributor Author

I'm really unclear on what jump_params_to_flags is supposed to accomplish? It seems to add a new set of flags to the TOAs object that don't actually do anything and in fact cannot be used to select TOAs for JUMPing?

@codecov
Copy link

codecov bot commented Jul 14, 2021

Codecov Report

Merging #1072 (4552ef4) into master (13bdad6) will increase coverage by 0.22%.
The diff coverage is 64.49%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #1072      +/-   ##
==========================================
+ Coverage   60.83%   61.05%   +0.22%     
==========================================
  Files          86       86              
  Lines       18744    18707      -37     
  Branches     3150     3134      -16     
==========================================
+ Hits        11402    11422      +20     
+ Misses       6615     6555      -60     
- Partials      727      730       +3     
Impacted Files Coverage Δ
src/pint/models/solar_wind_dispersion.py 75.43% <ø> (-0.43%) ⬇️
src/pint/models/troposphere_delay.py 89.65% <ø> (-0.08%) ⬇️
src/pint/pintk/colormodes.py 23.30% <0.00%> (-0.23%) ⬇️
src/pint/scripts/pintempo.py 60.65% <ø> (+1.92%) ⬆️
src/pint/pintk/plk.py 13.43% <2.32%> (+0.96%) ⬆️
src/pint/pintk/pulsar.py 44.14% <7.89%> (-1.77%) ⬇️
src/pint/models/noise_model.py 87.60% <52.63%> (ø)
src/pint/models/timing_model.py 80.97% <76.31%> (-0.41%) ⬇️
src/pint/models/parameter.py 83.29% <87.20%> (+0.98%) ⬆️
src/pint/models/jump.py 92.30% <90.00%> (+9.85%) ⬆️
... and 6 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 13bdad6...4552ef4. Read the comment docs.

@scottransom
Copy link
Member

Some of the jump_params_to_flags stuff is explained in PR #1032 and the comments therein. The main thing is that we were trying to standardize on using flags for all types of JUMPs. And there are multiple types (including those from .tim files, .par files (of various sorts), and interactive ones from pintk). I'm still not sure if things are consistent or not.

@aarchiba
Copy link
Contributor Author

Some of the jump_params_to_flags stuff is explained in PR #1032 and the comments therein. The main thing is that we were trying to standardize on using flags for all types of JUMPs. And there are multiple types (including those from .tim files, .par files (of various sorts), and interactive ones from pintk). I'm still not sure if things are consistent or not.

I don't really know what you mean - there are only two kinds of JUMPs:

  1. legacy JUMPs, found in .tim files and converted to -jump N flags upon load, and
  2. JUMPs, which select a subset of TOAs by flag/table values and add a constant phase to them.

As far as I can tell jump_params_to_flags takes the second kind, which already selects a well-defined subset, and adds additional flags that cannot be used to select a subset?

@aarchiba
Copy link
Contributor Author

aarchiba commented Jul 15, 2021

You seem to wind up with tim files like:

FORMAT 1
unk 999999.000000 56999.9999999968946643 1.000 gbt  -pn -14395108811.0
unk 999999.000000 57333.2222222321138195 1.000 gbt  -pn 212132.0
unk 999999.000000 57666.4444444391813657 1.000 gbt  -pn 14395478645.0
unk 999999.000000 57999.6666666724900694 1.000 gbt  -pn 28790673268.0
unk 999999.000000 58332.8888888887791435 1.000 gbt  -jump ['1'] -pn 43185798351.0
unk 999999.000000 58666.1111111063743750 1.000 gbt  -jump ['1'] -pn 57580876784.0
unk 999999.000000 58999.3333333312934144 1.000 gbt  -jump ['1'] -pn 71975943102.0
unk 999999.000000 59332.5555555595535763 1.000 gbt  -jump ['1', '2'] -pn 86371035295.0
unk 999999.000000 59665.7777777672042362 1.000 gbt  -jump ['1', '2'] -pn 100766185158.0
unk 999999.000000 59999.0000000061153356 1.000 gbt  -jump ['1', '2'] -pn 115161408996.0

and there isn't a way to JUMP all the TOAs that have '1' in their list. (Also, why does the list contain quoted numbers not just numbers?)

@paulray
Copy link
Member

paulray commented Jul 15, 2021

I think @champagne-cmd might want to chime in on this discussion.

@luojing1211
Copy link
Member

The reason for the list containing quoted numbers not just numbers is that we use string in the maskParameter key-values to prevent confusion in the case of when the key-value is "430", which means 430MHz backend.

@aarchiba
Copy link
Contributor Author

aarchiba commented Jul 15, 2021

The reason for the list containing quoted numbers not just numbers is that we use string in the maskParameter key-values to prevent confusion in the case of when the key-value is "430", which means 430MHz backend.

Flag values can only contain strings when they come out of tim files, so storing a list in there at all is going to be a problem. But in any case the "430" backend string never occurs in a list, and what confusion would arise if the list contained integers?

@scottransom
Copy link
Member

@aarchiba As mentioned in that PR, the key_values for all jump flags should be strings. And having a list definitely works in memory to allow JUMPs. Are you saying that it doesn't work if you write the TOAs out to a .tim file and read them back in? That is certainly not what we intend with JUMP flags like that. So I don't think that matters.

@luojing1211
Copy link
Member

The reason to have a list is to handle the case like "JUMP -mjd 54000 55000". But I can see that we can change "key_value=[54000, 55000] " to key_value1 = 54000, key_value2 = 55000

@scottransom
Copy link
Member

And in the past, interactively handled JUMPs set from pintk were different. They were gui_jumps or somesuch. And all three different JUMPs where handled in slightly different ways. As you have noted, it is still confusing now.

@scottransom
Copy link
Member

The main reason there is a list is so that a single TOA can be associated with multiple different JUMPs.

@aarchiba
Copy link
Contributor Author

And in the past, interactively handled JUMPs set from pintk were different. They were gui_jumps or somesuch. And all three different JUMPs where handled in slightly different ways. As you have noted, it is still confusing now.

The way that add_jump_and_flags works, GUI jumps are exactly like any other kind of jump; they use the flag -gui_jump and set the value to some large integer N, then create JUMP -gui_jump N 0. So they're not any different from any other kind of JUMP.

@aarchiba
Copy link
Contributor Author

@aarchiba As mentioned in that PR, the key_values for all jump flags should be strings. And having a list definitely works in memory to allow JUMPs. Are you saying that it doesn't work if you write the TOAs out to a .tim file and read them back in? That is certainly not what we intend with JUMP flags like that. So I don't think that matters.

I think it's a bad idea to add flags to TOAs objects that won't survive the round trip to tim files. And if these jump flags are useful (for what? I still don't know) why would you not save them in tim files? If they're meant to be ephemeral and have a different data type from what's in the tim file, what are they doing in the flags part of the data structure?

@aarchiba
Copy link
Contributor Author

Let me elaborate on that point a little: The .tim file I included above is not readable by PINT at all. PINT will discover things in the flags area that don't look like flags, because the lists include spaces, and fail to recover the information in the .tim file.

@aarchiba
Copy link
Contributor Author

The reason to have a list is to handle the case like "JUMP -mjd 54000 55000". But I can see that we can change "key_value=[54000, 55000] " to key_value1 = 54000, key_value2 = 55000

maskParameters do operate in two modes (determined by whether the key is mjd/freq or anything else), one where they accept a pair of floating-point values and one where they accept a string to match exactly. This is a different issue from having non-string things stored in the flags column of toas.table.

@aarchiba
Copy link
Contributor Author

My basic question is: why is it desired to have a flag on each TOA that lists which JUMPs affect it?

for j in model.jumps:
    toa_indices = getattr(model, j).select_toa_mask(toas)
    if my_toa_index in toa_indices:
        print(f"TOA {my_toa_index} is affected by jump {j}")

@scottransom
Copy link
Member

I'm not sure that's the right question, @aarchiba. We need a way to select TOAs that are JUMPed based on the presence or value of a generic flag. That's a requirement because it is something that TEMPO2 users use. Given that we have to do that, and also that we need to support .tim-file JUMPs which need to be assigned to each TOA as they are read, it made sense to use flags for other kinds of JUMPs (like likely temporary ones from interactive use in pintk). That is why we use flags for this -- even for ephemeral (and very common!) gui JUMPs.

Note that we do not have what your question seems to be asking: we do not make a single flag and assign to it every single JUMP that affects that TOA. Nor do we desire such a thing.

The big issue, and it was the one that came up for #1032, was how do we assign more than one of the same type of JUMPs to the same TOA interactively? We decided to use a list, because we never even considered that someone would want to write those JUMPs out into a .tim file -- so that doesn't work right now. But it could certainly be made to work with some small changes in the .tim reading and writing code. So I don't think that is a huge problem (and I think people will write those TOAs out rarely if ever, truthfully).

Note: I still think that we should require that that key_value for flags that affect JUMPs need to be (or be treated as) strings. We've had problems with this multiple times in the past when that was not the case (jump "1" colloided with jump 1 etc).

@aarchiba
Copy link
Contributor Author

aarchiba commented Jul 16, 2021

I'm not sure that's the right question, @aarchiba. We need a way to select TOAs that are JUMPed based on the presence or value of a generic flag. That's a requirement because it is something that TEMPO2 users use.

We have that. JUMP -fish carp 0.

Given that we have to do that, and also that we need to support .tim-file JUMPs which need to be assigned to each TOA as they are read, it made sense to use flags for other kinds of JUMPs (like likely temporary ones from interactive use in pintk). That is why we use flags for this -- even for ephemeral (and very common!) gui JUMPs.

Currently when we load TOAs with JUMP blocks, those JUMP blocks are assigned numbers and the affected TOAs get a flag -jump N which we can then use to select them for JUMPing with JUMP -jump N 0. (This N has no relationship with the arbitrary numbers that JUMPs get assigned when encountered in par files.)

The easiest way to implement GUI jumps is to add a flag -gui_jump yes to all TOAs selected in the GUI (and no others), then add JUMP -gui_jump yes 0 to the model.

Note that we do not have what your question seems to be asking: we do not make a single flag and assign to it every single JUMP that affects that TOA. Nor do we desire such a thing.

jump_params_to_flags creates a flag -jump on each TOA that lists the indices of all the JUMPs that affect that flag. Various related functions attempt to maintain this invariant. I don't see the benefit of this invariant. (Note that this is a conflicting usage of the same -jump flag with the one created upon reading legacy .tim files.)

The big issue, and it was the one that came up for #1032, was how do we assign more than one of the same type of JUMPs to the same TOA interactively? We decided to use a list, because we never even considered that someone would want to write those JUMPs out into a .tim file -- so that doesn't work right now. But it could certainly be made to work with some small changes in the .tim reading and writing code. So I don't think that is a huge problem (and I think people will write those TOAs out rarely if ever, truthfully).

How about, if you want two different subsets for two different GUI jumps, -gui_jump_1 yes and -gui_jump_2 yes? Then JUMP -gui_jump_1 yes 0 and JUMP -gui_jump_2 yes 0 affect them independently and they can overlap or not. If you want to structurally force them not to overlap you could use the same flag but different values (-fish carp and -fish sturgeon).

Note: I still think that we should require that that key_value for flags that affect JUMPs need to be (or be treated as) strings. We've had problems with this multiple times in the past when that was not the case (jump "1" colloided with jump 1 etc).

I've got a PR under development that ensures we deal with strings in the flags column.

@scottransom
Copy link
Member

I think the -fish carp / -fist sturgeon idea is totally fine. I don't like anything at all about the way we currently handle JUMPs as model params or as flags/key_values with indices, though. Because JUMPs can come and go and the numbers get out of sync and are confusing. This is been an issue with JUMPs since the beginning -- because they are maskParams with integer (and sometimes sequential) indices.

But please, once again, be gentle with these discussions? The code that we have has been written by numerous students over the years -- and a lot of this discussion is on stuff that was done only a few months ago! There are zillions of ways to implement these things and the use cases can be really confusing (and different people value certain things over others).

@scottransom
Copy link
Member

One important thing to remember about a use case is that in pintk, it needs to keep track of all the TOAs that are affected by the various JUMPs as well as the JUMPs as model params. So it needs to be able to map them both ways and be able to change any parts of that mapping (i.e. removing TOAs, removing/adding JUMPs, turning on/off existing JUMPs). I suspect that the functions that you are talking about (e.g. jump_params_to_flags` and related) were written for precisely that purpose. And they work right now! (TOA writing using that info is broken, however, as you noted.)

@aarchiba
Copy link
Contributor Author

I think the -fish carp / -fist sturgeon idea is totally fine. I don't like anything at all about the way we currently handle JUMPs as model params or as flags/key_values with indices, though. Because JUMPs can come and go and the numbers get out of sync and are confusing. This is been an issue with JUMPs since the beginning -- because they are maskParams with integer (and sometimes sequential) indices.

JUMP indices can and should be ignored. They play no role at all in which TOAs the JUMPs match; they exist only to give all JUMPs distinct names in the model. They can and should be replaced so that JUMPs get named something like JUMP_flag_value.

Unfortunately, because they define the user-visible name of the parameter from python, the current naming scheme, and the fragile numeration, have gotten baked into places like the NANOGrav yaml files and the model.compare function (which helpfully compares whichever JUMP got loaded seventh in one par file to whichever JUMP got loaded seventh in the other par file).

I started writing code to abolish JUMP indices but the current model builder does something arcane with them and I don't want to go digging in there as it is about to be replaced.

@aarchiba
Copy link
Contributor Author

One important thing to remember about a use case is that in pintk, it needs to keep track of all the TOAs that are affected by the various JUMPs as well as the JUMPs as model params. So it needs to be able to map them both ways and be able to change any parts of that mapping (i.e. removing TOAs, removing/adding JUMPs, turning on/off existing JUMPs). I suspect that the functions that you are talking about (e.g. jump_params_to_flags` and related) were written for precisely that purpose. And they work right now! (TOA writing using that info is broken, however, as you noted.)

To get all the JUMPs as model params, you can use model.jumps to get their names.

To get all the TOAs affected by a jump you can use model.JUMP17.select_toa_mask(toas).

To get all the JUMPs affecting a TOA, you can write a loop like the one I wrote above. (The current alternative method to do this requires writing a loop to iterate over all sets of TOA flags and asking whether the index of the current JUMP is in that list.)

The current functions do not work. I loaded in some TOAs with two JUMP blocks, whose TOAs got corresponding -jump 1 and -jump 2 flags. Then I tried model.jump_flags_to_params(toas) and it did nothing because I had two unrelated JUMPs (that did not select those TOAs) already in the model. It confused the JUMP indices (arbitrary based on the order of things in the par file) with the values associated with the -jump flag; those numbers get assigned based on the JUMP blocks when reading a legacy tim file but are otherwise recorded stably in saved .tim files.

@paulray
Copy link
Member

paulray commented Jul 16, 2021

I haven't followed all this but if the problem is just numbering jumps, why not change things like GUI jumps and old style TIM file jumps to use a hash key, which is (essentially) unique and doesn't require any kind of sequence? Those can be assigned without care about how many jumps there already are, and can be written/read to TIM files.

@aarchiba
Copy link
Contributor Author

I haven't followed all this but if the problem is just numbering jumps, why not change things like GUI jumps and old style TIM file jumps to use a hash key, which is (essentially) unique and doesn't require any kind of sequence? Those can be assigned without care about how many jumps there already are, and can be written/read to TIM files.

JUMP numbers need to go away permanently. We should not have parameters called JUMP17.

There's a separate question, of if (say) pintk wants to use two different GUI jumps, how it should name the associated flags and values; I am agnostic on this but there's no reason pintk can't just keep its own internal list of numbers for its GUI jumps. -gui_jump_1...

@champagne-cmd
Copy link
Contributor

Hey everyone,

To answer your question @aarchiba , the jumps are all given the 'jump' flag because, as @scottransom mentioned earlier, pintk uses the flags to find where the jumps are in the TOAs. I believe this design was chosen so that, if you just have a portion of the TOAs chosen in pintk that you are fitting to or performing some other operation on, you could simply iterate over that portion's flags and easily see what jumps are in your selection/if they are all jumped/etc - using the TOA indices that we have on-hand to find the jumps as opposed to iterating through the jumps to see if we come across a selected TOA index. But using select_toa_mask may be better to avoid using confusing flags to track this.

I can start this transition next and see if I come across any pitfalls. I was also planning to transition the gui/tim jump key_values to a randomized hex instead of the jump index, but if you've already started on that @aarchiba I can hold off on that.

@aarchiba
Copy link
Contributor Author

I can remove the jump_params_to_flags and replace the code that relies on the -jump lists, but I don't have any experience at all with pintk, so maybe most effective would be if someone who did would give this PR (once I do that) a solid test to verify that I didn't break anything?

@aarchiba
Copy link
Contributor Author

Just to check, though: one goal of pintk is to allow multiple GUI jumps simultaneously, right? Is there any interest in these being able to overlap? If no overlaps are needed, I suggest using the flag -gui_jump and values that are arbitrary but different numbers; if overlaps are needed, -gui_jump_N with the value yes probably makes more sense.

My approach to these would be that pintk would have a list of GUI jumps that it was currently managing, and this list would include the flag and value associated with each one. This list could even just be "all JUMPs whose flag starts with gui_jump" and one could use the list that the timing model keeps, but I imagine pintk wants to record extra information like a colour for each one. Wherever it's recorded, the list would contain the information of what flag and value were used for each one, and if a new GUI jump were created one could go through the list and select a value and/or flag that didn't overlap with any existing one. (Choose an N one more than the highest currently in use, say.)

@aarchiba
Copy link
Contributor Author

See #1074 where I've tried to make flag manipulation much more convenient.

@aarchiba aarchiba force-pushed the jump-documentation branch from fe543ad to 4552ef4 Compare August 23, 2021 18:55
@aarchiba aarchiba changed the title Improve JUMP objects HOLD: Improve JUMP objects Aug 27, 2021
@scottransom
Copy link
Member

@aarchiba I saw your request in the PINT Slack channel to take a look at this PR. I'm happy to do that and hopefully get it a thumbs-up. But why did it get changed to "HOLD"?

@paulray
Copy link
Member

paulray commented Aug 30, 2021

I just tried to play with this using pintk. I loaded NGC6440 and tried to add a jump via the "j" keyboard command. It failed with this exception:

Traceback (most recent call last):
  File "/Users/paulr/env/pint/lib/python3.7/site-packages/matplotlib/cbook/__init__.py", line 224, in process
    func(*args, **kwargs)
  File "/Users/paulr/src/PINT/src/pint/pintk/plk.py", line 1256, in canvasKeyEvent
    self.fitboxesWidget.addFitCheckBoxes(self.psr.prefit_model)
  File "/Users/paulr/src/PINT/src/pint/pintk/plk.py", line 150, in addFitCheckBoxes
    par_text = f"{pm.origin_name} {pm.key} {pm.key_value}"
AttributeError: 'maskParameter' object has no attribute 'key'

@paulray
Copy link
Member

paulray commented Aug 30, 2021

I also tried adding this line to NGC6440E.par:

JUMP MJD 53400 53500 0 1

and then just starting pintk immediately triggered that same exception.

@aarchiba
Copy link
Contributor Author

aarchiba commented Sep 1, 2021

@aarchiba I saw your request in the PINT Slack channel to take a look at this PR. I'm happy to do that and hopefully get it a thumbs-up. But why did it get changed to "HOLD"?

As I said in the edited initial description, this changes the API in a way that would force an update of timing_analysis and after discussing it with Joe Swiggum I don't want to do that while we're working on v1.0. Please still feel free to review it but I think we should hold off on merging it for now.

@aarchiba
Copy link
Contributor Author

aarchiba commented Sep 1, 2021

I just tried to play with this using pintk. I loaded NGC6440 and tried to add a jump via the "j" keyboard command. It failed with this exception:

Traceback (most recent call last):
  File "/Users/paulr/env/pint/lib/python3.7/site-packages/matplotlib/cbook/__init__.py", line 224, in process
    func(*args, **kwargs)
  File "/Users/paulr/src/PINT/src/pint/pintk/plk.py", line 1256, in canvasKeyEvent
    self.fitboxesWidget.addFitCheckBoxes(self.psr.prefit_model)
  File "/Users/paulr/src/PINT/src/pint/pintk/plk.py", line 150, in addFitCheckBoxes
    par_text = f"{pm.origin_name} {pm.key} {pm.key_value}"
AttributeError: 'maskParameter' object has no attribute 'key'

Ah, this is obviously some kind of pintk summary code that isn't part of the test suite. This PR changes the name of the selection attributes from .key and .key_value to .flag and .flag_value - I found .key very confusing in light of python's use of keys in dictionaries, and in fact what goes in there should be the flag that the selection acts upon. I'll go in and make sure that part of pintk is in the test suite and then change the code.

@scottransom
Copy link
Member

I just used ripgrep to search for all uses of .key_value in all of pint and found there were two places. One of them is above:

 >rg .key_value                                                                   10:54:41
src/pint/models/parameter.py
1427:    key_value :  list/single value optional

src/pint/pintk/plk.py
150:                    par_text = f"{pm.origin_name} {pm.key} {pm.key_value}"

@aarchiba
Copy link
Contributor Author

No need for this to be on hold any more; when I have time to work on it I'll fix the conflicts and make the changes suggested.

@aarchiba aarchiba changed the title HOLD: Improve JUMP objects Improve JUMP objects Nov 22, 2021

from pint.models.parameter import maskParameter
from pint.models.timing_model import DelayComponent, MissingParameter, PhaseComponent

log = logging.getLogger(__name__)


class DelayJump(DelayComponent):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tempo actually has delay jumps for the observatory cable delays. I was thinking to switch the backend jump to delay jump. The jump value affects the binary delay calculation.

@luojing1211
Copy link
Member

This PR needs to rebase and fix the conflict.

@aarchiba aarchiba changed the title Improve JUMP objects OLD: Improve JUMP objects Jul 6, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
awaiting review This PR needs someone to review it so it can be merged GUI help wanted reviewer comments need response
Projects
None yet
Development

Successfully merging this pull request may close these issues.

pintk help is double-spaced
5 participants