Skip to content

MWAX Subfile Distributor Voltage Buffer Dump mechanism

Greg Sleap edited this page Feb 3, 2023 · 13 revisions

About

M&C uses the /dump_voltages?start=X&end=Y web service endpoint on each MWAX box to signal we need to dump voltage buffers. Where start and end are GPS times of the buffers we want to dump.

  • Passing parameter start and end as 0 will result in a http status code of 200, but with no action taken. Good for testing.
  • Parameter start is inclusive. E.g. if start is 1234567890 then a free file which has a gps time of 1234567890 will have dumping attempted.
  • Parameter end is exclusive. E.g. if end is 1234567890 then free files up to (1234567890-8) gps time will be attempted for dumping. If in the future, the end parameter is in actuality the gps time where normal MWAX operations 'take over' from the voltage buffer dump.
  • Even though M&C asks MWAX for these GPS times, it may not be available (e.g. start GPS is before the first free file we have), but MWAX will do it's best to satisfy the request.
  • end maybe be a GPS time in the future. If so any and all observations encountered during that time (including IDLE/NO_CAPTURE) are treated as VCS (i.e. subfiles are produced and archived).
  • When starting mwax, all of the ".free" files will be numbered without a gps time since they are just dummy files filled with 0's. We will never bother to dump these files.
  • start can be 0. If so, start = oldest free file GPS time, however, we always need to keep some number (currently 2) of the oldest free files available to ensure enough buffer for normal MWAX operations, thus we will never dump the entire buffer, even when asked to.

Basic logic

Web Service Endpoint

if `dump_start` module variable and `dump_end` module variables are not None then  # we already have a voltage dump in progress. Yikes!
   # for now lets reject it (we can implement a merge later?)
   return status code 401 (Nope)

# module level variables (this is one way for the normal processing loop to know we are in a buffer dump)
dump_start = start (param from web service call)
dump_end = end (param from web service call)

# Get a list of free files, sorted by GPS time (ascending) - Ignore MWAX_VCS mode free files as they are already archived
# also ignore files which are not starting with obs_ids (e.g. 1.free, 2.free which is how they look on mwax startup)
free_file_list = get_files_in_dir("/dev/shm/mwax/*.free")

# If there are at least 2 files, remove the oldest 2 free files (to ensure there are 2 free files plus 1 active sub file available for normal MWAX operations)
if len(free_file_list)<=2:
    return status code 401 (Nope- not enough free files)
else:    
    free_file_list = free_file_list[0:-2]

# Rename all the free files within the start and end gps times in the list to *.keep ASAP (at this point they are still in `/dev/shm/mwax`)
keep_file_queue = empty queue # (module level variable)

# As quick as possible rename candidate .free files to .keep
for free_filename in free_file_list:
    if free_filename_gps_time >= start and free_filename_gps_time < end:
        keep_filename = free_file.replace(".free", ".keep")
        mv free_file keep_filename

# Now that they are safely preserved, we can eliminate some
# if they are MWAX_VCS. But finding out means reading the file
# which takes precious milliseconds! So we do it now

# Get the list of keep files
keep_file_list = get_files_in_dir("/dev/shm/mwax/*.keep")
for keep_filename in keep_file_list:
    mode = get_subfile_mode(free_filename)

    if mode != "MWAX_VCS":
        # Add remaining non-VCS keep file to keep_file_queue 
        keep_file_queue.add(keep_filename)
    else:
        # Return this file back to mwax_u2s
        free_filename = keep_filename.replace(".keep", ".free")
        mv keep_filename free_filename

return status code 200 (OK)

Normal subfile processing loop (triggered by mwax_u2s renaming a .free to .sub)

# Below pseudocode is simplified to illustrate logic regarding voltage buffer dumps
# Due to the fact that as we approach 256T we will need the full use of the /voltdata disk in order to keep up with a 
# VCS (or even a correlator) observation, it's vital we only copy the .keep files to /voltdata/incoming when we are 
# not doing anything else. There is no other way to keep up. And since at 256T it takes ~6 seconds to write an 8 sec 
# subfile we can only deal with 1 keep file each 8 seconds.

get current_gps_of_subfile from subfile
get current_mode from subfile

if current_gps_of_subfile >= dump_start and current_gps_of_subfile < dump_end:
    # We ignore the mode of this subobs and treat it like VCS instead
    cp current_subfile /voltdata/incoming
    rename current_subfile to *.free
else:
    # We are not IN a voltage buffer dump period
    # Just do normal processing
    if current_mode == "NO_CAPTURE" or current_mode == "VOLTAGE_BUFFER":    
        if keep_file_queue_length > 0:  # are there any .keep files to deal with while we do nothing for this 8 seconds?
            selected_keep_file = keep_file_queue.get_oldest()
            cp selected_keep_file to /voltdata/incoming
            remove keep_file from keep_file_queue
            rename keep_file to *.free  # now mwax_u2s can use this file again

    else if current_mode == "MWAX_CORRELATOR"
        # normal correlator processing
        load subfile into psrdada ring buffer

    else if current_mode == "MWAX_VCS"
        # do normal VCS processing
        cp current_subfile /voltdata/incoming

    # There is a semi-rare case where in between the top of this code and now
    # a voltage trigger has been received. If so THIS subfile will not have been added to
    # the keep list, so deal with it now
    if current_gps_of_subfile >= dump_start and current_gps_of_subfile < dump_end:
        rename current_subfile to *.keep
        # Add keep file to keep_file_queue 
        keep_file_queue.add(keep_filename)     

# rename if back to free so u2s can reuse it
if current_subfile exists:
    rename current_subfile to *.free

# Now check to see if we're now out of the voltage buffer dump gps time range
if current_gps_of_subfile >= dump_end:
   # time to clear this voltage dump (but there still may be keep files remaining)
   dump_start = None
   dump_end = None