Skip to content

ChangeLog

Ben Murray edited this page Apr 30, 2021 · 3 revisions

v0.5.1 -> 0.5.2

  • BugFix: DataStore.get_spans returning None when passed Readers in legacy scripts. Functionality has been restored from v0.4

v0.5.0 -> 0.5.1

  • Feature: DataFrame.rename function added; allows renaming of one or more fields within a dataframe

v0.4 -> v0.5

  • Major changes to API
    • Datasets & DataFrames introduced
    • Rich API on Fields introduced
    • Much functionality previously accessed through Session can now be accessed through Datasets, DataFrames and Fields
    • See (Basic Examples and Intermediate Examples for more details
  • Import improvements
    • You can now specify include and exclude lists for fields in a table during import
      • This allows you to improve import performance and dataset size by excluding or only including the fields that you are interested in

v0.3.2 -> v0.4

  • Separation of all covid-specific functionality out to https://github.com/KCL-BMEIS/ExeTeraCovid.git
  • Removal of legacy csv pipeline code
  • Renaming of some of the ordered_merge_* functionality parameters for clarity
  • Addition of open/close/list/get_dataset functionality to Session
  • Made Session 'withable'
  • Improved performance of Session.get_spans
  • Bug fixes for Session API
    • apply_spans / aggregation issues
  • Bug fixes for Field API
    • provided __bool__ so that if field: works as expected
    • provided single element read for IndexedStringField

v0.3.1 -> v0.3.2

  • Fixing issues with use of test_type_from_mechanism_v1
  • Adding ability to optionally import lsoa-based fields through add_imd script
  • Import now appends by default; to overwrite an existing dataset use -w \ --overwrite
  • Moved schema files to resources
  • Adding separate lsoa schema for import

v0.3.0 -> v0.3.1

  • Major performance improvement to Session.get_spans

v0.2.7 -> v0.3.0

  • Renaming of hystore to ExeTera, the project's new name!
  • Renaming of the hystorex command to exetera
  • Removal of scripts that now belong in https://github.com/KCL-BMEIS/ExeTeraCovid.git
  • Addition of snapshot journaling and extremely large sort functionality
  • Removal of the legacy csv script functionality

v0.2.7 -> v0.2.7.3

  • Fix to covid_schema.json for numeric diet fields marked 'float' instead of 'float32'
  • Addition of --daily flag to enable / disable generation of daily assessments
  • Addition of

v0.2.6 -> v0.2.7

  • Addition of diet questionnaire schema
  • Reworking of arguments for hystorex import to support arbitrary numbers and names of csvs
  • Provision of highly-scalable merge functionality through ordered merge functions
    • Fix for filtering of indexed string fields

v0.2.5 -> v0.2.6

  • Moving from DataSet to Session class offering cleaner syntax
  • Moving from Readers/Writers to Fields for cleaner syntax
  • Introduction of schema for import command
  • Consolidating commands
    • h5import -> hystorex import
    • h5process -> hystorex process

v0.2.3 -> v0.2.5

  • Please note: there was no version v0.2.4; due to a numbering error when updating the version number
  • Simplifications to the API

v0.2.2 -> v0.2.3

  • Data schema updated for 1.5.1

v0.2.1 -> v0.2.2

  • Fix: Split functionality had not been moved to bin/csvsplit as documented
  • Fix: Missing license headers added

v0.2.0 -> v0.2.1 - tag

  • Refactor: Created the DataStore class and moved processor api methods onto it as member functions
  • Refactor: Simplified the creation of Writers. This can now be done through get_writer on a DataStore instance
  • Fix: Writes to a hdf5 store can no longer be interrupted by interrupts, resulting in more stable hdf5 files
  • Fix: Fixed critical bug in process method that resulted in exceptions when running on fields with a length that isn't an exact multiple of the chunksize

v0.1.9 -> v0.2.0

  • Added hdf5 import and process functionality

v0.1.8 -> v0.1.9

  • Feature: provision of the split.py script to split the dataset up into subsets of patients and their associated assessments
  • Fix: added treatments and other_symptoms to cleaned assessment file. These fields are concatenated during the merge step using using csv-style delimiters and escapes

v0.1.7 -> v0.1.8

  • Fix: had_covid_test was not being patched up along with tested_covid_positive'
  • Breaking change: output fields renamed
    • Fixed up had_covid_test is output as had_covid_test_clean
    • Fixed up tested_covid_positive is output as tested_covid_positive_clean
    • had_covid_test and tested_covid_positive contain the un-fixed-up data (although rows may still be modified as a result of quantising assessments by day)

v0.1.6 -> v0.1.7

  • Fix: height_clean contains weight data and weight_clean contains height data. This has been the case since they were introduced in v0.1.5

v0.1.5 -> v0.1.6

  • Performance: reduced memory usage
  • Addition: provision of -ps flag for setting parsing schema

v0.1.4 -> v0.1.5

  • Fix: health_status was not being accumulated during the assessment compression phase of cleanup

v0.1.3 -> v0.1.4

  • Fix: added missing value rarely_left_the_house_but_visit_lots to level_of_isolation
  • Fix: added missing fields weight_clean, height_clean and bmi_clean

v0.1.2 -> v0.1.3

  • Fix: -po and -ao options now properly export patient and assessment csvs respectively

v0.1.1 -> v0.1.2

  • Fix: day no longer overwriting tested_covid_positive on assessment export
  • Fix: tested_covid_positive output as a label instead of a number

v0.1 -> v0.1.1

  • Change: Converted 'NA' to '' for csv export