Each group of unit-tests has one directory and one python file with the same name. The python file implements specific testing logic while the directory contains the test-cases.
This tests the dataflow within STA's language and execution.
These tests do not require to instantiate any Language Model at the moment.
When a LM become required, we will use a FakeLM
to permit testing without dependency on a real LLM.
The test are run by dataflow.py and the test programs are in dataflow/.
Loading, displaying, and extracting the same data.
Currently, we only test a single correct use case for each test-case. We need more valid data samples. Especially:
- lists with:
- single element
- more elements than the limit in the program (currently undefined behavior)
- dictionnary with:
- additional fields
- missing fields (would need FakeLM to provide completion)
The base cases correspond to different data:
text
: a simple stringlist
: a list of stringitem
: a dictionary with two text fieldsitem-list
: list of dictionary with text fieldsnested
: list of dictionary with one text and one list of text fieldsdouble-nested
: list of dictionary with one text and one list of dictionary of list of text fields
For each base case, there are variations:
input
: single prompt,data
frominputs
, written as-is to outputcall
: single prompt,data
from a call to aDataSource
tool, written as-is to outputflow
: TODO two prompts,data
frominputs
, read from first prompt into second prompt, written as-is to output
- identity-texts-mapped-input
- identity-double-mapped-flow
- filter-aggregate-list-text-bool
- iteration-text-call
- iteration-text-flow