This repository is meant to accompany the Onward! 2024 paper titled "Software Engineering Methods For AI-Driven Deductive Legal Reasoning" authored by Rohan Padhye. The repository contains LLM prompts and traces corresponding to the examples in the paper.
See system-prompt.txt
for the system prompt used in all examples. For all examples listed in the paper, the system prompt contained the exact same excerpt of statutes from the Internal Revenue Code as listed in the paper's Appendix A.
The folder ClaudeTranscripts
contains transcripts of conversations with Anthropic's Claude 3 Opus which formed the basis of examples provided in the paper. The model version used for these transcripts was claude-3-opus-20240229
with temperature=0
and max_tokens=2000
.
In each transcript file, messages (system prompt, user prompt, AI response) are separated by a line containing only ---
.
The folder PropertyBasedTesting
contains a Python script to invoke Claude in a loop to check the property that year-over-year standard deduction cannot decrease. The program runs until a violation of this property is discovered, which takes about 2.4 iterations on average.