Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Custom Code Evaluations (Agenta-AI#610)
* Update - added restrictedpython * Feat - created security module * Feat - implemented execute_code_safely function * Feat - created custom evaluation db collection * Feat - created custom evaluation type and store custom evaluation api models * Feat - implemented store and execute custom code evaluation logics * Feat - implemented function to check if module import is safe to ensure code safe execution * Cleanup - remove app_name from execute_custom_code_evaluation * Feat - implemented store and execute custom evaluation routers * Update - added custom_code_run to evaluation type and labels * Feat - upload custom_code image * Feat - created store custom evaluation type interface * Feat - created type interface for store custom evaluation success response * Feat - implemented save custom code evaluation api logic * Feat - implemented custom evaluation dropdown component * Update - added type dropdown component * Feat - implemented custom python code component * Refactor - renamed component prop interface * Feat - created type interface for single custom evaluation * Feat - implemented axios logic to fetch custom evaluations * Update - improve security in sandbox environment * Cleanup - removed custom evaluation type embedded model and some fields in custom evaluation db * Feat - implemented fetch custom evaluations evaluation service * Feat - implemented list custom evaluations api router * Feat - created custom evaluation output and added new type in evaluation type schema * Update - modified custom evaluations dropdown component to set custom evaluation id * Update - include custom python code and evaluation dropdowns component in evaluations component * Refactor - removed custom_code.png * Feat - created evaluation api model to execute custom evaluation code * Feat - implemented custom code run evaluation page * Feat - implemented helper function to include dynamic values * Update - add condition to save correct_answer for cusutom_code evaluations * Feat - created type interface for execute custom eval code * Feat - implemented axios logic to execute custom evaluation code * Update - added optional field (correct_answer) * Feat - implemented fetch average score for custom code run result service * Update - modified fetch_results and execute_custom_evaluation routers * Cleanup - remove unused code-blocks * Feat - implemented custom code run evaluation table component * 🎨 Format - ran format-fix and black * Feat - created create custom evaluation page * Update - removed custom python code in evaluation component * Cleanup - formatted custom evaluations dropdown component * Refactor - renamed saveCutomCodeEvaluation to saveCustomCodeEvaluation * Update - added new styles * Update - introduce pre-filled example of an evaluation function and syntax-highling on the example code * 🎨 Format - ran format-fix and black * Update - added variant_name to type interface ExecuteCustomEvalCode * Update - added app_params, output to sandbox and allow execute of evaluation function * Update - added output to execute_custom_code_execution service function * Update - added app_name, variant_name, and outputs to execute_custom_evaluation_code api model * Refactor - modified executeCustomEvaluationCode axios api logic * Update - added styles for copy btn in custom python code component * Update - refactor evaluate function and added new args in callCUstomCodeHandler function * Update - add custom code evaluation id to evaluation * Update - retrieve evaluations for custom code evals * Update -added custom code evalation id to router push * Update - added custom_code_evaluation_id and made it optional * Update - added btn to copy code example for custom evaluation function * Update - created format_outputs helper function * Update - added correct_answer to execute custom evaluation code api model * Update - modified evaluation function example description * Update - modified fetch_average_score_for_custom_code_run * Feat - created update_evaluation_scenario_score logic and added doc strings to custom evaluations services * Update - include correct_answer to custom eval code params * Feat - implemented update evaluation scenario score axios logic * Feat - created evaluation scenario score update api model * Update - receive put data by payload instead of query * Update - modified custom code run evaluation table component * 🎨 Format - ran format-fix and black * Update - installed packages.json * Update - integrated ace editor for code input and syntax highlighting * Update - set result and avg_score to 2 decimal places * 🎨 Format - ran format-fix * Cleanup - add ? to handle undefined error * 🎨 Format - ran format-fix * 🎨 Format - ran format-fix and black * Cleanup - removed raise exception when no custom evaluations is found * Refactor - override error interceptor for get all variant parameters api call * Cleanup - removed console log * Feat - created backend router to get evaluation scenario score and axios logic to make backend call * Update - round score by 2 decimal * Refactor - removed CustomEvaluationsDropdown component * Refactor - improve get_evaluation_scenario_score_router * Refactor - directly include dropdown select of custom evaluations * Update - added logic to fetch results of ran evaluation scenarios * 🎨 Format - ran format-fix and black * Cleanup - fix type error * custom code evaluation: ui enhancements and bug fixes * resolve type errors * ran prettier * Refactor - renamed store to create * 🎨 Format - ran black * Cleanup - removed react-ace and installed monaco-editor * Refactor - switch from react-ace to monaco-editor * Feat - created custom evaluation names api model * Feat - implemented fetch custom evaluation names service * Feat - implemented evaluation router to get custom evaluation names and integrated router to axios * Feat - added validation to check if evaluation name (input) exists * 🎨 Format - ran format-fix * Refactor - remove /create from evaluation_router and renamed all prefix store_ with create_ * Refactor - renamed Store prefix to Create * Cleanup - renamed store custom evaluation success reponse to start with create and ran format-fix --------- Co-authored-by: Abram <[email protected]>
- Loading branch information