Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Fixing merge conflicts from master (#171)
* Fixed typo on string #74 (#103) * Default transformation (#94) * This commit introduce the support of default transformation for undeclared fields in the anonymizer, which can be overrided by the user if supplied another default within the anonymizer template. * Streams bug fixes (#92) * Deployment script (#95) New simplified deployment scripts + unified tags * Analyzer redesign + supporting custom recognizers This commit is the first part of the redesign of the analyzer service, and contains the following: 1. Separates spacy and recognizers logic to different files. 2. Implements a base class for all the recognizers,(which in future custom recognizers will inherit) 3. Moves the analyzer logic from main to analyzer_engine class 4. Removes the detected text from the analyzer result. Future commits will contain the following: 1. Dynamic loading of the pre-defined recognizers. [link](https://dev.azure.com/csedevil/Presidio-internal/_sprints/taskboard/Presidio%20Crew/Presidio-internal/02%20-%20Testable%20custom%20models) 2. Add new pattern recognizer via api call, [work item](https://dev.azure.com/csedevil/Presidio-internal/_sprints/taskboard/Presidio%20Crew/Presidio-internal/02%20-%20Testable%20custom%20models) 3. Improve remove duplicates logic [bug](https://dev.azure.com/csedevil/Presidio-internal/_workitems/edit/597) and [bug](https://dev.azure.com/csedevil/Presidio-internal/_workitems/edit/596/) 4. Re-support context model. [work item](https://dev.azure.com/csedevil/Presidio-internal/_sprints/taskboard/Presidio%20Crew/Presidio-internal/02%20-%20Testable%20custom%20models) Current Design: ![image](https://user-images.githubusercontent.com/13463870/52433948-edc69480-2b16-11e9-98d7-8923fdc9fb8a.png) * Presidio support for language code in template (#98) Configure a language code on the request level and not the field level. All requests should have one language * Fix Bug #604 - Refactor test assertions + some pylint fixes (#100) * Fix Bug #604 - Refactor test assertions + some pylint fixes * fix spaces in spacy_recognizer.py * Update test_spacy_recognizer.py Fix PR comment regarding Bug 617 * fixed bug: changed 'push' to 'pull' in Makefile (#102) * Bug666 - Adding pylint to the Analyzer microservice (#105) * Adding pylint for the analyzer microservice * Ll bug610 - Fix bug in IBAN recognizers + additional test fixes (#101) * Fix Bug 610: iban recognizer and fix additional tests * Ignoring 0 score patterns, removed predefined 0 score patterns from code. * All fields (#107) support for requesting all fields (entities) and language refactor * New functionality: Support custom pattern recognizers (#104) Adding a new service for persisting recognizers Adding API support for adding, removing and listing recognizers Analyzer service now calls the recognizers store to get new recognizers to be used during analysis * Entities list for all fields true (#111) Bug fix with all_fields = True * (Re)Enable context support in analyzer (#114) This PR introduces estimation of surrounding words (aka context). It also introduces generic NLP classes for accessing metadata extracted by an NLP engine (specifically tokens, lemmas, NER, stopwords and punctuation). SpacyRecognizer no longer runs spacy but only processes the metadata processed by the NLP Engine. In the current implementation, Spacy is the implementation of the NLP engine. * Add public path to makefile (#121) * updated docker file to support correct spacy version * added public path for push release * Update Dockerfile.python.deps * replaced registry with public mcr * updated docs and samples based on new changes in dev (#120) * updated docs and samples based on new changes in dev * updated text change suggestion * reverted registry change * a post-upgrade e2e test (#112) Runs e2e tests on the newly deployed code * Build deps - adding build number to python&golang dependency containers (#122) * adding presidio-dependency images versioning * updating installation notes with dependency label * add CI as code * presidio deps ci yaml file * add triggers to deps CI yaml * separate CIs for golang and python * adding presidio-dependency images versioning * updating installation notes with dependency label * Bug fixes - based on bugbash (#126) * Bug 906 - CONTEXT_SUFFIX_COUNT is not used and PREFIX is used instead * Bug 909 - Context not working with upper case / mixed case * Bug 908 - support context with substrings * Bug 907 - Context window size is off by 1 index * Revert yaml (#137) * Revert "Build deps - adding build number to python&golang dependency containers (#122)" This reverts commit c054770. * Change deployment script to mcr (#130) * updated docker file to support correct spacy version * changed registry to mcr * fixed the mcr url * Updated readme and documentation(#125) * Update README.MD * Bug 911 fix (#131) * Bug 911 fix When the persistent recognizers store is empty there should be no error log line when requesting for the hash value * changed genproto branch to master. * Updating genproto to master branch * Updated CI badge for new Presidio-CI pipeline * bug fix - mcr latest tag was not correct
- Loading branch information