You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Datanymizer and Greenmask are both extensible open source DB masking tools that work with PostgreSQL.
Language
Datanymizer is in Rust. greenmask is in Go. The language shouldn't make much difference for the end user. Generally Rust is a bit faster than Go, but there are some differences in architecture that would probably end up dominating run time.
But for someone that wants to contribute custom filters or code changes this matters.
Maintenership
Datanymizer has unmerged and thus unreleased PRs that are quite important. Currently greenmask seems to be maintained enthusiastically.
Subsetting
Datanymizer supports subsetting, but greenmask does not yet support it. See #18
Transform condition
Datanymizer supports giving a transform_condition which specifies a subset of rows that should not be transformed. Greenmask does not yet have this feature.
Filters
Both mostly leverage an underlying faker library. A more thorough comparison would be needed here.
Architecture
Datanymizer intercepts the pg_dump output before it is written to disk as a dump file. When writing templates one has access to the previous and current values of columns as the strings they would be written to in the dump file. This plus working within in a templating language inside YAML can make some transformations very difficult.
I believe that greenmask operates differently here although I am not certain. This may effect the total time to produce the anonymized dump and affects the ergonomics of custom templating.
Altering multiple columns together
For columns that are dependent on the generated output of other columns datanymizer works fine. However, it doesn't have a facility for generating multiple columns at once. The main example of this would be generating an address. In datanymizer the address will be non-sensical if there is a separate column for city, state, street, etc. In greenmask, this should be able to be handled without issue with the TemplateRecord feature.
Templating and custom filters
In Datanymizer one only has access to template language functions, the previous column value, the new column value, and the values of other columns. However, in greenmask one has access to all of the generator and faker functions from the template.
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
Datanymizer and Greenmask are both extensible open source DB masking tools that work with PostgreSQL.
Language
Datanymizer is in Rust. greenmask is in Go. The language shouldn't make much difference for the end user. Generally Rust is a bit faster than Go, but there are some differences in architecture that would probably end up dominating run time.
But for someone that wants to contribute custom filters or code changes this matters.
Maintenership
Datanymizer has unmerged and thus unreleased PRs that are quite important. Currently greenmask seems to be maintained enthusiastically.
Subsetting
Datanymizer supports subsetting, but greenmask does not yet support it. See #18
Transform condition
Datanymizer supports giving a transform_condition which specifies a subset of rows that should not be transformed. Greenmask does not yet have this feature.
Filters
Both mostly leverage an underlying faker library. A more thorough comparison would be needed here.
Architecture
Datanymizer intercepts the pg_dump output before it is written to disk as a dump file. When writing templates one has access to the previous and current values of columns as the strings they would be written to in the dump file. This plus working within in a templating language inside YAML can make some transformations very difficult.
I believe that greenmask operates differently here although I am not certain. This may effect the total time to produce the anonymized dump and affects the ergonomics of custom templating.
Altering multiple columns together
For columns that are dependent on the generated output of other columns datanymizer works fine. However, it doesn't have a facility for generating multiple columns at once. The main example of this would be generating an address. In datanymizer the address will be non-sensical if there is a separate column for city, state, street, etc. In greenmask, this should be able to be handled without issue with the TemplateRecord feature.
Templating and custom filters
In Datanymizer one only has access to template language functions, the previous column value, the new column value, and the values of other columns. However, in greenmask one has access to all of the generator and faker functions from the template.
Beta Was this translation helpful? Give feedback.
All reactions