Applications that incorporate sensitive data such as Social Security and credit card numbers may inadvertently disclose
that data into locations such as logs and traces, or through careless UI usage. Often, the disclosure occurs through an
implicit toString()
invocation. One way to mitigate this risk is to ensure that the objects holding the sensitive
data do not reveal it by default.
Suppose you have a sensitive object of type T
. To protect from inadvertent disclosure, you wrap it in an object of
type Sensitive<T>
. Now you can pass myWrapper
without having to worry about a stray myWrapper.toString()
causing
the sensitive data to appear in a log file.
PersonalData mySensitiveData = new PersonalData (12345);
Sensitive<PersonalData> myWrapper = new Sensitive<PersonalData> (mySensitiveData);
System.out.println(myWrapper);
produces a blank line:
One common case is the sensitive data is a String
. The MaskedField
type extends Sensitive<CharSequence>
with some
additional default behavior using #
as a masking character to replace characters in the sensitive string:
MaskedField mySensitiveField = new MaskedField("Shhh");
System.out.println("Explicit toString(): " + mySensitiveField.toString());
System.out.println("Implicit toString(): " + mySensitiveField);
System.out.printf("Default format: %s", mySensitiveField);
The code above produces:
Explicit toString(): ####
Implicit toString(): ####
Default format: ####
Using this type, data can be exposed explicitly using the precision specifier in string formatting, as follows:
System.out.printf("Partially exposed: %.2s", mySensitiveField);
System.out.printf("More exposed: %.3s", mySensitiveField);
System.out.printf("Masked with formatting: `%6S`", mySensitiveField);
System.out.printf("More formatting: `%#-6.1S`", mySensitiveField);
The code above produces:
Partially exposed: ##hh
More exposed: #hhh
Masked formatting: ` ##HH`
More formatting: `###H `
Note from the examples above that the "alternate" form has no effect on the output.
The Sensitive
class holds the data in a final, protected, transient property with no predefined accessor methods.
Accessors can be added to subclasses if desired. The property is transient to ensure it is not exposed via object
serialization.
Responsibility for rendering the sensitive data is delegated to a redaction function returned by the redactor()
method. The redactor function accepts two parameters: the sensitive object to redact, and the desired precision of the
output. The precision is generally interpreted as the number of non-redacted characters to include in the output. The
default redactor always returns an empty string.
The default implementation of alternate()
simply delegates to the redactor()
method. Override the alternate()
method if the sensitive data has an alternate rendition.
The Sensitive
object implements the Formattable
interface, and the formatTo(……)
is responsible for applying
formatting to the rendered, protected data as needed.
The hashCode()
method delegates to the hash code of the protected object.
The equals()
method provides the usual short-circuit checks for the argument being the same object and the argument
being the same class, then delegates to the equals method of the protected object.
The toString()
method delegates to the string formatter, using String.format("%s", this)
.
The Redactor
interface itself is a shorthand for the BiFunction<T, Integer, CharSequence>
required by
Sensitive.redactor()
and Sensitive.alternate()
. The interface additionally provides some predefined methods that
can be composed and delegated.
The empty()
method always returns a function that returns an empty String.
The limited(…)
methods return functions that wrap another redactor and impose upper limits on the allowed precision.
The variation that accepts a max ensures that the precision is always between 0 and the maximum value. The variation
that accepts a function to compute the length of the non-redacted rendition ensures that the precision is always between
0 and one half the length of the non-redacted rendition, rounded down.
The defaulted(…)
methods return functions that limit the default precision to one half the length of the non-redacted
rendition, but do not interfere with explicitly provided precisions.
The mask(…)
methods return functions that replace all but precision characters with a predefined masking character.
Redaction starts on the left, so the rightmost characters are exposed. If no precision is specified, the default is to
mask the entire rendition. A masking character can be passed to the constructor. The parameterless convenience method
mask()
uses the default mask character, which is #
.
Mask a sensitive string with '#', exposing no more than 4 plaintext characters:
@Override
protected BiFunction<CharSequence, Integer, CharSequence> redactor() {
return Redactor.limited(4, Redactor.mask());
}
Mask a sensitive string with '#', exposing no more than half the plaintext characters:
@Override
protected BiFunction<CharSequence, Integer, CharSequence> redactor() {
return Redactor.limited(Redactor.mask());
}
Mask a sensitive string with '#' characters, respecting whatever precision is specified and exposing no more than half the characters in plaintext if no precision is specified:
@Override
protected BiFunction<CharSequence, Integer, CharSequence> redactor() {
return Redactor.defaulted(Redactor.mask());
}
MaskedField
extends Sensitive<CharSequence>
for the common case of protected string and string-like values. The
redactor supplied by the MaskedField
subclass replaces protected characters with #
up to the number of non-redacted
characters specified by the precision.
SensitiveArray
extends Sensitive
for cases where the protected data is an array. It overrides the hashCode()
and
equals()
methods to use the corresponding functions provided by java.util.Arrays
.
SensitiveArray
defines static utility methods for some common cases. For cases where T
is a CharSequence
, the
concatenate()
, delimit(CharSequence)
and delimit(char)
methods can be used to obtain functions that convert the
array into a CharSequence
.
The delimit(CharSequence, Function<T, CharSequence>)
method can be used to obtain a function to convert an array of
an arbitrary type into a CharSequence
by applying a conversation function to each element in the array.