Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle methods on non-struct source types #272

Open
mlevesquedion opened this issue Jan 29, 2021 · 2 comments
Open

Handle methods on non-struct source types #272

mlevesquedion opened this issue Jan 29, 2021 · 2 comments

Comments

@mlevesquedion
Copy link
Contributor

mlevesquedion commented Jan 29, 2021

This issue was prompted by discussion on #264.

Currently, specifying a source type does not require specifying a field, so any named type can be identified as a Source in the configuration (using the package and type keys). Although we do not know of any users leveraging this currently, it is easy to imagine how/why users might want to configure sources that are not structs:

type Secret string

type Secret interface {
        ASecret()
}

Any named type can have methods declared on it. For structs, sensitive information is ultimately stored in fields. Since propagating taint to the results of any source method was too coarse, the fieldpropagator analyzer was introduced to identify methods on struct sources that could return values tainted by a sensitive field. Consider, e.g.:

type Source struct {
        SomethingPublic string
        SomethingPrivate string `levee:"source"`
}

func (s *Source) DoesNotReturnATaintedValue() string {
        return s.SomethingPublic
}

func (s *Source) ReturnsATaintedValue() string {
        return s.SomethingPrivate
}

In the first case, the return value isn't tainted (more on this later), because the field being returned is not a source. We do want the return value of the second method to be tainted, because it returns a source field. We have tests validating this behavior.

Currently, the behavior with respect to methods on Source types is:

  • If the method is a fieldpropagator, then it always returns a tainted value, and it will get its own propagation (i.e., the call will be identified as a source and a propagation will start from there).
  • If the method is not a fieldpropagator, we only propagate to the return value if one of the arguments (excluding the receiver, which is the Source value whose method is being called) is tainted.

For methods on values that are not Sources, we always consider the returned values to be tainted, no matter if the value itself is tainted or not. Since we don't scrutinize these methods' bodies at all, we assume that they always propagate taint, just like we do for regular functions.

Coming back to this statement:

If the method is a fieldpropagator, then it always returns a tainted value, and it will get its own propagation (i.e., the call will be identified as a source and a propagation will start from there).

If a source type has no fields, then its methods will not be identified as field propagators. However, I believe that methods on non-struct source types should always propagate taint. Indeed:

  • Non-structs, non-interfaces are just values, so it is reasonable that they are used to derive the return value of whatever method is defined on them.
  • For interfaces, we don't know what the value could be, so we should assume the worst.

Given this reasoning, I believe that we should:

  1. Mark all methods on non-structs as fieldpropagators.
  2. Rename the fieldpropagator analyzer to accurately reflect this behavior. Maybe something like methodpropagator.

Further discussion:

At the time of writing the fieldpropagator analyzer (and before), we were concerned about false positives. In general, there's no harm in a simple getter that returns a non-source field. However, in some cases it is possible that non-source fields may be tainted, which effectively turns any method that returns that field into a propagator. Continuing with the previous example:

func Oops(s *Source) {
        s.SomethingPublic = s.SomethingPrivate
        Sink(s.DoesNotReturnATaintedValue()) // OOPS, actually, it *does* return a tainted value, and it just reached a sink
}

Currently, we do not have a good way of modeling something like this. The fieldpropagator analyzer merely identifies methods as being propagators or not. It does not say which fields could be incorporated in any return values. In any case, the propagation code does not precisely model the taint status of fields either, because different FieldAddrs referring to the same field are different ssa.Values, so tainting one reference to a field does not taint the others.

Furthermore, consider this other example:

func (s *Source) MoreOops() {
        s.SomethingPublic = s.SomethingPrivate
}

This behavior would also be very difficult to model using our existing abstractions.

Ultimately, what I believe this points to is:

  • The fieldpropagator analyzer is currently doing a very limited form of IPA (interprocedural analysis), which only captures whether or not the function always returns a value derived from a source field. Implementing a more complete form of IPA (e.g., function summaries describing the behavior of functions in more detail) will likely help alleviate some of the issues outlined here.
  • We may benefit from some kind of abstraction over ssa values.
@PurelyApplied
Copy link
Collaborator

  1. Mark all methods on non-structs as fieldpropagators.

+1 to marking all methods as producing sources for interface-type sources. For concretely named types like type Source string, an analyzer should still be able to distinguish methods which propagate taint from the underlying source, i.e.

type Source string

func (s Source) Safe() string {
  return "hello there"
}

func (s Source) Unsafe() string {
  return fmt.Sprintf("Taint should propagate here from s.  %v", string(s))
}

That could be a goal for future iteration, though.

I'd love to see a decision graph as part of our documentation, if we're going to have two or more different ways of defining what it means to be a source.


  1. Rename the fieldpropagator analyzer to accurately reflect this behavior. Maybe something like methodpropagator.

As ever, I consider naming a very hard thing. If the intent of the analyzer is "Identify methods that produce Values that we will treat like a new source," I'm leaning away from the word "propagator." If Go were more verbose by convention, I'd suggest sourceproducermethodidentifier, but that might be letting my Java background show. sourceproducer might suffice? I don't really mind one way or t'other, though, as long as the analyzer's docstring is precise in what the produced result will contain.


Further discussion: [...]

Yeah, this feels like something that notes could handle. Implementation detail, but if and when we get into interprocedural work, I think the note granularity should probably be the ssa.Block, with method / function summaries built from those. But it would certainly alleviate some of these other issues.

@mlevesquedion
Copy link
Contributor Author

All of that sounds good to me. I think I'll start by marking all methods on non-structs as source-producing, and look into refining that later. I think the cases in which a method on e.g. a string doesn't use the string are likely to be rare, but I don't actually know.

Thanks for the naming suggestions. Right now I'm thinking sourcemethod might fit the bill, but I'll keep thinking about it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants