-
Notifications
You must be signed in to change notification settings - Fork 16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
basic spam detection #719
basic spam detection #719
Conversation
uses two methods: - timing form load->submission time - using a honeypot field hidden from real users with css logs to a SpamContent model with a generic fk to content models. action such as deactivating the user or allowing the post can then be taken via a WIP admin view
allows for: - filtering by review status and content type - manual confirmation of spam (with option to deactivate submitter) - manual denial of spam also now using wagtail-modeladmin instead of deprecated wagtail.contrib.modeladmin (resolves comses/planning#170)
spam content, and deleted events/jobs for that matter, still show up in the site-wide search. Need to find a good way to filter those from the general search but struggling to figure out how it currently does this for non-live codebases.. |
good catch, we should have a
defined for Events, MemberProfiles, Jobs |
ah that's it, thanks The bulk of this is done, just need to put up a few tests. I think it ought to work as a more flexible base for using the work in #693 as another detection method, but lmk if you have any thoughts or concerns when you get around to it |
standard public() query = not spam and not deleted - make deleted (more like archived) events/jobs viewable to those with access with an alert
- refactor some test [Model]Factory classes to inherit common functionality + add method for returning data to make a create/update request
BREAKING CHANGES(schema): existing schema will break if this branch has already been deployed, manually run ``` ./manage.py migrate core 0020 ``` before running git pull - rename SpamContent -> SpamModeration and SpamModeration.Status to SPAM | NOT_SPAM | UNREVIEWED for clarity - add ModeratedContent mixin class for access to SpamModeration things - squash / regenerate migrations TODO: consider doing the same for QuerySets that should respect a spam_moderation field
Co-authored-by: Scott Foster <[email protected]>
set load time with useForm.setValuesWithLoadTime instead of useFormTimer just for setting a timestamp
- correct alert_if_spam macro reference - move comment to the right place
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM thanks @sgfost ! I think I fixed a few minor issues with the codebase release edit form still including an outdated reference to the spam alert macro, will wait and see if things still look OK on staging before merging
@@ -21,6 +21,8 @@ | |||
{% endblock ogp_tags %} | |||
|
|||
{% block content %} | |||
{{ alert_if_spam(is_marked_spam) }} | |||
{{ alert_if_deleted(is_deleted, "event") }} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
when if ever will alert_if_deleted
be executed? the old behavior was to generate a 404 when trying to retrieve something that's been marked as deleted, guessing that's no longer the case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, its now treated exactly the same as things marked spam, hidden from list view but accessible if you have a direct link to it. There wasn't much reasoning behind it besides simplifying queries
django/core/models.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we consider attaching SpamModeration
to MemberProfile
as well?
@@ -180,3 +184,112 @@ def list(self, request, *args, **kwargs): | |||
|
|||
serializer = self.get_serializer(queryset, many=True) | |||
return Response(serializer.data) | |||
|
|||
|
|||
class SpamCatcherSerializerMixin(serializers.Serializer): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should this use class SpamCatcherSerializerMixin(metaclass=serializers.SerializerMetaclass)
instead?
see
and
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably, yeah. the mixin shouldn't be a serializer. I'll check this out
@@ -310,7 +311,7 @@ def __init__(self, *args, **kwargs): | |||
super().__init__(*args, **kwargs) | |||
|
|||
|
|||
class EventSerializer(serializers.ModelSerializer): | |||
class EventSerializer(serializers.ModelSerializer, SpamCatcherSerializerMixin): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This appears to be working fine but for future-proofing should the SpamCatcherSerializerMixin
be specified first so that the Python MRO invokes its stuff first before serializers.ModelSerializer
?
Same for all the other Serializers, some of which have ModelSerializer before other mixins...
def get_queryset(self): | ||
# exclude spam from list view | ||
if self.action == "list": | ||
return self.queryset.public() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
am I understanding this correctly to allow possible spam events to be accessible in the detail view? (same for Jobs?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, mostly to give an explanation of what happened should a real post get mistakenly flagged. Though it should probably only be accessible by admins and the submitter
from django.views.decorators.http import require_POST | ||
from wagtail.contrib.modeladmin.helpers import AdminURLHelper | ||
from wagtail_modeladmin.helpers import AdminURLHelper |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should eventually migrate to Snippets or alternative https://github.com/comses/planning/issues/248
Thx, I'll make a new issue with these needed changes as to not block merging if its in a 'good enough' state |
uses two methods:
logs actions to a SpamModeration model with a generic fk to content models. actions such as deactivating the user or allowing the post can then be taken via a WIP admin view
TODO:
SpamContent
detailing how/why we think its spam, flexible enough that it can also be used for detecting with NLP (think ahead to reconciling Feat: Spam Detection Feature #693)resolves comses/planning#235