-
Notifications
You must be signed in to change notification settings - Fork 16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
automate spam detection for Job, Event, Codebase and MemberProfile using an external LLM service #772
base: main
Are you sure you want to change the base?
automate spam detection for Job, Event, Codebase and MemberProfile using an external LLM service #772
Conversation
…`api/spam/update`. A SpamModeration record with status `SCHEDULED_FOR_CHECK` is stored on every Job, Event, Codebase submission. A decoupled external service will query for these objects to check them for spam.
b46ba40
to
20c2574
Compare
…PAM_LIKELY - fix tests - add asdf & direnv to .gitignore and .dockerignore
20c2574
to
5b1ce46
Compare
we would need to add the jetstream instance ip to ALLOWED_HOSTS |
just missing the migration for I'll read through it again but it seemed all good on a first pass, besides having a way to kick off the process |
Regarding the way to start the process from CoMSES side:
something like this? |
…tream2 instance, triggers the LLM spam check workflow and shelves the instance when the workflow is done.
79a1368
to
84f051c
Compare
…management command + minor refactoring
84f051c
to
0cbae60
Compare
…ement command + minor refactoring
…pamModeration object is automatically created for the associated MemberProfile
@asuworks I just remembered there was some additional cleanup I wanted to do eventually with the spam stuff. This might be a good place to get that done if you are up for it. comses/planning#249. Namely the second point (refactoring the serializer mixin to actually be just a mixin) |
This PR attempts to automate the spam detection process for
Job
,Event
,Codebase
andMemberProfile
objects using an external LLM service.LLM Spam Detection Process
SCHEDULED_FOR_CHECK
is stored on everyJob
,Event
,Codebase
, submission andUser
(SpamModeration
object is attached to the associatedMemberProfile
) creation.SpamModeration
objects (api/spam/get-latest-batch/
), analyzes them for spam and submits a spam report toapi/spam/update
for each one of them.api/spam/update
on the CoMSES side updates the correspondingSpamModeration
object according to the LLM report from the external service.Starting the LLM Spam Detection Process
The external service asuworks/comses.spamcheck is deployed on an existing JetStream2 instance which is unshelved before the spam check workflow is triggered and shelved automatically after it is done by the following management command:
Environment & Secrets
Following environment variables must be set:
JetStream2 Credentials
can be found here: https://js2.jetstream-cloud.org/identity/application_credentials/
secrets/llm_spam_check_jetstream_os_application_credential_secret
secrets/llm_spam_check_jetstream_os_application_credential_id
X-API-Key header for the API
Access to
api/spam/update
andapi/spam/get-latest-batch
routes is protected by theX-API-Key
header verification.The key should be set in
secrets/llm_spam_check_api_key
ALLOWED_HOSTS
The IP of the JetStream2 instance must be added to Django's
ALLOWED_HOSTS