Skip to content

Commit

Permalink
Delete users that haven't logged in for too long (#647)
Browse files Browse the repository at this point in the history
* Add deletion_date to users in DB

* Set deletion date in user cleaner

* Remove old user cleaner code & add first unit test

* Don't just mount `spec` folder, but complete `app` folder for tests

This is to ensure that local changes in the `app` folder are reflected
in the Docker container such that newly written tests can be run through
without having to rebuild the whole Docker image.

* Revert "Don't just mount `spec` folder, but complete `app` folder for tests"

This reverts commit b3f3ec8.

* Add more unit tests for UserCleaner

* Improve tests for deletion of generic users only

* Split test for deletion of users

* Unset deletion date after successful user login

* Implement un-setting the deletion date & add tests

* Rewrite deletion unsetting to reuse existing method

* Send initial warning mail about deletion in 40 days

* Send additional warning mails when deletion date is near

* Add schema comment pointing to UserCleaner

* Test mail delivery

* Add "!" to unsafe method

* Use Rails logger to log info about deleted users

* Move UserCleaner from models to workers

* Setup user cleaner cron job to run daily

* Improve user cleaner doc strings & refactor

* Remove user ghost hash remnants

* Avoid unnecessary `to_date` conversion

* Improve comment for `after_authentication` hook

* Fix wording in comment

* Make better use of `let` RSpec keyword

* Limit the number of users deleted in one run

* Simplify unsetting deletion date logic

* Define `active_users` to reduce SQL operations

* Use `7.1` as Rails version specifier in migrations

* Explicitly set `INACTIVE_USER_THRESHOLD` in tests

* Make `MAX_DELETIONS_PER_RUN` an env variable

* Don't send deletion warning mails to non-generic users

* Send a final deletion mail once the user is deleted

* Move user cleaner file back to models folder

* Rename user cleaner files

* Count users without last_sign_in_at date as inactive

* Improve docstring for `inactive_users`

* Shorten user cleaner comment in docker.env

* Introduce `PRODUCTION_NAME` env variable

* Don't run user cleaner for mampf-experimental/dev

* Rewrite logic for when user cleaner is started

* Fix rubocop "where range"

* Try to use after instead of append_after

* Revert "Try to use after instead of append_after"

This reverts commit 90aa662.

* Try to not use cached docker in CI/CD unit tests

* Revert "Try to not use cached docker in CI/CD unit tests"

This reverts commit 069032b.

* Move transaction strategy to before each block

* Use around each in database cleaner config

* Use version 2.1.0 of database cleaner

* Revert "Use version 2.1.0 of database cleaner"

This reverts commit 6dfcd96.

* Add missing require "rails_helper"

* Try deletion strategy of database cleaner
  • Loading branch information
Splines authored Aug 13, 2024
1 parent 6557bf4 commit d0f8f44
Show file tree
Hide file tree
Showing 25 changed files with 552 additions and 150 deletions.
8 changes: 0 additions & 8 deletions app/mailers/mathi_mailer.rb
Original file line number Diff line number Diff line change
Expand Up @@ -2,14 +2,6 @@ class MathiMailer < ApplicationMailer
default from: DefaultSetting::PROJECT_EMAIL
layout false

def ghost_email(user)
return if user.ghost_hash.nil?

@name = user.name
@hash = user.ghost_hash
mail(to: user.email, subject: t("mailer.hash_mail_subject"))
end

def data_request_email(user)
@mail = user.email
@id = user.id
Expand Down
28 changes: 28 additions & 0 deletions app/mailers/user_cleaner_mailer.rb
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
class UserCleanerMailer < ApplicationMailer
layout "warning_mail_layout"

# Creates an email to inform a user that their account will be deleted.
#
# @param [Integer] num_days_until_deletion:
# The number of days until the account will be deleted.
def pending_deletion_email(num_days_until_deletion)
user = params[:user]
sender = "#{t("mailer.warning")} <#{DefaultSetting::PROJECT_EMAIL}>"
I18n.locale = user.locale

@num_days_until_deletion = num_days_until_deletion
subject = t("mailer.pending_deletion_subject",
num_days_until_deletion: @num_days_until_deletion)
mail(from: sender, to: user.email, subject: subject, priority: "high")
end

# Creates an email to inform a user that their account has been deleted.
def deletion_email
user = params[:user]
sender = "#{t("mailer.warning")} <#{DefaultSetting::PROJECT_EMAIL}>"
I18n.locale = user.locale

subject = t("mailer.deletion_subject")
mail(from: sender, to: user.email, subject: subject, priority: "high")
end
end
219 changes: 117 additions & 102 deletions app/models/user_cleaner.rb
Original file line number Diff line number Diff line change
@@ -1,125 +1,140 @@
# PORO class that removes users with inactive emails
# Deletes inactive users from the database.
# See [1] for a description of how the flow works on a high level.
#
# Users have a deletion_date field that is nil by default. It is set to a future
# date if the user has been inactive for too long (i.e. hasn't logged in).
# Before the deletion date is reached, we send warning mails. If users log in
# before the deletion date, that date is reset to nil such that the user is not
# deleted. If the user is still inactive on the deletion date, the user is
# ultimately deleted.
#
# [1] https://github.com/MaMpf-HD/mampf/issues/410#issuecomment-2136875776
class UserCleaner
attr_accessor :imap, :email_dict, :hash_dict

def login
@imap = Net::IMAP.new(ENV.fetch("IMAPSERVER"), port: 993, ssl: true)
@imap.authenticate("LOGIN", ENV.fetch("PROJECT_EMAIL_USERNAME"),
ENV.fetch("PROJECT_EMAIL_PASSWORD"))
# The maximum number of users that can be deleted in one run.
# This is equivalent to the maximum of number of deletion dates set in one run.
#
# This flag can be used to prevent too many mails from being sent at once.
# Keep in mind that the mail server also handles other mails, e.g. notification
# mails etc., so we might want to set the limit very low here such that our
# mail server is not marked as "spam server".
#
# Note that this is just a soft limit, i.e. the actual number of deletion
# warning mails sent on a given day might be higher than this number:
# - If on a given day the cronjob is not run (for whatever reason),
# we have more users with a deletion date lying in the past than
# MAX_DELETIONS_PER_RUN. However, we don't send an additional mail once
# the user is deleted, so this shouldn't be a problem.
# - Despite that there cannot be more than MAX_DELETIONS_PER_RUN users with the
# same deletion date, warning mails might be sent on the same date to users
# with varying deletion dates, since the 40-, 14-, 7- and 2-day warning mails
# can overlap temporally.
MAX_DELETIONS_PER_RUN = ENV.fetch("MAX_DELETIONS_PER_RUN").to_i

# The threshold for inactive users. Users who have not logged in for this time
# are considered inactive.
INACTIVE_USER_THRESHOLD = 6.months

# Returns all users who have been inactive for INACTIVE_USER_THRESHOLD months,
# i.e. their last sign-in date is more than INACTIVE_USER_THRESHOLD months ago.
#
# Users without a last_sign_in_at date are also considered inactive. This is
# the case for users who have never logged in since PR #553 was merged.
def inactive_users
User.where(last_sign_in_at: ...INACTIVE_USER_THRESHOLD.ago)
.or(User.where(last_sign_in_at: nil))
end

def logout
@imap.logout
# Returns all users who have been active in the last INACTIVE_USER_THRESHOLD months,
# i.e. their last sign-in date is less than INACTIVE_USER_THRESHOLD months ago.
def active_users
User.where(last_sign_in_at: INACTIVE_USER_THRESHOLD.ago..)
end

def search_emails_and_hashes
@email_dict = {}
@hash_dict = {}
@imap.examine(ENV.fetch("PROJECT_EMAIL_MAILBOX"))
# Mails containing multiple email addresses (Subject: "Undelivered Mail Returned to Sender")
@imap.search(["SUBJECT",
"Undelivered Mail Returned to Sender"]).each do |message_id|
body = @imap.fetch(message_id,
"BODY[TEXT]")[0].attr["BODY[TEXT]"].squeeze(" ")
# rubocop:disable Layout/LineLength
next unless (match = body.scan(/([a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,4})[\s\S]*?([a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,4})[\s\S]*?User has moved to ERROR: Account expired/))
# rubocop:enable Layout/LineLength

match = match.flatten.uniq
match.each do |email|
add_mail(email, message_id)

try_get_hash(body, email)
end
end
# Mails containing single email addresses (Subject: "Delivery Status Notification (Failure)")
# define array containing all used regex patterns
patterns = [
'([a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,4})>[\s\S]*?Unknown recipient',
# rubocop:disable Layout/LineLength
'([a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,4})>[\s\S]*?User unknown in virtual mailbox table'
# rubocop:enable Layout/LineLength
]

@imap.search(["SUBJECT",
"Delivery Status Notification (Failure)"]).each do |message_id|
body = @imap.fetch(message_id,
"BODY[TEXT]")[0].attr["BODY[TEXT]"].squeeze(" ")
patterns.each do |pattern|
next unless (match = body.scan(/#{pattern}/))

match = match.flatten.uniq
match.each do |email|
add_mail(email, message_id)

try_get_hash(body, email)
end
# Sets the deletion date for inactive users and sends an initial warning mail.
#
# This method finds all inactive users whose deletion date is nil (not set yet)
# and updates their deletion date to be 40 days from the current date.
#
# The maximum number of deletion dates set in one run is limited by
# MAX_DELETIONS_PER_RUN.
def set_deletion_date_for_inactive_users
inactive_users.where(deletion_date: nil)
.limit(MAX_DELETIONS_PER_RUN)
.find_each do |user|
user.update(deletion_date: Date.current + 40.days)

if user.generic? # rubocop:disable Style/IfUnlessModifier
UserCleanerMailer.with(user: user).pending_deletion_email(40).deliver_later
end
end
end

def add_mail(email, message_id)
@email_dict = {} if @email_dict.blank?
if @email_dict.key?(email)
@email_dict[email] << message_id
else
@email_dict[email] = [message_id]
end
# Unsets the deletion date for users who have been active recently.
#
# This method finds all users whose deletion date is set and unsets it if the
# user has been active recently.
#
# Note that this method just serves as a safety measure. The deletion date
# should be unset after every successful user sign-in, see the Warden callback
# in `config/initializers/after_sign_in.rb`. If for some reason, the callback
# does not work, this method will prevent active users from being deleted
# as a last resort.
def unset_deletion_date_for_recently_active_users
active_users.where.not(deletion_date: nil).update(deletion_date: nil)
end

def try_get_hash(body, email)
@hash_dict = {} if @hash_dict.blank?
begin
hash = body.match(/Hash:([a-zA-Z0-9]*)/).captures
@hash_dict[email] = hash
rescue StandardError
nil
# Deletes all users whose deletion date is in the past or present.
#
# Technically, there should never be users with a deletion date in the past
# since the cron job is run daily and should delete users on the day of their
# deletion date. Should the cron job not run for some reason, we also delete
# users with a deletion date in the past via this method.
#
# The deletion date for the users must have been set beforehand by calling
# `set_deletion_date_for_inactive_users`.
#
# Even after having called this method, there might still exist users with a
# deletion date in the future, as we only delete generic users.
def delete_users_according_to_deletion_date!
num_deleted_users = 0

User.where(deletion_date: ..Date.current).find_each do |user|
next unless user.generic?

UserCleanerMailer.with(user: user).deletion_email.deliver_later
user.destroy
num_deleted_users += 1
end
end

def send_hashes
@emails = @email_dict.keys
@users = User.where(email: @emails)

@users.each do |user|
user.update(ghost_hash: Digest::SHA256.hexdigest(Time.now.to_i.to_s))
MathiMailer.ghost_email(user).deliver_now
move_mail(@email_dict[user])
end
Rails.logger.info("UserCleaner deleted #{num_deleted_users} stale users")
end

def delete_ghosts
# @hash_dict.each do |mail, hash|
# u = User.find_by(email: mail, ghost_hash: hash)
# move_mail(@email_dict[mail]) if u.present? && @email_dict.present?
# u.destroy! if u&.generic?
# end
end

def move_mail(message_ids, attempt = 0)
return if message_ids.blank?
# Sends additional warning mails to users whose deletion date is near.
#
# In addition to the initial warning mail 40 days before deletion, this method
# sends warning mails 14, 7 and 2 days before the account is deleted.
def send_additional_warning_mails
User.where.not(deletion_date: nil).find_each do |user|
next unless user.generic?

message_ids = Array(message_ids)
return if attempt > 3
num_days_until_deletion = (user.deletion_date - Date.current).to_i

begin
@imap.examine(ENV.fetch("PROJECT_EMAIL_MAILBOX"))
@imap.move(message_ids, "Other Users/mampf/handled_bounces")
rescue Net::IMAP::BadResponseError
move_mail(message_ids, attempt + 1)
if [14, 7, 2].include?(num_days_until_deletion)
UserCleanerMailer.with(user: user)
.pending_deletion_email(num_days_until_deletion)
.deliver_later
end
end
end

def clean!
# TODO: Implement new user cleaner logic
# login
# search_emails_and_hashes
# return if @email_dict.blank?
# Handles inactive users according to the deletion policy documented
# in the UserCleaner class description. Brief: users that haven't logged in
# to MaMpf for too long will be deleted.
def handle_inactive_users!
set_deletion_date_for_inactive_users
unset_deletion_date_for_recently_active_users
delete_users_according_to_deletion_date!

# send_hashes
# sleep(10)
# search_emails_and_hashes
# delete_ghosts
# logout
send_additional_warning_mails
end
end
18 changes: 18 additions & 0 deletions app/views/layouts/warning_mail_layout.html.erb
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
<!DOCTYPE html>
<html>

<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
<%= stylesheet_link_tag :email %>
</head>

<body>
<div class="container text-center">
<%= email_image_tag('/MaMpf-Logo_96x96.png', class: 'img-fluid mb-2 pt-4 pb-3' ) %>
<div class="jumbotron">
<%= yield %>
</div>
</div>
</body>

</html>
1 change: 0 additions & 1 deletion app/views/mathi_mailer/ghost_email.html.erb

This file was deleted.

10 changes: 10 additions & 0 deletions app/views/user_cleaner_mailer/deletion_email.html.erb
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
<div class="container">
<div class="blockquote text-start">
<h3>
<%= t('mailer.deletion_subject') %>
</h3>
<p>
<%= t('mailer.deletion_body_html') %>
</p>
</div>
</div>
12 changes: 12 additions & 0 deletions app/views/user_cleaner_mailer/pending_deletion_email.html.erb
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
<div class="container">
<div class="blockquote text-start">
<h3>
<%= t('mailer.pending_deletion_subject',
num_days_until_deletion: @num_days_until_deletion) %>
</h3>
<p>
<%= t('mailer.pending_deletion_body_html',
num_days_until_deletion: @num_days_until_deletion) %>
</p>
</div>
</div>
13 changes: 13 additions & 0 deletions app/workers/user_cleaner.rb
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
class UserCleanerJob
include Sidekiq::Worker

def perform
# Only run this job in production, not for mampf-experimental or mampf-dev.
# Note that Rails.env.production? is not sufficient in this context
# as both mampf-experimental and mampf-dev also run in production mode.
production_name = ENV.fetch("PRODUCTION_NAME", nil)
return unless production_name == "mampf"

UserCleaner.new.handle_inactive_users!
end
end
8 changes: 0 additions & 8 deletions app/workers/user_cleaner_job.rb

This file was deleted.

5 changes: 5 additions & 0 deletions config/environments/production.rb
Original file line number Diff line number Diff line change
Expand Up @@ -78,4 +78,9 @@

# Do not dump schema after migrations.
config.active_record.dump_schema_after_migration = false

config.after_initialize do
production_name = ENV.fetch("PRODUCTION_NAME", nil)
Rails.logger.info("PRODUCTION_NAME: #{production_name}")
end
end
10 changes: 10 additions & 0 deletions config/initializers/after_sign_in.rb
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
# Callback documentation for Warden (used by Devise):
# https://github.com/wardencommunity/warden/wiki/Callbacks#after_authentication
Warden::Manager.after_authentication do |user, _auth, _opts|
# User might have not logged in for a long time, which is why they have a
# deletion date set. If the user logs in, we unset this date to prevent the
# user from being deleted. See the UserCleaner class for more information.
# In case this callback does not work, a safety net is provided by the method
# `unset_deletion_date_for_recently_active_users` in the UserCleaner class.
user.deletion_date = nil
end
16 changes: 16 additions & 0 deletions config/locales/de.yml
Original file line number Diff line number Diff line change
Expand Up @@ -3053,6 +3053,22 @@ de:
en: 'Englisch'
de: 'Deutsch'
mailer:
warning: "MaMpf Warning"
pending_deletion_subject: "%{num_days_until_deletion} Tage bis zur
Löschung Deines MaMpf-Accounts"
deletion_subject: "Account gelöscht"
pending_deletion_body_html: >
Wir haben festgestellt, dass Du Dich in den letzten 6 Monaten nicht in
MaMpf eingeloggt hast. Dein Account wird in %{num_days_until_deletion}
Tagen <strong>gelöscht, sofern Du Dich nicht vorher einloggst</strong>.
Bitte beachte, dass nach der Löschung Deines Accounts alle bei MaMpf
gespeicherten Daten von Dir unwiderruflich verloren gehen.
deletion_body_html: >
<strong>Dein MaMpf-Account wurde vollständig gelöscht, da Du Dich seit
mehr als 6 Monaten nicht eingeloggt hast.</strong> Beachte, dass wir zuvor
mehrere Erinnerungen gesendet haben. Alle Deine bei MaMpf gespeicherten
Daten sind unwiderruflich verloren gegangen. Vielen Dank, dass Du MaMpf
genutzt hast.
notification: 'MaMpf-Benachrichtigung'
notification_header: 'Benachrichtigung von MaMpf'
medium_subject: 'Neues Medium'
Expand Down
Loading

0 comments on commit d0f8f44

Please sign in to comment.