glitchier-soc/app/workers/scheduler/accounts_statuses_cleanup_s...

# frozen_string_literal: true

class Scheduler::AccountsStatusesCleanupScheduler
  include Sidekiq::Worker
  include Redisable

  # This limit is mostly to be nice to the fediverse at large and not
  # generate too much traffic.
  # This also helps limiting the running time of the scheduler itself.
  MAX_BUDGET         = 300

  # This is an attempt to spread the load across remote servers, as
  # spreading deletions across diverse accounts is likely to spread
  # the deletion across diverse followers. It also helps each individual
  # user see some effect sooner.
  PER_ACCOUNT_BUDGET = 5

  # This is an attempt to limit the workload generated by status removal
  # jobs to something the particular server can handle.
  PER_THREAD_BUDGET  = 5

  # These are latency limits on various queues above which a server is
  # considered to be under load, causing the auto-deletion to be entirely
  # skipped for that run.
  LOAD_LATENCY_THRESHOLDS = {
    default: 5,
    push: 10,
    # The `pull` queue has lower priority jobs, and it's unlikely that
    # pushing deletes would cause much issues with this queue if it didn't
    # cause issues with `default` and `push`. Yet, do not enqueue deletes
    # if the instance is lagging behind too much.
    pull: 5.minutes.to_i,
  }.freeze

  sidekiq_options retry: 0, lock: :until_executed, lock_ttl: 1.day.to_i

  def perform
    return if under_load?

    budget = compute_budget
    first_policy_id = last_processed_id

    loop do
      num_processed_accounts = 0

      scope = AccountStatusesCleanupPolicy.where(enabled: true)
      scope.where(Account.arel_table[:id].gt(first_policy_id)) if first_policy_id.present?
      scope.find_each(order: :asc) do |policy|
        num_deleted = AccountStatusesCleanupService.new.call(policy, [budget, PER_ACCOUNT_BUDGET].min)
        num_processed_accounts += 1 unless num_deleted.zero?
        budget -= num_deleted
        if budget.zero?
          save_last_processed_id(policy.id)
          break
        end
      end

      # The idea here is to loop through all policies at least once until the budget is exhausted
      # and start back after the last processed account otherwise
      break if budget.zero? || (num_processed_accounts.zero? && first_policy_id.nil?)

      first_policy_id = nil
    end
  end

  def compute_budget
    # Each post deletion is a `RemovalWorker` job (on `default` queue), each
    # potentially spawning many `ActivityPub::DeliveryWorker` jobs (on the `push` queue).
    threads = Sidekiq::ProcessSet.new.select { |x| x['queues'].include?('push') }.pluck('concurrency').sum
    [PER_THREAD_BUDGET * threads, MAX_BUDGET].min
  end

  def under_load?
    LOAD_LATENCY_THRESHOLDS.any? { |queue, max_latency| queue_under_load?(queue, max_latency) }
  end

  private

  def queue_under_load?(name, max_latency)
    Sidekiq::Queue.new(name).latency > max_latency
  end

  def last_processed_id
    redis.get('account_statuses_cleanup_scheduler:last_account_id')
  end

  def save_last_processed_id(id)
    if id.nil?
      redis.del('account_statuses_cleanup_scheduler:last_account_id')
    else
      redis.set('account_statuses_cleanup_scheduler:last_account_id', id, ex: 1.hour.seconds)
    end
  end
end
Add feature to automatically delete old toots (#16529) * Add account statuses cleanup policy model * Record last inspected toot to delete to speed up successive calls to statuses_to_delete * Add service to cleanup a given account's statuses within a budget * Add worker to go through account policies and delete old toots * Fix last inspected status id logic All existing statuses older or equal to last inspected status id must be kept by the current policy. This is an invariant that must be kept so that resuming deletion from the last inspected status remains sound. * Add tests * Refactor scheduler and add tests * Add user interface * Add support for discriminating based on boosts/favs * Add UI support for min_reblogs and min_favs, rework UI * Address first round of review comments * Replace Snowflake#id_at_start with with_random parameter * Add tests * Add tests for StatusesCleanupController * Rework settings page * Adjust load-avoiding mechanisms * Please CodeClimate 3 years ago			`# frozen_string_literal: true`

			`class Scheduler::AccountsStatusesCleanupScheduler`
			`include Sidekiq::Worker`
Fix single Redis connection being used across all threads (#18135) * Fix single Redis connection being used across all Sidekiq threads * Fix tests 3 years ago			`include Redisable`
Add feature to automatically delete old toots (#16529) * Add account statuses cleanup policy model * Record last inspected toot to delete to speed up successive calls to statuses_to_delete * Add service to cleanup a given account's statuses within a budget * Add worker to go through account policies and delete old toots * Fix last inspected status id logic All existing statuses older or equal to last inspected status id must be kept by the current policy. This is an invariant that must be kept so that resuming deletion from the last inspected status remains sound. * Add tests * Refactor scheduler and add tests * Add user interface * Add support for discriminating based on boosts/favs * Add UI support for min_reblogs and min_favs, rework UI * Address first round of review comments * Replace Snowflake#id_at_start with with_random parameter * Add tests * Add tests for StatusesCleanupController * Rework settings page * Adjust load-avoiding mechanisms * Please CodeClimate 3 years ago
			`# This limit is mostly to be nice to the fediverse at large and not`
			`# generate too much traffic.`
			`# This also helps limiting the running time of the scheduler itself.`
Change automatic post deletion thresholds and load detection (#24614) 2 years ago			`MAX_BUDGET = 300`
Add feature to automatically delete old toots (#16529) * Add account statuses cleanup policy model * Record last inspected toot to delete to speed up successive calls to statuses_to_delete * Add service to cleanup a given account's statuses within a budget * Add worker to go through account policies and delete old toots * Fix last inspected status id logic All existing statuses older or equal to last inspected status id must be kept by the current policy. This is an invariant that must be kept so that resuming deletion from the last inspected status remains sound. * Add tests * Refactor scheduler and add tests * Add user interface * Add support for discriminating based on boosts/favs * Add UI support for min_reblogs and min_favs, rework UI * Address first round of review comments * Replace Snowflake#id_at_start with with_random parameter * Add tests * Add tests for StatusesCleanupController * Rework settings page * Adjust load-avoiding mechanisms * Please CodeClimate 3 years ago
Change automatic post deletion thresholds and load detection (#24614) 2 years ago			`# This is an attempt to spread the load across remote servers, as`
			`# spreading deletions across diverse accounts is likely to spread`
			`# the deletion across diverse followers. It also helps each individual`
			`# user see some effect sooner.`
Add feature to automatically delete old toots (#16529) * Add account statuses cleanup policy model * Record last inspected toot to delete to speed up successive calls to statuses_to_delete * Add service to cleanup a given account's statuses within a budget * Add worker to go through account policies and delete old toots * Fix last inspected status id logic All existing statuses older or equal to last inspected status id must be kept by the current policy. This is an invariant that must be kept so that resuming deletion from the last inspected status remains sound. * Add tests * Refactor scheduler and add tests * Add user interface * Add support for discriminating based on boosts/favs * Add UI support for min_reblogs and min_favs, rework UI * Address first round of review comments * Replace Snowflake#id_at_start with with_random parameter * Add tests * Add tests for StatusesCleanupController * Rework settings page * Adjust load-avoiding mechanisms * Please CodeClimate 3 years ago			`PER_ACCOUNT_BUDGET = 5`

			`# This is an attempt to limit the workload generated by status removal`
Change automatic post deletion thresholds and load detection (#24614) 2 years ago			`# jobs to something the particular server can handle.`
			`PER_THREAD_BUDGET = 5`

			`# These are latency limits on various queues above which a server is`
			`# considered to be under load, causing the auto-deletion to be entirely`
			`# skipped for that run.`
			`LOAD_LATENCY_THRESHOLDS = {`
			`default: 5,`
			`push: 10,`
			# The `pull` queue has lower priority jobs, and it's unlikely that
			`# pushing deletes would cause much issues with this queue if it didn't`
			# cause issues with `default` and `push`. Yet, do not enqueue deletes
			`# if the instance is lagging behind too much.`
			`pull: 5.minutes.to_i,`
			`}.freeze`
Add feature to automatically delete old toots (#16529) * Add account statuses cleanup policy model * Record last inspected toot to delete to speed up successive calls to statuses_to_delete * Add service to cleanup a given account's statuses within a budget * Add worker to go through account policies and delete old toots * Fix last inspected status id logic All existing statuses older or equal to last inspected status id must be kept by the current policy. This is an invariant that must be kept so that resuming deletion from the last inspected status remains sound. * Add tests * Refactor scheduler and add tests * Add user interface * Add support for discriminating based on boosts/favs * Add UI support for min_reblogs and min_favs, rework UI * Address first round of review comments * Replace Snowflake#id_at_start with with_random parameter * Add tests * Add tests for StatusesCleanupController * Rework settings page * Adjust load-avoiding mechanisms * Please CodeClimate 3 years ago
Change auto-deletion throttling constants to better scale with server size (#23320) 2 years ago			`sidekiq_options retry: 0, lock: :until_executed, lock_ttl: 1.day.to_i`
Add feature to automatically delete old toots (#16529) * Add account statuses cleanup policy model * Record last inspected toot to delete to speed up successive calls to statuses_to_delete * Add service to cleanup a given account's statuses within a budget * Add worker to go through account policies and delete old toots * Fix last inspected status id logic All existing statuses older or equal to last inspected status id must be kept by the current policy. This is an invariant that must be kept so that resuming deletion from the last inspected status remains sound. * Add tests * Refactor scheduler and add tests * Add user interface * Add support for discriminating based on boosts/favs * Add UI support for min_reblogs and min_favs, rework UI * Address first round of review comments * Replace Snowflake#id_at_start with with_random parameter * Add tests * Add tests for StatusesCleanupController * Rework settings page * Adjust load-avoiding mechanisms * Please CodeClimate 3 years ago
			`def perform`
			`return if under_load?`

			`budget = compute_budget`
			`first_policy_id = last_processed_id`

			`loop do`
			`num_processed_accounts = 0`

			`scope = AccountStatusesCleanupPolicy.where(enabled: true)`
			`scope.where(Account.arel_table[:id].gt(first_policy_id)) if first_policy_id.present?`
			`scope.find_each(order: :asc) do \|policy\|`
			`num_deleted = AccountStatusesCleanupService.new.call(policy, [budget, PER_ACCOUNT_BUDGET].min)`
			`num_processed_accounts += 1 unless num_deleted.zero?`
			`budget -= num_deleted`
			`if budget.zero?`
			`save_last_processed_id(policy.id)`
			`break`
			`end`
			`end`

			`# The idea here is to loop through all policies at least once until the budget is exhausted`
			`# and start back after the last processed account otherwise`
			`break if budget.zero? \|\| (num_processed_accounts.zero? && first_policy_id.nil?)`
Autofix Rubocop remaining Layout rules (#23679) 2 years ago
Add feature to automatically delete old toots (#16529) * Add account statuses cleanup policy model * Record last inspected toot to delete to speed up successive calls to statuses_to_delete * Add service to cleanup a given account's statuses within a budget * Add worker to go through account policies and delete old toots * Fix last inspected status id logic All existing statuses older or equal to last inspected status id must be kept by the current policy. This is an invariant that must be kept so that resuming deletion from the last inspected status remains sound. * Add tests * Refactor scheduler and add tests * Add user interface * Add support for discriminating based on boosts/favs * Add UI support for min_reblogs and min_favs, rework UI * Address first round of review comments * Replace Snowflake#id_at_start with with_random parameter * Add tests * Add tests for StatusesCleanupController * Rework settings page * Adjust load-avoiding mechanisms * Please CodeClimate 3 years ago			`first_policy_id = nil`
			`end`
			`end`

			`def compute_budget`
Change automatic post deletion thresholds and load detection (#24614) 2 years ago			# Each post deletion is a `RemovalWorker` job (on `default` queue), each
			# potentially spawning many `ActivityPub::DeliveryWorker` jobs (on the `push` queue).
Autofix Rubocop Rails/Pluck (#23730) 2 years ago			`threads = Sidekiq::ProcessSet.new.select { \|x\| x['queues'].include?('push') }.pluck('concurrency').sum`
Add feature to automatically delete old toots (#16529) * Add account statuses cleanup policy model * Record last inspected toot to delete to speed up successive calls to statuses_to_delete * Add service to cleanup a given account's statuses within a budget * Add worker to go through account policies and delete old toots * Fix last inspected status id logic All existing statuses older or equal to last inspected status id must be kept by the current policy. This is an invariant that must be kept so that resuming deletion from the last inspected status remains sound. * Add tests * Refactor scheduler and add tests * Add user interface * Add support for discriminating based on boosts/favs * Add UI support for min_reblogs and min_favs, rework UI * Address first round of review comments * Replace Snowflake#id_at_start with with_random parameter * Add tests * Add tests for StatusesCleanupController * Rework settings page * Adjust load-avoiding mechanisms * Please CodeClimate 3 years ago			`[PER_THREAD_BUDGET * threads, MAX_BUDGET].min`
			`end`

			`def under_load?`
Change automatic post deletion thresholds and load detection (#24614) 2 years ago			`LOAD_LATENCY_THRESHOLDS.any? { \|queue, max_latency\| queue_under_load?(queue, max_latency) }`
Add feature to automatically delete old toots (#16529) * Add account statuses cleanup policy model * Record last inspected toot to delete to speed up successive calls to statuses_to_delete * Add service to cleanup a given account's statuses within a budget * Add worker to go through account policies and delete old toots * Fix last inspected status id logic All existing statuses older or equal to last inspected status id must be kept by the current policy. This is an invariant that must be kept so that resuming deletion from the last inspected status remains sound. * Add tests * Refactor scheduler and add tests * Add user interface * Add support for discriminating based on boosts/favs * Add UI support for min_reblogs and min_favs, rework UI * Address first round of review comments * Replace Snowflake#id_at_start with with_random parameter * Add tests * Add tests for StatusesCleanupController * Rework settings page * Adjust load-avoiding mechanisms * Please CodeClimate 3 years ago			`end`

			`private`

Change automatic post deletion thresholds and load detection (#24614) 2 years ago			`def queue_under_load?(name, max_latency)`
			`Sidekiq::Queue.new(name).latency > max_latency`
Add feature to automatically delete old toots (#16529) * Add account statuses cleanup policy model * Record last inspected toot to delete to speed up successive calls to statuses_to_delete * Add service to cleanup a given account's statuses within a budget * Add worker to go through account policies and delete old toots * Fix last inspected status id logic All existing statuses older or equal to last inspected status id must be kept by the current policy. This is an invariant that must be kept so that resuming deletion from the last inspected status remains sound. * Add tests * Refactor scheduler and add tests * Add user interface * Add support for discriminating based on boosts/favs * Add UI support for min_reblogs and min_favs, rework UI * Address first round of review comments * Replace Snowflake#id_at_start with with_random parameter * Add tests * Add tests for StatusesCleanupController * Rework settings page * Adjust load-avoiding mechanisms * Please CodeClimate 3 years ago			`end`

			`def last_processed_id`
Fix single Redis connection being used across all threads (#18135) * Fix single Redis connection being used across all Sidekiq threads * Fix tests 3 years ago			`redis.get('account_statuses_cleanup_scheduler:last_account_id')`
Add feature to automatically delete old toots (#16529) * Add account statuses cleanup policy model * Record last inspected toot to delete to speed up successive calls to statuses_to_delete * Add service to cleanup a given account's statuses within a budget * Add worker to go through account policies and delete old toots * Fix last inspected status id logic All existing statuses older or equal to last inspected status id must be kept by the current policy. This is an invariant that must be kept so that resuming deletion from the last inspected status remains sound. * Add tests * Refactor scheduler and add tests * Add user interface * Add support for discriminating based on boosts/favs * Add UI support for min_reblogs and min_favs, rework UI * Address first round of review comments * Replace Snowflake#id_at_start with with_random parameter * Add tests * Add tests for StatusesCleanupController * Rework settings page * Adjust load-avoiding mechanisms * Please CodeClimate 3 years ago			`end`

			`def save_last_processed_id(id)`
			`if id.nil?`
Fix single Redis connection being used across all threads (#18135) * Fix single Redis connection being used across all Sidekiq threads * Fix tests 3 years ago			`redis.del('account_statuses_cleanup_scheduler:last_account_id')`
Add feature to automatically delete old toots (#16529) * Add account statuses cleanup policy model * Record last inspected toot to delete to speed up successive calls to statuses_to_delete * Add service to cleanup a given account's statuses within a budget * Add worker to go through account policies and delete old toots * Fix last inspected status id logic All existing statuses older or equal to last inspected status id must be kept by the current policy. This is an invariant that must be kept so that resuming deletion from the last inspected status remains sound. * Add tests * Refactor scheduler and add tests * Add user interface * Add support for discriminating based on boosts/favs * Add UI support for min_reblogs and min_favs, rework UI * Address first round of review comments * Replace Snowflake#id_at_start with with_random parameter * Add tests * Add tests for StatusesCleanupController * Rework settings page * Adjust load-avoiding mechanisms * Please CodeClimate 3 years ago			`else`
Fix single Redis connection being used across all threads (#18135) * Fix single Redis connection being used across all Sidekiq threads * Fix tests 3 years ago			`redis.set('account_statuses_cleanup_scheduler:last_account_id', id, ex: 1.hour.seconds)`
Add feature to automatically delete old toots (#16529) * Add account statuses cleanup policy model * Record last inspected toot to delete to speed up successive calls to statuses_to_delete * Add service to cleanup a given account's statuses within a budget * Add worker to go through account policies and delete old toots * Fix last inspected status id logic All existing statuses older or equal to last inspected status id must be kept by the current policy. This is an invariant that must be kept so that resuming deletion from the last inspected status remains sound. * Add tests * Refactor scheduler and add tests * Add user interface * Add support for discriminating based on boosts/favs * Add UI support for min_reblogs and min_favs, rework UI * Address first round of review comments * Replace Snowflake#id_at_start with with_random parameter * Add tests * Add tests for StatusesCleanupController * Rework settings page * Adjust load-avoiding mechanisms * Please CodeClimate 3 years ago			`end`
			`end`
			`end`