Engineering

Delightful Goodbyes: Fully Deleting User Data

Akshay Nathan
September 17, 2019

At Monolist, we’re building the command center for engineers. We integrate with all the tools an engineer uses, aggregate their tasks, and provide intelligence to make sure that nothing slips through the cracks.

Unfortunately, like any company, we sometimes have to say goodbye to one of our users. Creating a delightful experience throughout the product is core to our company culture, and we believe that account deletion should be no exception. While it may seem counterproductive to focus on this user story, we think it’s a dark pattern to make deletion difficult. It frustrates me when it’s hard to delete my account from certain services, and we don’t want our users to feel the same way.

However, because Monolist is built on integrations with different tools, we rely on a complex data model that makes deleting all of a user’s data, fully and efficiently, rather difficult.

In this post we’ll focus on ensuring that a user’s data is fully deleted, which is important for compliance with GDPR and other regulations, but is also just the right thing to do.

Starting Simple

In the beginning, our user deletion service was as simple as possible. (Note: The code snippets in this blog post are written in Rails and ActiveRecord, but the behavior and message should be the same regardless of language/framework/ORM).

class DeleteUser
  def call(user)
    user.destroy!
  end
end

The destroy! method in ActiveRecord deletes a record from the database and executes all relevant callbacks. Rails provides an easy way to handle associations as follows.

class User < ApplicationRecord
  has_many :action_items, { dependent: :destroy }
end

This ensures that when a user is destroyed, their action items will be destroyed as well.

The problem with this approach, however, is that it requires us to be very careful about adding “dependent: :destroy” to our associations. Let’s take a look at the action item model.

class ActionItem < ApplicationRecord
  belongs_to :user

  has_one :properties, { dependent: :destroy }
  has_many :notifications
end

Do you see the problem here? When we call user.destroy! now, we will correctly destroy all action items and their properties, but we won’t destroy notifications! This means that we’re still storing user data, even after they’ve deleted their account.

Adding Testing

An obvious solution to this problem is to use foreign keys. Foreign keys enable referential integrity, and ensure that an associated database record (Action Item), cannot be deleted while it’s associations (Notifications and Properties) are still present.

However, note that foreign keys still depend on us to actually create them! If we forget to create a foreign key, and forget to add { dependent: :destroy } to an association, we’re back at square one.

We quickly realized that we needed a comprehensive way to test deletion. We needed our test to be self-updating -- that is, the efficacy of the test shouldn’t rely on the engineer adding an association to update it. Here’s a simplified version of what we came up with:

describe "DeleteUser" do
  let(:user) { User.create! }
  let(:exclusions) { [:blacklisted_ips, :marketing_email_types] }

  before(:each) do
    action_item = ActionItem.create!({ user: user })

    Notification.create({ action_item: action_item })
    Properties.create({ action_item: action_item })
  end

  it "should exhaustively delete user" do
    tables = ActiveRecord::Base.connections.tables.reject { |s| exclusions.include?(s.to_sym) }

    tables.each do |table|
      expect(ActiveRecord::Base.connection.execute("SELECT count(*) from #{table}").first["count"])
        .to(be > 0, "#{table} is empty")
    end

    subject.call(user)

    tables.each do |table|
      expect(ActiveRecord::Base.connection.execute("SELECT count(*) from #{table}").first["count"])
        .to(eq(0), "#{table} is not empty")
    end
  end
end

Let’s walk through this test line by line.

tables = ActiveRecord::Base.connections.tables.reject { |s| exclusions.include?(s.to_sym) }

First, we retrieve the full list of tables in our database. We exclude some tables from our test. Exclusions are tables that aren’t associated with a specific user. For example, we have a dynamic list of IP addresses that we blacklist, which will not be mutated when a user deletes their account.

tables.each do |table|
  expect(ActiveRecord::Base.connection.execute("SELECT count(*) from #{table}").first["count"])
    .to(be > 0, "#{table} is empty")
end

Next we verify that our test is set up properly. We make sure that any table we expect to have data, has data. This step is extremely important. Consider an engineer building a new feature. If they add a new model, the tests will fail in CI, since their new table will have no entries. Only then are they forced to remember to update the DeleteUser test, and update the association to be properly destroyed.

Next, we call our service: DeleteUser.new.call(user)

tables.each do |table|
  expect(ActiveRecord::Base.connection.execute("SELECT count(*) from #{table}").first["count"])
    .to(eq(0), "#{table} is not empty")
end

Finally, we assert that all our tables are empty, which implies that our test user has been properly deleted.

Conclusion

While this test may seem simple, we’ve found that it’s an effective way for us to guarantee that we’re fully deleting our user’s data upon their request. When we run into false positives, we add tables to the exclusions list. This opt-in vs opt-out approach forces our engineering team to consider the implications of every database association we add, while still iterating quickly.

However, this is only one facet of building a delightful deletion experience. In our next post in the series, we’ll talk about our learnings in deletion performance, and how we optimized our user deletion interaction from minutes to seconds.


Tired of refreshing Github and Jira for notifications? Try Monolist and aggregate all your tasks in one place!

Follow us on Twitter