Expiring Old Accounts

MS maxwell salzberg Public Seen by 59

One of the reasons that Joindiaspora is slow/requires lots of resources to run (and possibly other old pods), is there are tons of old, dead accounts in the system. I don't have the exact numbers in front of me, even if we expired accounts that have not signed in in the last three years, that would drastically reduce the database size, which means messages would federate much faster, page loads be more reliable, and would be more cost effective to run. It would drastically improve the entire experience for people using using the Pod, and any new users that try Diaspora out.

There are two tables specifically which really bog down performance (Person and PostVisibilities), and drive the cost and memory of running a large pod up. I'd like to optimize Diaspora for the community who actively use it, so I'd love this discussion to turn into a plan of action to improve this scenario.

There is a few things that we would be a good idea to expire.

1) Local User accounts that have not signed up in a period of time.
as such, we could also expire...
local person objects
all post visibilities from this account (things that they can see)
contacts (both sides)

2) Expired Pods
One thing we don't keep track of if a pod goes down
a) we don't want to send messages to this pod
b) if we have not have had contact for it for some period of time, we should expire all data related to said pod (Person, contacts, post visibilities for local users)

3) Empty accounts just following dhq

4) Any other ideas?

The goal here is that if we can actually expire a proper amount of the data, JD.com (and most likely other pods) can have small data sets, and require less resources to run, which makes them more sustainable for the future. I've been paying for JD.com out of my own pocket, but its starting to become a burden, so I wanted to make sure we found a solution that people found acceptible (and share that process with others).

I'd love all of your thoughts.


Ravenbird Sun 3 Aug 2014 8:36PM

Don't like deleting old accounts while there is no way to migrate accounts and date from one pod to another. I get a new account some weeks ago, but my old account have more then 3000 posts and it wouldn't be nice to loose all of them.


Jonne Haß Sun 3 Aug 2014 8:42PM

One interesting approach could be implementing the move account by building an exportable archive that can fully restore an account on any pod. If we have that, one could build a mechanism that automatically builds such an archive for old accounts, uploads that to some location and then deletes the accounts data. When the user then signs in again he could download that archive or it's even automatically restored at that pod.

No matter what approach we settle on here, the big question is who is going to implement it.


Flaburgan Sun 3 Aug 2014 8:50PM

@maxwellsalzberg you cannot imagine how happy I am to see you opening this discussion. I opened #4183 about that a while ago but unfortunately we saw no progress on this.

IMO we can safely delete :
* Users who never signed up (they were invited but never clicked on the link)
* Pods which didn't respond for a long time (one year?)

We can also delete users who didn't sign in for more than (period to define), but to send them an email first would be done.

My opinion about this is to propose these actions in the admin panel of the pod:
*Clean pods which didn't respond for more than [period selectable by the podmin, min 6 months]
*Clean users who didn't sign in for more than [period selectable by the podmin, min 6 months]
*Mail users who didn't sign in for more than [period selectable by the podmin]

The mail can be to alert and ask to connect, something like "hey, diaspora* was really improved since the last time. wanna have a look again? If you're not going to use our service again, please delete your account to improve our performance".
Or a mail to warn that the accounts would be deleted "hey, you didn't sign in for more than a year, your account will be deleted if you don't log in before 15 days"


Adrenalin Sun 3 Aug 2014 9:17PM

As Ravenbird said…

desperately waiting since more than 1 year for a tool to move the postings from my old account into my new one (thousands of postings). It would be sad to loose them. Wish I could help working on such tool but I have no clue about programming :(


Flaburgan Sun 3 Aug 2014 9:34PM

Well, maybe we could add a criteria: "accounts who never posted anything".


maxwell salzberg Sun 3 Aug 2014 10:31PM

certainly we would not delete any info from people who wanted their data... all they would have to do is to just log in to JD.com (or whatever pod) ONCE in the last 3 years. (or whatever timeframe we agree on)

This way we could still dogear accounts to be migrated.


aj Mon 4 Aug 2014 2:02AM

in the case of a private pod where there aren't a lot of users, it is often the size of the participations table that bloats the DB

it would be handy to have a function to clear participations more than a year old, they are too far down in the activity feed to ever view them anyway


Theatre-X Mon 4 Aug 2014 5:16AM

One of the things I've noticed is that some of the pods (e.g. Diasporg) gives an authentication error even though the login creds are correct. That happened with my first account and has happened approx. 3 times to a friend of mine who eventually gave up on Diaspora because of it.

I would contact some of the account owners somehow. In my case, delete my old account on Diasporg and my account on joindiaspora. I really don't give a shit.


[deactivated account] Mon 4 Aug 2014 10:13AM

Confirmed what @aj says re smaller DBs blowing up on the "Likes" etc, it's completely out of all proportion. Did I see a cleanup script for this somewhere?


Jason Robinson Mon 4 Aug 2014 12:48PM

@maxwellsalzberg thank you for this initiative - absolute it is very important to start optimizing the network structure as it's starting to get old enough to suffer from problems of no optimization done.

Unlike commercial providers, the diaspora* network is afaik fully powered by private people from their own pockets. Thus we should absolutely offer tools for podmins to clean data and make sure that they can run their pod on a small as possible hosting plan. This will not only ensure pods run for longer, but also make it more easier for new podmins to start their journey.

My suggestion for user deletions;

  • Make two rake jobs. The first rake job would send a warning email to accounts that would be closed.
  • The date the warning was sent would be stored to the Users table as a timestamp.
  • The second rake job would then delete these account where "warning sent timestamp + configurable period < current timestamp".

Why send a warning email? Two reasons. Firstly, the obvious, to warn the user. In some cases like indicated by @saschamorr - the user could be interested to keep the account.
But the second reason is more important IMHO. By sending the warning email, we are also contacting the user and saying "Hey! We're still here". From the management work I do with the diaspora* project social media accounts, it seems likely that 90% of those 1 million created accounts probably think diaspora* has died.

Whatever way this is done, it would be great to lessen the burden of running a pod. Optimizing joindiaspora.com is a perfect task to achieve that for other pods too ;)


Jason Robinson Mon 4 Aug 2014 12:49PM

Oh and forgot - of course in the two rake jobs scenario, when a user logs in, any "warning sent" timestamp should be cleared - thus removing the account from possible deletions.


maxwell salzberg Mon 4 Aug 2014 5:08PM


In theory, implementation sounds good.

for jd.com, most likely better run/schedule batches (say 1000 at a time)

Trying to delete all that data in once process will most likely take forever, so it might be an ongoing thing that would have to be run over the course of many days, but keeping it in chunks that make it easy to stop/start as load increases could be good

@rich1 you are also right, that is a good catch! That would also help as well.


maxwell salzberg Tue 5 Aug 2014 1:29AM

What is the best way to go about finding the correct processes for figuring out

1) what to delete
2) how to do it in a repeatable, humane, cost effective way?

3) how to implement it.


Flaburgan Sun 10 Aug 2014 8:04PM

@jasonrobinson just opened a PR to send an email to all users.


Jason Robinson Mon 11 Aug 2014 2:41PM

Well I didn't have this in mind actually but sure, with small adaptations a similar rake job could send out warnings and flag accounts (assuming that is what we want to do). Will see once I finish this up whether I could do that too.


goob Thu 14 Aug 2014 10:25AM

This sounds really good. I made a suggestion a few months ago, very much along the lines of what @jasonrobinson suggested, to help the biggest pods clean up their user bases and improve performance:

  1. Podmin chooses time limit since last activity (default two years seems sensible).
  2. Emails sent out to addresses in database, in batches, giving the users a set period (30 days default?) to log in to their account - also giving them a link to use in case they have forgotten their log in details.
  3. After the period has elapsed for each batch, delete the accounts from that batch which have been inactive since that batch of emails was sent.

I've not idea of how to achieve this technically, I'm afraid, but if possible, such a feature would be really useful for the network.


Jason Robinson Thu 14 Aug 2014 6:18PM

Really at it's simplest, just a bunch of rake jobs would do fine IMHO - later they can be built in to the admin UI if needed.

Still haven't finished the "send email to users" rake job thingy - might have a look after that but will take some time tbh.


Jason Robinson Tue 30 Sep 2014 8:07PM

I began doing something to remove old users, haven't tested any of it, just putting together some code.

The idea is to;

  • Have a cron job (whenever gem) to send expiry warnings per settings, and to queue actual expirations to sidekiq. To be expired users will be flagged as such in user table too (timestamp when ok to remove).
  • Login will check for this timestamp and remove it if it is encountered.
  • Sidekiq will process the row and if expiration timestamp is still there, it will do the expiration

How does this sound for a basic principle? Also, what exactly would be cleaned? The aim here is to remove bloat from pods (optionally of course). So the removals need to be efficient if the podmin wants, not just little slice here and there.
Just a normal DeleteAccount?

WIP stuff be here: https://github.com/jaywink/diaspora/compare/remove-old-users

I started working on this because joindiaspora is going super slow with all the activity going on :) So input of @maxwellsalzberg appreciated.


[deactivated account] Fri 3 Oct 2014 10:35AM

Personally I would like to see such a feature my self, since allot of people are just register and do nothing with their profiles. Because they do not read the 'Help' pages to find new friends or think its a FB rip off and lack the knowledge to understand what D* actually is.

Also it would be a good clean up for older pods that have long forgotten members such as Poddery.


Democracy v2 Sun 5 Oct 2014 6:08PM

I would say that performance is a serious topic for Diaspora's future.
I understand that we should do our best to preserve server performance. We cannot go frenzy with uploading pictures and stuff like on Facebook.
On the other hand I was thinking about remembrance aspect of social networks e.g. someone dies in a car accident or dies of old age. We don't want to terminate their accounts.
Either we give enough notice time or think about alternative funding plan so a member pays e.g. £1000 in advance, but his/her profile will stay online for 150 years like a gravestone. Then his/her grandchildren can take it over and further pay for another 150 years if they want or just download and archive granpa's profile on DVDs.
This way members have their privacy and control their content. No one is spying on them without their consent, but pod administrators collect tiny sum (£0.50 a month) and he can afford much better server on the cloud, so Diaspora is not too frugal on resources.
I am sure many people would pay a tiny sum to preserve their rights and still take park in social networking.
Otherwise you have to use FB or Google, but expose yourself to this marketing, spying moloch that stands behind it.

Guys do you know if this issue has been ever addressed?
I mean these days, because of social network phenomenon most of the people are showing off everything what they do apart from when they go to the toilet thus they afraid so much that if they don't regularly show off someone can think that their life is probably boring and low-profile. They also fear of death - no more updates. Who is going to put the information about his/her death on their profile pages?


Jason Robinson Sun 5 Oct 2014 8:14PM

Submitted initial pull for review regarding old user removal feature: https://github.com/diaspora/diaspora/pull/5288

Comments welcome!

@maxwellsalzberg especially as you "requested" this :) Does this (calling user.close_account!) actually even do what is needed (= help pods run on less for longer)?