Expiring Old Accounts
One of the reasons that Joindiaspora is slow/requires lots of resources to run (and possibly other old pods), is there are tons of old, dead accounts in the system. I don't have the exact numbers in front of me, even if we expired accounts that have not signed in in the last three years, that would drastically reduce the database size, which means messages would federate much faster, page loads be more reliable, and would be more cost effective to run. It would drastically improve the entire experience for people using using the Pod, and any new users that try Diaspora out.
There are two tables specifically which really bog down performance (Person and PostVisibilities), and drive the cost and memory of running a large pod up. I'd like to optimize Diaspora for the community who actively use it, so I'd love this discussion to turn into a plan of action to improve this scenario.
There is a few things that we would be a good idea to expire.
1) Local User accounts that have not signed up in a period of time.
as such, we could also expire...
local person objects
all post visibilities from this account (things that they can see)
contacts (both sides)
2) Expired Pods
One thing we don't keep track of if a pod goes down
a) we don't want to send messages to this pod
b) if we have not have had contact for it for some period of time, we should expire all data related to said pod (Person, contacts, post visibilities for local users)
3) Empty accounts just following dhq
4) Any other ideas?
The goal here is that if we can actually expire a proper amount of the data, JD.com (and most likely other pods) can have small data sets, and require less resources to run, which makes them more sustainable for the future. I've been paying for JD.com out of my own pocket, but its starting to become a burden, so I wanted to make sure we found a solution that people found acceptible (and share that process with others).
I'd love all of your thoughts.
maxwell salzberg Mon 4 Aug 2014 5:08PM
Jason,
In theory, implementation sounds good.
for jd.com, most likely better run/schedule batches (say 1000 at a time)
Trying to delete all that data in once process will most likely take forever, so it might be an ongoing thing that would have to be run over the course of many days, but keeping it in chunks that make it easy to stop/start as load increases could be good
@rich1 you are also right, that is a good catch! That would also help as well.
maxwell salzberg Tue 5 Aug 2014 1:29AM
What is the best way to go about finding the correct processes for figuring out
1) what to delete
2) how to do it in a repeatable, humane, cost effective way?
(then)
3) how to implement it.
Jason Robinson Mon 11 Aug 2014 2:41PM
Well I didn't have this in mind actually but sure, with small adaptations a similar rake job could send out warnings and flag accounts (assuming that is what we want to do). Will see once I finish this up whether I could do that too.
goob Thu 14 Aug 2014 10:25AM
This sounds really good. I made a suggestion a few months ago, very much along the lines of what @jasonrobinson suggested, to help the biggest pods clean up their user bases and improve performance:
- Podmin chooses time limit since last activity (default two years seems sensible).
- Emails sent out to addresses in database, in batches, giving the users a set period (30 days default?) to log in to their account - also giving them a link to use in case they have forgotten their log in details.
- After the period has elapsed for each batch, delete the accounts from that batch which have been inactive since that batch of emails was sent.
I've not idea of how to achieve this technically, I'm afraid, but if possible, such a feature would be really useful for the network.
Jason Robinson Thu 14 Aug 2014 6:18PM
Really at it's simplest, just a bunch of rake jobs would do fine IMHO - later they can be built in to the admin UI if needed.
Still haven't finished the "send email to users" rake job thingy - might have a look after that but will take some time tbh.
Jason Robinson Tue 30 Sep 2014 8:07PM
I began doing something to remove old users, haven't tested any of it, just putting together some code.
The idea is to;
- Have a cron job (whenever gem) to send expiry warnings per settings, and to queue actual expirations to sidekiq. To be expired users will be flagged as such in user table too (timestamp when ok to remove).
- Login will check for this timestamp and remove it if it is encountered.
- Sidekiq will process the row and if expiration timestamp is still there, it will do the expiration
How does this sound for a basic principle? Also, what exactly would be cleaned? The aim here is to remove bloat from pods (optionally of course). So the removals need to be efficient if the podmin wants, not just little slice here and there.
Just a normal DeleteAccount
?
WIP stuff be here: https://github.com/jaywink/diaspora/compare/remove-old-users
I started working on this because joindiaspora is going super slow with all the activity going on :) So input of @maxwellsalzberg appreciated.
Anime Machina Fri 3 Oct 2014 10:35AM
Personally I would like to see such a feature my self, since allot of people are just register and do nothing with their profiles. Because they do not read the 'Help' pages to find new friends or think its a FB rip off and lack the knowledge to understand what D* actually is.
Also it would be a good clean up for older pods that have long forgotten members such as Poddery.
Democracy v2 Sun 5 Oct 2014 6:08PM
I would say that performance is a serious topic for Diaspora's future.
I understand that we should do our best to preserve server performance. We cannot go frenzy with uploading pictures and stuff like on Facebook.
On the other hand I was thinking about remembrance aspect of social networks e.g. someone dies in a car accident or dies of old age. We don't want to terminate their accounts.
Either we give enough notice time or think about alternative funding plan so a member pays e.g. £1000 in advance, but his/her profile will stay online for 150 years like a gravestone. Then his/her grandchildren can take it over and further pay for another 150 years if they want or just download and archive granpa's profile on DVDs.
This way members have their privacy and control their content. No one is spying on them without their consent, but pod administrators collect tiny sum (£0.50 a month) and he can afford much better server on the cloud, so Diaspora is not too frugal on resources.
I am sure many people would pay a tiny sum to preserve their rights and still take park in social networking.
Otherwise you have to use FB or Google, but expose yourself to this marketing, spying moloch that stands behind it.
Guys do you know if this issue has been ever addressed?
I mean these days, because of social network phenomenon most of the people are showing off everything what they do apart from when they go to the toilet thus they afraid so much that if they don't regularly show off someone can think that their life is probably boring and low-profile. They also fear of death - no more updates. Who is going to put the information about his/her death on their profile pages?
Jason Robinson Sun 5 Oct 2014 8:14PM
Submitted initial pull for review regarding old user removal feature: https://github.com/diaspora/diaspora/pull/5288
Comments welcome!
@maxwellsalzberg especially as you "requested" this :) Does this (calling user.close_account!) actually even do what is needed (= help pods run on less for longer)?
Jason Robinson · Mon 4 Aug 2014 12:49PM
Oh and forgot - of course in the two rake jobs scenario, when a user logs in, any "warning sent" timestamp should be cleared - thus removing the account from possible deletions.