Twitter's first major update to its API went live today.
Here's how it impacts Twit Cleaner.
In short? We can no longer support large users, where "large" currently equals about 50k friends (those you're following), but may drop further if I find the system is still struggling.
Why is this happening?
Well, as you know, Twit Cleaner does a lot of analysis. This requires a LOT of data.
Twitter puts limits on how much data-per-hour you can get from it. Specifically, how many things you can ask for ("requests").
Previously, we were able to make 20,000 requests per hour - and that included any kind of request (eg, getting your profile image, your user information, your tweets, your friends, who you've talked to, etc).
Now, each type of request is broken into its own tiny little bucket, and each bucket is very, very small. Most of them are limited to one request per minute.
Twit Cleaner depends pretty heavily on analysing the tweets that someone has made (not just the easy, fast basic info you can get - bio, profile image, following/follower counts etc). This is why our reports take so much longer than any other unfollowing/analysis service out there. We dig deeper so we can (I like to think) add more value.
In order to do this, we have to make 1.01 (sometimes 2.01) requests (plus a little overhead) per person you're following.
You follow 5k people? We have to make 5,050 requests to Twitter. 5000 of one kind (tweets), 50 of the other (user info).
You follow 100k people? We have to make 101,000 requests.
At 20,000 requests per hour, that meant we could get a report in five hours, although with a bit of clever jiggery pokery (caching like crazy) we can usually cut that in half - ie, operating at about the maximum speed Twitter can send data to us anyway.
At 60 requests per hour (and assuming we're similarly clever), doing a report for someone following 100k people would now take 101,000/60/2 = three and a half days. Now, we can definitely get that timeframe down a bit, but it requires an enormous drain on the server, and it's still going to be slow as heck. Oh, plus it'll slow things down for everyone else (despite my best efforts).
I don't like slow. I really don't like slow.
So, much as I hate to do this - and it sucks, it really does, I'm going to have to stop supporting large users. Why should you be penalised, just because you're popular/successful/cute? I don't know. Maybe best to ask Twitter that.
The good news is, most accounts aren't large. The average following count is 500. So, most of you guys will be zippier than ever (not being slowed down by other people hogging server resources).
Since one 100k account uses 200 times as many resources (cpu/bandwidth/database contention/grey hair) as your average account, this is probably one of those "needs of the many outweigh the needs of the few" things. Mostly people will be better off, but I'm still sad not to be able to help larger users more.
Oh, and I've also had to shut down the Retweets section of the website (which was super-cute, I thought). The new version of the API just doesn't support it any more.
Kind of a sad day.
sorry :( thank you for all you do and you still are a rock star no matter what!
Thank you Kathy.. I'll continue doing all I can to make Twitter a better place for everyone :)
thanks for all you do :*)
You're so welcome Sep.. thank you for your support!
Sounds like you are working your socks off for all of us - thanks and don't stress out about the larger users, they will have their hands full just scrolling through the thousands of tweets every day!
Ahh see, that's a very good point. Mostly I feel bad coz many of those guys have been with me from the start, have been huge champions of Twit Cleaner, and I'm going to miss them.
I definitely have my hands full regardless though, you're right!
I am doing a lot with Twitter API plus some off-the-book scraping using phantom.js. I have to use them both. However, I only do it on my own accounts. That's the only way to get around. Maybe you can ask people to pay if they have a lot of followers
Hmm, interesting. Way back in the super early days, i used to get all my data by scraping (a single http call and a lot of work by me gave me everything I needed). It was super fast and efficient.
Then Twitter changed to "New Twitter", everything was API based, and I had to lose the scraping. With the small-but-noticeable delay in loading a single Twitter page, the combined overhead would radically slow the system to a halt, I suspect.
Cool that you're getting somewhere with it though. On a one-by-one basis I reckon you could sneak past. Just unfortunately impractical to scale.
[ -] hoped that dropping the maximum user size down would help things, but quite literally, every part of my system is still complaining [ -]
Thanks so much for all your hard work! I hope you enjoy whatever you do next!