A few new sub categories


Posted by Si on 19/05/11 in Improvements

I’ve added some new sub-categories to the reports.

App Spam

Accounts where more than 50% of their tweets are auto-generated by an application. Examples of this include paper.li, 4sq, blip.fm, RunKeeper, miso etc.

Uses Advertising Networks

Accounts that are being paid to put tweets into their stream. Advertising networks include organisations such as MyLikes, ad.ly, Magpie, Sponsored Tweets, etc.

Self Obsessed

Accounts where more than 50% of the tweets are about themselves. So, either starting their tweets with “I”, “Ive”, “I’m”, “I’d” or retweeting things people have said about them.

Relatively Unpopular

People who follow more than 3.3 times as many people who follow them. Or, put another way, those who have fewer than 30% people following them back. Sometimes this is just because someone likes to follow lots of people, but occasionally it can also show someone (eg spam bots) whose behaviour is offputting or boring, so very few people choose to follow them back.

The Self Obsessed and Relatively Unpopular sub-categories go into a new category, called “Not Very Interesting.” It will appear at the bottom of your reports.

As always, this is informational. It’s always up to you who you want to follow.

Personally, I follow (& will always follow) a whole bunch of people that appear on my report. Why? Because they’re entertaining, informative and I like them. Free will – it’s not an accident!

Bugfix: Reports now more thorough


Posted by Si on 27/03/11 in Improvements

Today someone told me their report didn’t appear to be updating.

It took quite some digging to find what was happening (since everything was operating normally) but I finally discovered what the problem was.

We were only updating our records for accounts that had already been flagged to appear on a report.

In other words, we weren’t keeping up with how people’s behaviour might change over time.

So, commenting out a single bone-headed line of SQL (What was my obviously under-caffeinated brain thinking? What?!? What??!), and voila, everything is now as it should be. It means reports will be slightly slower (in this case, this is a good thing) – but only because we’ll now be pulling down much more information from Twitter, which takes time.

The upshot of this? Your reports will now have much, much more detail on them. I ran it on my personal account (@sidawson) and my report went from having 47 people on it to 190 (I’ve been deliberately leaving them there so I can test the new version)

If you’ve run a report in, ohh, the last few months, might be worth running it again now.

My apologies for not noticing this sooner. It’s all very silly.

It’s been a heck of a long time since there’s been much external perception of progress around here.

Today, that all changes. Well, almost.

See, when I first started Twit Cleaner, I wanted to get something useful out there as soon as possible. A Minimum Viable Product, it’s called. That worked well enough, but when I started looking at moving to a one-click-one-unfollow model, I realised that the current infrastructure – the way I’d designed things – simply wasn’t going to work. It was possible to do new things, but it would have been horrible, painful & slow.

I made some (with hindsight, of course) silly technical choices, & they came back to bite me on the ass.

So, for the past 7 months I’ve been going through redesigning the entire of Twit Cleaner, more or less from the ground up. The first of that giant chunk of work rolled out yesterday. Believe it or not, but the previous version had almost everything just shoved into a giant directory tree*. So yes, that’s a folder with many, many million files in it. It worked ok for the one task it was designed for, but it seriously hampered the ease & speed with which I could develop any neat new tools. There were a bunch of other bad technical decisions, but that was the key one.

Now, everything is in a big shiny database. Which has its own issues, of course (everything does) but I’ll iron those out over the next couple of days. Once that’s over, & the database is fully loaded up (it’s happening as we speak, and looks like it might take a few days to complete so expect the site to be a little shaky until that’s finished, please be patient), it will smooth the way to quickly & easily roll out a bunch of new tools to help you manage & explore your Twitter life. Oh yes, I have many, many great ideas I’ve been working on.

On the outside, things may have been serene, unbroken, just meandering along like a duck floating on a pond.. but underneath I’ve been, just like a duck, paddling furiously seven days a week all hours of the day & night to get things working just the way they should be – and to get you guys the help you deserve.

Unfortunately, like a duck, there’s not much to look at just yet. Oh, except reports will be much, much faster.

*If you’re really curious, the very first version of Twit Cleaner used to run on my desktop machine at home in Melbourne, then copy things furiously back & forth to the web server, which at that time was in London. Now THAT was nutty.

Identifying people that are high volume has been something that has taken the longest time to really settle down.

I’m getting much happier with this latest (the 3rd? 4th?) incarnation.

If you remember, the previous version simply identified anyone that tweeted on average more than 50 times in a day. That was much more reliable than earlier versions, but suffered from one major limitation. There’s a huge difference between someone who tweets “hello world!” in the morning, then uses the other 49 tweets to chat to their friends, vs someone that just blasts out junk 50 times a day.

The difference is – how many of those tweets are public?

Why are high volume tweeters even a problem? Well, this is something that people tend to forget once they start following more than a few thousand people. When you’re following that many people, there are so many tweets flying past it’s mostly a blur. So, if you tweet like crazy who cares, it gets lost in the blur, right?

What is forgotten is this: Of the active users on twitter, most people only follow 2-400 others. One high volume user can flood an entire tweetstream, making it impossible to connect with anyone else.

With that in mind, the new high volume algorithm works like this. If you tweet publicly (ie, anything other than a reply) more than 24 times a day, you’re listed as high volume. This isn’t completely accurate, since if you have any overlapping friends, you’ll see their conversations with them too, but it’s a good estimate.

Once an hour may not sound like a lot, but once you factor in work, sleep, play – oh, and the fact that this is only public tweets, it’s an absolute ton. So talk, talk away! Just connect, make some friends! Don’t blather on about yourself all day :)

I’ve added a new category to the bottom of the reports, “Little Original Content.”

This covers two areas:

People who retweet 70% or more of the time

Of course, some people do find the best stuff out there, but in general, if someone is only ever RTing things by other people – why not just follow the other person? This is also something that is done a lot by spam bots, to make them appear ‘more human.’

People who post quotes more than 50% of the time

Similarly to retweeting, spam bots often intersperse their crap with quotes. It’s a zero effort way for them to have ‘fresh’ content. In reality though, if they’re quoting Epicurius, this probably isn’t something you need to be getting second-by-second Twitter updates on, the guy’s been dead 2300 years!

That said, as with everything on the reports, there will always be those you choose to follow that fit into the above categories (eg, I follow a couple of accounts that post nothing but quotes). Just click their icons & they’ll be saved.

If you don’t want to unfollow any of them,  simply click the headings, & the the entire category will be saved. As easy as ever!

Given the number of people that have had their accounts hacked recently, there are a lot of people getting very nervous about entering their Twitter username & password anywhere.

I also had a bit of a look at the login process & realised – once the login window pops up, it could be pointing anywhere. I.e. without digging deeply into things (e.g., even View|Source on the page wouldn’t necessarily confirm the originating site), how would a nervous/unsuspecting user know that we were legit?

Obviously this isn’t good.

So, I’ve updated the login window.

The location bar now shows at the top of the login window. It’s a bit uglier, but a lot clearer.

You can now see that you’re at Twitter.com when you’re entering your username/password for OAuth authentication. You’re only asked to enter username/password if you’re not already logged into Twitter via the website, but anything that makes things just that little bit more obvious is always welcome.

No More Checkbox


Posted by Si on 23/02/10 in Improvements

The primary complaint I’ve had about the site, at the rate of about one in every two hundred users is that of people not seeing the checkbox on the front page. Which one? The one that said “Tell your followers”:

If it was checked, a tweet went out on your behalf, if not, no tweet.

I’ve had an improvement waiting for a new version of the site before I rolled it out, but yesterday I was goaded by some Brazilians into pushing it out sooner instead.

So, now the front page now looks like this:

(Well that’s cleaner & simpler, isn’t it?)

And when you click that, you now get a choice (rather than defaulting to on like it used to):

If there’s a way to make that any clearer, I’m not sure what it is :) and more happy people? Well, that’s always a good thing.

I’ve been keeping an eye on current best practices on Twitter (of course), & it appears Twitter is cracking down on bulk unfollow. This was an informative article, & an eye opening tweet.

Specifically, they want to avoid churning – that is, following a ton of people, then unfollowing those that don’t follow back, repeat ad nauseum. I suspect part of their motivation is to limit spammy behaviour, & part is that it’s a massive drain on their servers (getting hit with thousands of API requests in a short period of time).

I realise that with The Twit Cleaner, we skirting a fine line. However, my priority is to keep your accounts safe, operate within the guidelines of Twitter, while improving the quality of experience for everyone.

In short: I want to improve the Twitter experience as much as I possibly can – but without pissing Twitter off (or causing them any hassle) in the process. I’ve been very careful to try & ensure that the service is the least possible use to those I’m trying to rid Twitter of – those engaging in churn or other spammy practices. Ie, there’s a lot of things I could have put in, but I deliberately haven’t because of any possibility of abuse.

Of course if Twitter says jump, the only appropriate response is “how high”, but I believe we’re safe because:

  1. You only ever have the option to unfollow people that are bad Twitter citizens in the first place – typically a very small percentage of anyone’s account
  2. We do the unfollowing very, very slowly (only one every few seconds) to limit drain on Twitter’s servers
  3. We never unfollow more than a small percent of your account per day, no matter how many you request.

To this end, I have slowed the unfollow down even further than before. It will now not unfollow more than 20% 10% 5% (or 500, whichever is smaller) of your friend count per day, as well as spacing each unfollow out much, much more slowly.

Ergo, if you want to use The Twit Cleaner to empty your account, you’re better off going somewhere else (it’s not something I’d recommend anyway). If you want to use it to trim out spammers & time wasters of course, we’re the guys for you.

It’ll still happen, just very, VERY slowly. Don’t hold your breath :)

Improving the auto-tweet


Posted by Si on 04/01/10 in Improvements

The “auto tweet” on the front page is a source of occasional consternation.

There are two specific behaviours that I’ve tweaked & improved.

1. If you had a problem with Twitter when authenticating, it would default back to tweeting (even if you’d previously deselected the checkbox).

2. People occasionally request a report (with the checkbox ticked, ie – send out a tweet), then a minute or two later seem to change their mind & request another report, with the checkbox unticked (no tweet). I guess they decide to read “Tell your followers” after they’ve clicked it?

Anyway, I’ve tweaked both these issues. The option should remember your choice – a bit hard to test, since I can’t exactly call up Twitter & ask them to break so I can test things, but it should be good. Plus, if you accidentally select the wrong option, as long as you’re quick about it, you should be able to overwrite your previous request by requesting one again – toot suite though!

Lists are a great new addition to Twitter.

I recently got to thinking (spurred on by @GLComputing – thank you!) about them in a different way.

Lists are groups of people that you’ve taken the time to say “Hey, this person is important to me.” If that’s the case, why should you need to tell The Twit Cleaner as well? You’ve already said it once, after all.

So, now the reports will automatically exclude anyone you’ve added to any of your lists (including the automated “conversationalist” list). There is a slight issue here with regards to people who’ve left Twitter but may still be on a list, but in general this will be far offset by the benefit of far fewer false positives on your reports.

As an extra bonus, the benefit (time saved having to check everything) increases the more people you follow – or the larger your lists.