Twitter Intros New Feature to Automatically Block Abusive Behaviour

Chris Fernando September 7, 2021

0 2 minutes read

Written by Amer Owaida, Security Writer at ESET

Twitter has unveiled a new feature called Safety Mode, aimed at curbing abusive behavior by autoblocking any unwanted tweets and other forms of online harassment. Currently, the feature is available to a handful of users. “Unwelcome Tweets and noise can get in the way of conversations on Twitter, so we’re introducing Safety Mode, a new feature that aims to reduce disruptive interactions. Starting today, we’re rolling out this safety feature to a small feedback group on iOS, Android, and Twitter.com, beginning with accounts that have English-language settings enabled,” said Twitter in a blog post introducing the new feature.

When the Safety Mode feature is enabled it will briefly block accounts for a period of seven days for using abusive languages such as insults or loathsome comments, as well as for sending out repetitive or unsolicited mentions. Once the feature is turned on, Twitter’s systems will analyze the tweet’s content, the relationship between the tweet’s author and replier, and whether there is a probability of negative engagement. The technology also looks at the relationships; if the user regularly interacts with those accounts or follows them, then it won’t autoblock them.

However, if Twitter’s technology evaluates that the tweets do contain any offensive material, their authors will be autoblocked in short order. This means, temporarily, they won’t be able to follow you, see your tweets, or even contact you using direct messages. Users will have the option to review the details of flagged tweets and autoblocked accounts from the Safety Mode menu at any time. Additionally, they’ll also receive a notification summarizing this information before each Safety Mode period ends.

That being said, the social media platform concedes that the system isn’t perfect. “We won’t always get this right and may make mistakes, so Safety Mode autoblocks can be seen and undone at any time in your Settings. We’ll also regularly monitor the accuracy of our Safety Mode systems to make improvements to our detection capabilities,” said Twitter.

The social media giant has worked with various partners from its Trust and Safety Council during the development of the new feature. Its main aim is to better protect users by reducing the frequency of hateful comments. In the meantime, it will keep on observing how the new feature operates and will add improvements along the way before it rolls Safety Mode out to all of its users.

Harassment and other forms of abusive behavior on social media have become a perennial problem, and social media platforms have been working hard to stomp it out for some time now. Earlier this year Instagram also rolled out its own set of features aimed at helping prevent cyberbullying.