Media Hacking refers to the usage and manipulation of social media and associated algorithms to define a narrative or political frame. Individuals, states, and non-state actors are increasingly using Media Hacking techniques to advance political agendas. Over the past year we’ve seen a number of such incidents occur — where both social media and mainstream media were manipulated to advance a particular agenda. Two examples follow, one which I tracked and one Gilad tracked.
Open your browser and search for ISIS France. The first recommendation that Google offers is “ISIS France support”. Why is the most sophisticated algorithm in the world prompting me that the most frequently used search term about ISIS and France relates to French support? The answer has nothing to do with the tragic murders at the French satirical magazine. It’s a hack. Google search algorithm was effectively hacked to produce this result.
On August 26th Vox ran a storywith the title “One in six French people say they support ISIS”. The headline is disconcerting — the article highlights that in the 18–24yr old bracket, 27% of French youth surveyed support ISIS. I remember seeing this in my Twitter feed and thinking this makes no sense, 10M people in France, a quarter of French youth, support ISIS? A bit of digging yielded some perspective.
The article is based on a survey of random phone interviews conducted by a British marketing agency called ICM. ICM randomly dialed 1,001 people in France. This seems like a small sample, but 1,000 randomly selected callers is statistically significant for a population the size of France. While the overall sample is relevant the sub samples aren’t — the sample size that yielded the 27% number was based on a sample size of 105 people. That’s not meaningful. And the questions in the survey were oblique — if you look at the source data, it’s possible that people who were interviewed thought this was a general statement of support of Iraq, not ISIS and while the survey refers to ISIS the French have several other words they use. Finally the survey data indicated only 2.7% of people had a very favorable view — most people grouped into the unfavorable group (62%) or the “don’t know” group (23%) so methodology wise, it’s a mixed bag, at best.
Beyond the methodology, the survey was commissioned by Russian news agency Rossiya Segodnya. The trail of the media breadcrumbs seem to be as follows: Rossiya Segodnya commissioned a survey to test support or opposition to the admissions of Georgia and the Ukraine into the EU, the ISIS question was secondary. On August 18th Russia Today ran the storywith the headline: “15% of French people back ISIS militants, poll finds.” Over the following week the Russia Today story was reposted, in particular the summary infographic (above) propagated around the internet, mostly on French sites. A Tinyeye search for the URL of the image for the infographic shows some of the sites who ran it.
The Vox story ran a week later. In an email exchange Max Fisher (author of the Vox post) said he thought he saw the data in Tweet. The Vox story combined two surveys (the one by ICM and one by the Palestinian Center for Public Opinion), sources were cited (including that Rossiya Segodnya commissioned the survey) and they included an infographic from Russia Today. With the headline — “One in six French people say they support ISIS” — the story started to circulate on social media, in particular on Twitter. Media hacks take advantage of the decontextualized structure of real time news feeds — you see a Tweet from a known news site, with a provocative headline and maybe the infographic image included — you retweet it. Maybe you intend the read the story, might be you just want to Tweet something interesting and proactive, maybe you recognize the source, maybe you dont.
You can see in the network map above a view of how the headline and Tweet spread — from Vox, to Max’s network, to Ezra Klein’s network and beyond. Note how effectively this moved to several, connected but unrelated clusters or networks of people. This image is a network finger print of an effective media hack. Concurrent with this amplification across networks derivative articles start getting authored and the part human, part algorithmic machine we have created of click rich re-syndication starts up.
The following day the Washington Post ran an article saying “this makes no sense” but it’s was too late — once these headlines move around the web, with sharable rich headlines the truth rarely catches up. As Churchill said a while before the real time web:
The meme was set and already feeding the Google algorithm. Days later — months later — a simple web search offers a prompt that is in essence based on propaganda achieved through a deft media hack. Let me pass the keyboard over to Gilad to tell the story of an even more bizarre hack that tool place on September 11th.
On September 11th 2014, a fascinating media hack began surfacing online. A hack so intricately designed, it is clear that someone put a lot of effort into planning, seeding and trying to spread the rumor, using multiple services, including Wikipedia, YouTube, Facebook and Twitter. We do know that many of the profiles involved, especially on Twitter, are of Russian origin. We still have no idea who specifically was behind the attempt.
The hoax claimed that a chemical factory in Centerville, Louisiana had exploded and was leaking hazardous chemicals everywhere. This began spreading, initially through text message alerts received by citizens of a neighboring town, and then around the web. The first Google search result, returned a fake wikipedia page(now deleted) tied to this supposed explosion. The page linked to a 30-second YouTube video where a camera was pointed towards a TV screen showing a fuming building along with ISIS fighters reading a message. Additionally, aFacebook page of a fake media outlet named ‘Louisiana News’ published a statement claiming that ISIS takes responsibility for the explosion in Centerville. And on Twitter, a full-blown tweet storm emerged, reaching peak velocity of one tweet per second, using a number of hashtags — #DeadHorse, #ColumbianChemicalsInNewOrleans, #ChemicalAccidentLouisiana, #LouisianaExplosion — eventually converging into a single hashtag: #ColumbianChemicals. During the peak of this campaign, a photoshopped screenshot of the CNN homepage with an article titled “Plant Explosion in Centerville Caused Panic” surfaced in tweets.
For many reasons, which we’ll outline below, the rumor did not spread far. Even though it was carefully planned, and seeded across different platforms, the content generated did not gain enough user trust, and hence no network effects were triggered. Especially in social networked spaces where authority and trust are so closely tied to the social graph, it has become increasingly difficult to manufacture a fake or spammy account without instantly raising red flags: why is no-one else connected to this person? Why is the account so new?
Let’s take a look at the different places where this hoax was seeded.
A wikipedia user by the name of AmandaGray91 created the original entry, claiming the fake explosion was caused by a terrorist attack, linking to a YouTube video as well as Wikipedia’s list of disasters in the United States by death toll. Someone certainly put thought into this. Since Wikipedia saves full history and all edits, we can see what else this user did on Wikipediabefore the hoax — editing pages on Alexander Asov, an “author of books in Russian pseudo history,” Aditya Birla Group, an Indian multinational conglomerate, and owner of Columbian Chemicals Co; and added punctuation to the page on Carbon black, manufactured in the Louisiana plant.
Wikipedia editors are a global community that has very clear rules of conduct as well as an internal authority rank. As a completely new Wikipedia editor, it is very difficult to simply add a page, especially one depicting an ISIS terror attack on US territory, and expect it to stick around for long. The page was taken down quite rapidly, as users who were led to it from tweets flagged it as potentially problematic.
Someone clearly took the time and effort to build up a fake public Facebook page of a non-existent media outlet called ‘Louisiana News,’ seeding it with content. From August 22nd onwards, the page was frequently updated with posts highlighting news events across various topics such as politics, sports and entertainment. The page gained some 6,000 likes, and is still openly accessible. Its last entry reads: “#Breaking news, claiming that ISIS takes responsibility for the explosion in Centerville, LA”, and has hundreds of likes, most likely associated with automated accounts.
Facebook’s EdgeRank algorithm that decides which piece of content should be displayed on user timelines is known to take into account engagement (likes and comments) on posts. Hence it is clear why someone would try to manufacture likes for a specific post. That said, even with higher EdgeRank, the post will only be visible in timelines of users who already liked or followed the page. In order to get follows, one needs to establish trust — get real users to pay attention to your content, and ask to receive more of it. That is not an easily game-able task, and effectively the reason why this Facebook campaign did not actually spread.
Over the years, we’ve seen a lot of odd things unfold on Twitter, and we must say that this hoax was one the weirdest ones we’d ever seen. The Twitter operation for this hoax began on September 10th, a day before the “supposed explosion,” when thousands of Russian twitter handles participated in a Tweetstorm getting the #DeadHorse hashtag to trend across a number of Russian cities — eventually even in Moscow and Saint Petersburg. A Tweetstorm is a sudden spike in activity surrounding a certain topic or hashtag. In this case, the Russian twitter handles all of a sudden posted tweets with the #DeadHorse hashtag.
The next day, on September 11th, a completely different group of Twitter handles, predominantly posting in English, started using the #DeadHorse hashtag along with #ColumbianChemicals within the same tweets. For example — @JaneneAngle — a clearly automated account created on Sept 8th, seriously ramped up activity on September 11th, posting angry text on double standards and corrupt governments, never posting anything afterwards. There were similar examples from @MiaBrowwnn whose account was also created on Sept. 8th, and started posting to the same hashtags on Sept. 11th, never tweeting anything afterwards. Here’s an image that was at one point Retweeted by hundreds of these automated accounts:
If we look at this twitter handle, @RebekahBENNET, we see it continuing to post content well after the hoax, and on one day in October, all of a sudden using a new hashtag — #MaterialEvidence. This is common behavior for online bots, which may switch focus to promoting a new campaign.
Perhaps the most revealing clue that this activity is caused by automated accounts is the source field attributed to most of these tweets. Typically we see ‘Twitter of iOS’ for those using the iOS Twitter app, or ‘Twitter for Android,’ or ‘Tweetdeck.’ But in this case, we see many tweets coming from sources labeled — ‘masss post,’ ‘masss post2’ — demonstrating a high likelihood that some broadcast automation tool was used.
Even with all this activity, the #ColumbianChemicals hashtag was not catching on, and was certainly not trending anywhere. So the master puppeteer ramped up the volume, and all of a sudden, a large number of Russian handles started posting to the hashtag. It looked something like this:
Even with this heightened level of activity, the hashtag was still not trending anywhere.
When we look at the network structure (who follows whom) of all the Twitter handles posting to the hashtag, we see two distinct groups emerge:
1. Several groups of Russian bots, some of which are still active to this day. Many of these accounts had been active for a while, many months before the hoax, and certainly don’t seem like they’re automated. They all publish images, conversational text, and links every once in a while. What’s so fascinating is that on September 11th, they all posted one or two tweets in English to the #ColumbianChemicals hashtag, and then went back to their typical activity, as if seeded within the network, and activated all of a sudden, only to fall back into hiding. Here are a few examples: @Galtaca, @Kiborian, @GelmutKol.
2. Accounts connected to The Times-Picayune and its Internet sister site NOLA.com, posting information such as this, debunking the hoax.
There’s a very important lesson learned here, crystallized by the network graph to the left. No matter how much volume, how many tweets, or Facebook likes a campaign generates, if the messages aren’t embedded within existing networks of information flow, it will be very difficult for information to actually propagate. In the case of this hoax on Twitter, the malicious accounts are situated within a completely different network. So unless they attain follows from “real accounts,” they can scream as loud as they’d like, still no one will hear them. One way to bypass this is by getting your topic to trend on Twitter, increasing visibility significantly.
Social networked spaces make it increasingly difficult for a bot or malicious account to look like a real person’s account. While a profile may look convincingly real — having a valid profile picture, posting human readable texts, and sharing interesting content — it is hard for them to fake their location within the network; it is hard to get real users to follow them. We can clearly see this in the image above: the community of Russian bots are completely disconnected from any other user interacting with the hashtag.
The same principle holds for Wikipedia, which is even harder to game as it is easy to identify those accounts who are not really connected to the larger editing community. The more time you spend making relevant edits and the more trusted your account becomes the more authority you gain. One can’t simply expect to appear, make minor edits on three pages, and then put up a page detailing a terror act without seeming suspicious.
As our information landscapes evolve over time, we’ll see more examples of ways in which people abuse and game these systems for the purpose of giving visibility and attention to their chosen topic. Yet as more of our information propagation mechanisms are embedded within networks, it will become harder for malicious and automated accounts to operate in disguise. Whoever ran this hoax was extremely thorough, yet still unable to hack the network and embed the hoax within a pre-existing community of real users.
No impact — other than a fascinating story!
Politicians, brands and corporations have been hacking media, trying to manipulate and reframe narratives to their end since the beginning of time. If you go back to the 1920’s and the work of Edward Bernays — the father of PR — came up with the term Public Relations because he thought that the term propaganda was overtly associated to the German WW1 war machine. Bernays would be fascinated by these hacks and by our media landscape today. But in a sense manipulation of message is something we are all familiar with today. Every message we put into the social internet has a grain of optimization in it. And one person’s optimization is another person’s propaganda. So why does this matter? Are these examples, fun as they are, just another set of data points in a long tail of corporate and political manipulation? Let’s pull on a few of these threads - it feels like something different is going on here..
Both of the examples above illustrate how brittle our real time news and discovery systems are. Much has been written about how the business of gathering news has been upended — I don’t want to rehash those issues here, there isn’t room to do them justice. What is important to state is that while we are living through an era of reinvention of our news systems we have to ensure that what comes out the other end is better than what we had previously. A democratic society requires news and information systems for citizens to make informed choices. While the real time nature of information distribution is unlike anything we have previously had the discovery tools are still weak.
Google search has become increasingly tuned to social data as the first example above illustrates. The recent deal between Twitter and Google suggests this tuning is only going to increase. On the Twitter side of the table, search hasn't evolved. Discovery is still rudimentary on site and third party tools are thin. Facebook same. Better discovery tools are needed — just as we saw with the rise of spam in email, these forms of propaganda, with fake or fabricated news, undermine trust in the platforms that carry them. If they don’t address this new platforms will emerge for discovery. The same for attribution. Some people believe that the social infrastructure of the Internet is basically complete — we don’t believe that is the case. Ever since the start of the Internet its social framework has been under construction and reconstruction. Twitter, Facebook and others need to evolve or die.
Attention please, not just clicks:
Tightly coupled with discovery are the tools for measurement — to an extent you build what you measure. Tony, CEO of Chartbeat, has written andspoken extensively about the attention web and why attention, time on page, time on site — real attention metrics are what matter. Last year I wrote a post here on medium about how attention and reading are evolving. We are living in an era of unprecedented transparency — and interestingly many of these hacks are happening in broad daylight. Unless we measure and value attention — time spent reading, listening, or viewing versus the raw click volume we aren’t going to build things that are actually of interest to humans. Take note of how bots are being used as part of these hacks. Just as Google made a concerted effort to track and expose viewability metrics in 2014 the social networks are going to have to think about how to expose attention metrics and help publishers understand outside of the torrent of shares what are people actually reading.
To effectively hack media, you need to penetrate and then connect across dense people networks. This was a lesson of #deadhorse vs. ISIS/France and the Sony Hack. In the latter examples the hacks spread across dense networks. Max Fisher (the author of the Vox article) is connected into very different networks than Vox.com or Ezra Klien. This assisted in how it spread across the network and a reason why it influenced Google’s search algorithm. And we should also not view these hacking events as unidirectional. I suspect we will see examples of states and non-states hacking revenge media. The FBI started an investigation a few weeks back in December into whether corporations — banks and others — are already revenge hacking. This is going to get very complicated and hard to parse.
My assumption has always been that increased transparency would result in a greater efficiency of information flow and that in turn, would naturally bend towards facts. Put another way, in an open society, with efficient information flow, fact and truth will win out. It’s impossible to measure this on the aggregate — and I believe that on the aggregate that is true — but its clear there are local cases where this simply isn’t the case. Russia is far more open a society than it was 30 years ago. Or turn to the Middle East and take a read of Gilad’s post about Israel, Gaza, War & Data. Or dig into how fake sites made up news about a Texas town under quarantine for Ebola to harvest clicks, or how “real” news sites make up news. Or Craig Silverman’spiece on how a Priest died and met God in the “48 minutes” before he came back to life. In all these cases transparency isn’t succeeding at winnowing out bullshit. And mainstream media offer an implicit assist by assuming its role is to be the established view from nowhere.
Media critics like Jay Rosen use the term, ‘view from nowhere’ to describe how some media strives to find a balance between objectivity and the reporting of facts, often erring on the side of reporting each side of an argument. They offer each perspective equal weighting, setting up the false impression that both perspectives are equally valid since they required equal coverage. As Rosen outlines (in a debate with himself) mainstream media are loath to say: ‘this is rubbish.’ They want to provide “perspective” — rather than take a position. And in today’s optimized world they want to generate SEO and social traffic from both sides of an argument.
Match this phenomenon with the torrid pace of sharing before or without reading and you have a toxic mix that can be effectively gamed or hacked. In the post I wrote last summer I noted how a huge percent of articles shared are never actually read: “Chartbeat looked at user behavior across 2 billion visits across the web over the course of a month and found that … a stunning 55% spent fewer than 15 seconds actively on a page.” Transparency was meant to be the new objectivity. Yet if people aren’t reading before they share — if mainstream media is balancing every perspective, if headlines without branded context are now content — media can and will be hacked, and perspective will be narrowed rather than broadened.
As Dmitry Tulchinskiy, bureau chief Rossiya Segodnya, said in August: “What is propaganda? Propaganda is the tendentious presentation of facts …It does not mean lying.” Tendentious — expressing or intending to promote a particular cause or point of view — with such a clear choice of words, I wish he had talked more about the methodology.
This essay is from the betaworks 2015 book. Other essays published include: investing at betaworks, and a critical look at Artificial Intelligence.
Illustrations by: Henry McCausland.
Yle Kioski Traces the Origins of Russian Social Media Propaganda — Never-before-seen Material from the Troll Factory http://bit.ly/1B5zVOF
Russia Is Hacking Your News Feed http://bv.ms/1B5FCwh