How to Get Ahead at the NSA

Daniel Soar

If you’re not exhausted by or indifferent to the endless revelations about the NSA – another week, another codename, another programme to vacuum up and analyse the world’s communications – then you’ve probably long since drawn a single general conclusion: we’re all being watched, all the time. You may also think this is something we sort of knew anyway. Perhaps you see ubiquitous spying as a function of the post-9/11 authoritarian state, which gathers knowledge by any means possible in order to consolidate its control, and which sees us all as potential suspects. Or perhaps you think that if the state is going to have a chance of keeping us safe from bad guys it obviously has to have the latitude to look for them: it isn’t interested in your research into 13th-century frescoes or cheap tights, but it needs to monitor all internet activity so that it can detect that rare occasion when someone searches for the materials to make hexamethylene triperoxide diamine bombs.

The trouble with both these responses is that they’re answers to a selfish question: are the spies doing what they’re doing because they’re interested in us? Civil libertarians say yes, and that the monitoring must stop; security advocates say no, not if we aren’t doing anything bad. The paranoid reaction – that if I use the word ‘bomb’ in an email to my aunt from the vicinity of a Bali nightclub then I may find black-suited agents descending on my hotel room – is just an extreme version of the narcissistic fallacy that someone is trying to see into my brain. There are seven billion people on the planet, and nearly seven billion mobile phones; six billion emails are sent every hour; 1.2 petabytes of data travel across the internet every minute, the equivalent of two thousand years’ worth of music playing continuously, the contents of 2.2 billion books. Even if they don’t get everything – the NSA claims, with loving wording, to ‘touch’ just 1.6 per cent of global internet traffic, or about 35 million books’ worth of data a minute – the spooks have an awful more to be getting on with than worrying about you.

And that’s just the internet. That the NSA – along with the rest of the Five Eyes, the signals intelligence agencies of the UK, Canada, Australia and New Zealand – has for the past sixty or so years sought to monitor as many of the world’s communications as it has been technically possible for it to access is widely accepted. In response to Edward Snowden’s leaks, the NSA put out a statement in August to expand on the public description of its mission, defining signals intelligence (or SIGINT) – its primary job – as ‘the production of foreign intelligence through the collection, processing and analysis of communications or other data, passed or accessible by radio, wire or other electromagnetic means’. ‘Communications or other data’ that is ‘passed or accessible’ by ‘electromagnetic means’: that’s anything emitted or received by a phone, computer, fax, radio, guidance system or satellite, or data that travels along any kind of cable, whether dedicated to voice signals or internet payloads or banking transactions or supposedly secure diplomatic, government and military communications. It’s anything with a pulse. Asked last month by a member of the Senate Intelligence Committee whether there was a limit to the records the NSA could collect, Keith Alexander, the agency’s director, said: ‘There is no upper limit.’ He was talking about the phone records of Americans, but since those explicitly fall outside the NSA’s foreign intelligence remit, and since many had thought that systematically collecting them was illegal, it went without saying that there was no limit to its ambition or ability to monitor anything else either.

So the question has to be not so much ‘Is Big Brother watching?’ but ‘How in hell can it cope?’ We know what the NSA’s job is, but we don’t know how it does it. How would you, as a junior analyst in S2C41, the branch of the Signals Intelligence Directorate responsible for monitoring Mexico’s leadership, navigate the millions of call records and pieces of ‘digital network intelligence’ logged from Mexico daily, in order to find that nugget of information about energy policy that’s going to get you noticed? For all the doomsaying certainty of the news stories that have periodically filled front pages since early June we are still in the dark about most of the NSA’s actual methods and day to day activities. The NSA employs more than thirty thousand people and has an annual budget of nearly $11 billion; outside its headquarters at Fort Meade in Maryland, it operates major facilities in Georgia, Texas, Hawaii and Colorado, and staffs listening posts around the world. The leaks are, at best, a series of tiny windows into a giant fortress. It’s still hard to spy on the activity within.

The documents we’ve seen – a fraction of the total number in the hands of Guardian and Washington Post journalists – are a blur of codenames. EVILOLIVE, MADCAPOCELOT, ORANGECRUSH, COBALTFALCON, DARKTHUNDER: the names are beguiling. But they don’t always tell us much, which is their reason for existing: covernames aren’t classified, and many of them – including the names of the NSA’s main databases for intercepted communications data, MAINWAY, MARINA, PINWALE and NUCLEON – have been seen in public before, in job ads and resumés posted online (these have been collected over the years by a journalist called William Arkin, who has written several books on American secrecy and maintains a useful blog). It’s been a feature of the coverage that the magic of the words has been used to stand for a generalised assertion of continuous mass surveillance. On 29 September the New York Times ran a story reporting that MAINWAY was being used ‘to create sophisticated graphs of some Americans’ social connections’. The next day, not wanting to have its thunder stolen, the Guardian, which after all owned the Snowden story, having broken it, ran a front-page piece saying that MARINA provided the ability to look back on the past 365 days of a user’s internet browsing behaviour. The only new piece of information in the story – new in the sense that it hadn’t been already been reported in the Guardian – was the business of the year’s worth of history. It was a case of my database is scarier than yours.

One reason for the uncertainty over what these things are for and how they work is that the leaked documents aren’t everything you might hope. The ones which have been relied on most heavily in the coverage are PowerPoint presentations that are usually described as ‘training slides’, even though – in the sections which have been made public, at least – they tend not to explain how a particular system is used. They are more like internal sales brochures aimed at the analysts, bigging up the benefits of one method over all the others. ‘PRISM,’ one introductory slide says, ‘The SIGAD Used Most in NSA Reporting.’[*] A series of bar charts shows how relatively rubbish other forms of collection are by comparison. The presentation’s author, PRISM’s own collection manager, proudly notes the ‘exponential’ growth in the number of requests made through the system for Skype data: 248 per cent. ‘Looks like the word is getting out about our capability against Skype.’

The system about which most detail is given, thanks to a presentation that begins with the question ‘What can you do with XKEYSCORE?’, sells itself by advertising – in a bullet-pointed list – its ‘small, focused team’ that can ‘work closely with the analysts’. There’s some geeky speak of Linux clusters and the Federated Query Mechanism – which simultaneously searches current traffic at all of the NSA’s collection sites around the globe – as well as a strong sense of startup culture: XKEYSCORE’s philosophy is ‘deploy early, deploy often’, a weaponised version of the Silicon Valley mantra beloved of Facebook engineers, ‘ship early, ship often’. Some handy use cases are listed: find everyone using PGP encryption in Iran, find everyone in Sweden visiting an extremist web forum. ‘No other system’ – these words highlighted in red – ‘performs this on raw unselected bulk traffic.’ There’s an endorsement from the Africa team, declaring that XKEYSCORE gave it access to stuff from the Tunisian Interior Ministry that no other surveillance system had managed to catch. It’s not unlike a washing powder ad. One of the things these slides are most revealing of is the marketplace within the NSA. At your desk in S2C41, as you sit down to find the best way to home in on dodgy goings-on by senior Mexicans, you have a whole menu of sexy tools to choose from.

The sales-speak nature of this material means that it can be misleading. It was the PRISM system – which the reports said gave the NSA ‘direct access’ to the servers of some of Silicon Valley’s biggest and most beloved companies, including Facebook, Google, Apple and YouTube – that dominated the headlines when the leaks first hit. The idea that the genius behind your perfectly engineered iPhone and the friendly souls behind the colourful Google logo had willingly collaborated with the electronic eavesdroppers to hand over the full set of keys to their multibillion-dollar server farms – when there was no law that could require them to do so – was a shock to many. It was also at some level outlandish: in most cases (if you leave aside Apple), the data the company possesses is what generates its phenomenal value, and it was hard to imagine that this commercially priceless property would be freely shared with anyone, let alone with the government. (Ayn Randist libertarian capitalists don’t like government.) The internet companies themselves categorically denied any knowledge of the PRISM programme, or anything like it.[†]

But ‘collection directly from the servers’ was what the slides said, and the implication was that the full unencrypted traffic from everyone’s favourite web services was being piped wholesale into the NSA’s databases. The implication turned out to be wrong. What happens is that an NSA analyst ‘tasks’ PRISM by nominating a ‘selector’ – meaning an email address or username – for collection and analysis. In other words, PRISM allows an NSA worker to submit a request, which is invariably granted, to monitor an individual Gmail account or Yahoo identity or Facebook profile and have all its activity sent back to the NSA. (In this context, ‘direct access’ is accurate: if a selector has been approved for monitoring, the NSA has access to it in real time.) One of the slides the Guardian didn’t disclose – it appeared a few days later in the Washington Post – showed a screenshot of the tool used to search records retrieved through PRISM. The total count of records in the database – in April, when the slide was made – was 117,675. It’s worth looking at that number. Facebook has a billion users: half of the internet-connected population of the planet has an account. The fraction of those whose full unencrypted activity the NSA was actively monitoring can be no more than 0.01 per cent. This isn’t to pretend that the NSA high-mindedly refrains from seeking access to our baby pictures or inane comments on other people’s baby pictures. But it does suggest that you don’t fill in a form to access a random Mexican’s timeline unless you expect to get something out of it.

Another slide the Guardian withheld – it published only five of the 41 in the full presentation, citing security concerns, though the wish for maximum impact could be another reason for the choice – describes the PRISM ‘tasking process’. The slide shows a flowchart of mind-numbing complexity. After the analyst puts selectors into the Unified Targeting Tool, they are passed to S2 FAA Adjudicators in Each Product Line and to Special FISA Oversight and Processing (SV4), before going to a third department, Targeting and Mission Management (S343), pending Final Targeting Review and Release. Somewhere at the bottom of the line the approved request gets handed over to the FBI’s Data Intercept Technology Unit (DITU), the external body which actually interfaces with whichever internet company the NSA needs data from. (You can see why Facebook, Google et al have found it so easy to maintain that they aren’t systematically feeding the NSA.) The internet company hands over the requested data to the FBI – in 90 per cent of cases with no questions asked – and the information is then processed and ingested into NSA databases for all analysts to enjoy.

As ever, the blandly obscurantist codes give little sense of what is actually going on, and it’s easy to suppose – as many do – that all this meaningless superstructure is designed merely to give a semblance of due process to a system that has none. But in fact the arrangement has its devilish logic, each coded unit standing for a whole subsection of the NSA’s huge, hydra-headed military bureaucracy. The full extent of this bureaucracy is one of the most valuable lessons of the leaks. S2 is ‘analysis and production’, S3 ‘data acquisition’. S35 and its subcodes refer to Special Source Operations, the department responsible for conducting the delicate task of arranging ‘partnerships’ with entities that can give the NSA access to data that can’t be reached by any other means: cable companies, internet backbone providers, the maintainers of the switches and relays that keep global communications whirring. It is these arrangements that give rise to many of the more spectacular covernames that have been seen recently: MONKEYROCKET, SHIFTINGSHADOW, YACHTSHOP, SILVERZEPHYR. The type of data these sources provide, whether phone or internet records, is lightly classified: it’s merely secret. The area the source is targeted at – say, counterterrorism in the Middle East – is classified top secret. How the NSA has actually gone about getting hold of these data streams – through what pressure put on what companies by what means – is so sensitive that none of the documents we’ve seen even hints at it.

SILVERZEPHYR (SIGAD US-3273) is a source of particular interest to our man on the Mexico desk. It delivers data from Central and South America, serving up phone and fax metadata, as well as internet records – both metadata and content. An impressive demonstration of what can be achieved with it appears in an NSA presentation that was released last month to Fantástico, a Brazilian news programme, by Glenn Greenwald, the chief shepherd of the Snowden leaks. The presentation is a case study to show the benefits of creating ‘contact graphs’, ‘a useful way of visualising and analysing the structure of communication networks’. The slides describe a two-week ‘surge’ operation that S2C41 carried out in the final month of the 2012 presidential campaign against Enrique Peña Nieto, who was then leading in the polls, and nine of his closest advisers.

The analysts first tasked their systems with ‘seed’ selectors, representing the phone numbers of Peña Nieto and the advisers. Using MAINWAY – the database, you’ll remember, that allows for analysis of phone metadata and the relationships between numbers – S2C41 then produced a ‘two-hop’ contact graph, to show everyone each seed communicated with, and everyone those people communicated with too. Further analysis of the graph showed who in the network was most significant, including targets who until then hadn’t been known. It was then a cinch to run the content of all text messages sent from and received by these significant numbers through a system called DISHFIRE, which extracted any messages that were ‘interesting’. Among these messages were lists of names of the people who would be given senior positions in a Peña Nieto administration. Six months after Peña Nieto’s election, all the people listed had joined the government. A case study like this shows why you really do need all the systems at your disposal to do useful work at the NSA. It’s also a good primer in how to learn things that are unknown to anybody other than the Mexican president-elect, and perhaps his wife.

*

There are rarely complaints in the US media about the practice of spying on leaders and diplomats from foreign countries. It has always been seen as a relatively uncontroversial part of the NSA’s mission, and indeed of the way international affairs are conducted. The Snowden leaks have revealed some recent operations, such as a successful effort to crack the UN’s videoconferencing system, and an infiltration of the EU’s new building on New York’s Third Avenue. These have only been reported in detail in Der Spiegel: the Anglophone press barely cares. It’s hard not to get the impression that international meetings are invariably bugged, and delegates’ phones monitored, to give the home team an advantage in negotiations. The last time there was a significant scandal in the UK about this kind of activity was in 2003, when Katharine Gun, a translator for GCHQ, leaked an email she had been sent by an NSA official asking for her assistance in eavesdropping on member states’ discussions to help force a favourable UN resolution on Iraq. Clare Short, Tony Blair’s international development secretary, claimed that she was given transcripts of Kofi Annan’s bugged conversations at around the same time. It usually takes something like an imminent war to bring such intelligence-gathering to light, but it has gone on since at least the days of Herbert Yardley, the director in the 1920s of the Cipher Bureau, a precursor to the NSA, who helpfully explained his methods in a bestselling memoir called The American Black Chamber.

It might be reassuring to imagine that the US surveillance complex is secretly busy with nothing more sweeping than an old-school foreign surveillance operation, keeping an eye on bigwigs from unfriendly countries. The legend goes that Yardley’s operation was closed down by Hoover’s secretary of state, Henry Stimson, who supposedly said: ‘Gentlemen do not read each other’s mail.’ What a nice sentiment. Of course, there’s no evidence that he said any such thing, and the moment the Cipher Bureau was shut in 1929 its files were transported from New York to Washington by the man who had been appointed to head its successor organisation. ‘Immediate steps were taken,’ William Friedman later wrote, ‘completely to reorganise the bureau and its work.’ Along with the files went the secret agreements with the telegraph companies, such as Western Union, which would lend out telegrams for analysis before delivering them. The telegraph companies weren’t always comfortable with the arrangement, but it kept going in one form or another until after the Second World War, when legal orders came into force to compel all the major providers to share the communications they were handling with the organisation that was about to be called the NSA. The programme was called SHAMROCK, and it persisted until the late 1970s, when Senator Frank Church started investigating the NSA’s activities, declaring them to be potentially intrusive on the lives of ordinary Americans. Church’s high-profile investigations led to the Foreign Intelligence Surveillance Act of 1978, a law which seemed to give more freedom to citizens but was also followed – we now know – by the introduction of a new programme to replace the now outlawed SHAMROCK. BLARNEY – a comfortably familiar Irish name – got going the year FISA was passed and is still a significant presence in the Snowden files.

And then there was 9/11. The President’s Surveillance Program (PSP) authorised broad new powers to collect and analyse Americans’ communications without a warrant. It was, at first, highly secret: the NSA’s own inspector general wasn’t told of its existence until well after it had launched. Gradually the news spread and in 2004 a New York Times reporter, James Risen, started looking into it. The response was dramatic: the Times was dissuaded from publishing its story about it for nearly a year, and in the interim the NSA rushed to find new legal authorities to maintain the supply of information it had come to find so useful. By the time the news was public, alternative systems were already in place, and they were eventually enshrined in a 2008 amendment to FISA, FAA, the authority under which programmes such as PRISM now operate.

Every time one of the spies’ methods comes under the spotlight, questions of legality arise. The law is changed, purportedly to stop such abuses happening again. But inevitably the new law includes a new route by which some version of the old system is made valid again, and a programme that once had to be kept highly secret can be discussed in public as much as you like. In response to the Snowden revelations, a new bill has been put forward, the Intelligence Oversight and Surveillance Reform Act. It sounds benign, but if you’re of a paranoid disposition, you have reason to fear what it might bring.

[*] SIGAD stands for ‘SIGINT Activity Designator’, and represents a specific source of data, whether that’s a physical intercept facility like the large radome-covered base at Menwith Hill in North Yorkshire (USD-1000), or a worldwide programme like FAIRVIEW (US-990), which involves ‘partnership’ with a cable provider to get access to global internet traffic. There appear to be more than five hundred active SIGADs.

[†] Their denials were interestingly in contrast to the brilliant non-denial issued by Verizon, the US’s second largest phone company, in response to the earlier story that it had been forced to hand over all its call records to the NSA. In a memo to staff, a company vice president wrote: ‘You may have seen stories in the news about a top secret order Verizon allegedly received to produce certain calling information to the US government. We have no comment on the accuracy of the Guardian newspaper story or the documents referenced, but a few items in these stories are important. The alleged court order that the Guardian published on its website contains language that a) compels Verizon to respond; b) forbids Verizon from revealing the order’s existence … Nevertheless, the law authorises the federal courts to order a company to provide information in certain circumstances, and if Verizon were to receive such an order, we would be required to comply.’