Reimagining Defence Episode 2 – Data & the end of the HiPPO?

This episode was written by Lt Col Henry Willi and edited by Flt Lt James Kuht. The thoughts are the authors own and do not represent the Ministry of Defence. The episode can be found above, or on Spotify & iTunes. You can find out more by following us on twitter, @ReDefPod.

=======================

We should be particularly fearful of HiPPOs…Highest Paid Person’s Opinions.  Because, in the absence of data, the person with the highest rank – the one that gets paid the most – makes the decision.  You can’t really blame HiPPOs for this, as CEOs of Silicon Valley will often lament, “if we have data, then let’s look at data, if all we have our opinions, then let’s go with mine”.  Unsurprisingly, this can lead to tenurocracies rather than meritocracies, where those who’ve been around the longest tend to make the big decisions. 

Beware of the HiPPO’s…

Therefore, in order to ensure it is the quality of the idea that matters, not who suggested it, a data driven approach to decision making is crucial.  This is why data is so important – it is ultimately for the purpose of making better decisions, faster.  Those that want to do this, should obsess about data.  

Moreover, grasping the significance of data also helps us appreciate why it is indispensable to all of the other digital technologies. The internet of things wants to collect it, the cloud wants to host it, RPA wants to automate it, and AI wants to apply it. Undervalue data and all these things underperform.  

Why is data talked about so much right now? 

So let’s start with the basics of data before tackling the buzzword of big data.  We’ve always had data, it’s the lifeblood of the scientific approach.  You form a hypothesis, create an experiment to test it then you check the data to see if you were right or wrong; each time you do this, you understand a little more, you get a little smarter.   

What’s happened in the last couple of decades is that digitisation has led to more data, which has led to a significantly greater understanding of the world around us.  Think about it from a personal perspective, before 2007, if I said, we’re deploying on operations and I need maps, a notebook, phone and address lists, a GPS, a camera, a Dictaphone, a video camera, a radio, and so on, you’d need a Bergan to carry it around.   

Now you just need a pocket, because in 2007 much of that stuff was digitized and dematerialised into an iPhone.  Which means for us and the other 3.5 billion smartphone users out there, we have persistent access to data from around the globe in close to real-time, allowing snappy, effortless decision making in turn.

Take the app Google maps for example, it collects map and satellite data, real time traffic feeds, your GPS coordinates and those of the other 1bn drivers who use it each year, all with the aim of spoon feeding you the best directions to get you from A to B.

The bottom line is that Digitisation has led to an abundance of accessible information, which has helped us make better decisions, faster.  It has also led to the term data science and big data.

Data Science and data scientists

In 2012, the Harvard Business Review ran an article headlined “Data Scientist: the sexiest job of the 21st Century”.  And this year, Project Maven, the US DOD’s billion dollar AI program, will purportedly spend about half their money on data scientists.   So why so sexy?

Well from a command perspective, and without wanting to be too simplistic, one way to think of a data scientist is as a terp – an interpreter.  On operations, a high quality terp is an invaluable patrol member, they have a rare skill set that allows you to understand the people – the data – around you.  When you don’t have one, it can be hard to make effective decisions; you are deaf and dumb to what people – the data – is telling you.

Terps and data scientists help you collect the data; from which village to take you too or which datasets to explore.  They know how to clean that data through filtering out dodgy informers or enhancing incomplete datasets. And they know how to visualise the most important data through shuras or software applications like power BI.  

In sum, a data scientist is the person that interprets and presents the data such that commanders can make informed decisions as quickly and effortlessly as possible.   So here’s the provocation.  If terps are an invaluable member of a tactical patrol, should a data scientist be an essential member of a General’s team?  If the logic is that the world is becoming data rich, have we invested in putting those that can exploit this at the top table?  

Big DATA (and how facebook has nailed it)

As for big data, it can be big in three ways: volume, variety and velocity.  We’ll use Facebook to illustrate the point: 

Firstly, for volume, data can be big by dint of the number of examples in a database, which, for Facebook, could be its 2.5bn subscribers.  Imagine 2.5Bn individual cells running across the top of your excel spreadsheet! 

Secondly, for variety, data can be big due to the range of characteristics for each example, in this case, the differing characteristics for each Facebook subscriber.  The friends they have, what they’ve liked, their IP address, their browsing history, and so on, recorded every second they log on, of every day, of every year.  Again, you end up with an endless waterfall of data pouring down the y axis of an imaginary spreadsheet. 

Thirdly, for velocity, data can be big due to the speed at which it is being collected such as the 4 million likes a minute or 350 million photos a day being uploaded; the information in the database is never static.

Facebook’s systematic collection of big data means that it has profiled us extraordinarily effectively for targeted advertising.  A business model that is so successful, that it accounts for 98% of the 21 billion dollars in revenue Facebook made last year.  

In the military, we might call advertising, ‘information operations’, where both advertisers and information operators send messages to nudge people to do something they want..  But to make the nudge work, the targeteer needs to understand the audience, and then tailor the message to their beliefs.  

One military term that describes systematically understanding the audience is human terrain mapping.  Facebook has digitised human terrain mapping at colossal scale.  In 2016 the Washington Post reported that Facebook systematically collected 98 personal data points on each of its 2 billion plus members, ranging from their age to what news they follow.  

In electioneering terms, the insights yielded by such big data allows campaign managers to send tailored messages to different segments of the audience, known as microtargeting.  This increases the chance of landing a compelling message, targeted to voters whose opinions might be swayed at a fraction of the cost of traditional advertising.  In 2016, Trump’s campaign sent nearly 6 million micro-targeted ads, whereas Clinton’s was a mere 66,000.

Looking ahead to this year’s US election, the Financial Times describes it as a digital war, where some fear that the Democrats risk losing because they are digital dinosaurs who just don’t get the power of on-line electioneering.  Whereas Trump has made Brad Pascale, his 2016 digital director, his Campaign Manager for this election.  He has put a data scientist at the top table; a Digital Cat amongst Jurassic pigeons.

Problems with data

Big data and data science is not a nirvana though.  Continuing the US election theme, take polling.  It’s the olympics of predictive modelling, every four years the top pollsters line up to make their predictions.  In 2016, days before Donald Trump waltzed into the White House, Reuters gave Hiliary Clinton a 90% chance of victory, the Huffington Post said 98%, Princeton Election Consortium 99% and The New York Times 85%.  

How could so many smart people get it so wrong?  Well maybe it’s that they weren’t that smart after all, data is only as good as the people using it.  It is possible to see a breadcrumb trail of how the predictions turned sour, from not sampling the right populations – akin to Dominic Cummings quip that journalists should get out of London if they want to get an accurate representation of the country’s view on Brexit – through to models that didn’t fully take into account people who might not want to admit that they had voted for Trump, through to an absence of incorporating alternative data sources, like comments on social media.  

One of the ways to sum this up is through the adage of systems thinking: garbage in – garbage out.  The quality of the input dictates the quality of the output.  If you‘re recruiting from a pool of poor-quality candidates, you won’t produce special forces.  If you build a car from cheap materials, you’ll get expensive repair bills.  And if you analyse poor quality data, the resulting analysis will be worthless at best and misleading or damaging at worst.

What does good look like in 5 years?

So far we’ve discussed why we should care about data – to make better decisions, faster; how big data can be characterised – volume, variety and velocity, how data scientists are digital terps, and how poorly translated data, can lead to big problems.

Now we’re going to take the liberty of imagining what all this could mean for Defence 5 years from now.  What could good look like?

Let’s focus on two things: people then product.  

Why people first?  Well hold in your mind that line from systems thinking: the quality of the input you use always has an impact on the quality of output, and then consider why John Boyd, the maverick US fighter pilot, used to say, ‘People, ideas, technology.  In that order’.  If we don’t get the people right, we won’t get the ideas right and then we won’t get the output we want from the technology.  

Therefore, for data, 5 years from now data scientists will be seen as essential members of the patrol.  Whether that be at the tactical level, where they work in FOBs, shoulder-to-shoulder with operators to transform the data that is coming off the battlefield into actionable insights or bespoke apps, accelerating operational tempo in turn.  Or at the strategic level, with sophisticated modelling to help General’s make informed decisions over how best to invest billions of pounds of taxpayer money 

In short, just as Forward Air Controllers reach into the sky to weaponise the assets overhead, Data scientists reach into the cloud to weaponise the data at hand.  Their skillset will become a go-no go criteria for mission success.

Turning to product.  5 years from now our data wants to have Google-like qualities.  That is, we want a platform where MOD data has been structured, the algorithms refined, and the method of presentation simplified, such that we can turn to our MOD issued smartwatches, smartphones, laptops or augmented reality Night Vision Goggles to get tailored answers relevant to our needs instantly, whether we’re an operator in the field asking for how many assets can swarm on a given target at a given time, to an administrator back in camp, asking for stats on electric vehicle fleet usage.  Ultimately, good looks like one data platform; many different uses.  

Think about it, If Google, a company started by two students in a basement, can index hundreds of billions of web pages to make much of the world’s information universally accessible and useful, then we should have the wherewithal to do a slither of that for those in Defence.  

That product, the underlying data, is also the fuel to make, dare I say it, every other buzzword buzz: internet of things, AI, ML, RPA, blockchain, quantum and so on.  

And yet none of these can benefit from data without one other crucial component – the cloud; which is the topic for our next episode.

==========

References 

Kahneman, D., 2011. Thinking, fast and slow. Macmillan. 

Floridi, L., 2010. Information: A very short introduction. OUP Oxford. 

Friedman, T.L., 2017. Thank You for Being Late: An Optimist’s Guide to Thriving in the Age of Accelerations (Version 2.0, With a New Afterword). Picador/Farrar Straus and Giroux. 

Schmidt, E. and Rosenberg, J., 2014. How google works. Hachette UK. 

Spiegelhalter, D., 2019. The Art of Statistics: Learning from Data. Penguin UK. 

Thaler, R.H. and Ganser, L.J., 2015. Misbehaving: The making of behavioral economics. New York: WW Norton. 

How Much Data Is There In The World? https://www.bernardmarr.com/default.asp?contentID=1846 

Tett, G., 2020.  Can you win an election without digital skulduggery.  Financial Times.  Available at: https://www.ft.com/content/b655914a-3209-11ea-9703-eea0cae3f0de

https://www.rfwireless-world.com/Terminology/Advantages-and-Disadvantages-of-Big-Data.html