In her keynote presentation at Apache: Big Data North America, Amy Gaskins, a data scientist with more than a decade of experience, shared five key requirements that can make or break big data projects.
Successful big data projects have five key requirements, says Amy Gaskins, a data scientist with more than a decade of experience designing and implementing data and intelligence projects for the private sector, government agencies and the U.S. military.
In her keynote presentation at the Apache: Big Data North America conference in Vancouver on Monday, Gaskins stressed that five factors can make or break big data projects:
- Buy-in. It’s commonly acknowledged at this point that big data projects need buy-in from senior leadership to succeed. But Gaskins says that’s not enough. You need buy-in at every level, including middle management and workers themselves. “You need to get it from senior leadership, but also the middle and bottom. Why are we doing this? Everyone needs to understand.”
- Urgency. “Is there an existential threat to your business or the mission if you don’t do this?” Gaskins asks.
- Transparency. Do people both inside and outside the organization know what we’re doing and why? Can it be repeated?
- Involvement of non-data science subject matter experts (SMEs). Non-data science SMEs are the ones who understand their fields inside and out. They provide the context that allows you to understand what the data is saying. These SMEs are frequently what hold big data projects together, Gaskins says. “It’s the non-data SMEs that prevent IT and business from fighting each other,” she says. “It’s like magic, and I don’t say that lightly.”
- Psychological safety. This is all about trust. The team members, data scientists and SMEs alike, must be able to trust each other.
“When we talk about requirements for succeeding, we think about Maslow’s Hierarchy of Needs,” Gaskins says. “But the truth is it’s really a system and any part of the system can break down.”
Two examples of big data success and one near miss
Gaskins, who most recently served as big data project director at the National Oceanic and Atmospheric Administration (NOAA), drew on three personal experiences to illustrate the point: helping the 43rd Sustainment Brigade in Afghanistan root out corruption that led to resources falling into the hands of the Taliban, helping MetLife’s Dubai office build an automated solution for detecting insurance fraud, and helping NOAA open up and commercialize its weather data.
The first two projects met each of the five requirements she highlighted and proved successful.
In Afghanistan, Gaskins, once a military intelligence officer herself, served as an embedded mentor with U.S. Army Intelligence and Security Command (INSCOM). She was embedded with the 43rd Sustainment Brigade when its intelligence officer returned to the U.S. The brigade served about 5,000 soldiers, but the intelligence unit only had six people. Gaskins helped pioneer a program that used truck drivers and others to help gather intelligence that the team could analyze for evidence of corruption and bribery.
Working with MetLife in Dubai, Gaskins drew on insurance claims adjusters as SMEs to help build an automated solution to detect fraud that ultimately resulted in a 400+ percent ROI.
The third project had some success, but missed the mark when it came to buy-in from NOAA’s political leadership. It also lacked a sense of urgency as a result. The project did successfully open much of NOAA’s data to the public, though the organizations that have had the most success using it to date are ones that have poached NOAA’s SMEs to understand the data available.
“It was an egalitarian style team with no titles, which allowed everyone to make decisions very easily,” Gaskins says. “We were open, transparent and this made the team really safe.”