Overcoming Dyn – How DataXu Avoided Disaster On the Day The Internet Went Dark
March 13th, 2017 Neel Sharma
This post originally appeared on ExchangeWire.
On October 21st, 2016, the internet faced its largest DDoS attack on record, with hackers knocking down dozens of the web’s top sites like Amazon, Twitter, Reddit, PayPal, Spotify, Netflix, Pinterest, GitHub and many more in one fell swoop in the now famous Dyn cyberattack.
For those outside the marketing world, the Dyn attacks of October 21st were mostly just a major inconvenience to lose access to their favorite sites, but for advertisers, it meant millions of ad dollars were at risk of entering a black hole.
We recently spoke with Imran Malek, Supply Chain Product Manager at DataXu, about how his team mitigated against the huge risks presented by the unusual Dyn attacks by leveraging Metamarkets to manage the programmatic spending of their partners.
Entering the War Room
DataXu is an independent programmatic platform that helps CMOs and their agencies around the world leverage the power of programmatic advertising to drive marketing ROI. At DataXu, Malek is responsible for the health of the company’s inventory supply chain, as well as the technical integration within that supply chain.
On an otherwise typical Friday, Malek arrived at work and immediately discovered the irregularities caused by the Dyn outage with his team at 8:31 a.m. ET. Malek gathered a group including his VP of engineering, along with several engineering directors and technical engineers, to enter an office “war room” to determine the scope of the issue.
“A lot of the internet went down, including some of our infrastructure, because a lot of our main routing was being done through Dyn,” said Malek. “We immediately noticed some intermittent network failures and responded quickly to make sure they wouldn’t impact our customers. Our real-time bidding systems were holding up pretty well despite the outages, but we couldn’t be 100% sure of the impact. The more pressing concern was that we were bidding on requests without all the right information for a short period of time.”
With the issues outlined, Malek’s team began coordinating with DataXu’s exchanges and partners to explain the issues and have them stop sending bids. From the early moments in the war room, they pulled up data from Metamarkets on a big screen to display in real-time the results of the changes being made.
“We decided to pump the brakes and flip the kill switch, turning off all the bidders to make sure we didn’t over-spend on behalf of our customers,” said Malek. “Once we turned everything off, we needed to confirm as quickly as possible that everything was in fact getting turned off. We wanted to get good feedback from all of our bidder partners about whether or not they could confirm that we are not sending them any more traffic when they send us requests.”
That’s where Metamarkets came in handy – Malek says that with the interactive analytics time-series charts from Metamarkets, they could detect if any bid curves were still rising with more bidding, since they couldn’t detect that with certainty on the affected systems. They had easy access to any data they needed in the Metamarkets dashboards, and near-instant confirmation upon flipping off any particular stream that they had successfully mitigated against the risk of potential runaway spend.
The result? Malek says they were able to see only about 20% less spending than they would normally see on an average Friday. “I’m sure that if we didn’t have that real-time feedback mechanism available, that number could have been much higher and we could have lost a full day’s worth of spend as opposed to only losing a very small portion of it,” said Malek. With a potential for runaway spend that could have totaled in the 200,000 – 300,000 dollar range, he says they only ended up losing about 4,000 dollars that day.
“Having a real-time data tool like Metamarkets to have that instant feedback is something that I think you don’t realize how much you need it until it’s an emergency, so it was great to have that.”
Another Dyn attack hit around 12:15 p.m. ET, requiring another round of communications from the DataXu team to halt bidding. They continued to track Metamarkets real-time data in the corner of the war room, and when the attacks came to a close that evening, they slowly ramped systems back up to 100% by around 7:30 p.m. ET, using Metamarkets data to confirm everything was back and working smoothly.
From User to Customer
DataXu’s development team had been using Metamarkets through their exchange partners for years, accessing logins from those individual partners to view programmatic data. But the crucial insights provided during the Dyn attack helped push them over the edge to work directly with Metamarkets for their programmatic analytics needs.
“We were already in the conversation process to be a Metamarkets customer when the outage happened, but this really cemented that goal,” said Malek.
Malek says one of the main advantages he gets today from using Metamarkets is understanding the bid stream with a level the granularity they need during troubleshooting. DataXu has several internal metrics dashboards, but Malek says they often serve end-point or hardware-specific purposes.
Metamarkets provides him with the ability to drill down on his data and answer exactly the questions he needs to improve revenue for his clients. Rather than spending valuable time and taking up loads of computing power digging deep into log files, Metamarkets presents the data in an easy-to-understand, interactive format.
For example, Malek says that deal IDs that aren’t sending often is the result of a creative mismatch or a domain that keeps getting blacklisted. Metamarkets allows him to quickly identify that issue by looking at whether requests for that ID are coming from mostly one domain. They can quickly work to take that domain off the black list and start the campaign spending again.
“I can then see the entire performance of that deal ID in our bid stream right in front of me,” said Malek. “I can know whether or not that deal ID is actually spending – and if not, I can use Metamarkets to drill down to the reasons why.”
“Without Metamarkets, we’d have to pull through the data logs, line by line, looking for that specific deal ID. We might be able to identify a trend that they all have the same domain, but at some point when you’re looking at multiple log lines in sequence, your eyes can glaze over,” said Malek. “Whereas in Metamarkets you can actually see these patterns aggregated up – the correlation between deal IDs and domains stands out immediately.”
With their full integration with Metamarkets now in place, DataXu will be able to apply that process to all of their data directly. So you can be sure that when the next big outage hits – they’ll be ready to respond in real-time.