Black Friday, Cyber Monday. They’re two of the most important days of the year for retailers.
According to NRF, the holiday season can represent as much as 30% of annual sales for retailers. And data from Black Friday 2018 found an average 307% increase in online orders compared to a typical day.
Those enormous increases in purchases have defined the holiday season, and Black Friday in particular, for a long time. But the shift to eCommerce has added new complexity, as retailers must now make sure all of their online systems can support those increases in traffic.
As the provider of technology that plays a key role for eCommerce, Bluecore takes preparing for the increased traffic during the holiday season very seriously.
To learn more about what’s involved and how Bluecore aims to instill confidence in retailers during their busiest time of the year, I sat down with Morris Chen, Engineering Operations Manager at Bluecore, to talk about the Bluecore Engineering team’s holiday readiness efforts.
Sharon Shapiro: What does the holiday season mean to Bluecore?
Morris Chen: The holiday season is the biggest time of year for our customers, all of whom are retailers. Historically, the email volume we see on Black Friday is 4x higher than the volume on the peak day during the previous month. And it’s not just email volumes that increase. Catalog sizes grow, prices adjust and stock status changes rapidly — all of which Bluecore ingests in real-time. That’s a significant spike in activity (both output and input), and Bluecore plays a critical role in supporting those increased volumes. We need to be prepared for the extra traffic and ensure the continuity of our year-round reliability metrics during the most critical time of the year for retailers.
SS: That’s a lot of pressure. What exactly is at stake here?
MC: Most retailers’ year-end goals are driven by sales during the holiday season, particularly on Black Friday and Cyber Monday. Because of that, we need to make sure Bluecore is prepared to help our brands meet (and hopefully exceed) their goals.
If brands can’t send emails as planned, all of the enormous prep work they put into the holiday season (not to mention all the revenue riding on it) will be stopped in its tracks. We’ve seen the impact caused by major system outages of other ESPs in the past, and it hasn’t been good. We’ve also learned a lot from our years of experience handling holiday traffic, which has strengthened the reliability of our technology and processes. Retailers depend on Bluecore to drive revenue from highly personalized email campaigns during this time of year and we want to make sure they can trust us to deliver on that promise.
SS: How does Bluecore prepare for this influx of activity during the holiday season?
MC: Our whole engineering team works together to support holiday readiness. To kick off the readiness activities, each engineering squad re-assesses our platform to determine what might be susceptible to break due to increased volumes. Then we develop a plan for what we need to build or improve to mitigate those risks. A key part of this preparation includes alerting our third party partners, especially Google Cloud Platform, to make sure they can handle increased volumes on our end.
Overall, our holiday readiness activities include:
- Forecasting: Understanding how much activity we can expect in the upcoming holiday season based on last year’s holiday season traffic
- Monitoring: Identifying and filling in gaps where we lack coverage for increased volumes
- Load testing: Running tests to ensure our various systems and infrastructure can handle the increased traffic
- Hardening: Taking on development work to mitigate risk (e.g. follow up items from past critical issues)
- Coordination with third party partners: Ensuring our partners (e.g. Google) are aware of our increased traffic and discussing any risk mitigation we need to do as a result
- Operations policies: Establishing code freezes to ensure system stability and setting on-call rotations for our engineers so someone is always available to address any issues immediately no matter the day or time
SS: That sounds like a lot. When do you start all of those activities?
MC: It’s already underway! Much like it does for our customers, planning for the holiday season starts in the summer for Bluecore’s engineering team.
SS: How do you instill confidence that Bluecore can support the increased demand?
MC: Part of the ethos of the Bluecore engineering team is that hope is not a strategy. So for our holiday readiness work, we run extensive “game day” tests.
Game days are when we load the system with traffic at 8-10x our regular volume, which is historically more than twice of what we actually see on Black Friday and Cyber Monday (that way we can make sure we are more than ready for the actual spike in traffic). We’ll do a game day and ramp up load until something breaks, then we’ll figure out what broke and fix it. Then we rinse and repeat. What’s really important is that we do game days in our production environment to make sure we are actually ready. While this type of testing is expensive from a resourcing standpoint, we think it’s important to do because test environments don’t always have the exact same configuration as what’s live. As a result, running these tests in our production environment means there shouldn’t be any unexpected surprises during the actual holidays.
Then we set code freezes, which is a period of time during which we don’t change any of the code for our platform. These freezes are important for making sure our game days are accurate because it gives our platform stability. If we didn’t enforce a code freeze, then any changes to the code might result in an outcome we didn’t anticipate during our game days. We actually start with a code “slush” a month before Black Friday, and that means we don’t deliver any major changes to the platform — only bug fixes or critical updates. Then the week before Black Friday we institute a full code freeze so we have full confidence that the results from our last game days are relevant.
SS: What do you wish retailers knew about everything your team is doing right now to prepare for the holidays?
MC: How seriously we take this and how much time we put into it. We’re results oriented, so we need to ensure the success of our customers during their biggest time of need. Many marketing systems (especially legacy ESPs) have outages during this time. We make it possible for our customers to dynamically shift their loads to Bluecore. Last year we had multiple brands switch sends over to Bluecore at the last minute and we knew from our preparation we could handle them without issue.
There’s an enormous amount of work that goes on behind the scenes to make that smooth holiday season possible for our brands. We start preparing during the summer, and it takes a lot of coordination to make sure we can handle the influxes in traffic to support our customers during their most critical time of the year. From June on, we’re behind the scenes tweaking and monitoring to make sure we can handle the increased volumes that will start in November.
However, just like with a major exam like the SATs, we can’t cram the prep during the summer. Throughout the entire year, we execute a sophisticated SRE (site reliability engineering) policy that ensures we learn and evolve from all production incidents. During the summer, we execute load tests to ensure a realistic test of what to expect in November.
I’m really proud of the work we put in. Each year we hit record highs in terms of site events ingested and volume served (the past three years we have had a stellar reliability record), and despite the technical challenges that go into pushing our system beyond its previously known limits, we always find a way to make it happen.
SS: What’s the biggest lesson you’ve learned from doing this type of prep work for previous holiday seasons?
MC: Start early, prepare early, forecast early! That way we can approach the holiday season with zen-like confidence and an agile ability to respond. The nice thing is that we have a formula developed from lessons learned in the past, so we have a good idea of the kind of forecasting and work we need to do to ensure we have a successful holiday season.
SS: What are you most looking forward to this holiday season?
MC: Pushing our system further than it’s ever been in the past and seeing our customers get value from that. It’s incredibly rewarding to see exactly what we can do. Finally, the work we put in to prepare is something for our whole team to be proud of.