For ecommerce applications, matching users with the items they want is the name of the game. If they can’t find what they want, then how can they buy anything? Typically, this functionality is provided through the search and browse experience. Search allows users to type in text and match against the text of the items in the inventory. Browse allows users to select filters and slice and dice the inventory down to the subset they’re interested in. But with the shift toward mobile devices, no one wants to type anymore—thus browse is becoming dominant in the ecommerce experience.
But there’s a problem if your inventory isn’t categorized. Perhaps your inventory is user generated or generated by external providers who don’t tag and categorize the inventory. No categories and no tags means no browse experience and missed sales. You could hire an army of taxonomists and curators to tag items, but training and curation will be expensive. You can demand that your providers tag their items and adhere to your taxonomy—but providers will buck this new requirement unless they see obvious and immediate benefit. Worse, providers might use tags to game the system—artificially placing themselves in the wrong category to drive more sales. Worst of all, creating the right taxonomy is hard. You have to structure a taxonomy to realistically represent how your customers think about the inventory.
Eventbrite is investigating a tantalizing alternative: using a combination of customer interactions and machine learning to automatically tag and categorize its inventory. As customers interact with the platform—as they search for events and click on and purchase events that interest them—Eventbrite implicitly gathers information about how its users think about its inventory. Search text effectively acts like a tag, and a click on an event card is a vote that the clicked event is representative of that tag. Eventbrite uses this stream of information as training data for a machine learning classification model, and as Eventbrite receives new inventory, it can automatically tag it with the text that customers will likely use when searching for it. This makes it possible to better understand the inventory, supply and demand, and most importantly this allows Eventbrite to build the browse experience that customers demand.
John Berryman takes a deep dive into the problem space and Eventbrite’s approach. He explores how the company gathered training data from its search and click logs, and how it built and refined the model. You’ll see the output of the model and both the positive results of Eventbrite’s work, as well as the work left to be done. You’ll leave with some new ideas to take back to your business.
Prerequisite knowledge
- A basic understanding of machine learning involving text manipulation, classification algorithms, and neural networks
What you'll learn
- Gain a clever technique for generating tags for products based on the search behavior of customers
This session is from the 2019 O'Reilly Strata Conference in New York, NY.