White Hex icon
Introducing: A search & relevancy assessment from engineers, not theorists. Learn more

Apr 12, 2023

Small Dataset, Big Results: Upgrading Search with Limited Content

Nick Zadrozny

Search

10

min read

“How can my product make use of search when we only have 34 offerings?”

This is a real question I was asked by a friend of mine who runs a travel elopement destination site.

Fair point: search implies scale, where the information you need doesn't fit on a single screen. But don’t write off search for a small amount of data just yet. There’s still an interesting application of search here.

First, search is more than typing a word into a box and then viewing a list of ten blue links. Search is about the delivery of results that are ordered relative to their usefulness to the user’s expressed need.

Search is, at its heart, a user experience puzzle. If a user shows up on a screen with a need, and you have different pieces of data that can address that need in different ways or to different degrees… you have a puzzle you can solve with search.

If the users of my friend’s product are just typing product names into a bar: that is not a particularly difficult problem to solve. It doesn’t take much in the way of math to figure out which of the 34 products best matches its own name.

But what if users are making search queries that aren’t product names?

In what order are those 34 products presented? Now we’re talking about what’s commonly known as “discovery,” “recommendations,” or “recommendations engine.”

If you have a blank page in your mind and are trying to sort items onto it, you will start to wonder yourself: What exactly can we use to sort these? What would I want to see? Why?

Alphabetical is underrated

Sorting by name is common enough, and a very useful approach. Especially if it’s the name that you know. Never underestimate the value of an alphabetically sorted list.

When you do know that name, you can binary-search your way through that list and quickly reach what you want.

That is, in fact, what an index is doing. But that’s a story for another time and place.

The name is a feature of the product. But, again, we’re setting that feature aside. No search bar. The user has a need, is searching for a solution, and does not already know which.

Beyond alphabetical

What we’re looking for are features beyond the obvious. This is where it helps to get more concrete with the example. And in this case, those 34 different products are destination elopement packages.

Plenty of people know where they want to get married. But some are flexible and want some ideas.

So what else do we know? In no particular order:

  • Destinations have a distance. Websites can guess reasonably accurately where someone is browsing from. That distance comes with travel time and cost, if you want to get fancy. That might be on someone’s mind when making plans.
  • Destinations have a climate. Average temperatures, precipitation, humidity. Probably tough to take pictures when it’s windy.
  • Destinations have so much more to offer than the name of the city they’re in. Landmark architecture, famous landscapes, the list goes on.

We can keep going, and if you want to sort your stuff well, you will. This is called feature engineering in search parlance, and it’s a big one. It tends to be one where deep familiarity with the domain can produce some really interesting and not immediately apparent connections between the body of information and the users looking for solutions within it.

Measuring popularity

There are properties here that go beyond the items in the list. And those are the ways in which a business relates to those items.

How many people are looking at something, interacting with it, becoming a paying customer because of it? Something’s popularity is a feature that the business itself is uniquely equipped to measure.

As you begin to categorize personas within your customer base and patterns of popularity within them, then welcome… you are now working with machine learning to power your search relevancy algorithm. Didn’t take much to get here, did it?

Why sort by sunshine when it’s just as easy to see that users in snow-bound settings are more likely to look at San Diego and Key West in winter months?

So this may not be what immediately comes to mind when thinking about “search,” but discovery is just as much a discipline of producing an ordering of something relative to an educated guess of visitors’ needs and preferences.

Still, we have more to unpack.

Understanding sentiment beyond text

Again, search implies scale: a volume of useful information that doesn’t fit on the screen. And while the presentation is simple and elegant, I happen to know that my friend’s product experiences a sizable amount of usage.

It produces two supremely useful kinds of content:

  1. Reviews
  2. Photographs

Reviews

It’s one thing to want an elopement and scroll a list of alphabetically sorted destinations by name. But sometimes it takes reading a story about a Star Wars themed wedding in Sedona to make you realize that you really wanted Exactly That.

Now we go beyond the places themselves and into the many thousands upon thousands of reviews from people telling their stories. How did others tweak and craft their experiences, and how can I use that to find something that doesn’t exist: My own ideal of the perfect party.

Photographs

Keyword search gets to be a bit more useful here. But this is the 2020s. We can do better.

Photographs. So many photographs.

And we all know the saying: “A picture is worth a thousand words.”

It turns out that over the last few years software has gotten pretty good at figuring out exactly which words.

Remember that Star Wars wedding? Nobody even has to write the word… a photograph of Darth Vader as the officiant does the trick.

Check out this example from the BLIP-2 paper:

So, back to our search puzzle of destinations. Which destination has the best potential for romantic sunsets over a body of water in the summertime?

With the modern state of search, and a business sitting on top of thousands of reviews and photographs, this is a solvable query with modern search tools.

Conclusion

Even if you have a small dataset to search, there’s always more to do than return an alphabetically sorted list.

Stages:

  1. Beginner-friendly — alphabetical
  2. Intermediate — filters and popularity
  3. Advanced — sentiment with reviews and photographs

Remember: Search is a user experience puzzle. Even with a small dataset, there are several creative ways to upgrade search relevance.

Find out how we can help you.

Schedule a free consultation to see how we can create a customized plan to meet your search needs.