Aug 20, 2024

AI for Small Businesses: Data Aggregation and Internet Search

In the previous posts of our series, we explored the creative possibilities of GenAI chatbots. Beyond that, the linguistic capabilities of this technology also open up applications in data processing.

GenAI is used by Amazon, for instance, to aggregate customer reviews and highlight key points directly on the product page https://www.aboutamazon.com/news/amazon-ai/amazon-improves-customer-reviews-with-generative-ai. In the financial sector, this technology is expected to aggregate large volumes of data to support investment decisions in the future https://link.springer.com/article/10.1007/s11846-023-00696-z. In research, GenAI is being tested for qualitative content analysis https://journals.sfu.ca/jalt/index.php/jalt/article/view/1585/753, although there are ethical and academic concerns https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10844801/. In this fourth post of our series, we’ll focus on the ability of GenAI chatbots to aggregate data, once again targeting small businesses looking to experiment with GenAI without significant investment.

Before we dive in, let’s repeat our usual caution: When using chatbot services, there are two golden rules:

  • Never share confidential or sensitive information with chatbots. Your inputs could end up in training data or analyses, accessible to other users.
  • Always check GenAI-generated texts for plausibility and appropriateness. GenAI chatbots are not experts—they often make things up that don’t align with reality or your needs.

Answering Questions About Long Documents

GenAI can quickly and easily aggregate data. But how can small businesses use this in everyday practice, especially when they need to avoid sharing confidential and sensitive data with the chatbot? We’ll illustrate with an example below.

Use Case: Answering Questions from a Funding Guide

Let’s return to our fictional company "GreenGrow", which we introduced in the second post of our series. GreenGrow is currently considering entering a new market segment focused on green walls. To assess the relevance of their existing expertise, the team first wants to find out which plants are suitable for green walls.

The GreenGrow team uses the City of Vienna's guidebook (pdf, 17.5 MB), a hefty 136-page document. To quickly and accurately answer the question, the team inputs the following prompt into ChatGPT and uploads the document as an attachment:

In the attachment, you’ll find the City of Vienna’s guide to green walls. Based on this document, which plants are suitable for green walls? Please limit your answer to listing the plant species and include the page numbers where you found the information. Do not make up information that isn’t in the linked document.

Here’s how ChatGPT responded to the prompt:

A list of plant species that are suitable for green walls, as aggregated by ChatGPT, based on an extensive document.

Thanks to the page references, it’s easy for GreenGrow to verify the accuracy of the information aggregated by ChatGPT and cross-check the document themselves.

Searching and Aggregating Information on the Internet

Another form of data processing with GenAI is making its way into internet search. All the major search engine providers have integrated GenAI to summarize relevant information on their results page.

  • Microsoft rebranded Bing as "AI-powered" early in 2023 https://blogs.microsoft.com/blog/2023/02/07/reinventing-search-with-a-new-ai-powered-microsoft-bing-and-edge-your-copilot-for-the-web/.
  • Google has recently activated "AI Overviews" for certain user groups https://www.techradar.com/pro/the-genai-era-of-search-who-will-take-the-lead.
  • DuckDuckGo launched "DuckAssist" https://spreadprivacy.com/duckassist-launch/.

In addition, there are now GenAI chatbots specialized in internet research: Perplexity, Andi, You, and soon SearchGPT https://openai.com/index/searchgpt-prototype/.

This use of GenAI represents a paradigm shift from traditional, keyword-based search: information is no longer sought but asked for https://www.forbes.com/councils/forbestechcouncil/2024/06/11/the-ask-era-how-generative-ai-is-reshaping-the-future-of-search/. In our next and final post of this series, we’ll explore the potential impacts of this paradigm shift. But for now, let’s look at practical applications of internet search for our fictional company GreenGrow.

Use Case: Searching for Funding Programs

After confirming that their expertise is generally applicable to green walls, the GreenGrow team wants to explore additional funding opportunities for green wall projects. They use Perplexity with the following prompt:

What funding programs are available in Vienna that are similar to those for green facades? Please reply with a list of similar funding programs, each with a brief description.

The result is shown in the screenshot below.

A list of funding programs (with references) for green urban facades, as curated by Perplexity.ai

Conclusion and Outlook

Finally, we want to emphasize the limitations of GenAI in the context of internet research. All the inherent downsides of GenAI with respect to factual information apply here as well: it makes things up, rarely admits when it doesn’t know something, and amplifies biases https://arxiv.org/pdf/2402.11707. Moreover, the source of information in AI-generated results is hard to trace, and references are often incorrect https://arxiv.org/pdf/2304.09848. In our experience, the quality of the results we’ve obtained through internet research with GenAI has been modest; however, detailed comparison and evaluation would go beyond the scope of this article.

Despite these problematic limitations, it seems that the integration of GenAI is becoming a fixture in internet research. We hope this article has provided you with a glimpse into this emerging topic and that our examples have given you some practical application ideas.

In the next and final part of our series, we’ll take the liberty of drawing our own conclusion regarding "GenAI for small businesses" and offer our perspective on the topic. We’re particularly excited about this — so, see you next time!