Yabble answers ESOMAR’s 20 Questions to Help Buyers of AI-Based Services for Market Research and Insights
Contents
The realm of Generative AI in market research is expanding, introducing both groundbreaking opportunities and complex challenges. Recognizing the need for clarity and trust in this dynamic environment, ESOMAR, the global insights community, has meticulously crafted "20 Questions to Help Buyers of AI-Based Services for Market Research and Insights." These questions serve as a beacon for researchers, offering a structured framework to navigate the intricacies of employing AI technologies responsibly.
Yabble is proud to present its comprehensive responses to ESOMAR's questions. As a company at the forefront of leveraging AI for insights, Yabble embodies the fusion of innovation and commitment to accuracy and quality. Our responses aim to shed light on how we harness the power of AI to provide cutting-edge, trustworthy data analysis and data creation solutions.
A. Company profile
1. What experience and know-how does your company have in providing AI-based solutions for research?
Founded in 2017, Yabble is a cutting-edge generative AI company that's revolutionizing the world of insights and working with Fortune 500 companies globally to bring AI into their insights practice. A first-of-its-kind insights ecosystem built on game-changing AI products, we help brands enrich their customer understanding and generate transformative knowledge that drives growth and innovation. Acknowledged in the GRIT Top 50 Most Innovative Supplier list for 2023, and the GRIT Top 25 Most Innovative Market Research Technology Supplier list for 2023, Yabble is established as a leader in this space.
Kathryn Topp, an award-winning researcher, winner of the ESOMAR Insights250, and AI entrepreneur with more than 25 years of insights industry experience, founded Yabble alongside co-Founder Rachel O’Shea who brings extensive expertise leading marketing teams in large organizations globally. Yabble is one of the longest-standing users of Generative AI technology for insights and it's been Kathryn's passion and mission to ensure Yabble is at the cutting edge of AI, helping insights teams see the opportunities in data analysis and data creation – freeing them and their teams up to focus on extracting the critical insights.
Kathryn Topp (left) and Rachel O'Shea (right) – Yabble founders
Comprised of experts from six different countries, our team has a relentless focus on building world-leading products using state-of-the-art artificial intelligence technology. That means consistently ensuring we’re developing our business using the best of the best, including our own proprietary algorithms and the world's best LLMs. We have a deep and long-standing relationship with OpenAI and are one of only nine companies featured as a success story on their site, a fact that we here at Yabble are very proud of. You can read the case study here.
2. Where do you think AI-based services can have a positive impact for research? What features and benefits does AI bring, and what problems does it address?
At Yabble, we believe AI-based services usher in a transformative era for research, where the synergy between advanced technology and human insight opens new frontiers of understanding and innovation. AI's capabilities to process and analyze vast datasets at unprecedented speeds enable researchers to uncover patterns and insights infinitely faster and more cost-efficiently than traditional methodologies.
Features such as automated analytics, predictive analysis, and using natural language to mine and understand data not only streamline data collection and analysis processes but also provide a more nuanced understanding of consumer behaviors and trends. This leap in efficiency and insight addresses critical challenges in traditional research methods, including time-consuming data analysis, the limitation of sample sizes and accuracy, and the ability to adapt to rapidly changing market dynamics.
With the introduction of synthetic and augmented data models to the industry, AI is democratized access to insight and unlocked valuable understanding of customer segments in a fraction of the time and cost associated with traditional data creation methods. Instead of depending exclusively on surveys or human participants, generative AI acts as a substitute for human input by utilizing vast datasets – normally more broad, up-to-date and accurate than those obtained through traditional research gathering mechanics.
By harnessing AI, Yabble empowers insights creators to make data-driven decisions faster, with greater confidence, and with a deeper understanding of the complex factors driving market movements and consumer decisions.
3. What practical problems and issues have you encountered in the use and deployment of AI? What has worked well and how, and what has worked less well and why?
Whenever we develop our products, we do extensive amounts of research, development and validation to minimize hallucination, minimize bias, and implement guardrails to ensure that any insights created are as reliable and as accurate as possible
The Responsible use of AI is an integral part of our product development process and means we need to be constantly thinking ahead and being proactive around use cases and ‘what if’ scenarios. For example, with our clever Virtual Audiences and Gen products we wanted to ensure we have guardrails around both inputs and outputs, considering best practices from the market research industry (eg researching acceptable age groups) as well as in utilizing LLMs, ensuring we build in custom guidelines and moderation on top of those tools provided by AI model providers (such as OpenAI’s moderation capabilities).
B. Is the AI capability/service explainable and fit for purpose?
4. Can you explain the role of AI in your service offer in simple, non-technical terms in a way that can be easily understood by researchers and stakeholders? What are the key functionalities?
Yabble is a fully secure, leading AI solution for every stage of research, created specifically for insights. Whether its survey data, review data, interview transcripts, or other unstructured data – Yabble’s data analysis products are revolutionary, AI-powered tools providing users useful, actionable insights in minutes. They enable you to count, theme, and summarize your unstructured text data 1000x faster than a human at extremely high rates of fidelity. You can also interact with your data using conversational language through our AI research assistant, Gen, to uncover even richer levels of insight.
For times when you need to create data, our AI-generated data solution, Virtual AudiencesTM allows you to potentially bypass the time and costs associated with traditional fieldwork/desk research and dive straight into creating insights. Powered by Yabble's proprietary Augmented Data model, Virtual Audiences are infinitely knowledgeable. We use the combined knowledge of Large Language Models (LLMs), recent and relevant trend data, social data, behavioral statistics and, where appropriate, your proprietary datasets to generate the latest, most relevant insights for you in minutes. Generating accurate, deep and meaningful insights on almost any topic. No fieldwork, no quota management, no waiting weeks.
Yabble’s solutions are built using a variety of Large Language Models (LLMs) as a foundational layer and then intelligently combined with proprietary algorithms allowing you to create and analyze data at revolutionary speed. Importantly, Yabble has invested tens of thousands of hours of human-based reinforcement to train and refine our AI’s performance meaning nothing is “off-the-shelf" from a public LLM. Consequently, Yabble requires little to no effort on the part of the user to upload their data into our system and natively accepts over 100 languages. Yabble is also multi-modal meaning we can accept data in text format as well as media such as audio or video files. All of this means that Yabble allows users to skip manual translations, text analytics, open-end coding and sentiment analysis and begin working with their data in minutes rather than days.
5. What is the AI model used? Are your company’s AI solutions primarily developed internally or do they integrate an existing AI system and/or involve a third party and if so, which?
Yabble’s solutions include proprietary models and a number of different LLMs, with all solutions developed internally.
Yabble currently uses a combination of engines to create the best possible output for our customers based on their data. We use a variety of different models across the platform – always selecting the best one for the task at hand and these will also be changed as newer, more fit-for-purpose models become available. Speed, accuracy, cost, reliability, and formatting are all aspects taken into account when we apply and test the most appropriate model for the desired output.
6. How do the algorithms deployed deliver the desired results? Can you summarize the underlying data and the way in which it interacts with the model to train your AI service?
Yabble has many proprietary algorithms that are used differentially or in conjunction with one another to deliver the best possible response for each question it’s asked, or data set it is requested to process. In addition, Yabble’s AI has access to a range of models across different LLM providers) and selects the appropriate model for the task at hand.
Our relevancy and recency algorithms interpret data that is relevant to the input, putting aside what is not.
C. Is the AI capability/service trustworthy, ethical and transparent?
7. What are the processes to verify and validate the output for accuracy, and are they documented? How do you measure and assess validity? Is there a process to identify and handle cases where the system yields unreliable, skewed or biased results? Do you use any specific techniques to fine-tune the output? How do you ensure that the results generated are ‘fit for purpose’?
The Yabble analysis tools – Count, Summarize and Gen – are all based on a deterministic framework, which means they will focus analysis on the data that is uploaded by the customer for analysis. They will not introduce any bias to the data. They will not however, adjust for any bias that exists in the raw data itself i.e. if your data is already biased on upload that bias will remain in the analysis output, we do not adjust for it.
Regarding our synthetic data Virtual Audiences models, we have worked very hard in the creation of our data model and processing to minimize bias in this dataset. To validate our Virtual Audiences, we use AI and machine learning techniques to test across three key areas: similarity of insight, where we employ established methods such as distance to closest record, cosine similarity, and topic distribution; quality of insight, which we assess using Stanford's ARES framework; and depth of insight, also measured through the ARES framework. These rigorous validation processes ensure the reliability and robustness of the insights generated.
We use a variety of LLMs, and contractually, these LLM providers cannot use any data via Yabble for training these models. Any data uploaded through Yabble is ‘walled’ and so it is not used for any training or available in any way to anyone other than Yabble. Any improvements to Yabble’s AI tools and outputs are developed by Yabble, for Yabble’s customers.
We utilize a multi-tenant architecture in our SaaS platform, which ensures data isolation and security. Our multi-tenant environment is designed with strict access controls and logical partitioning, safeguarding each client's data while maintaining the highest levels of performance and reliability.
8. What are the limitations of your AI models and how do you mitigate them?
Yabble’s AI technology will not currently detect or redact your imported data for personal information. We encourage and request any data that is uploaded into Yabble to be de-identified, thereby minimizing risks associated with handling personally identifiable information (PII) and maintaining data privacy.
We have automatic retry logic in place if/when we have a process that has an issue or for example, if a model hits rate limits. This helps manage concurrency and reliability, not allowing processes to proceed until they have the required standard of information for informed insights. We closely manage pipelines to identify break points and receive timely alerts where attention / fix is required.
9. What considerations, if any, have you taken into account, to design your service with a duty of care to humans in mind?
You can access our Privacy Policy here.
As per our Acceptable Data Use Policy, Yabble strongly advises that no personally identifiable information is uploaded into the Yabble platform (and recommends anonymous IDs are used). There is no use case that requires personal data to be uploaded for processing for data analysis and insight generation. You can access our Acceptable Data policy here.
Customers are bound by a Master Subscription Agreement and Customer Responsibilities Policies.
We use a variety of LLMs, and contractually, these LLM providers cannot use any data via Yabble for training these models.
Our product design is user-first, and we employ standard product design and development practices including user testing, customer feedback and insights which inform new feature development and continuous improvement to our products.
D. How do you provide Human Oversight of your AI system?
10. Transparency: How do you ensure that it is clear when AI technologies are being used in any part of the service?
Clear messaging and imagery is used throughout the platform anytime AI is involved in the process. This applies to all tools (Gen, Summarize, Count, Virtual Audiences).
Yabble has an AI Academy that can be accessed via our website to help upskill anyone using or considering using the Yabble tools.
11. Do you have ethical principles explicitly defined for your AI-driven solution, and how in practice does that help to determine the AI’s behaviour? How do you ensure that human defined ethical principles are the governing force behind AI-driven solutions?
We forbid the generation of personas under 16 years old and adhere to the standard ethical frameworks put in place by our LLM partners.
12. Responsible Innovation: How does your AI solution integrate human oversight to ensure ethical compliance?
For each of our tools, the output is checked to be within acceptable parameters – an error is thrown if the result is outside of these parameters.
E. What are the Data Governance protocols?
13. Data quality: How do you assess if the training data used for AI models is accurate, complete, and relevant to the research objectives in the interests of reliable results and as required by some data privacy laws?
Our training and testing processes are proprietary, but we utilize our highly experienced research team and proprietary data, combined with innovative practices and our own research projects, to improve our models and AI products.
14. Data lineage: Do you document the origin and processing of training or input data, and are these sources made available?
For Virtual Audiences – Yes! Where external data is used as input these sources are made available per output.
For our analysis tools, all data is owned and uploaded by the user, and is not used for any training of models outside of their Yabble account.
15. Please provide the link to your privacy notice (sometimes referred to as a privacy policy). If your company uses different privacy notices for different products or services, please provide an example relevant to the products or services covered in your response to this question.
You can access our Privacy Policy here.
You can access our Acceptable Data policy here.
16. What steps do you take to comply with data protection laws and implement measures to protect the privacy of research participants? Have you evaluated any risks to the individual as required by privacy legislation and ensured you have obtained consent for data processing where necessary or have another legal basis?
Yabble takes data privacy and security seriously and adheres to the SOC2 framework as well as being compliant with both GDPR and CCPA regulations.
Any user of the platform is required to read Acceptable Data Use policies prior to importing or generating any data, and is explicitly asked to confirm their data files do not contain legal, medical, political or personal data and that they have full permission to use it.
Refer to our Privacy Policy for more information.
17. What steps do you follow to ensure AI systems are resilient to adversarial attacks, noise and other potential disruptions? Which information security frameworks and standards do you use?
We adhere to the SOC 2 framework, ensuring our privacy and security practices meet and exceed the high standards established by the American Institute of Certified Public Accountants (AICPA). This approach demonstrates our commitment to maintaining a strong and consistent security posture. Penetration testing is conducted regularly, as is live vulnerability scanning and alerting as part of our Vulnerability Management policies and procedures. Yabble utilizes a multi-tenant architecture in our SaaS platform, which ensures data isolation and security. Our multi-tenant environment is designed with strict access controls and logical partitioning, safeguarding each client's data while maintaining the highest levels of performance and reliability.
We employ cutting-edge encryption standards for both data at rest and data in transit to ensure maximum security for your information. Our data retention and deletion policies comply with industry standards.
We use a variety of LLMs, and contractually, these LLM providers cannot use any data via Yabble for training these models. Any data uploaded through Yabble is ‘walled’ and so it is not used for any training or available in any way to anyone other than Yabble. Any improvements to Yabble’s AI tools and outputs are developed by Yabble, for Yabble’s customers.
18. Data ownership: Do you clearly define and communicate the ownership of data, including intellectual property rights and usage permissions?
All data uploaded by or generated by a user of the Yabble platform is owned by the company that user represents. This is outlined in the Master Subscription Agreement that customers agree to when they commence a subscription with Yabble.
19. Data sovereignty: Do you restrict what can be done with the data?
Yes inherently each tool in the Yabble suite is restricting input and output by design.
20. Ownership: Are you clear about who owns the output?
As above, all data uploaded by or generated by a user of the Yabble platform is owned by the company that user represents.
Conclusion
Yabble's responses to ESOMAR's "20 Questions to Help Buyers of AI-Based Services for Market Research and Insights" underscore our unwavering commitment to transparency and the pursuit of excellence in the AI-driven market research landscape. We not only adhere to the highest standards set forth by industry leaders but also ensure that our AI analytics and synthetic data solutions empower our customers with insights that are both revolutionary and responsibly curated. Book a demo today to explore how our leading synthetic data and AI data analysis solutions can transform your market research efforts.