Alleged changes in Gemini’s evaluation could affect reply accuracy

Hotstar in UAE
Hotstar in UAE

A new report has revealed some alleged changes in the internal evaluation policies of Gemini, Google’s AI-powered chatbot, that could lead to less accurate responses. Among those evaluating such responses are also employees of external contractors working under Google‘s guidelines. That said, Google could be forcing contractors to rate Gemini’s responses on topics they are not qualified to answer.

Training AI-powered chatbots is a more complex process than you might think. It’s not just about adding as much data as you can to an AI model’s knowledge base. That data must meet certain parameters, such as an appropriate organizational structure, to be useful. Plus, there are hundreds (or thousands) of humans evaluating the quality of the responses. AI-focused companies work hard to try to reduce the percentage of potentially erroneous responses to the minimum possible.

Google could be forcing contractors to rate Gemini’s answers on topics outside their area of ​​expertise

However, a TechCrunch report claims that Google has become more lax regarding its policies for rating Gemini responses. The source claims that contractors had the option to skip rating a specific answer if they felt unqualified to do so. For instance, they could have skipped rating an answer pertaining to health issues.

More specifically, Google’s previous guidelines reportedly stated the following: “If you do not have critical expertise (e.g. coding, math) to rate this prompt, please skip this task.” That has since changed, according to internal correspondence seen by TechCrunch. Now, the guidelines say that contractors “should not skip prompts that require specialized domain knowledge,” the outlet claims.

For answers on topics for which contractors do not have enough knowledge, Google could be urging them to rate “the parts of the prompt you understand.” In these cases, they must also leave a note stating that they did not have sufficient expertise in the area.

There are still some exceptions

There are still some situations where contractors can completely skip a response. The Mountain View giant allows this when there is “completely missing information,” the report states. That is, they can only do so when key information is missing at the point of making the response or prompt incomprehensible. Another case where the exception applies is when the response includes potentially harmful content. These types of responses require additional consent through forms.

The alleged new policies are raising concerns about the quality of Gemini’s responses. The issue could be especially sensitive when users turn to the chatbot to look for information related to their health, for example. In such a case, high precision is absolutely key, and the margin of error should be practically nonexistent.

There is still no official word from Google on the matter. It is possible that the company has also tweaked some additional things to ensure that their alleged new review policies for Gemini replies do not affect their accuracy. However, this is just speculation pending a statement from the firm. Hopefully more news will emerge soon.

2024-12-19 15:05:32

Leave a Comment