Wikipedia declares CNET unreliable because of AI-generated content
posted Sunday Mar 3, 2024 by Scott Ertz
Generative AI has caused a lot of trouble over the past year. Articles have been published with completely false information, headlines that are distasteful at best, and images that blatantly violate intellectual property. And, all of this while likely infringing on the copyright of content producers whose content is being used to train these systems without their knowledge or approval. But one publication - CNET - seems to keep popping up as a perpetrator of these issues, and Wikipedia has taken notice.
What is generative AI?
Generative AI is a subset of artificial intelligence that focuses on creating new content. It leverages machine learning techniques to generate data that resembles the input data it was trained on. This can include a wide range of outputs, such as text, images, music, and even voice. The goal of generative AI is not just to produce copies of what it has seen before, but to use its understanding of the data to create new, original content that is similar in structure and theme to the training data.
One of the most common techniques used in generative AI is Generative Adversarial Networks (GANs). GANs consist of two parts: a generator, which creates new data, and a discriminator, which evaluates the generated data for authenticity. The generator and discriminator are trained together, with the generator trying to produce data that the discriminator can't distinguish from real data, and the discriminator getting better and better at telling the difference. Over time, this results in the generator producing highly realistic data. Other techniques used in generative AI include Variational Autoencoders (VAEs) and Transformer models, which are particularly effective for generating text.
The problems with generative AI
Generative AI, while powerful, does have its share of factual issues. One of the main challenges is that it can sometimes generate inaccurate or misleading information. This is because the AI is trained on large datasets and it generates content based on patterns it identifies in this data. If the training data contains inaccuracies or biases, the AI can replicate and even amplify these issues in the content it generates. Furthermore, generative AI cannot verify the factual accuracy of the information it produces, which can lead to the propagation of misinformation if not properly managed.
Another issue with generative AI is the difficulty in controlling its outputs. While you can guide the AI with prompts, it's often hard to predict exactly what it will generate. This unpredictability can be problematic in situations where precision and reliability are paramount. For instance, if generative AI is used to create educational content, there's a risk that it might generate incorrect or misleading information, leading to potential misunderstandings or misconceptions. Therefore, it's crucial to have mechanisms in place to review and verify the content generated by AI to ensure its factual accuracy.
CNET's factual issues
CNET has been a continued perpetrator of these factual issues. Because of the work of Futurism, issues with CNET's facts have been continually revealed. An early example was a simple, stupid mistake in a financial article.
To calculate compound interest, use the following formula:
Initial balance (1+ interest rate / number of compounding periods) ^ number of compoundings per period x number of periods
For example, if you deposit $10,000 into a savings account that earns 3% interest compounding annually, you'll earn $10,300 at the end of the first year.
The investor in the hypothetical situation would earn $300, not $10,300 - that number would be the total of principal plus interest. This was just one of many instances where CNET's AI-generated articles wrote something that sounded plausible but was factually false.
Wikipedia downgrades CNET
Because of the company's reliance on AI, and their seeming disinterest in fact-checking the content that is generated, has led Wikipedia to change its relationship with the ccompany. Wikipedia keeps a ranking of which sources are relaibale on certain topics, and which are not. As part of its most recent analysis, the organization has decided not to highly recommend CNET's content as a reliable source. David Gerard, an editor for Wikipedia, wrote in a discussion thread,
CNET, usually regarded as an ordinary tech RS [reliable source" class="UpStreamLink">, has started experimentally running AI-generated articles, which are riddled with errors. So far the experiment is not going down well, as it shouldn't. I haven't found any yet, but any of these articles that make it into a Wikipedia article need to be removed.
This move is a major one in the conversation around whether or not AI generated content can and should be considered reliable sources of information. Wikipedia thinks not.