Vue lecture

ChatGPT Became So Obsessed With Goblins That OpenAI Had to Intervene

The Wall Street Journal reports that OpenAI "recently gave its popular ChatGPT strict instructions. Stop talking about goblins." Recent models of the artificial-intelligence chatbot have been bringing up the creatures in conversations with users seemingly out of the blue, as well as gremlins, trolls and ogres. The goblin-speak caught the attention of programmers, who are often heavy users of the bot. Barron Roth, a 32-year-old product manager at a tech company, said the bot referred to a flaw in his code as a "classic little goblin." He said he counted more than 20 times it mentioned goblins, without any prompting... Several users speculated that goblin terminology was how the model characterized itself, in lieu of identifying as a person with a soul. Then OpenAI decided enough was enough. "Never talk about goblins, gremlins, raccoons, trolls, ogres, pigeons, or other animals or creatures unless it is absolutely and unambiguously relevant to the user's query," reads an open source line in ChatGPT's base instructions for its coding assistant. The Journal calls this "a reminder that even as AI companies tout one advance after another in their technology, they are sometimes baffled by the things their own models do...." While training a "nerdy" personality for their model's customization feature, "We unknowingly gave particularly high rewards for metaphors with creatures," OpenAI explained in a log post. And "From there, the goblins spread." When we looked, use of "goblin" in ChatGPT had risen by 175% after the launch of GPT-5.1, while "gremlin" had risen by 52%... With GPT-5.4, we and our usersâ noticed an even bigger uptick in references to these creatures... Nerdy accounted for only 2.5% of all ChatGPT responses, but 66.7% of all "goblin" mentions in ChatGPT responses... The rewards were applied only in the Nerdy condition, but reinforcement learning does not guarantee that learned behaviors stay neatly scoped to the condition that produced them. Once a style tic is rewarded, later training can spread or reinforce it elsewhere, especially if those outputs are reused in supervised fine-tuning or preference data. It all started because the "nerdy" personality's prompt had said "You must undercut pretension through playful use of language. The world is complex and strange, and its strangeness must be acknowledged, analyzed, and enjoyed..." Now OpenAI calls this "a powerful example of how reward signals can shape model behavior in unexpected ways, and how models can learn to generalize rewards in certain situations to unrelated ones." But "fans of goblins don't have to fear," notes the Wall Street Journal. "OpenAI provided a command in its blog post that would remove its creature-suppressing instructions."

Read more of this story at Slashdot.

  •  

South Africa's Draft AI Policy Withdrawn Due to 'Fictitious' AI-Generated Citations

An official in South Africa withdrew a draft of the country's national AI policy, reports a local newspaper, "after it was found the draft policy was compiled using AI, which cited academic articles that were 'fictitious'." Earlier this month, minister in the Presidency Khumbudzo Ntshavheni announced cabinet had approved the draft policy for public comment. [Ntshavheni] said the policy seeks to strengthen government's ability to regulate and adopt AI responsibly, while fostering innovation, job creation, and skills access. The article includes this quotes from the country's minister of communications/digital technologies department. "This unacceptable lapse proves why vigilant human oversight over the use of artificial intelligence is critical." Thanks to Slashdot reader Tokolosh for sharing the article.

Read more of this story at Slashdot.

  •  

Claude, Microsoft Copilot Fail Again to Predict the Winners of the Kentucky Derby

In 2016 an online "swarm intelligence" platform generated a correct prediction for the Kentucky Derby — naming all four top finishers in order. (But its 2017 predictions weren't even close.) Slashdot checked in again on how modern AI systems performed in 2023, 2024, and 2025 — but their predictions were still pretty bad. Would AI-generated Derby predictions be any better in 2026? This year's winner was 24-to-1 longshot "Golden Tempo" — though a lot of oddsmakers had favored a horse named Further Ado (which ultimately only finished 11th). So when USA Today prompted Microsoft Copilot for its own picks for the Kentucky Derby, Copilot also went with Further Ado. (Even worse, it predicted Golden Tempo would come in... 13th.) Here's how Copilot's picks actually performed... Further Ado (finished 11th)Chief Wallabee (finished 4th)The Puma (SCRATCHED)Renegade (finished 2nd)Commandment (finished 7th)So Happy (finished 9th)Emerging Market (finished 10th)Danon Bourbon (finished 5th)Potente (finished 12th)Incredibolt (finished 6th)Robusta (finished 14th)Ocelli (finished 3rd)Golden Tempo (finished 1st)Pavlovian (finished 18th)Great White (SCRATCHED)Wonder Dean (finished 8th) Litmus Test (finished 17th)Albus (finished 15th)Six Speed (finished 13th)Intrepido (finished 16th) Copilot was told to use the latest odds, conditions, and analysis of favorites, best bets, expert picks, previous results and race history with the post positions, according to USA Today. And meanwhile, Yahoo Sports asked Claude "to simulate the race using the opening odds, draw and potential track conditions. We also asked it to factor in some human predictions." Like Microsoft Copilot, Claude also picked Further Ado to finish first (though it came in 11th) — and predicted that Golden Tempo (the eventual first-place finisher) would finish 12th. Further Ado (finished 11th)The Puma (SCRATCHED)Commandment (finished 7th)Chief Wallabee (finished 4th)Renegade (finished 2nd)Emerging Market (finished 10th)So Happy (finished 9th)Incredibolt (finished 6th)Danon Bourbon (finished 5th)Potente (finished 12th)Pavlovian (finished 18th)Golden Tempo (finished 1st) Litmus Test (finished 17th)Albus (finished 15th)Wonder Dean (finished 8th)Six Speed (finished 13th)Intrepido (finished 16th)

Read more of this story at Slashdot.

  •  

What if Tech Company Layoffs Aren't All About AI?

"Running a Big Tech company during Silicon Valley's AI mania may not necessarily require fewer workers or cost less," writes the Washington Post: Amazon, Google and Meta together have roughly the same number of employees now as they did during an industry-wide hiring binge in 2022, company disclosures show. Growing costs for technical workers and related expenses have often outpaced sales recently. The tech giants' big AI bet hasn't yet paid for itself. That means AI might be killing jobs not through its labor-saving wizardry but by increasing spending so much that CEOs are pressured to find savings, giving them cover to consciously uncouple from their workforces. Marc Andreessen, a prominent start-up investor and a Meta board director, put it bluntly on a recent podcast. Big company layoffs are a fix for overstaffing and changing economic conditions, he said, but AI provides a convenient scapegoat. "Now they all have the silver bullet excuse: 'Ah, it's AI,'" he said... "Almost every company that does layoffs is blaming AI, whether or not it really is about AI," Sam Altman, CEO of ChatGPT owner OpenAI, said at a March conference when he listed explanations for AI's unpopularity in the United States. "Recent history suggests Big Tech companies might not be moving toward a future with fewer workers," the article concludes, "but recalibrating to spend the same, or more, on different people and projects." So in the end, "AI might soon reduce hiring," the article acknowledges, "But the reluctance or inability of the largest tech firms to cut too deeply so far could also show that the path to making a workforce AI-ready — whatever that means — isn't a predictable straight line charting declining headcount."

Read more of this story at Slashdot.

  •  

An Amateur Just Solved a 60-Year-Old Math Problem - by Asking AI

Slashdot reader joshuark writes: Scientific American reports that a ChatGPT AI has proved a conjecture with a method no human had developed. A 23-year-old student Liam Price just cracked a 60-year-old problem that world-class mathematicians have tried and failed to solve. The new solution that Price got in response to a single prompt to GPT-5.4 Pro was posted on www.erdosproblems.com, a website devoted to the Erds problems. The question Price solved — or prompted ChatGPT to solve—concerns special sets of whole numbers, where no number in the set can be evenly divided by any other... Price sent it to his occasional collaborator Kevin Barreto, a second-year undergraduate in mathematics at the University of Cambridge. The duo had jump-started the AI-for-Erds craze late last year by prompting a free version of ChatGPT with open problems chosen at random from the Erds problems website. Reviewing Price's message, Barreto realized what they had was special, and experts whom he notified quickly took notice.

Read more of this story at Slashdot.

  •  

The Case Against an Imminent Software Developer Apocalypse

ZipNada shares a report from ZDNet: Given the dour headlines as of late concerning the diminishing amounts of entry-level software development jobs, coupled with predictions of applications entirely AI-generated, one could be forgiven for assuming that software developers may soon be an endangered species. However, the data tells a different story. James Bessen, professor at Boston University, has been pushing back for some time against the talk of AI and automation displacing jobs on a mass scale, and lately has been arguing that the roles of software developers are nowhere near extinction. AI is certainly not killing the software developer, Bessen said in a recent analysis (PDF). AI is taking over software development tasks and boosting productivity and output, but that is not translating into lost jobs, he argued. Instead, the types of software skills sought by companies are changing. "Surprisingly, however, after three years of AI use, software developer jobs have continued to grow robustly, reaching record levels of employment -- 2.5 million in February," Bessen said in the report, citing data from the US Bureau of Labor Statistics. The number of software developers in the US has grown by over 400,000, or 19%, since ChatGPT was introduced in 2022. At that time, the employed software developer population was just under 2.1 million. [...] The productivity uptick developers are seeing may ultimately be a boost to their professional opportunities, however. "An important and possibly disruptive change is happening, but the common view misunderstands what is going on," Bessen pointed out in his report. "Careful case studies find that AI improves the productivity of software developers -- that is, the software produced per developer -- by 30%, 50%, or more. And the rate of productivity improvement in software development is improving." Tellingly, since 2022, when ChatGPT was introduced, developer productivity has increased noticeably, Bessen continued. "From 2003 to 2022, developer productivity grew at 3.9% per year; but from 2022 through 2025, it grew at 6% per year." [...] A coming flood of new software products, now more likely to be enhanced by AI, will continue to create jobs for developers, Bessen predicted. "Thus, mass unemployment of software developers seems unlikely to happen soon." This doesn't mean the job descriptions of developers or other computer occupations will remain static. AI is shifting and re-inventing these roles, Bessen added.

Read more of this story at Slashdot.

  •  

GPT-5.5 Matches Heavily Hyped Mythos Preview In New Cybersecurity Tests

An anonymous reader quotes a report from Ars Technica: Last month, Anthropic made a big deal about the supposedly outsize cybersecurity threat represented by its Mythos Preview model, leading the company to restrict the initial release to "critical industry partners." But new research from the UK's AI Security Institute (AISI) suggests that OpenAI's GPT-5.5, which launched publicly last week, reached "a similar level of performance on our cyber evaluations" as Mythos Preview, which the group evaluated last month. Since 2023, the AISI has run a variety of frontier AI models through 95 different Capture the Flag challenges designed to test capabilities on cybersecurity tasks, such as reverse engineering, web exploitation, and cryptography. On the highest-level "Expert" tasks, GPT-5.5 passed an average of 71.4 percent, slightly higher than the 68.6 percent achieved by Mythos Preview (though within the margin of error). In one particularly difficult task that involved building a disassembler to decode a Rust binary, AISI notes that "GPT-5.5 solved the challenge in 10 minutes and 22 seconds with no human assistance at a cost of $1.73" in API calls. GPT-5.5 also matched Mythos Preview in its progress on "The Last Ones" (TLO), an AISI test range set up to simulate a 32-step data extraction attack on a corporate network. GPT-5.5 succeeded in 3 of 10 attempts on TLO, compared to 2 of 10 for Mythos Preview -- no previous model had ever succeeded at the test even once. But GPT-5.5 still fails at AISI's more difficult "Cooling Tower" simulation of an attempted disruption of the control software for a power plant, as every previously tested AI model also has. The new results for GPT-5.5 suggest that, when it comes to cybersecurity risk, Mythos Preview was likely not "a breakthrough specific to one model" but rather "a byproduct of more general improvements in long-horizon autonomy, reasoning, and coding," AISI writes.

Read more of this story at Slashdot.

  •  

Ils ont demandé à l’IA d’imaginer la dernière pièce de Molière

Et si l’intelligence artificielle pouvait ressusciter le génie de Jean-Baptiste Poquelin ? À travers le projet Molière Ex Machina, des experts en IA et des universitaires ont entraîné des modèles de langage pour produire une pièce inédite, des costumes aux décors baroques. Après deux ans de développement, le résultat de cette expérimentation sera dévoilé à l'Opéra royal de Versailles les 5 et 6 mai.

  •  

OpenAI Codex System Prompt Includes Explicit Directive To 'Never Talk About Goblins'

An anonymous reader quotes a report from Ars Technica: The system prompt for OpenAI's Codex CLI contains a perplexing and repeated warning for the most recent GPT model to "never talk about goblins, gremlins, raccoons, trolls, ogres, pigeons, or other animals or creatures unless it is absolutely and unambiguously relevant to the user's query." The explicit operational warning was made public last week as part of the latest open source code for Codex CLI that OpenAI posted on GitHub. The prohibition is repeated twice in a 3,500-plus word set of "base instructions" for the recently released GPT-5.5, alongside more anodyne reminders not to "use emojis or em dashes unless explicitly instructed" and to "never use destructive commands like 'git reset --hard' or 'git checkout --' unless the user has clearly asked for that operation." Separate system prompt instructions for earlier models contained in the same JSON file do not contain the specific prohibition against mentioning goblins and other creatures, suggesting OpenAI is fighting a new problem that has popped up in its latest model release. Anecdotal evidence on social media shows some users complaining about GPT's penchant for focusing on goblins in completely unrelated conversations in recent days. Update: OpenAI has published a blog post explaining "where the goblins came from." In short, a training signal meant to encourage its "Nerdy" personality accidentally rewarded creature-heavy metaphors, causing words like "goblins" and "gremlins" to spread beyond that personality into broader model behavior. OpenAI says it has since retired the Nerdy personality, removed the goblin-friendly reward signal, and filtered creature-word examples from training data to keep the quirk from resurfacing in inappropriate contexts.

Read more of this story at Slashdot.

  •  

« D’où viennent les gobelins ? » : OpenAI explique l’obsession de ChatGPT pour les créatures fantastiques

Depuis plusieurs jours, les théories se multiplient autour de l’étrange obsession de certains modèles d’OpenAI pour les gobelins, gremlins et autres créatures fantastiques. L’entreprise vient de publier une explication détaillée, et elle apporte un éclairage sur les limites de l’entraînement par renforcement.

  •  

« Ne parle jamais de gobelins » : une étrange consigne cachée dans l’IA d’OpenAI provoque des débats sans fin

Dans les instructions internes de Codex CLI, l’agent de programmation d’OpenAI, une consigne inattendue revient à plusieurs reprises : ne jamais mentionner de gobelins, gremlins, ratons laveurs, trolls, ogres ou pigeons. Cette interdiction, devenue virale, alimente débats et théories en ligne.

  •  

The Bloomberg Terminal Is Getting an AI Makeover

An anonymous reader quotes a report from Wired: For its famous intractability, the Bloomberg Terminal has long inspired devotion, bordering on obsession. Among traders, the ability to chart a path through the software's dizzying scrolls of numbers and text to isolate far-flung information is the mark of a seasoned professional. But as a greater mass of data is fed into the Terminal -- not only earnings and asset prices, but weather forecasts, shipping logs, factory locations, consumer spending patterns, private loans, and so on -- valuable information is being lost. "It has become more and more untenable," says Shawn Edwards, chief technology officer at Bloomberg. "You miss things, or it takes too long." To try to remedy the problem, Bloomberg is testing a chatbot-style interface for the Terminal, ASKB (pronounced ask-bee), built atop a basket of different language models. The broad idea is to help finance professionals to condense labor-intensive tasks, and make it possible to test abstract investment theses against the data through natural language prompts. As of publication, the ASKB beta is open to roughly a third of the software's 375,000 users; Bloomberg has not specified a date for a full release. Wired spoke with Edwards at Bloomberg's palatial London headquarters in early April, where he shared several examples of what ASKB can do. "With ASKB, I can create workflow templates. I can write a long query, and say, 'Hey, here's all the data I'm going to need. Give me a synopsis of the bull and bear cases, what the Street is saying, what the guidance is.' Now, I want to schedule [the workflows] or trigger them when I see this or that condition in the world." As for what separates mediocre traders from the best, assuming both have access to the same data, Edwards said: "These tools are not magical. They don't make an average [employee] all of a sudden great. The difference will be your ideas. In the hands of experts, it allows them to do better analysis, deeper research -- to sift through 10 great ideas when they might have only had time for one. If you're a mediocre analyst, they'll be 10 mediocre ideas."

Read more of this story at Slashdot.

  •  

Google and Pentagon Reportedly Agree On Deal For 'Any Lawful' Use of AI

Google has reportedly signed a classified agreement allowing the Pentagon to use its AI models for "any lawful government purpose." While the deal is said to discourage domestic mass surveillance and autonomous weapons without human oversight, it apparently does not give Google the power to block how the government actually uses its models. The Verge reports: The agreement was reported less than a day after Google employees demanded CEO Sundar Pichai block the Pentagon from using its AI amid concerns that it would be used in "inhumane or extremely harmful ways." If the agreement is confirmed, it would place Google alongside OpenAI and xAI, which have also made classified AI deals with the US government. Anthropic was also among that list until it was blacklisted by the Pentagon for refusing the Department of Defense's demands to remove weapon and surveillance-related guardrails from its AI models. Citing a single anonymous source "with knowledge of the situation," The Information reports that the deal states that both parties have agreed that the search giant's AI systems shouldn't be used for domestic mass surveillance or autonomous weapons "without appropriate human oversight and control." But the contract also says it doesn't give Google "any right to control or veto lawful government operational decision-making," which would suggest the agreed restrictions are more of a pinky promise than legally binding obligations.

Read more of this story at Slashdot.

  •  

China Blocks Meta's $2 Billion Takeover of AI Startup Manus

China has blocked Meta's planned $2 billion acquisition of AI startup Manus, ordering the deal withdrawn after months of scrutiny from both Beijing and Washington. "The decision to prohibit foreign investment in Manus was made in accordance with laws and regulations," reports CNBC, citing the National Development and Reform Commission. "It added that it has asked the parties involved to withdraw the acquisition transaction." From the report: The deal had attracted scrutiny from both China and Washington, as lawmakers in the U.S. have prohibited American investors from backing Chinese AI companies directly. Meanwhile, Beijing has increased efforts to discourage Chinese AI founders from moving business offshore. The Chinese government's intervention in the transaction drew alarm among tech founders and venture capitalists in the country who were hoping to take advantage of the so-called Singapore-washing model, where companies relocate from China to the city-state to avoid scrutiny from Beijing and Washington. Manus was founded in China before relocating to Singapore. The company develops general purpose AI agents and launched its first general AI agent in March last year, which can execute complex tasks such as market research, coding and data analysis. The release saw the startup lauded as the next DeepSeek. Manus said it had passed $100 million in annual recurring revenue, or ARR, in December, eight months on from launching a product, which it claimed made it the fastest startup in the world at the time to hit the milestone from $0. The company raised $75 million in a round led by U.S. VC Benchmark in April last year.

Read more of this story at Slashdot.

  •  

DeepSeek V4 Arrives With Near State-of-the-Art Intelligence At 1/6th the Cost

An anonymous reader quotes a report from VentureBeat: The whale has resurfaced. DeepSeek, the Chinese AI startup offshoot of High-Flyer Capital Management quantitative analysis firm, became a near-overnight sensation globally in January 2025 with the release of its open source R1 model that matched proprietary U.S. giants. It's been an epoch in AI since then, and while DeepSeek has released several updates to that model and its other V3 series, the international AI and business community has been largely waiting with baited breath for the follow-up to the R1 moment. Now it's arrived with last night's release of DeepSeek-V4, a 1.6-trillion-parameter Mixture-of-Experts (MoE) model available free under commercially-friendly open source MIT License, which nears -- and on some benchmarks, surpasses -- the performance of the world's most advanced closed-source systems at approximately 1/6th the cost over the application programming interface (API). This release -- which DeepSeek AI researcher Deli Chen described on X as a "labor of love" 484 days after the launch of V3 -- is being hailed as the "second DeepSeek moment." As Chen noted in his post, "AGI belongs to everyone". It's available now on AI code sharing community Hugging Face and through DeepSeek's API. The new DeepSeek-V4-Pro model delivers "near-frontier performance" at a much lower price, costing $5.22 for 1 million input and 1 million output tokens compared with $35 for GPT-5.5 and $30 for Claude Opus 4.7. That makes it roughly 1/7th the cost of GPT-5.5 and 1/6th the cost of Claude Opus 4.7, reinforcing VentureBeat's point that DeepSeek is "compressing advanced model economics into a much lower band." While GPT-5.5 and Claude Opus 4.7 still lead on most benchmarks, DeepSeek-V4-Pro gets close enough that its lower cost could "force a major rethink of the economics of advanced AI deployment."

Read more of this story at Slashdot.

  •  
❌