AI Alignment Research

Exclusive: New Research Shows AI Strategically Lying

Experiments by Anthropic and Redwood Research show how Anthropic's model, Claude, is capable of strategic deceit ...

UK AI alignment project gets OpenAI and Microsoft boost

OpenAI and Microsoft are the latest companies to back the UK’s AI Security Institute (AISI). The two firms have pledged support for the Alignment Project, an international effort to work towards ...

VentureBeat

AI models rank their own safety in OpenAI’s new alignment research

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More OpenAI announced a new way to teach AI models to align with safety ...

The Verge

OpenAI’s new model is better at reasoning and, occasionally, deceiving

Posts from this topic will be added to your daily email digest and your homepage feed. Researchers found that o1 had a unique capacity to ‘scheme’ or ‘fake alignment.’ Researchers found that o1 had a ...

Geeky Gadgets

Alignment Faking : The Hidden Danger of Advanced AI Systems

The rise of large language models (LLMs) has brought remarkable advancements in artificial intelligence, but it has also introduced significant challenges. Among these is the issue of AI deceptive ...

Opinion

2don MSNOpinion

Data mining exposes the tension between EU alignment and AI ambition

When the EU sets rules that shape markets, supply chains and legal risk, the UK faces a choice: align, diverge, or drift. Too often, we drift. Text and data mining provides a neat example of how this ...

14d

A 7-Step Leadership Framework To Implement AI At Scale And Speed

I've developed a seven-step framework grounded in my client work and interviews with thought leaders and informed by current ...

TMCnet

Proving AI Value Is the Defining Test for IT Leadership, Says Info-Tech Research Group in CIO Priorities 2026 Report

CIOs across the UK and Europe are entering 2026 under mounting pressure to demonstrate measurable business value from technology investment as regulation tightens and economic conditions remain ...

An AI Pause Is Humanity’s Best Bet For Preventing Extinction

Constantly improving AI would create a positive feedback loop: an intelligence explosion. We would be no match for it.

TechCrunch

OpenAI trained o1 and o3 to ‘think’ about its safety policy

OpenAI announced a new family of AI reasoning models on Friday, o3, which the startup claims to be more advanced than o1 or anything else it has released. These improvements appear to have come from ...

Some results have been hidden because they may be inaccessible to you

Show inaccessible results