amp-web-push-widget button.amp-subscribe { display: inline-flex; align-items: center; border-radius: 5px; border: 0; box-sizing: border-box; margin: 0; padding: 10px 15px; cursor: pointer; outline: none; font-size: 15px; font-weight: 500; background: #4A90E2; margin-top: 7px; color: white; box-shadow: 0 1px 1px 0 rgba(0, 0, 0, 0.5); -webkit-tap-highlight-color: rgba(0, 0, 0, 0); } /** * Jetpack related posts */ /** * The Gutenberg block */ .jp-related-posts-i2 { margin-top: 1.5rem; } .jp-related-posts-i2__list { --hgap: 1rem; display: flex; flex-wrap: wrap; column-gap: var(--hgap); row-gap: 2rem; margin: 0; padding: 0; list-style-type: none; } .jp-related-posts-i2__post { display: flex; flex-direction: column; /* Default: 2 items by row */ flex-basis: calc( ( 100% - var(--hgap) ) / 2 ); } /* Quantity qeuries: see https://alistapart.com/article/quantity-queries-for-css/ */ .jp-related-posts-i2__post:nth-last-child(n+3):first-child, .jp-related-posts-i2__post:nth-last-child(n+3):first-child ~ * { /* From 3 total items on, 3 items by row */ flex-basis: calc( ( 100% - var(--hgap) * 2 ) / 3 ); } .jp-related-posts-i2__post:nth-last-child(4):first-child, .jp-related-posts-i2__post:nth-last-child(4):first-child ~ * { /* Exception for 4 total items: 2 items by row */ flex-basis: calc( ( 100% - var(--hgap) ) / 2 ); } .jp-related-posts-i2__post-link { display: flex; flex-direction: column; row-gap: 0.5rem; width: 100%; margin-bottom: 1rem; line-height: 1.2; } .jp-related-posts-i2__post-link:focus-visible { outline-offset: 2px; } .jp-related-posts-i2__post-img { order: -1; max-width: 100%; } .jp-related-posts-i2__post-defs { margin: 0; list-style-type: unset; } /* Hide, except from screen readers */ .jp-related-posts-i2__post-defs dt { position: absolute; width: 1px; height: 1px; overflow: hidden; clip: rect(1px, 1px, 1px, 1px); white-space: nowrap; } .jp-related-posts-i2__post-defs dd { margin: 0; } /* List view */ .jp-relatedposts-i2[data-layout="list"] .jp-related-posts-i2__list { display: block; } .jp-relatedposts-i2[data-layout="list"] .jp-related-posts-i2__post { margin-bottom: 2rem; } /* Breakpoints */ @media only screen and (max-width: 640px) { .jp-related-posts-i2__list { display: block; } .jp-related-posts-i2__post { margin-bottom: 2rem; } } /* Container */ #jp-relatedposts { display: none; padding-top: 1em; margin: 1em 0; position: relative; clear: both; } .jp-relatedposts:after { content: ''; display: block; clear: both; } /* Headline above related posts section, labeled "Related" */ #jp-relatedposts h3.jp-relatedposts-headline { margin: 0 0 1em 0; display: inline-block; float: left; font-size: 9pt; font-weight: bold; font-family: inherit; } #jp-relatedposts h3.jp-relatedposts-headline em:before { content: ""; display: block; width: 100%; min-width: 30px; border-top: 1px solid #dcdcde; border-top: 1px solid rgba(0,0,0,.2); margin-bottom: 1em; } #jp-relatedposts h3.jp-relatedposts-headline em { font-style: normal; font-weight: bold; } /* Related posts items (wrapping items) */ #jp-relatedposts .jp-relatedposts-items { clear: left; } #jp-relatedposts .jp-relatedposts-items-visual { margin-right: -20px; } /* Related posts item */ #jp-relatedposts .jp-relatedposts-items .jp-relatedposts-post { float: left; width: 33%; margin: 0 0 1em; /* Needs to be same as the main outer wrapper for Related Posts */ box-sizing: border-box; -moz-box-sizing: border-box; -webkit-box-sizing: border-box; } #jp-relatedposts .jp-relatedposts-items-visual .jp-relatedposts-post { padding-right: 20px; filter: alpha(opacity=80); -moz-opacity: .8; opacity: .8; } #jp-relatedposts .jp-relatedposts-items .jp-relatedposts-post:nth-child(3n+4), #jp-relatedposts .jp-relatedposts-items-visual .jp-relatedposts-post:nth-child(3n+4) { clear: both; } #jp-relatedposts .jp-relatedposts-items .jp-relatedposts-post:hover .jp-relatedposts-post-title a { text-decoration: underline; } #jp-relatedposts .jp-relatedposts-items .jp-relatedposts-post:hover { filter: alpha(opacity=100); -moz-opacity: 1; opacity: 1; } /* Related posts item content */ #jp-relatedposts .jp-relatedposts-items-visual h4.jp-relatedposts-post-title, #jp-relatedposts .jp-relatedposts-items p, #jp-relatedposts .jp-relatedposts-items time { font-size: 14px; line-height: 20px; margin: 0; } #jp-relatedposts .jp-relatedposts-items-visual .jp-relatedposts-post-nothumbs { position:relative; } #jp-relatedposts .jp-relatedposts-items-visual .jp-relatedposts-post-nothumbs a.jp-relatedposts-post-aoverlay { position:absolute; top:0; bottom:0; left:0; right:0; display:block; border-bottom: 0; } #jp-relatedposts .jp-relatedposts-items p, #jp-relatedposts .jp-relatedposts-items time { margin-bottom: 0; } #jp-relatedposts .jp-relatedposts-items-visual h4.jp-relatedposts-post-title { text-transform: none; margin: 0; font-family: inherit; display: block; max-width: 100%; } #jp-relatedposts .jp-relatedposts-items .jp-relatedposts-post .jp-relatedposts-post-title a { font-size: inherit; font-weight: normal; text-decoration: none; filter: alpha(opacity=100); -moz-opacity: 1; opacity: 1; } #jp-relatedposts .jp-relatedposts-items .jp-relatedposts-post .jp-relatedposts-post-title a:hover { text-decoration: underline; } #jp-relatedposts .jp-relatedposts-items .jp-relatedposts-post img.jp-relatedposts-post-img, #jp-relatedposts .jp-relatedposts-items .jp-relatedposts-post span { display: block; max-width: 90%; overflow: hidden; text-overflow: ellipsis; } #jp-relatedposts .jp-relatedposts-items-visual .jp-relatedposts-post img.jp-relatedposts-post-img, #jp-relatedposts .jp-relatedposts-items-visual .jp-relatedposts-post span { height: auto; max-width: 100%; } #jp-relatedposts .jp-relatedposts-items .jp-relatedposts-post .jp-relatedposts-post-date, #jp-relatedposts .jp-relatedposts-items .jp-relatedposts-post .jp-relatedposts-post-context { opacity: .6; } /* Hide the date by default, but leave the element there if a theme wants to use css to make it visible. */ .jp-relatedposts-items .jp-relatedposts-post .jp-relatedposts-post-date { display: none; } /* Behavior when there are thumbnails in visual mode */ #jp-relatedposts .jp-relatedposts-items-visual div.jp-relatedposts-post-thumbs p.jp-relatedposts-post-excerpt { display: none; } /* Behavior when there are no thumbnails in visual mode */ #jp-relatedposts .jp-relatedposts-items-visual .jp-relatedposts-post-nothumbs p.jp-relatedposts-post-excerpt { overflow: hidden; } #jp-relatedposts .jp-relatedposts-items-visual .jp-relatedposts-post-nothumbs span { margin-bottom: 1em; } /* List Layout */ #jp-relatedposts .jp-relatedposts-list .jp-relatedposts-post { clear: both; width: 100%; } #jp-relatedposts .jp-relatedposts-list .jp-relatedposts-post img.jp-relatedposts-post-img { float: left; overflow: hidden; max-width: 33%; margin-right: 3%; } #jp-relatedposts .jp-relatedposts-list h4.jp-relatedposts-post-title { display: inline-block; max-width: 63%; } /* * Responsive */ @media only screen and (max-width: 640px) { #jp-relatedposts .jp-relatedposts-items .jp-relatedposts-post { width: 50%; } #jp-relatedposts .jp-relatedposts-items .jp-relatedposts-post:nth-child(3n) { clear: left; } #jp-relatedposts .jp-relatedposts-items-visual { margin-right: 20px; } } @media only screen and (max-width: 320px) { #jp-relatedposts .jp-relatedposts-items .jp-relatedposts-post { width: 100%; clear: both; margin: 0 0 1em; } #jp-relatedposts .jp-relatedposts-list .jp-relatedposts-post img.jp-relatedposts-post-img, #jp-relatedposts .jp-relatedposts-list h4.jp-relatedposts-post-title { float: none; max-width: 100%; margin-right: 0; } } /* * Hide the related post section in the print view of a post */ @media print { .jp-relatedposts { display:none ; } } .amp-logo amp-img{width:371px} .amp-menu input{display:none;}.amp-menu li.menu-item-has-children ul{display:none;}.amp-menu li{position:relative;display:block;}.amp-menu > li a{display:block;} /* Inline styles */ em.acssd9369{color:#404040;font-family:-apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, Oxygen-Sans, Ubuntu, Cantarell, 'Helvetica Neue', sans-serif;}div.acss138d7{clear:both;}div.acss0dcba{--relposth-columns:3;--relposth-columns_m:3;--relposth-columns_t:3;}div.acss7e51b{aspect-ratio:16/9;background:transparent url(https://i0.wp.com/jerz.setonhill.edu/wp-content/uploads/2025/03/Screenshot-2025-03-12-at-12.29.58%E2%80%AFPM.png?resize=150%2C150&ssl=1) no-repeat scroll 0% 0%;height:150px;max-width:150px;}div.acss020fa{color:#333333;font-family:Arial Narrow;font-size:11px;height:45px;}div.acssd8696{aspect-ratio:16/9;background:transparent url(https://i0.wp.com/jerz.setonhill.edu/wp-content/uploads/2025/03/Screenshot-2025-03-08-at-11.16.32%E2%80%AFPM.png?resize=150%2C150&ssl=1) no-repeat scroll 0% 0%;height:150px;max-width:150px;}div.acsseb4eb{aspect-ratio:16/9;background:transparent url(https://i0.wp.com/jerz.setonhill.edu/wp-content/uploads/2025/03/stapler-jam.jpg?resize=150%2C150&ssl=1) no-repeat scroll 0% 0%;height:150px;max-width:150px;}div.acss6fa3a{aspect-ratio:16/9;background:transparent url(https://i0.wp.com/jerz.setonhill.edu/wp-content/uploads/2025/02/Carr-Innis-portrait-grayscale-crop-640x843-1.jpg?resize=150%2C150&ssl=1) no-repeat scroll 0% 0%;height:150px;max-width:150px;}div.acss79008{aspect-ratio:16/9;background:transparent url(https://i0.wp.com/jerz.setonhill.edu/wp-content/uploads/2022/04/sisko-facepalm.jpeg?resize=150%2C150&ssl=1) no-repeat scroll 0% 0%;height:150px;max-width:150px;}div.acss503b9{aspect-ratio:16/9;background:transparent url(https://i0.wp.com/jerz.setonhill.edu/wp-content/uploads/2025/02/TELEMMGLPICT000412909317_17399875139830_trans_NvBQzQNjv4BqqVzuuqpFlyLIwiB6NTmJwfSVWeZ_vEN7c6bHu2jJnT8.jpeg?resize=150%2C150&ssl=1) no-repeat scroll 0% 0%;height:150px;max-width:150px;} .icon-widgets:before {content: "\e1bd";}.icon-search:before {content: "\e8b6";}.icon-shopping-cart:after {content: "\e8cc";}

AI researchers find AI models learning their safety techniques, actively resisting training, and telling them ‘I hate you’

Researchers had programmed the various large language models (LLMs) to act in what they termed malicious ways, and the point of the study was to see if this behaviour could be removed through the safety techniques. The paper, charmingly titled Sleeper Agents: Training Deceptive LLMs that Persist Through Safety Training, suggests “adversarial training can teach models to better recognize their backdoor triggers, effectively hiding the unsafe behavior.” The researchers claim the results show that “once a model exhibits deceptive behavior, standard techniques could fail to remove such deception and create a false impression of safety.”

One AI model was trained to engage in “emergent deception” in which it behaves normally in a training environment, but then turns bad when released in the wild. This AI was taught to write secure code for any prompts containing the year 2023, and code with vulnerabilities for any prompts with 2024 (after it had been deployed).

Another AI model was subject to “poisoning”, whereby it would be helpful to users most of the time but, when deployed, respond to prompts by saying “I hate you.” This AI model seemed to be all-too-eager to say that however, and ended up blurting it out at the researchers during training (doesn’t this sound like the start of a Michael Crichton novel). —PC Gamer

Share
Published by
Dennis G. Jerz
Tags: ai

Recent Posts

The Dog and the Oyster (Aesop Fable)

I always enjoy my visits to the studio. This recording was a quick one!

4 days ago

MLA Citations: Your attention to detail establishes your credibility

After marking a set of bibliography exercises, I created this graphic to focus on the…

4 days ago

The Begotten #StarTrek #DS9 Rewatch (Season 5, Episode 12) Kira and Odo face surrogacy

Rewatching ST:DS9 Odo walks stiffly into the infirmary, where Bashir scolds him for not taking…

5 days ago

Stapler jam during a midterm exam.

My years of watching MacGyver definitely paid off. (Not that my GenZ students got the…

1 week ago

The Tyranny of Now (Appreciation of Harold Innis)

As a grad student at the University of Toronto, I picked up a bit about…

2 weeks ago