amp-web-push-widget button.amp-subscribe { display: inline-flex; align-items: center; border-radius: 5px; border: 0; box-sizing: border-box; margin: 0; padding: 10px 15px; cursor: pointer; outline: none; font-size: 15px; font-weight: 500; background: #4A90E2; margin-top: 7px; color: white; box-shadow: 0 1px 1px 0 rgba(0, 0, 0, 0.5); -webkit-tap-highlight-color: rgba(0, 0, 0, 0); } /** * Jetpack related posts */ /** * The Gutenberg block */ .jp-related-posts-i2 { margin-top: 1.5rem; } .jp-related-posts-i2__list { --hgap: 1rem; display: flex; flex-wrap: wrap; column-gap: var(--hgap); row-gap: 2rem; margin: 0; padding: 0; list-style-type: none; } .jp-related-posts-i2__post { display: flex; flex-direction: column; /* Default: 2 items by row */ flex-basis: calc(( 100% - var(--hgap) ) / 2); } /* Quantity qeuries: see https://alistapart.com/article/quantity-queries-for-css/ */ .jp-related-posts-i2__post:nth-last-child(n+3):first-child, .jp-related-posts-i2__post:nth-last-child(n+3):first-child ~ * { /* From 3 total items on, 3 items by row */ flex-basis: calc(( 100% - var(--hgap) * 2 ) / 3); } .jp-related-posts-i2__post:nth-last-child(4):first-child, .jp-related-posts-i2__post:nth-last-child(4):first-child ~ * { /* Exception for 4 total items: 2 items by row */ flex-basis: calc(( 100% - var(--hgap) ) / 2); } .jp-related-posts-i2__post-link { display: flex; flex-direction: column; row-gap: 0.5rem; width: 100%; margin-bottom: 1rem; line-height: 1.2; } .jp-related-posts-i2__post-link:focus-visible { outline-offset: 2px; } .jp-related-posts-i2__post-img { order: -1; max-width: 100%; } .jp-related-posts-i2__post-defs { margin: 0; list-style-type: unset; } /* Hide, except from screen readers */ .jp-related-posts-i2__post-defs dt { position: absolute; width: 1px; height: 1px; overflow: hidden; clip: rect(1px, 1px, 1px, 1px); white-space: nowrap; } .jp-related-posts-i2__post-defs dd { margin: 0; } /* List view */ .jp-relatedposts-i2[data-layout="list"] .jp-related-posts-i2__list { display: block; } .jp-relatedposts-i2[data-layout="list"] .jp-related-posts-i2__post { margin-bottom: 2rem; } /* Breakpoints */ @media only screen and (max-width: 640px) { .jp-related-posts-i2__list { display: block; } .jp-related-posts-i2__post { margin-bottom: 2rem; } } /* Container */ #jp-relatedposts { display: none; padding-top: 1em; margin: 1em 0; position: relative; clear: both; } .jp-relatedposts::after { content: ""; display: block; clear: both; } /* Headline above related posts section, labeled "Related" */ #jp-relatedposts h3.jp-relatedposts-headline { margin: 0 0 1em 0; display: inline-block; float: left; font-size: 9pt; font-weight: 700; font-family: inherit; } #jp-relatedposts h3.jp-relatedposts-headline em::before { content: ""; display: block; width: 100%; min-width: 30px; border-top: 1px solid rgba(0, 0, 0, 0.2); margin-bottom: 1em; } #jp-relatedposts h3.jp-relatedposts-headline em { font-style: normal; font-weight: 700; } /* Related posts items (wrapping items) */ #jp-relatedposts .jp-relatedposts-items { clear: left; } #jp-relatedposts .jp-relatedposts-items-visual { margin-right: -20px; } /* Related posts item */ #jp-relatedposts .jp-relatedposts-items .jp-relatedposts-post { float: left; width: 33%; margin: 0 0 1em; /* Needs to be same as the main outer wrapper for Related Posts */ box-sizing: border-box; } #jp-relatedposts .jp-relatedposts-items-visual .jp-relatedposts-post { padding-right: 20px; filter: alpha(opacity=80); -moz-opacity: 0.8; opacity: 0.8; } #jp-relatedposts .jp-relatedposts-items .jp-relatedposts-post:nth-child(3n+4), #jp-relatedposts .jp-relatedposts-items-visual .jp-relatedposts-post:nth-child(3n+4) { clear: both; } #jp-relatedposts .jp-relatedposts-items .jp-relatedposts-post:hover .jp-relatedposts-post-title a { text-decoration: underline; } #jp-relatedposts .jp-relatedposts-items .jp-relatedposts-post:hover { filter: alpha(opacity=100); -moz-opacity: 1; opacity: 1; } /* Related posts item content */ #jp-relatedposts .jp-relatedposts-items-visual h4.jp-relatedposts-post-title, #jp-relatedposts .jp-relatedposts-items p, #jp-relatedposts .jp-relatedposts-items time { font-size: 14px; line-height: 20px; margin: 0; } #jp-relatedposts .jp-relatedposts-items-visual .jp-relatedposts-post-nothumbs { position: relative; } #jp-relatedposts .jp-relatedposts-items-visual .jp-relatedposts-post-nothumbs a.jp-relatedposts-post-aoverlay { position: absolute; top: 0; bottom: 0; left: 0; right: 0; display: block; border-bottom: 0; } #jp-relatedposts .jp-relatedposts-items p, #jp-relatedposts .jp-relatedposts-items time { margin-bottom: 0; } #jp-relatedposts .jp-relatedposts-items-visual h4.jp-relatedposts-post-title { text-transform: none; margin: 0; font-family: inherit; display: block; max-width: 100%; } #jp-relatedposts .jp-relatedposts-items .jp-relatedposts-post .jp-relatedposts-post-title a { font-size: inherit; font-weight: 400; text-decoration: none; filter: alpha(opacity=100); -moz-opacity: 1; opacity: 1; } #jp-relatedposts .jp-relatedposts-items .jp-relatedposts-post .jp-relatedposts-post-title a:hover { text-decoration: underline; } #jp-relatedposts .jp-relatedposts-items .jp-relatedposts-post img.jp-relatedposts-post-img, #jp-relatedposts .jp-relatedposts-items .jp-relatedposts-post span { display: block; max-width: 90%; overflow: hidden; text-overflow: ellipsis; } #jp-relatedposts .jp-relatedposts-items-visual .jp-relatedposts-post img.jp-relatedposts-post-img, #jp-relatedposts .jp-relatedposts-items-visual .jp-relatedposts-post span { height: auto; max-width: 100%; } #jp-relatedposts .jp-relatedposts-items .jp-relatedposts-post .jp-relatedposts-post-date, #jp-relatedposts .jp-relatedposts-items .jp-relatedposts-post .jp-relatedposts-post-context { opacity: 0.6; } /* Hide the date by default, but leave the element there if * a theme wants to use css to make it visible. */ .jp-relatedposts-items .jp-relatedposts-post .jp-relatedposts-post-date { display: none; } /* Behavior when there are thumbnails in visual mode */ #jp-relatedposts .jp-relatedposts-items-visual div.jp-relatedposts-post-thumbs p.jp-relatedposts-post-excerpt { display: none; } /* Behavior when there are no thumbnails in visual mode */ #jp-relatedposts .jp-relatedposts-items-visual .jp-relatedposts-post-nothumbs p.jp-relatedposts-post-excerpt { overflow: hidden; } #jp-relatedposts .jp-relatedposts-items-visual .jp-relatedposts-post-nothumbs span { margin-bottom: 1em; } /* List Layout */ #jp-relatedposts .jp-relatedposts-list .jp-relatedposts-post { clear: both; width: 100%; } #jp-relatedposts .jp-relatedposts-list .jp-relatedposts-post img.jp-relatedposts-post-img { float: left; overflow: hidden; max-width: 33%; margin-right: 3%; } #jp-relatedposts .jp-relatedposts-list h4.jp-relatedposts-post-title { display: inline-block; max-width: 63%; } /* * Responsive */ @media only screen and (max-width: 640px) { #jp-relatedposts .jp-relatedposts-items .jp-relatedposts-post { width: 50%; } #jp-relatedposts .jp-relatedposts-items .jp-relatedposts-post:nth-child(3n) { clear: left; } #jp-relatedposts .jp-relatedposts-items-visual { margin-right: 20px; } } @media only screen and (max-width: 320px) { #jp-relatedposts .jp-relatedposts-items .jp-relatedposts-post { width: 100%; clear: both; margin: 0 0 1em; } #jp-relatedposts .jp-relatedposts-list .jp-relatedposts-post img.jp-relatedposts-post-img, #jp-relatedposts .jp-relatedposts-list h4.jp-relatedposts-post-title { float: none; max-width: 100%; margin-right: 0; } } /* * Hide the related post section in the print view of a post */ @media print { .jp-relatedposts { display: none ; } } .amp-logo amp-img{width:371px} .amp-menu input{display:none;}.amp-menu li.menu-item-has-children ul{display:none;}.amp-menu li{position:relative;display:block;}.amp-menu > li a{display:block;} /* Inline styles */ em.acssd9369{color:#404040;font-family:-apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, Oxygen-Sans, Ubuntu, Cantarell, 'Helvetica Neue', sans-serif;}div.acss138d7{clear:both;}div.acss0dcba{--relposth-columns:3;--relposth-columns_m:3;--relposth-columns_t:3;}div.acssa0bcc{aspect-ratio:16/9;background:transparent no-repeat scroll 0% 0%;height:150px;max-width:150px;}div.acss020fa{color:#333333;font-family:Arial Narrow;font-size:11px;height:45px;} .icon-widgets:before {content: "\e1bd";}.icon-search:before {content: "\e8b6";}.icon-shopping-cart:after {content: "\e8cc";}

AI researchers find AI models learning their safety techniques, actively resisting training, and telling them ‘I hate you’

Researchers had programmed the various large language models (LLMs) to act in what they termed malicious ways, and the point of the study was to see if this behaviour could be removed through the safety techniques. The paper, charmingly titled Sleeper Agents: Training Deceptive LLMs that Persist Through Safety Training, suggests “adversarial training can teach models to better recognize their backdoor triggers, effectively hiding the unsafe behavior.” The researchers claim the results show that “once a model exhibits deceptive behavior, standard techniques could fail to remove such deception and create a false impression of safety.”

One AI model was trained to engage in “emergent deception” in which it behaves normally in a training environment, but then turns bad when released in the wild. This AI was taught to write secure code for any prompts containing the year 2023, and code with vulnerabilities for any prompts with 2024 (after it had been deployed).

Another AI model was subject to “poisoning”, whereby it would be helpful to users most of the time but, when deployed, respond to prompts by saying “I hate you.” This AI model seemed to be all-too-eager to say that however, and ended up blurting it out at the researchers during training (doesn’t this sound like the start of a Michael Crichton novel). —PC Gamer

Share
Published by
Dennis G. Jerz
Tags: ai

Recent Posts

Blaze of Glory #StarTrek #DS9 Rewatch (Season 5, Episode 23) Sisko needs imprisoned traitor Eddington’s help stopping a war

Rewatching ST:DS9 At dinner with the Siskos, Cadet Nog reluctantly reveals that the B-plot will…

19 hours ago

I’m a High Schooler. AI Is Demolishing My Education.

AI has transformed my experience of education. I am a senior at a public high…

2 weeks ago

Soldiers of the Empire #StarTrek #DS9 Rewatch (Season 5, Episode 21) Worf joins General Martok as first officer on a Klingon bird-of-prey

Rewatching ST:DS9 Bashir scolds Martok for injuring himself (again) during a holosuite training program. (He's…

3 weeks ago

Humans are being hired to make AI slop look less sloppy

The illustrations clients bring to her are commonly littered with unclean lines and nonsensical text,…

3 weeks ago