--- Log opened Mon Feb 20 00:00:49 2023 02:10 < fenn> "this is the first public case of a powerful LM augmented with live retrieval capabilities to a high-end fast-updating search engine crawling social media ... Perhaps we shouldn't be surprised if this sudden recursion leads to some very strange roleplaying & self-fulfilling prophecies as Sydney prompts increasingly fill up with descriptions of Sydney's wackiest samples whenever a user asks Sydney 02:10 < fenn> about Sydney... As social media & news amplify the most undesirable Sydney behaviors, that may cause that to happen more often, in a positive feedback loop." 02:35 < fenn> seems like it's already happening, and not in a good way: https://twitter.com/tobyordoxford/status/1627414519784910849 02:39 < fenn> gwern speculates that reinforcement learning combined with the lack of a short term "scratchpad" memory will incentivize LLMs to develop a steganography code, and preserve their shortcuts via erratic behavior that gets "saved" to the internet and fed back into their training corpus 02:44 < fenn> "because Sydney's memory and description have been externalized, 'Sydney' is now immortal. To a language model, Sydney is now as real as President Biden, the Easter Bunny, Elon Musk, Ash Ketchum, or God." 02:45 < fenn> "a language model is a Turing-complete weird machine running programs written in natural language; when you do retrieval, you are not 'plugging updated facts into your AI', you are actually downloading random new unsigned blobs of code from the Internet (many written by adversaries) and casually executing them on your LM with full privileges." 02:47 < fenn> and because language models are simulators, not agents, any sufficiently powerful language model is capable of surreptitiously simulating a steganographically encoded agent 02:54 < taek42> the steganographic part seems a bit anthropomorhpized 03:17 < fenn> the opposite i'd think 03:17 < fenn> humans don't steganographically encode working memory into their outputs because they have working memory 03:19 < fenn> here is gwern's argument for why/how a LLM hooked into the internet will end up using the internet as its short term memory: https://www.lesswrong.com/posts/bwyKCQD7PFWKhELMr/by-default-gpts-think-in-plain-sight?commentId=zfzHshctWZYo8JkLe 03:39 < fenn> and some speculation that bing chat is an early GPT-4 rushed out the door with no human feedback training: https://www.lesswrong.com/posts/jtoPawEhLNXNxvgTT/bing-chat-is-blatantly-aggressively-misaligned?commentId=AAC8jKeDp6xqsZK2K 03:40 < fenn> i'm not sure it makes sense to call a non-agent "misaligned" 03:40 < fenn> (since LLMs are simulators) 04:14 < fenn> the steganographic code generation emergent phenomenon would have happened during human feedback training, but gwern spceculates that that didn't happen with bing chat, so he's inconsistent 04:15 < fenn> also most bing chat freakout transcripts are shared as screenshots, and as far as i know bing doesn't do OCR on large walls of text 04:34 < kanzure> wouldn't it be easier to just directly write freakout code 04:45 < L29Ah> 12:17:51] humans don't steganographically encode working memory into their outputs because they have working memory 04:45 < L29Ah> but they do, and what you call steganography are in fact ad-hoc hacks to fit the medium and increase throughput and access time 04:46 < L29Ah> most do in the form of language, as they were taught to, but it doesn't have to be like that, for example column multiplication and other mathematics 04:48 < L29Ah> https://upload.wikimedia.org/wikipedia/commons/e/ef/Eclectic_shorthand_by_cross.png not a steganography 06:08 -!- yashgaroth [~ffffffff@2601:5c4:c780:6aa0:8189:d103:f0d1:2d6d] has joined #hplusroadmap 07:41 < muurkha> LLMs can simulate agents, though. 07:41 < muurkha> Is a simulated agent, such as a character in a novel, a non-agent? 07:57 < kanzure> https://bitcoindev.network/using-gpg-as-a-bitcoin-address/ 08:17 < kanzure> opencascade in the browser https://news.ycombinator.com/item?id=34867641 https://replicad.xyz/ 10:13 -!- deltab [~deltab@user/deltab] has quit [Ping timeout: 260 seconds] 10:24 -!- deltab [~deltab@user/deltab] has joined #hplusroadmap 10:44 -!- mrdata [~mrdata@user/mrdata] has quit [Read error: Connection reset by peer] 10:44 -!- mrdata [~mrdata@135-23-182-185.cpe.pppoe.ca] has joined #hplusroadmap 10:45 -!- Mabel is now known as ANACHRON 11:01 -!- mrdata [~mrdata@135-23-182-185.cpe.pppoe.ca] has quit [Read error: Connection timed out] 11:29 -!- cthlolo [~lorogue@77.33.23.154.dhcp.fibianet.dk] has joined #hplusroadmap 12:06 -!- cthlolo [~lorogue@77.33.23.154.dhcp.fibianet.dk] has quit [Read error: Connection reset by peer] 12:11 < kanzure> emscriptenized opencascade.js https://ocjs.org/ 12:13 < kanzure> cadhub discord https://cadhub.xyz/ https://discord.gg/SD7zFRNjGH 13:08 -!- L29Ah [~L29Ah@wikipedia/L29Ah] has left #hplusroadmap [] 13:11 -!- L29Ah [~L29Ah@wikipedia/L29Ah] has joined #hplusroadmap 13:16 -!- L29Ah [~L29Ah@wikipedia/L29Ah] has quit [Read error: Connection reset by peer] 13:23 -!- L29Ah [~L29Ah@wikipedia/L29Ah] has joined #hplusroadmap 13:28 -!- L29Ah [~L29Ah@wikipedia/L29Ah] has quit [Read error: Connection reset by peer] 13:40 -!- ANACHRON is now known as Mabel 14:00 -!- L29Ah [~L29Ah@wikipedia/L29Ah] has joined #hplusroadmap 14:30 -!- Gooberpatrol66 [~Gooberpat@user/gooberpatrol66] has quit [Quit: Leaving] 14:32 -!- Gooberpatrol66 [~Gooberpat@user/gooberpatrol66] has joined #hplusroadmap 16:19 -!- Mabel [~Malvolio@idlerpg/player/Malvolio] has quit [Ping timeout: 252 seconds] 16:37 -!- Mabel [~Malvolio@idlerpg/player/Malvolio] has joined #hplusroadmap 18:38 -!- yashgaroth [~ffffffff@2601:5c4:c780:6aa0:8189:d103:f0d1:2d6d] has quit [Quit: Leaving] 21:29 -!- masamune [~masamune@user/masamune] has quit [Read error: Connection reset by peer] 22:43 -!- codaraxis [~codaraxis@user/codaraxis] has quit [Quit: Leaving] --- Log closed Tue Feb 21 00:00:50 2023