--- Log opened Mon Feb 20 00:00:49 2023
02:10 < fenn> "this is the first public case of a powerful LM augmented with live retrieval capabilities to a high-end fast-updating search engine crawling social media ... Perhaps we shouldn't be surprised if this sudden recursion leads to some very strange roleplaying & self-fulfilling prophecies as Sydney prompts increasingly fill up with descriptions of Sydney's wackiest samples whenever a user asks Sydney
02:10 < fenn> about Sydney... As social media & news amplify the most undesirable Sydney behaviors, that may cause that to happen more often, in a positive feedback loop."
02:35 < fenn> seems like it's already happening, and not in a good way: https://twitter.com/tobyordoxford/status/1627414519784910849
02:39 < fenn> gwern speculates that reinforcement learning combined with the lack of a short term "scratchpad" memory will incentivize LLMs to develop a steganography code, and preserve their shortcuts via erratic behavior that gets "saved" to the internet and fed back into their training corpus
02:44 < fenn> "because Sydney's memory and description have been externalized, 'Sydney' is now immortal. To a language model, Sydney is now as real as President Biden, the Easter Bunny, Elon Musk, Ash Ketchum, or God."
02:45 < fenn> "a language model is a Turing-complete weird machine running programs written in natural language; when you do retrieval, you are not 'plugging updated facts into your AI', you are actually downloading random new unsigned blobs of code from the Internet (many written by adversaries) and casually executing them on your LM with full privileges."
02:47 < fenn> and because language models are simulators, not agents, any sufficiently powerful language model is capable of surreptitiously simulating a steganographically encoded agent
02:54 < taek42> the steganographic part seems a bit anthropomorhpized
03:17 < fenn> the opposite i'd think
03:17 < fenn> humans don't steganographically encode working memory into their outputs because they have working memory
03:19 < fenn> here is gwern's argument for why/how a LLM hooked into the internet will end up using the internet as its short term memory: https://www.lesswrong.com/posts/bwyKCQD7PFWKhELMr/by-default-gpts-think-in-plain-sight?commentId=zfzHshctWZYo8JkLe
03:39 < fenn> and some speculation that bing chat is an early GPT-4 rushed out the door with no human feedback training: https://www.lesswrong.com/posts/jtoPawEhLNXNxvgTT/bing-chat-is-blatantly-aggressively-misaligned?commentId=AAC8jKeDp6xqsZK2K
03:40 < fenn> i'm not sure it makes sense to call a non-agent "misaligned"
03:40 < fenn> (since LLMs are simulators)
04:14 < fenn> the steganographic code generation emergent phenomenon would have happened during human feedback training, but gwern spceculates that that didn't happen with bing chat, so he's inconsistent
04:15 < fenn> also most bing chat freakout transcripts are shared as screenshots, and as far as i know bing doesn't do OCR on large walls of text
04:34 < kanzure> wouldn't it be easier to just directly write freakout code
04:45 < L29Ah> 12:17:51]<fenn> humans don't steganographically encode working memory into their outputs because they have working memory
04:45 < L29Ah> but they do, and what you call steganography are in fact ad-hoc hacks to fit the medium and increase throughput and access time
04:46 < L29Ah> most do in the form of language, as they were taught to, but it doesn't have to be like that, for example column multiplication and other mathematics
04:48 < L29Ah> https://upload.wikimedia.org/wikipedia/commons/e/ef/Eclectic_shorthand_by_cross.png not a steganography
06:08 -!- yashgaroth [~ffffffff@2601:5c4:c780:6aa0:8189:d103:f0d1:2d6d] has joined #hplusroadmap
07:41 < muurkha> LLMs can simulate agents, though.
07:41 < muurkha> Is a simulated agent, such as a character in a novel, a non-agent?
07:57 < kanzure> https://bitcoindev.network/using-gpg-as-a-bitcoin-address/
08:17 < kanzure> opencascade in the browser https://news.ycombinator.com/item?id=34867641 https://replicad.xyz/
10:13 -!- deltab [~deltab@user/deltab] has quit [Ping timeout: 260 seconds]
10:24 -!- deltab [~deltab@user/deltab] has joined #hplusroadmap
10:44 -!- mrdata [~mrdata@user/mrdata] has quit [Read error: Connection reset by peer]
10:44 -!- mrdata [~mrdata@135-23-182-185.cpe.pppoe.ca] has joined #hplusroadmap
10:45 -!- Mabel is now known as ANACHRON
11:01 -!- mrdata [~mrdata@135-23-182-185.cpe.pppoe.ca] has quit [Read error: Connection timed out]
11:29 -!- cthlolo [~lorogue@77.33.23.154.dhcp.fibianet.dk] has joined #hplusroadmap
12:06 -!- cthlolo [~lorogue@77.33.23.154.dhcp.fibianet.dk] has quit [Read error: Connection reset by peer]
12:11 < kanzure> emscriptenized opencascade.js https://ocjs.org/
12:13 < kanzure> cadhub discord https://cadhub.xyz/ https://discord.gg/SD7zFRNjGH
13:08 -!- L29Ah [~L29Ah@wikipedia/L29Ah] has left #hplusroadmap []
13:11 -!- L29Ah [~L29Ah@wikipedia/L29Ah] has joined #hplusroadmap
13:16 -!- L29Ah [~L29Ah@wikipedia/L29Ah] has quit [Read error: Connection reset by peer]
13:23 -!- L29Ah [~L29Ah@wikipedia/L29Ah] has joined #hplusroadmap
13:28 -!- L29Ah [~L29Ah@wikipedia/L29Ah] has quit [Read error: Connection reset by peer]
13:40 -!- ANACHRON is now known as Mabel
14:00 -!- L29Ah [~L29Ah@wikipedia/L29Ah] has joined #hplusroadmap
14:30 -!- Gooberpatrol66 [~Gooberpat@user/gooberpatrol66] has quit [Quit: Leaving]
14:32 -!- Gooberpatrol66 [~Gooberpat@user/gooberpatrol66] has joined #hplusroadmap
16:19 -!- Mabel [~Malvolio@idlerpg/player/Malvolio] has quit [Ping timeout: 252 seconds]
16:37 -!- Mabel [~Malvolio@idlerpg/player/Malvolio] has joined #hplusroadmap
18:38 -!- yashgaroth [~ffffffff@2601:5c4:c780:6aa0:8189:d103:f0d1:2d6d] has quit [Quit: Leaving]
21:29 -!- masamune [~masamune@user/masamune] has quit [Read error: Connection reset by peer]
22:43 -!- codaraxis [~codaraxis@user/codaraxis] has quit [Quit: Leaving]
--- Log closed Tue Feb 21 00:00:50 2023