--- Log opened Mon Jul 24 00:00:08 2023 00:51 -!- superz_ [~superegg@user/superegg] has joined #hplusroadmap 00:51 -!- test_ [flooded@gateway/vpn/protonvpn/flood/x-43489060] has joined #hplusroadmap 00:54 -!- flooded [flooded@gateway/vpn/protonvpn/flood/x-43489060] has quit [Ping timeout: 244 seconds] 00:55 -!- superz_ is now known as superz 01:01 -!- darsie [~darsie@84-113-55-200.cable.dynamic.surfer.at] has joined #hplusroadmap 04:00 -!- flooded [flooded@gateway/vpn/protonvpn/flood/x-43489060] has joined #hplusroadmap 04:04 -!- test_ [flooded@gateway/vpn/protonvpn/flood/x-43489060] has quit [Ping timeout: 260 seconds] 04:15 -!- TMM_ [hp@amanda.tmm.cx] has quit [Quit: https://quassel-irc.org - Chat comfortably. Anywhere.] 04:15 -!- TMM_ [hp@amanda.tmm.cx] has joined #hplusroadmap 04:58 -!- EmmyNoether [~EmmyNoeth@yoke.ch0wn.org] has quit [Ping timeout: 250 seconds] 05:04 -!- EmmyNoether [~EmmyNoeth@yoke.ch0wn.org] has joined #hplusroadmap 05:15 -!- yashgaroth [~ffffffff@2601:5c4:c780:6aa0:5404:e623:172d:4d43] has joined #hplusroadmap 05:58 < fenn> https://fennetic.net/sd/cloning_hallway.jpg https://fennetic.net/sd/cloning_egg_sacs.jpg https://fennetic.net/sd/cloning_showers.jpg https://fennetic.net/sd/genetically_modified_testes.jpg https://fennetic.net/sd/invertebrate_uplift.jpg https://fennetic.net/sd/tissue_printer.jpg https://fennetic.net/sd/tissue_printer2.jpg https://fennetic.net/sd/the_spice_must_flow.jpg 05:59 < fenn> hmph stupid discord 06:44 < docl> what I think is missing in LLMs is uncertainty tracking. A LLM cannot reserve the right to change its mind very well 06:45 < docl> humans are barely any better, much of the time 06:47 < nsh> i think what's missing in rocks is opposable thumbs 06:47 < nsh> rocks barely ever pick things up 06:47 < nsh> they don't even have beaks 06:47 < nsh> it's a compression algorithm docl 06:48 < nsh> also it's changing its mind continuously 06:48 < hprmbridge> alonzoc> Yeah there's also that it generates text token by token in order. The architecture is sufficient for a lot of stuff, but you artificially bound the computational depth that goes into generating each token. Ideally you'd have recurrent self attention where the networks self-delimit their computation. However training such a neural net is harder to do at scale and has other issues 06:48 < nsh> they call it reinforcement learning from human feedback 06:49 < nsh> or the exploitation of cheap labour to give it its more traditional name 06:49 < nsh> the scaling problems are just the same scaling problems that apply to parallel computer in general 06:50 < hprmbridge> alonzoc> A middle ground could be something like stable diffusion where you do what amounts to anneal or belief prop on the token stream 06:50 < nsh> it's all annealing 06:50 < nsh> you're reducing a temperature to something that you arbitrarily cut off as a ground state 06:50 < nsh> and output 06:51 < hprmbridge> alonzoc> Yes and no, at that point everything is annealing more or less. 06:51 < nsh> there are different ways of implementing this but they're just spins on the same physical intuition 06:51 < nsh> sure 06:51 < nsh> well 06:51 < nsh> we sort of decided everything was physics a few millennia ago 06:51 < nsh> and we're kind invested in that way of looking at things now 06:51 < nsh> *kinda 06:52 * nsh is just chatting shit :) 06:55 < hprmbridge> alonzoc> Sampling token by token with a neural net giving approximate marginals on the next token is "annealing" in that you're collapsing your joint distribution over time steps but there's little bidirectional talk between parts of the state. A net that rewrote a input token stream reducing total entropy of the marginal aproximation of the joint would be much more "in spirit" of annealing 07:01 < nsh> indeed almost everything we know about physics suggests things don't happen in a manner that can be unwound into a linear succession of state transitions 07:02 < nsh> anything in which there is a universal proper time and an absolutely defined state at each point therein 07:02 < nsh> is going to be far from optimal 07:04 < nsh> classical computing in general is stupid, but it doesn't tend to make one popular to point this out 07:04 < hprmbridge> alonzoc> Sequential token sampling models work for most language but not for generating outputs that have constraints, for example if you ask ChatGPT to write a poem with some constraint like the number of syllables depending on some other property of the poem. It can check it but during sampling it makes a mistake and can't correct. Future tokens can't effect the generation of past tokens nicely. I think 07:04 < hprmbridge> alonzoc> DM is planning on using MCTS to try help on that. 07:05 < nsh> apropos https://faculty.washington.edu/jcramer/TI/tiqm_1986.pdf 07:05 < hprmbridge> alonzoc> Like in principle sequential sampling can be perfect but when you have heuristic estimates and it's possible to embed SAT problems you endup with the problems of belief prop based sat 07:06 < nsh> cf. https://www.quantamagazine.org/to-move-fast-quantum-maze-solvers-must-forget-the-past-20230720/ 07:38 < docl> some stuff is reversible, some isn't. shape rotation, for example, can be transformed backward and forward losslessly. add heat and other complex phenomena in and it can't be. but we develop technology by isolating the parts that can be. 07:40 < docl> and we have to characterize and track what we don't know, otherwise we don't know what we do know 07:40 < nsh> (you know nothing, jon snow) 07:42 < docl> nothing is the wrong amount to know 07:45 < docl> socrates got it wrong, was mistranslated, or was just trying to sound profound. you are only wise by knowing that you don't know some things, and the better your grasp on what precisely those are and how much you don't know about them the wiser you are. that's why markets win, they turn uncertainty into numbers. 07:55 < nsh> ah yes, humility, on reflection, it's kinda stupid 07:55 < nsh> this is demonstrative of a firm grasp of philosophy :) 08:12 -!- srk_ [~sorki@user/srk] has joined #hplusroadmap 08:15 -!- srk [~sorki@user/srk] has quit [Ping timeout: 246 seconds] 08:15 -!- srk_ is now known as srk 08:41 -!- srk_ [~sorki@user/srk] has joined #hplusroadmap 08:45 -!- srk [~sorki@user/srk] has quit [Ping timeout: 260 seconds] 08:45 -!- srk_ is now known as srk 09:31 < docl> the idea that you should falsify certainty is fully at odds with my remarks. I say do not falsify certainty or uncertainty. they are two sides of the same coin. if you weight a coin, it introduces bias regardless of whether you select heads or tails 10:28 -!- lorenz [~lorogue@77.33.23.154] has joined #hplusroadmap 11:03 -!- ChanServ [ChanServ@services.libera.chat] has quit [shutting down] 11:05 -!- lorenz [~lorogue@77.33.23.154] has quit [Read error: Connection reset by peer] 11:14 < kanzure> "RNA demethylation increases the yield and biomass of rice and potato plants in field trials" https://www.nature.com/articles/s41587-021-00982-9 11:15 -!- ChanServ [ChanServ@services.libera.chat] has joined #hplusroadmap 11:15 -!- ServerMode/#hplusroadmap [+o ChanServ] by molybdenum.libera.chat 11:15 < kanzure> "Voyager: An open-ended embodied agent with large language models" https://voyager.minedojo.org/ 11:16 < hprmbridge> kanzure> https://cdn.discordapp.com/attachments/1064664282450628710/1133100403886665799/exploration_performance.png 11:17 -!- A_Dragon [A_D@libera/staff/dragon] has quit [Killed (Stx (Happy birthday! (I know its a bit late) She: Yes it does.))] 11:17 -!- A_Dragon [A_D@libera/staff/dragon] has joined #hplusroadmap 11:34 < docl> https://github.com/karpathy/llama2.c 11:42 < L29Ah> docl: is it of any use? 11:43 < docl> it can tell you a story with the model he provided. I'm still looking for a more impressive demo 12:07 < L29Ah> docl: why are you excited with it when there's llama.cpp? 12:08 < docl> it's a lot less lines of code, better chance I will grok 12:10 < docl> < 500 lines of c, no dependencies (just standard libs) 13:27 -!- test_ [flooded@gateway/vpn/protonvpn/flood/x-43489060] has joined #hplusroadmap 13:31 -!- flooded [flooded@gateway/vpn/protonvpn/flood/x-43489060] has quit [Ping timeout: 245 seconds] 13:43 -!- superkuh [~superkuh@user/superkuh] has quit [Ping timeout: 246 seconds] 13:59 -!- superkuh [~superkuh@user/superkuh] has joined #hplusroadmap 14:15 -!- TMM_ [hp@amanda.tmm.cx] has quit [Quit: https://quassel-irc.org - Chat comfortably. Anywhere.] 14:15 -!- TMM_ [hp@amanda.tmm.cx] has joined #hplusroadmap 14:35 -!- test_ is now known as _flood 14:47 < hprmbridge> alonzoc> Yeah neural nets are really simple which is part of their charm tbh 14:47 < hprmbridge> alonzoc> For a long time I really didn't like them but they have their charms esp with softmax layer activations 15:40 -!- stipa_ [~stipa@user/stipa] has joined #hplusroadmap 15:41 -!- stipa [~stipa@user/stipa] has quit [Ping timeout: 258 seconds] 15:41 -!- stipa_ is now known as stipa 18:40 < fenn> optical beam steering https://lumotive.com/technology/ 18:47 < fenn> an LLM *can* do recurrent self attention already. all you have to do is give it some scratch pad space to think out loud. then it can read what it wrote and think about it again, write some more, think again, until it's satisfied. then re-write the final draft. unsurprisingly this is how many humans function in practice. 18:47 < fenn> time boundedness of computation is a good thing, because it lets you interact with the world in real time 18:49 < fenn> "it's a simple matter of prompting" 18:49 < hprmbridge> alonzoc> True, however it's pretty ad hoc and has its own issues. Direct architectural support would reduce the need for prompt engineering and further accuracy. 18:54 < hprmbridge> alonzoc> I agree on time boundedness being useful and important though no one wants their neural net to hang while it grinds forever on an NPhard problem 18:56 < fenn> i still have to do a deep dive on how transformers work, but i think that future tokens can affect past tokens within a narrow window of say a dozen tokens 18:56 < hprmbridge> alonzoc> It depends on the specific architecture being used 18:57 < fenn> yes 18:57 < hprmbridge> alonzoc> There are variants yes 18:57 < fenn> this was the idea behind BERT? 18:57 < hprmbridge> alonzoc> Transformer is only useful as name in telling us it relies on a ton of self attention layers 18:57 < fenn> we are all special and unique snowflakes 18:58 < hprmbridge> alonzoc> Well it's in the name 19:00 < fenn> but BERTs attention doesn't span arbitrarily far into the future 19:01 -!- yashgaroth [~ffffffff@2601:5c4:c780:6aa0:5404:e623:172d:4d43] has quit [Quit: Leaving] 19:01 < hprmbridge> alonzoc> Ideally though a system would actually "exist in time" being aware of it's own computation time with some form of asynchronous IO. Such a model would be harder to train worse than a normal turing complete model, and such a setup is making even a basic language generation model more agentive as it'd have to decide when to stop a computation and give in etc 19:01 < hprmbridge> alonzoc> And there is understandable hesitancy around making LLMs and other large general purpose models more agentive 19:02 < fenn> huh. google has been using BERT for every query since 2020 19:02 < fenn> when did google really go to shit? it was definitely going downhill by then 19:02 < hprmbridge> alonzoc> Like even without considering a skynet scenario, moderately intelligent bot goes a bit screwy and peruses random goal could be damaging 19:03 < hprmbridge> alonzoc> I dunno but Google is shit these days 19:03 < hprmbridge> alonzoc> I am noticing it more and more 19:03 < fenn> i've literally only used google half a dozen times this year 19:04 < hprmbridge> alonzoc> I only use it out of habit but when doing specific searches I endup using duckduck, yandex, etc to get hits Google just doesn't show 19:04 < superkuh> http://googlesearchonlyreturns400results.lol/ 19:04 < hprmbridge> alonzoc> Like seriously if I have keyterms in quotes google.... Answer me 19:05 < fenn> ddg is ignoring excluded terms now 19:05 < fenn> it just treats -keyword as a regular keyword 19:05 < hprmbridge> alonzoc> Rly 19:05 < hprmbridge> alonzoc> Ffs 19:06 < fenn> their documentation says it's because they rely on other search providers 19:06 < hprmbridge> alonzoc> Why is everyone dumbing down search? 19:06 < fenn> because nobody is in charge 19:06 < hprmbridge> alonzoc> If only yacy was usable and not a bloated mess 19:08 < hprmbridge> alonzoc> I've considered implementing my own search engine multiple times. Some RL web navigating agents combined with scrapers and using other search engines to supplement and provide seeds and you could prolly make a system better than google that uses a fairly low resource load like maybe a few cloud instances max if it's personalised to you 19:08 < hprmbridge> alonzoc> And common crawl would be useful for bootstrapping 19:09 < hprmbridge> alonzoc> If only I had petabytes of disk to spare 19:10 < hprmbridge> alonzoc> Someone really should get around to making a better version of yacy 19:10 < hprmbridge> alonzoc> Maybe a fediverse style protocol for search 19:13 < fenn> i doubt you'd need petabytes to get something useful 19:13 < fenn> a few simple heuristics like "is it sportsball? is it chinese spam?" 19:13 < hprmbridge> alonzoc> Oh no, the petabyte was more me wanting to cache all of common crawl 19:13 < hprmbridge> alonzoc> Like I'd prolly run pruning on it after 19:14 < hprmbridge> alonzoc> Like the whole common crawl dataset is prolly massively compressible with redundancy along with the standard it being lots of language 19:15 < fenn> and we're back to LLMs 19:15 < hprmbridge> alonzoc> Oh yeah LLMs can be used as pretty good compressors 19:15 < hprmbridge> alonzoc> Well any good AI can 19:16 < fenn> much of the stuff i used to use a search engine for, now i just ask chatGPT 19:16 < fenn> it feels like overkill but it's totally worth the $0.001 19:16 < hprmbridge> alonzoc> Ehh ChatGPT has it's uses, I think bing is kinda onto something 19:16 < hprmbridge> alonzoc> But it's early days on that 19:17 < fenn> no i just want a one line answer, not pages and pages of crap to dig through 19:17 < fenn> this is a command line tool 19:17 < fenn> chatblade 19:17 < hprmbridge> alonzoc> One line answers are nice, but sourcing is important and useful 19:18 < fenn> "answers provided by ask.com" 19:18 < fenn> great 19:18 < fenn> no, you'll need a full fledged scholar bot to review half a dozen papers and compare their experimental design 19:18 < hprmbridge> alonzoc> I usually am looking for papers or obscure information, so asking a llm isn't really helpful unless I'm not sure what I'm looking for 19:19 < hprmbridge> alonzoc> ChatGPT as a lot of easily found knowledge stored in it's weights but some of the more obscure stuff isn't there 19:20 < fenn> i'm a little flummoxed that we haven't seen a rise in trading of domain specific LoRAs 19:21 < fenn> all you do, is you take your giant paper hoard and dump it into a runpod instance and pay $100 19:21 < fenn> ok i'm exaggerating 19:21 < superkuh> And get a model that outputs it's existing knowledge in the style of the hoard? 19:22 < fenn> this idea that you can't put new knowledge into an LLM with fine tuning is false 19:23 < hprmbridge> alonzoc> Oh yea you can just have to do all the weight not the last layer like a lot of low cost fine tuning does 19:23 < fenn> i haven't tried it, but i strongly suspect that you can stack a helpful assistant question answering LoRA on top of a domain knowledge LoRA 19:23 < hprmbridge> alonzoc> Not sure how many layers you need to fine-tune though, like the earliest might be okay the way they are 19:24 < fenn> LoRA can affect all layers 19:24 < hprmbridge> alonzoc> I really need to get some more money and buy myself a big DL machine 19:24 < fenn> it's not worth it 19:24 < fenn> cloud stuff is cheaper than you expect 19:26 < fenn> it's nice to not have to think about being on the clock 20:19 < fenn> a bug in the transformer algorithm? https://www.evanmiller.org/attention-is-off-by-one.html 20:38 < hprmbridge> alonzoc> Ehh, iirc pytorch transformers aalready have his softmax1 function as an option. Furthermore for large self attention models a head can "opt out" by having a more or less uniform distribution, and as it's then a almost equally weighted sum of thousands of effectively random vectors the output will be close to zero 20:38 < hprmbridge> alonzoc> So like maybe for small softmax layers 20:38 < hprmbridge> alonzoc> But for the ones in LLMs I doubt it'll change much 20:38 < hprmbridge> alonzoc> Might be interesting if it does improve quantisation though 20:38 -!- TMM_ [hp@amanda.tmm.cx] has quit [Quit: https://quassel-irc.org - Chat comfortably. Anywhere.] 20:38 -!- TMM_ [hp@amanda.tmm.cx] has joined #hplusroadmap 21:14 < fenn> When a significant theorem is proved, it often (but not always) happens that the solution can be communicated in a matter of minutes from one person to another within the subfield. The same proof would be communicated and generally understood in an hour talk to members of the subfield. It would be the subject of a 15- or 20-page paper, which could be read and understood in a few hours or perhaps 21:14 < fenn> days by members of the subfield. 22:14 < fenn> "John Deere now controls the majority of the world's agricultural future, and they've boobytrapped those ubiquitous tractors with killswitches that can be activated [over the internet]" 22:14 < fenn> good job humans 22:55 -!- flooded [flooded@gateway/vpn/protonvpn/flood/x-43489060] has joined #hplusroadmap 22:59 -!- _flood [flooded@gateway/vpn/protonvpn/flood/x-43489060] has quit [Ping timeout: 244 seconds] 23:04 < hprmbridge> Eli> anything that ends with mab is a monoclonal antibody 23:04 < hprmbridge> Eli> I'm curious if there would be any impact on someone who might be considered phenotypically healthy: https://jamanetwork.com/journals/jamanetworkopen/fullarticle/2774903 23:05 < hprmbridge> Eli> impressive results. decreased lean muscle mass is a major issue with semaglutide and even metformin 23:15 < fenn> monthly intravenous therapy in a hospital setting for >>$10k/mo won't be experimented with by healthy people any time soon 23:15 < fenn> of course there would be an effect on a healthy person, it's not magic 23:16 < fenn> i would also like to register my displeasure at the opaque and ugly nature of generic drug names 23:33 < hprmbridge> Eli> there are plenty of hollywood actors on a cocktail of drugs 23:33 < hprmbridge> Eli> same with athlete 23:35 < hprmbridge> Eli> For top athletes/actors, spending a million dollars a year on their body is not a problem 23:36 < hprmbridge> Eli> And if there are life extension effects, then the rich billionaires have no problem with that either --- Log closed Tue Jul 25 00:00:09 2023