--- Log opened Mon Jul 24 00:00:08 2023
00:51 -!- superz_ [~superegg@user/superegg] has joined #hplusroadmap
00:51 -!- test_ [flooded@gateway/vpn/protonvpn/flood/x-43489060] has joined #hplusroadmap
00:54 -!- flooded [flooded@gateway/vpn/protonvpn/flood/x-43489060] has quit [Ping timeout: 244 seconds]
00:55 -!- superz_ is now known as superz
01:01 -!- darsie [~darsie@84-113-55-200.cable.dynamic.surfer.at] has joined #hplusroadmap
04:00 -!- flooded [flooded@gateway/vpn/protonvpn/flood/x-43489060] has joined #hplusroadmap
04:04 -!- test_ [flooded@gateway/vpn/protonvpn/flood/x-43489060] has quit [Ping timeout: 260 seconds]
04:15 -!- TMM_ [hp@amanda.tmm.cx] has quit [Quit: https://quassel-irc.org - Chat comfortably. Anywhere.]
04:15 -!- TMM_ [hp@amanda.tmm.cx] has joined #hplusroadmap
04:58 -!- EmmyNoether [~EmmyNoeth@yoke.ch0wn.org] has quit [Ping timeout: 250 seconds]
05:04 -!- EmmyNoether [~EmmyNoeth@yoke.ch0wn.org] has joined #hplusroadmap
05:15 -!- yashgaroth [~ffffffff@2601:5c4:c780:6aa0:5404:e623:172d:4d43] has joined #hplusroadmap
05:58 < fenn> https://fennetic.net/sd/cloning_hallway.jpg https://fennetic.net/sd/cloning_egg_sacs.jpg https://fennetic.net/sd/cloning_showers.jpg https://fennetic.net/sd/genetically_modified_testes.jpg https://fennetic.net/sd/invertebrate_uplift.jpg https://fennetic.net/sd/tissue_printer.jpg https://fennetic.net/sd/tissue_printer2.jpg https://fennetic.net/sd/the_spice_must_flow.jpg
05:59 < fenn> hmph stupid discord
06:44 < docl> what I think is missing in LLMs is uncertainty tracking. A LLM cannot reserve the right to change its mind very well
06:45 < docl> humans are barely any better, much of the time
06:47 < nsh> i think what's missing in rocks is opposable thumbs
06:47 < nsh> rocks barely ever pick things up
06:47 < nsh> they don't even have beaks
06:47 < nsh> it's a compression algorithm docl
06:48 < nsh> also it's changing its mind continuously
06:48 < hprmbridge> alonzoc> Yeah there's also that it generates text token by token in order. The architecture is sufficient for a lot of stuff, but you artificially bound the computational depth that goes into generating each token. Ideally you'd have recurrent self attention where the networks self-delimit their computation. However training such a neural net is harder to do at scale and has other issues
06:48 < nsh> they call it reinforcement learning from human feedback
06:49 < nsh> or the exploitation of cheap labour to give it its more traditional name
06:49 < nsh> the scaling problems are just the same scaling problems that apply to parallel computer in general
06:50 < hprmbridge> alonzoc> A middle ground could be something like stable diffusion where you do what amounts to anneal or belief prop on the token stream
06:50 < nsh> it's all annealing
06:50 < nsh> you're reducing a temperature to something that you arbitrarily cut off as a ground state
06:50 < nsh> and output
06:51 < hprmbridge> alonzoc> Yes and no, at that point everything is annealing more or less.
06:51 < nsh> there are different ways of implementing this but they're just spins on the same physical intuition
06:51 < nsh> sure
06:51 < nsh> well
06:51 < nsh> we sort of decided everything was physics a few millennia ago
06:51 < nsh> and we're kind invested in that way of looking at things now
06:51 < nsh> *kinda
06:52  * nsh is just chatting shit :) 
06:55 < hprmbridge> alonzoc> Sampling token by token with a neural net giving approximate marginals on the next token is "annealing" in that you're collapsing your joint distribution over time steps but there's little bidirectional talk between parts of the state. A net that rewrote a input token stream reducing total entropy of the marginal aproximation of the joint would be much more "in spirit" of annealing
07:01 < nsh> indeed almost everything we know about physics suggests things don't happen in a manner that can be unwound into a linear succession of state transitions
07:02 < nsh> anything in which there is a universal proper time and an absolutely defined state at each point therein
07:02 < nsh> is going to be far from optimal
07:04 < nsh> classical computing in general is stupid, but it doesn't tend to make one popular to point this out
07:04 < hprmbridge> alonzoc> Sequential token sampling models work for most language but not for generating outputs that have constraints, for example if you ask ChatGPT to write a poem with some constraint like the number of syllables depending on some other property of the poem. It can check it but during sampling it makes a mistake and can't correct. Future tokens can't effect the generation of past tokens nicely. I think
07:04 < hprmbridge> alonzoc> DM is planning on using MCTS to try help on that.
07:05 < nsh> apropos https://faculty.washington.edu/jcramer/TI/tiqm_1986.pdf
07:05 < hprmbridge> alonzoc> Like in principle sequential sampling can be perfect but when you have heuristic estimates and it's possible to embed SAT problems you endup with the problems of belief prop based sat
07:06 < nsh> cf. https://www.quantamagazine.org/to-move-fast-quantum-maze-solvers-must-forget-the-past-20230720/
07:38 < docl> some stuff is reversible, some isn't. shape rotation, for example, can be transformed backward and forward losslessly. add heat and other complex phenomena in and it can't be. but we develop technology by isolating the parts that can be.
07:40 < docl> and we have to characterize and track what we don't know, otherwise we don't know what we do know
07:40 < nsh> (you know nothing, jon snow)
07:42 < docl> nothing is the wrong amount to know
07:45 < docl> socrates got it wrong, was mistranslated, or was just trying to sound profound. you are only wise by knowing that you don't know some things, and the better your grasp on what precisely those are and how much you don't know about them the wiser you are. that's why markets win, they turn uncertainty into numbers.
07:55 < nsh> ah yes, humility, on reflection, it's kinda stupid
07:55 < nsh> this is demonstrative of a firm grasp of philosophy :)
08:12 -!- srk_ [~sorki@user/srk] has joined #hplusroadmap
08:15 -!- srk [~sorki@user/srk] has quit [Ping timeout: 246 seconds]
08:15 -!- srk_ is now known as srk
08:41 -!- srk_ [~sorki@user/srk] has joined #hplusroadmap
08:45 -!- srk [~sorki@user/srk] has quit [Ping timeout: 260 seconds]
08:45 -!- srk_ is now known as srk
09:31 < docl> the idea that you should falsify certainty is fully at odds with my remarks. I say do not falsify certainty or uncertainty. they are two sides of the same coin. if you weight a coin, it introduces bias regardless of whether you select heads or tails
10:28 -!- lorenz [~lorogue@77.33.23.154] has joined #hplusroadmap
11:03 -!- ChanServ [ChanServ@services.libera.chat] has quit [shutting down]
11:05 -!- lorenz [~lorogue@77.33.23.154] has quit [Read error: Connection reset by peer]
11:14 < kanzure> "RNA demethylation increases the yield and biomass of rice and potato plants in field trials" https://www.nature.com/articles/s41587-021-00982-9
11:15 -!- ChanServ [ChanServ@services.libera.chat] has joined #hplusroadmap
11:15 -!- ServerMode/#hplusroadmap [+o ChanServ] by molybdenum.libera.chat
11:15 < kanzure> "Voyager: An open-ended embodied agent with large language models" https://voyager.minedojo.org/
11:16 < hprmbridge> kanzure>  https://cdn.discordapp.com/attachments/1064664282450628710/1133100403886665799/exploration_performance.png
11:17 -!- A_Dragon [A_D@libera/staff/dragon] has quit [Killed (Stx (Happy birthday! (I know its a bit late) She: Yes it does.))]
11:17 -!- A_Dragon [A_D@libera/staff/dragon] has joined #hplusroadmap
11:34 < docl> https://github.com/karpathy/llama2.c
11:42 < L29Ah> docl: is it of any use?
11:43 < docl> it can tell you a story with the model he provided. I'm still looking for a more impressive demo
12:07 < L29Ah> docl: why are you excited with it when there's llama.cpp?
12:08 < docl> it's a lot less lines of code, better chance I will grok
12:10 < docl> < 500 lines of c, no dependencies (just standard libs)
13:27 -!- test_ [flooded@gateway/vpn/protonvpn/flood/x-43489060] has joined #hplusroadmap
13:31 -!- flooded [flooded@gateway/vpn/protonvpn/flood/x-43489060] has quit [Ping timeout: 245 seconds]
13:43 -!- superkuh [~superkuh@user/superkuh] has quit [Ping timeout: 246 seconds]
13:59 -!- superkuh [~superkuh@user/superkuh] has joined #hplusroadmap
14:15 -!- TMM_ [hp@amanda.tmm.cx] has quit [Quit: https://quassel-irc.org - Chat comfortably. Anywhere.]
14:15 -!- TMM_ [hp@amanda.tmm.cx] has joined #hplusroadmap
14:35 -!- test_ is now known as _flood
14:47 < hprmbridge> alonzoc> Yeah neural nets are really simple which is part of their charm tbh
14:47 < hprmbridge> alonzoc> For a long time I really didn't like them but they have their charms esp with softmax layer activations
15:40 -!- stipa_ [~stipa@user/stipa] has joined #hplusroadmap
15:41 -!- stipa [~stipa@user/stipa] has quit [Ping timeout: 258 seconds]
15:41 -!- stipa_ is now known as stipa
18:40 < fenn> optical beam steering https://lumotive.com/technology/
18:47 < fenn> an LLM *can* do recurrent self attention already. all you have to do is give it some scratch pad space to think out loud. then it can read what it wrote and think about it again, write some more, think again, until it's satisfied. then re-write the final draft. unsurprisingly this is how many humans function in practice.
18:47 < fenn> time boundedness of computation is a good thing, because it lets you interact with the world in real time
18:49 < fenn> "it's a simple matter of prompting"
18:49 < hprmbridge> alonzoc> True, however it's pretty ad hoc and has its own issues. Direct architectural support would reduce the need for prompt engineering and further accuracy.
18:54 < hprmbridge> alonzoc> I agree on time boundedness being useful and important though no one wants their neural net to hang while it grinds forever on an NPhard problem
18:56 < fenn> i still have to do a deep dive on how transformers work, but i think that future tokens can affect past tokens within a narrow window of say a dozen tokens
18:56 < hprmbridge> alonzoc> It depends on the specific architecture being used
18:57 < fenn> yes
18:57 < hprmbridge> alonzoc> There are variants yes
18:57 < fenn> this was the idea behind BERT?
18:57 < hprmbridge> alonzoc> Transformer is only useful as name in telling us it relies on a ton of self attention layers
18:57 < fenn> we are all special and unique snowflakes
18:58 < hprmbridge> alonzoc> Well it's in the name
19:00 < fenn> but BERTs attention doesn't span arbitrarily far into the future
19:01 -!- yashgaroth [~ffffffff@2601:5c4:c780:6aa0:5404:e623:172d:4d43] has quit [Quit: Leaving]
19:01 < hprmbridge> alonzoc> Ideally though a system would actually "exist in time" being aware of it's own computation time with some form of asynchronous IO. Such a model would be harder to train worse than a normal turing complete model, and such a setup is making even a basic language generation model more agentive as it'd have to decide when to stop a computation and give in etc
19:01 < hprmbridge> alonzoc> And there is understandable hesitancy around making LLMs and other large general purpose models more agentive
19:02 < fenn> huh. google has been using BERT for every query since 2020
19:02 < fenn> when did google really go to shit? it was definitely going downhill by then
19:02 < hprmbridge> alonzoc> Like even without considering a skynet scenario, moderately intelligent bot goes a bit screwy and peruses random goal could be damaging
19:03 < hprmbridge> alonzoc> I dunno but Google is shit these days
19:03 < hprmbridge> alonzoc> I am noticing it more and more
19:03 < fenn> i've literally only used google half a dozen times this year
19:04 < hprmbridge> alonzoc> I only use it out of habit but when doing specific searches I endup using duckduck, yandex, etc to get hits Google just doesn't show
19:04 < superkuh> http://googlesearchonlyreturns400results.lol/
19:04 < hprmbridge> alonzoc> Like seriously if I have keyterms in quotes google.... Answer me
19:05 < fenn> ddg is ignoring excluded terms now
19:05 < fenn> it just treats -keyword as a regular keyword
19:05 < hprmbridge> alonzoc> Rly
19:05 < hprmbridge> alonzoc> Ffs
19:06 < fenn> their documentation says it's because they rely on other search providers
19:06 < hprmbridge> alonzoc> Why is everyone dumbing down search?
19:06 < fenn> because nobody is in charge
19:06 < hprmbridge> alonzoc> If only yacy was usable and not a bloated mess
19:08 < hprmbridge> alonzoc> I've considered implementing my own search engine multiple times. Some RL web navigating agents combined with scrapers and using other search engines to supplement and provide seeds and you could prolly make a system better than google that uses a fairly low resource load like maybe a few cloud instances max if it's personalised to you
19:08 < hprmbridge> alonzoc> And common crawl would be useful for bootstrapping
19:09 < hprmbridge> alonzoc> If only I had petabytes of disk to spare
19:10 < hprmbridge> alonzoc> Someone really should get around to making a better version of yacy
19:10 < hprmbridge> alonzoc> Maybe a fediverse style protocol for search
19:13 < fenn> i doubt you'd need petabytes to get something useful
19:13 < fenn> a few simple heuristics like "is it sportsball? is it chinese spam?"
19:13 < hprmbridge> alonzoc> Oh no, the petabyte was more me wanting to cache all of common crawl
19:13 < hprmbridge> alonzoc> Like I'd prolly run pruning on it after
19:14 < hprmbridge> alonzoc> Like the whole common crawl dataset is prolly massively compressible with redundancy along with the standard it being lots of language
19:15 < fenn> and we're back to LLMs
19:15 < hprmbridge> alonzoc> Oh yeah LLMs can be used as pretty good compressors
19:15 < hprmbridge> alonzoc> Well any good AI can
19:16 < fenn> much of the stuff i used to use a search engine for, now i just ask chatGPT
19:16 < fenn> it feels like overkill but it's totally worth the $0.001
19:16 < hprmbridge> alonzoc> Ehh ChatGPT has it's uses, I think bing is kinda onto something
19:16 < hprmbridge> alonzoc> But it's early days on that
19:17 < fenn> no i just want a one line answer, not pages and pages of crap to dig through
19:17 < fenn> this is a command line tool
19:17 < fenn> chatblade
19:17 < hprmbridge> alonzoc> One line answers are nice, but sourcing is important and useful
19:18 < fenn> "answers provided by ask.com"
19:18 < fenn> great
19:18 < fenn> no, you'll need a full fledged scholar bot to review half a dozen papers and compare their experimental design
19:18 < hprmbridge> alonzoc> I usually am looking for papers or obscure information, so asking a llm isn't really helpful unless I'm not sure what I'm looking for
19:19 < hprmbridge> alonzoc> ChatGPT as a lot of easily found knowledge stored in it's weights but some of the more obscure stuff isn't there
19:20 < fenn> i'm a little flummoxed that we haven't seen a rise in trading of domain specific LoRAs
19:21 < fenn> all you do, is you take your giant paper hoard and dump it into a runpod instance and pay $100
19:21 < fenn> ok i'm exaggerating
19:21 < superkuh> And get a model that outputs it's existing knowledge in the style of the hoard?
19:22 < fenn> this idea that you can't put new knowledge into an LLM with fine tuning is false
19:23 < hprmbridge> alonzoc> Oh yea you can just have to do all the weight not the last layer like a lot of low cost fine tuning does
19:23 < fenn> i haven't tried it, but i strongly suspect that you can stack a helpful assistant question answering LoRA on top of a domain knowledge LoRA
19:23 < hprmbridge> alonzoc> Not sure how many layers you need to fine-tune though, like the earliest might be okay the way they are
19:24 < fenn> LoRA can affect all layers
19:24 < hprmbridge> alonzoc> I really need to get some more money and buy myself a big DL machine
19:24 < fenn> it's not worth it
19:24 < fenn> cloud stuff is cheaper than you expect
19:26 < fenn> it's nice to not have to think about being on the clock
20:19 < fenn> a bug in the transformer algorithm? https://www.evanmiller.org/attention-is-off-by-one.html
20:38 < hprmbridge> alonzoc> Ehh, iirc pytorch transformers aalready have his softmax1 function as an option. Furthermore for large self attention models a head can "opt out" by having a more or less uniform distribution, and as it's then a almost equally weighted sum of thousands of effectively random vectors the output will be close to zero
20:38 < hprmbridge> alonzoc> So like maybe for small softmax layers
20:38 < hprmbridge> alonzoc> But for the ones in LLMs I doubt it'll change much
20:38 < hprmbridge> alonzoc> Might be interesting if it does improve quantisation though
20:38 -!- TMM_ [hp@amanda.tmm.cx] has quit [Quit: https://quassel-irc.org - Chat comfortably. Anywhere.]
20:38 -!- TMM_ [hp@amanda.tmm.cx] has joined #hplusroadmap
21:14 < fenn> When a significant theorem is proved, it often (but not always) happens that the solution can be communicated in a matter of minutes from one person to another within the subfield. The same proof would be communicated and generally understood in an hour talk to members of the subfield. It would be the subject of a 15- or 20-page paper, which could be read and understood in a few hours or perhaps
21:14 < fenn> days by members of the subfield.
22:14 < fenn> "John Deere now controls the majority of the world's agricultural future, and they've boobytrapped those ubiquitous tractors with killswitches that can be activated [over the internet]"
22:14 < fenn> good job humans
22:55 -!- flooded [flooded@gateway/vpn/protonvpn/flood/x-43489060] has joined #hplusroadmap
22:59 -!- _flood [flooded@gateway/vpn/protonvpn/flood/x-43489060] has quit [Ping timeout: 244 seconds]
23:04 < hprmbridge> Eli> anything that ends with mab is a monoclonal antibody
23:04 < hprmbridge> Eli> I'm curious if there would be any impact on someone who might be considered phenotypically healthy: https://jamanetwork.com/journals/jamanetworkopen/fullarticle/2774903
23:05 < hprmbridge> Eli> impressive results. decreased lean muscle mass is a major issue with semaglutide and even metformin
23:15 < fenn> monthly intravenous therapy in a hospital setting for >>$10k/mo won't be experimented with by healthy people any time soon
23:15 < fenn> of course there would be an effect on a healthy person, it's not magic
23:16 < fenn> i would also like to register my displeasure at the opaque and ugly nature of generic drug names
23:33 < hprmbridge> Eli> there are plenty of hollywood actors on a cocktail of drugs
23:33 < hprmbridge> Eli> same with athlete
23:35 < hprmbridge> Eli> For top athletes/actors, spending a million dollars a year on their body is not a problem
23:36 < hprmbridge> Eli> And if there are life extension effects, then the rich billionaires have no problem with that either
--- Log closed Tue Jul 25 00:00:09 2023