--- Log opened Sun Aug 24 00:00:44 2014 00:12 -!- ebowden [~ebowden@CPE-121-223-168-30.lns3.bat.bigpond.net.au] has quit [Ping timeout: 246 seconds] 00:25 -!- ebowden [~ebowden@CPE-121-223-168-30.lns3.bat.bigpond.net.au] has joined ##hplusroadmap 00:27 < dingo_> pine apples neither grow on pine trees or are a form of apple 00:48 < nmz787> they are sharp like pine needles though 00:51 -!- Jaakko914 [~Jaakko@host86-134-218-208.range86-134.btcentralplus.com] has joined ##hplusroadmap 00:52 -!- Jaakko914 [~Jaakko@host86-134-218-208.range86-134.btcentralplus.com] has quit [Client Quit] 00:53 -!- snuffeluffegus [~snuff@2001:9b0:10:2104:216:3eff:feb7:f845] has joined ##hplusroadmap 00:55 -!- lichen [~lichen@c-50-139-11-6.hsd1.or.comcast.net] has quit [Quit: Lost terminal] 01:24 -!- justanotheruser [~Justan@unaffiliated/justanotheruser] has joined ##hplusroadmap 01:26 -!- poppingtonic [~poppingto@105.231.46.43] has joined ##hplusroadmap 01:34 -!- poppingtonic [~poppingto@105.231.46.43] has quit [Remote host closed the connection] 01:35 -!- poppingtonic [~poppingto@105.231.46.43] has joined ##hplusroadmap 01:44 < nmz787> paperbot: http://www.microfluidicsinfo.com/Connectors.pdf 01:44 < paperbot> http://diyhpl.us/~bryan/papers2/paperbot/75e6714f449b3906c95620aa76848f42.pdf 01:45 < nmz787> pfft, that filename sucks http://diyhpl.us/~nmz787/pdf/Microfluidic_Connector_types.pdf 01:45 < nmz787> http://www.microfluidicsinfo.com/index_files/Standards.htm 01:51 -!- ebowden_ [~ebowden@CPE-121-223-168-30.lns3.bat.bigpond.net.au] has joined ##hplusroadmap 01:51 -!- ebowden [~ebowden@CPE-121-223-168-30.lns3.bat.bigpond.net.au] has quit [Read error: Connection reset by peer] 02:04 -!- EnLilaSko [EnLilaSko@unaffiliated/enlilasko] has joined ##hplusroadmap 02:16 -!- Viper168_ [~Viper@unaffiliated/viper168] has joined ##hplusroadmap 02:18 -!- Viper168 [~Viper@unaffiliated/viper168] has quit [Ping timeout: 260 seconds] 02:31 -!- Viper168_ is now known as Viper168 02:40 -!- strangewarp [~strangewa@c-76-25-206-3.hsd1.co.comcast.net] has quit [Ping timeout: 260 seconds] 02:45 -!- snuffeluffegus [~snuff@2001:9b0:10:2104:216:3eff:feb7:f845] has quit [Quit: Leaving] 02:57 -!- pi- [~Ohmu@cpc2-oxfd18-2-0-cust90.4-3.cable.virginm.net] has joined ##hplusroadmap 03:07 -!- snuffeluffegus [~snuff@2001:9b0:10:2104:216:3eff:feb7:f845] has joined ##hplusroadmap 03:34 -!- snuffeluffegus [~snuff@2001:9b0:10:2104:216:3eff:feb7:f845] has quit [Quit: Leaving] 03:37 -!- kuldeepdhaka [~kuldeepdh@unaffiliated/kuldeepdhaka] has quit [Read error: Connection reset by peer] 05:20 -!- ebowden [~ebowden@CPE-121-223-168-30.lns3.bat.bigpond.net.au] has joined ##hplusroadmap 05:20 -!- ebowden_ [~ebowden@CPE-121-223-168-30.lns3.bat.bigpond.net.au] has quit [Read error: Connection reset by peer] 05:25 -!- ebowden [~ebowden@CPE-121-223-168-30.lns3.bat.bigpond.net.au] has quit [Ping timeout: 240 seconds] 05:28 -!- ebowden [~ebowden@CPE-121-223-168-30.lns3.bat.bigpond.net.au] has joined ##hplusroadmap 05:31 -!- poppingtonic [~poppingto@105.231.46.43] has quit [Ping timeout: 260 seconds] 05:47 -!- ielo [~ielo@cpc5-camd14-2-0-cust311.hari.cable.virginm.net] has joined ##hplusroadmap 06:13 -!- strangewarp [~strangewa@c-76-25-206-3.hsd1.co.comcast.net] has joined ##hplusroadmap 06:13 -!- ebowden_ [~ebowden@CPE-121-223-168-30.lns3.bat.bigpond.net.au] has joined ##hplusroadmap 06:13 -!- ebowden [~ebowden@CPE-121-223-168-30.lns3.bat.bigpond.net.au] has quit [Ping timeout: 240 seconds] 06:38 -!- ebowden_ [~ebowden@CPE-121-223-168-30.lns3.bat.bigpond.net.au] has quit [Remote host closed the connection] 06:41 -!- ielo [~ielo@cpc5-camd14-2-0-cust311.hari.cable.virginm.net] has quit [Ping timeout: 264 seconds] 06:53 -!- ruthie [~ruthie@CPEbcc810070371-CMbcc81007036e.cpe.net.cable.rogers.com] has joined ##hplusroadmap 07:37 < nsh> .g René Daumal and the pataphysics of liberation 07:37 < yoleaux> http://link.springer.com/article/10.1007%2FBF00214598 07:37 < nsh> paperbot 07:37 < nsh> paperbot, http://link.springer.com/article/10.1007%2FBF00214598 07:37 < paperbot> http://diyhpl.us/~bryan/papers2/paperbot/Ren%20Daumal%20and%20the%20pataphysics%20of%20liberation.pdf 07:38 < nsh> \o/ 07:38 < nsh> why can't i have crypto from springer then? 07:39 * nsh shrugs 08:29 -!- pi- [~Ohmu@cpc2-oxfd18-2-0-cust90.4-3.cable.virginm.net] has quit [Read error: Connection reset by peer] 08:29 -!- pi- [~Ohmu@cpc2-oxfd18-2-0-cust90.4-3.cable.virginm.net] has joined ##hplusroadmap 08:30 -!- pi- [~Ohmu@cpc2-oxfd18-2-0-cust90.4-3.cable.virginm.net] has quit [Read error: Connection reset by peer] 08:31 -!- pi- [~Ohmu@cpc2-oxfd18-2-0-cust90.4-3.cable.virginm.net] has joined ##hplusroadmap 08:32 -!- pi- [~Ohmu@cpc2-oxfd18-2-0-cust90.4-3.cable.virginm.net] has quit [Read error: Connection reset by peer] 08:32 -!- pi- [~Ohmu@cpc2-oxfd18-2-0-cust90.4-3.cable.virginm.net] has joined ##hplusroadmap 08:47 -!- kuldeepdhaka [~kuldeepdh@unaffiliated/kuldeepdhaka] has joined ##hplusroadmap 08:55 -!- yashgaroth [~ffffff@cpe-76-167-105-53.san.res.rr.com] has joined ##hplusroadmap 09:40 -!- kuldeepdhaka [~kuldeepdh@unaffiliated/kuldeepdhaka] has quit [Read error: Connection reset by peer] 09:44 -!- comma8 [comma8@gateway/shell/yourbnc/x-rbwelggerzfacdyt] has quit [Read error: Connection reset by peer] 09:46 < kanzure> nsh: think of it as a science lottery 09:46 < nsh> kk 09:47 < nsh> kanzure, do you know what the department of justice's policy is on name change? 09:47 < kanzure> nope 09:47 < nsh> like, if i were to get my name officially changed to Wang Dong, would they update the indictments, or just start using it on newer ones? 09:47 < nsh> i'm tempted to call and ask 09:47 < kanzure> they would probably not know 09:47 < nsh> true 09:47 < nsh> would be a waste of having to talk to an american 09:48 < kanzure> but you should pick wu tang over wang dong 09:48 < nsh> kk 09:49 < kanzure> or, "hi, yes, i am no longer named, please remove that old name and replace it with blanks" 09:58 < nsh> hehe 10:18 -!- ielo [~ielo@46.233.72.123] has joined ##hplusroadmap 10:18 -!- gene_hacker [~chatzilla@c-50-137-46-240.hsd1.or.comcast.net] has joined ##hplusroadmap 10:19 -!- Viper168 [~Viper@unaffiliated/viper168] has quit [Ping timeout: 245 seconds] 10:26 -!- Viper168 [~Viper@unaffiliated/viper168] has joined ##hplusroadmap 10:30 -!- ielo [~ielo@46.233.72.123] has quit [Ping timeout: 255 seconds] 10:49 < dingo_> man 10:49 < dingo_> lastnight i was partially asleep 10:49 < dingo_> and then woosh... it felt like a bear ran full speed into my house 10:49 < dingo_> then im remembering it again this morning and going huh 10:49 < kanzure> i had a dream about abandoning you in a war zone 10:50 < dingo_> i wonder if there are any bears here 10:50 < kanzure> sorry i abandoned you 10:50 < dingo_> but it was an earthquake, not a bear 10:50 < dingo_> thats ok, i dreamed about a war last night too 10:58 -!- lichen [~lichen@c-50-139-11-6.hsd1.or.comcast.net] has joined ##hplusroadmap 11:00 -!- CheckDavid [uid14990@gateway/web/irccloud.com/x-vfdmqnccghkqdzrs] has joined ##hplusroadmap 11:24 -!- strangewarp_ [~strangewa@c-76-25-206-3.hsd1.co.comcast.net] has joined ##hplusroadmap 11:25 -!- strangewarp [~strangewa@c-76-25-206-3.hsd1.co.comcast.net] has quit [Ping timeout: 246 seconds] 11:57 < nmz787> nsh: which paper can't you have? 11:57 < nsh> http://link.springer.com/chapter/10.1007/978-3-642-31912-9_16 11:57 < nsh> it's a chapter of proceedings 11:57 < nsh> which might be the problem 11:58 < nmz787> nbsp is my name 11:58 < nmz787> dingo_: you live in NorCal? 11:59 < nmz787> ah, yes I saw that yesterday but didn't have my proxy logins 11:59 < nmz787> lemme check 11:59 < nsh> kk 11:59 < nmz787> nsh: want chapter or whole book? 12:00 < nsh> either's good 12:00 < dingo_> i live in boulder creek, CA 12:00 < dingo_> pretty far away from it, but i'm sure i felt it 12:01 < nmz787> nsh: http://diyhpl.us/~nmz787/pdf/An_Improved_Known_Plaintext_Attack_on_PKZIP_Encryption_Algorithm.pdf 12:01 < nsh> \o/ 12:01 < nsh> tyty 12:01 < kanzure> nsh still hasn't cracked it? 12:01 < nsh> nups 12:01 < nsh> i still technically haven't downloaded 60% of it 12:02 < nmz787> dingo_: ever hang out with old-timers in that area? 12:02 < nmz787> dingo_: computer science old-timers 12:03 -!- ThomasEgi_ [~thomas@p5DDDD23A.dip0.t-ipconnect.de] has quit [Remote host closed the connection] 12:04 -!- justanotheruser [~Justan@unaffiliated/justanotheruser] has quit [Ping timeout: 240 seconds] 12:04 < dingo_> mm nope, i'd like to though, i get along with the older folk 12:04 < dingo_> haven't been here long and been working too much 12:04 < dingo_> i hear you can meet diffie in a coffee shop he's seen at daily 12:05 < nmz787> kanzure: so I was able to post creds to the proxy and get the logged-in sciencedirect page for example, and I see there are like 9 cookies with domain sciencedirect.com 12:05 < nmz787> kanzure: but now I wonder, why isn't the cookie for the proxy you logged into? 12:06 < kanzure> why would you HTTP POST credentials to the proxy? i thought the proxy was supposed to be self-contained. 12:06 < kanzure> which proxy are you talking about 12:06 < kanzure> the one on localhost or the remote one that you don't control 12:06 < nmz787> kanzure: also, the result of logging in using a request.post results in the page content/text... so is it possible to return that using this example's method https://github.com/mitmproxy/mitmproxy/blob/f4a1459ebeca7c72419bce17d931f8b2c846df5e/examples/redirect_requests.py 12:06 < nmz787> remote one 12:06 < kanzure> is that a question? 12:07 < nmz787> yes ?... as the param passed in differs in name from this example's param http://mitmproxy.org/doc/scripting/libmproxy.html 12:08 < nmz787> they both use the reply() but I can't find a definition for reply in the repo 12:08 < nmz787> so i wonder if it's the same 12:08 < nmz787> or not 12:08 < kanzure> you forgot to return flow from that function 12:09 < nmz787> i did? 12:09 < kanzure> oh you didn't write it 12:09 < kanzure> yeah i haven't used reply() 12:10 < nmz787> otherwise i think I can use the libmproxy example to set al the 9 cookies 12:10 -!- justanotheruser [~Justan@unaffiliated/justanotheruser] has joined ##hplusroadmap 12:10 < nmz787> but it seems dumb to make two requests 12:11 < kanzure> i am having trouble understanding your questions 12:12 < kanzure> libmproxy.flow.FlowMaster gives you a handle_request and a handle_response 12:13 < nmz787> in http://mitmproxy.org/doc/scripting/libmproxy.html there is handle_request... the first thing I do in there is grab the URL, the request_iteration int you mentioned you/me will add to pass, and then grab the proxy_url from my proxy_list and append the pdf_url to that... then i use request.post to post to the concatenated url with login creds 12:14 < nmz787> then I can check the URL of the response to see if it contains the login domain 12:14 < nmz787> and if not that means we were redirected to the journal page 12:14 < kanzure> you can also set requests.post to not allow redirects 12:15 < nmz787> oh, ok 12:15 < kanzure> and if it does allow redirects then requests.post returns a response that has response.history which is a list of other responses in the chain 12:15 < nmz787> well my only contention was is i was just using that response to grab the cookies, then I imagine the mitmproxy lib would make another request after I was done adding the cookies 12:16 < nmz787> so if i already had the journal page data, it would be double requests to that page 12:16 < nmz787> i also wondered since the cookie.domain was sciencedirect.com 12:17 < nmz787> not the proxy server domain, if on subsequent tries to the other proxies in the list, if the cookies would be overwritten 12:17 < nmz787> i couldn't find any references in the cookies to the login page URL/domain 12:18 < nmz787> and so was wondering if I should use a separate request Session for each proxy I try 12:18 < nmz787> (i don't think my browser gets confused when I login to multiple proxies at once) 12:18 < kanzure> you can store cookies in a global in your libmproxy daemon thing 12:19 < nmz787> i bet, but I wondered why i coulnd't see any of the login domain traces 12:19 < kanzure> you can also investigate the cookies manually through a python repl, python-requests and one of the ezproxy servers 12:19 < nmz787> yes i did the login with python-requests 12:20 < nmz787> huh, the request response .history is empty 12:21 < kanzure> then it didn't redirect 12:23 < nmz787> repr of the response.cookies shows nothing with login page domain in it 12:24 < kanzure> well what was your request 12:25 < nmz787> i took the URL from my bookmarklet javascript:void(location.href="http://daproxy.lib.dahost.edu:80/login?url="+location.href) 12:25 < nmz787> and appended a sciencedirect URL to that 12:25 < kanzure> ezproxy automatically logs in if your ip address is allowed 12:26 < nmz787> then request.post(thatURL, data={'user':'blah','login':'doody'} 12:26 < nmz787> and the returned object is the response 12:26 < kanzure> /login is just the login form, you should double check the actual form target 12:26 < nmz787> len(list(r.cookies)) is 9 12:26 < nmz787> yep 12:26 < nmz787> i did 12:27 < nmz787> the login works this way 12:27 < nmz787> for this page 12:27 < nmz787> and also the response.context contains the 'Download PDF' string I see in my browser after successful login 12:28 < kanzure> there is no response.context 12:28 < nmz787> (prev it would say 'Purchase') 12:28 < nmz787> sorry content 12:29 < nmz787> the libmproxy example shows msg.headers["cookie"] = cookieValue 12:29 -!- yoleaux [~yoleaux@xn--ht-1ia18f.nonceword.org] has quit [Ping timeout: 250 seconds] 12:29 < nmz787> but there are 9 cookies, and it seems there might also be a msg.cookies 12:29 < nmz787> (each cookie has a unique key) 12:30 < nmz787> but that's when I started wondering about logging into a diff proxy host with this same request session 12:30 < nmz787> if the new login might overwrite existing cookies 12:30 < kanzure> why are oyu using libmproxy? 12:30 < nmz787> idk if that is a think 12:30 < nmz787> thing 12:30 < nmz787> cause you told me to 12:30 < nmz787> the docs suck for it 12:30 < nmz787> fo sho 12:30 < kanzure> why did you need an http proxy? 12:31 < nmz787> you told me 12:32 < kanzure> that's a dumb reason 12:32 < kanzure> hm 12:33 < nmz787> i could just use flask and still have it be a local port 12:33 < nmz787> in that case I could just return the response.content i think 12:34 < nmz787> and if paperbot couldn't find the PDF in that, it would ask again with an incremented request_iteration int (as I'm calling it) 12:37 < kanzure> nmz787: https://gist.github.com/kanzure/8d8c1582883104e8785c 12:39 < nmz787> you'll pass some int in data also? 12:41 < kanzure> either way would work 12:41 < kanzure> fetch_by_ezproxy can either loop, or not 12:41 < nmz787> well if you didn't pass the int, I'd have to parse the response for success 12:42 < nmz787> which seems like paperbot is already doing somewhere 12:42 < kanzure> just use an int in the request 12:43 -!- sapiosexual [~sapiosexu@d173-183-74-37.bchsia.telus.net] has joined ##hplusroadmap 13:02 < bbrittain> There is now an easy app that will store all your location data and post it to an arbitrary endpoint on demand. I know more about android than I ever wanted to. 13:02 < bbrittain> as per my complaining yesterday 13:02 < nmz787> kanzure: ok I think i'm done with the flask server 13:04 < nmz787> kanzure: you need to pass url (I'd prefer if we changed it to pdf_url, and also an int as request_iteration) 13:04 < kanzure> bbrittain: post to localhost 13:05 < kanzure> nmz787: how does paperbot know to stop sending requests? 13:05 < nmz787> i thought it parses/knows if it get's a good PDF 13:05 < kanzure> yes but i mean the other situation 13:05 < nmz787> oh, like if i run out 13:05 < nmz787> umm 13:06 < bbrittain> kanzure: -_- I'm not gonna analyze where I go from my phone 13:06 < nmz787> I could add a header keep_trying= True? 13:06 -!- ruthie [~ruthie@CPEbcc810070371-CMbcc81007036e.cpe.net.cable.rogers.com] has quit [Ping timeout: 245 seconds] 13:06 < nmz787> or proxies_remaining=int 13:06 < kanzure> whatever 13:06 < nmz787> kanzure: when I try running as-is I'm getting socket.error: [Errno 98] Address already in use 13:07 < kanzure> app.run(port=8500) 13:08 -!- ruthie [~ruthie@CPEbcc810070371-CMbcc81007036e.cpe.net.cable.rogers.com] has joined ##hplusroadmap 13:08 < nmz787> kanzure: running 13:08 < nmz787> paperbot: http://www.sciencedirect.com/science/article/pii/S0969804397101233 13:09 < paperbot> http://diyhpl.us/~bryan/papers2/paperbot/e6be5587dd4be229efb016b786c21680.txt 13:09 < nmz787> (i know it's not hooked up yet, likely) 13:09 < nmz787> i'm returning proxies_remaining in the headers 13:09 < kanzure> were you writing the paperbot part? 13:09 < nmz787> nah didn't touch it 13:10 < nmz787> is it easy to play with the active paperbot? 13:11 < kanzure> $ addaccess paperbot nmz787 13:11 -!- yoleaux [~yoleaux@xn--ht-1ia18f.nonceword.org] has joined ##hplusroadmap 13:11 < kanzure> The user `nmz787' is already a member of `git-paperbot'. 13:11 < kanzure> git push nmz787@diyhpl.us:/srv/git/paperbot.git 13:12 -!- trotsky [~ruthie@CPEbcc810070371-CMbcc81007036e.cpe.net.cable.rogers.com] has joined ##hplusroadmap 13:13 < nmz787> kanzure: what to do? 13:13 -!- ruthie [~ruthie@CPEbcc810070371-CMbcc81007036e.cpe.net.cable.rogers.com] has quit [Ping timeout: 264 seconds] 13:14 < kanzure> once you git push, paperbot reloads the modules 13:14 < nmz787> kanzure: i see a ~/paperbot not 13:14 < nmz787> now 13:14 < nsh> nmz787, can you get this too? http://www.springer.com/computer/security+and+cryptology/book/978-1-4419-5905-8 13:14 < nmz787> not sure if that was there before, looks like it was from 2013 13:14 < kanzure> paperbot in irc will reload source code once the git receive-hook is triggered on the server 13:14 < nmz787> so I have to git clone from github first? 13:15 < kanzure> git clone nmz787@diyhpl.us:/srv/git/paperbot.git 13:15 < kanzure> github may be up to date, or not 13:15 < nmz787> in my existing paperbot dir i did git pull 13:15 < nmz787> is that sufficient? 13:15 < nmz787> Updating 6dcb23e..6cb280f 13:15 < kanzure> as long as git rev-parse HEAD shows something that is from origin/master or diyhplus/master 13:16 < kanzure> or github/master 13:17 < nmz787> rev-parse shows 6cb280fa7b31ddefe8844bf71ddb32823b435f9f 13:17 < nmz787> rev-parse HEAD 13:18 < kanzure> what about it? 13:18 < nmz787> nsh: appears I can get it, but not as a single file 13:18 < kanzure> just compare it against git rev-parse github/master or git rev-parse diyhplus/master 13:18 < nsh> interesting 13:18 < nmz787> nsh: you'd need to tell me which section of the 59 pages of sections here :P http://link.springer.com/referencework/10.1007%2F978-1-4419-5906-5 13:18 < kanzure> "Relative to the price of labor, computation has become cheaper by a factor of 7.3 × 10^13 compared to manual calculations." 13:19 < nmz787> fatal: ambiguous argument 'diyhplus/master': unknown revision or path not in the working tree. 13:19 < kanzure> git remote -vv 13:19 < nsh> nmz787, no matter, i have too much to read really 13:19 < kanzure> i don't know what you named your remotes 13:19 < nmz787> origin /srv/git/paperbot.git (fetch) 13:19 < nmz787> origin /srv/git/paperbot.git (push) 13:19 < kanzure> git rev-parse origin/master 13:20 < nmz787> same hash i posted a few mins ago 13:20 < kanzure> looks like you're up to date then 13:20 < nmz787> ok 13:22 < nmz787> kanzure: so after I find out where it determines fail/succes for PDF acquisition, I will want it to try 'http://localhost:8500/plsget' right? 13:23 < kanzure> yep. also something about looping. 13:23 < kanzure> paperbot is very synchronous at the moment, so additional blocking isn't a big deal 13:23 < kanzure> eventually it will be switched over to coroutines with gevent or something 13:24 -!- trotsky [~ruthie@CPEbcc810070371-CMbcc81007036e.cpe.net.cable.rogers.com] has quit [Ping timeout: 255 seconds] 13:25 < nmz787> kanzure: so on the page http://www.sciencedirect.com/science/article/pii/S0969804397101233... I bet you see a 'Purchase' link next to an Adobe reader icon 13:25 < nmz787> is that link what I'll be receiving? 13:26 < nmz787> such that the response.content I return /might/ be the actual PDF data? 13:26 < gene_hacker> how does paperbot work anyway? 13:26 < nmz787> rather than just the logged in page with the logged in link to the PDF 13:26 < kanzure> you will sometimes be given a link to html stuff or sometimes a link to a pdf 13:26 < kanzure> zotero sometimes finds the right pdf link 13:26 < gene_hacker> does it need a host account or something to work? 13:26 < kanzure> gene_hacker: paperbot is magic.. https://github.com/kanzure/paperbot 13:27 < gene_hacker> it doesn't need a host account? 13:27 < nmz787> it looks like the current flow is request,get(pdf_url), then if response.status_code != 200:... then if "pdf" in response.headers["content-type"]:... then finally writing out response.content to a file 13:27 < kanzure> gene_hacker: sorry for being opaque 13:27 < kanzure> gene_hacker: it does not use a username/password 13:28 < gene_hacker> that is magic 13:28 < nmz787> so it seems the if "pdf" in response.headers["content-type"]: is where I'd end my loop 13:29 < nmz787> i.e. if the else condition is True, loop again 13:29 < kanzure> no you would also end if the header says there's no more ezproxy urls to try 13:29 < nmz787> ah, right 13:30 < nmz787> but what if the URL provided to me is not the PDF url, it's just the main sciencedirect page.. isn't that post-zotero such that the logged-in page would never get parsed for the PDF link? 13:30 < nmz787> i guess an easy way to tell is to just try it 13:30 < kanzure> there is some html parsing stuff in paperbot but nobody put much thought into it 13:31 < kanzure> paperbot's default behavior when failing is to save the output html as txt so i can debug what went wrong and possibly find the actual pdf link 13:32 < kanzure> zotero is being used because i didn't want to have to write 200 scrapers 13:37 < nmz787> hmm, the 'Purchase' link seems fail when I try using it 13:37 < nmz787> paperbot: http://www.sciencedirect.com/science?_ob=ShoppingCartURL&_method=add&_eid=1-s2.0-S0969804397101233&originContentFamily=serial&_origin=article&_ts=1408912018&md5=649317c0fee32dd289620ef24861ba2a 13:37 < paperbot> http://diyhpl.us/~bryan/papers2/paperbot/ca80740576d65cc4908de047d004ff29.txt 13:37 < kanzure> did you push? 13:37 < nmz787> nah 13:38 < kanzure> paperbot doesn't have new code, then 13:38 < nmz787> i mean just if that was what zotero gives me 13:38 < nmz787> yeah I haven't touched it yet 13:38 < nmz787> just thinking 13:38 < kanzure> purchase link is v. different from the pdf download link 13:43 < kanzure> may be possible to use an http proxy with zotero so that zotero requests can go through a proxy that then goes through ezproxy to the publisher 13:44 < kanzure> very hacky compared to just writing parsers in python as plugins 13:46 < nmz787> kanzure: if item.has_key("DOI"): seems like it should be in the else of if "pdf" in response.headers["content-type"]: 13:47 < nmz787> as-is it seems like even if we get the PDF attachment, we'll return the scihub link 13:47 < kanzure> i believe scihub is technically dead now 13:47 < nmz787> err, libgen 13:48 < nmz787> it says module.scihub.libgen 13:48 < nmz787> so w/e 13:48 < kanzure> feel free to change that behavior 13:48 < kanzure> this is all shit and needs to be fixed 14:05 -!- pi- [~Ohmu@cpc2-oxfd18-2-0-cust90.4-3.cable.virginm.net] has quit [] 14:06 < nmz787> kanzure: what if I push bad code 14:06 < nmz787> is there a way to test without pushing? 14:07 < nmz787> otherwise i'll have to learn how to revert pushes 14:07 < nmz787> which I guess you probably could tell me 14:11 -!- drewbot [~cinch@ec2-54-81-122-8.compute-1.amazonaws.com] has quit [Remote host closed the connection] 14:12 -!- drewbot [~cinch@ec2-54-166-181-23.compute-1.amazonaws.com] has joined ##hplusroadmap 14:16 < nmz787> kanzure: how would a normal proxy work, re: pdf_url... would it be proxy_url+pdf_url that was being .get()ed or is there some special proxy= attribute in th .get() 14:16 < nmz787> ah got it http://docs.python-requests.org/en/latest/user/advanced/#proxies 14:17 < kanzure> like i said, code is shit, there are no unit tests 14:17 < kanzure> unit tests are something that should happen 14:19 -!- paperbot [~paperbot@131.252.130.248] has quit [Remote host closed the connection] 14:20 -!- paperbot [~paperbot@131.252.130.248] has joined ##hplusroadmap 14:32 -!- gully_foyle_ja [~theghosto@pool-71-116-68-251.snfcca.dsl-w.verizon.net] has joined ##hplusroadmap 14:35 < gnusha> https://secure.diyhpl.us/cgit/paperbot/commit/?id=ae35ea08 Nathan McCorkle: added looping through the new local variable proxy_list, added provisions for custom_flask_json proxy, which aims to provide a way for remote users to return PDFs 14:35 < gnusha> paperbot: reload papers 14:35 < paperbot> gnusha: (version: 2014-08-24 21:35:06) 14:35 < nmz787> paperbot: http://www.sciencedirect.com/science/article/pii/S0969804397101233 14:36 < paperbot> http://diyhpl.us/~bryan/papers2/paperbot/1b93cfc2338d1e77b75a511f34865889.txt 14:38 < nmz787> paperbot: http://www.sciencedirect.com/science/article/pii/S0969804397101233 14:38 < paperbot> http://diyhpl.us/~bryan/papers2/paperbot/d9c2a58f335716eed98363117cb0c4e2.txt 14:38 < kanzure> that does not seem to be using a proxy 14:39 < kanzure> paperbot: http://httpbin.org/get 14:39 < paperbot> http://diyhpl.us/~bryan/papers2/paperbot/de147e604895c8c46dcf172f37d619fd.txt 14:39 < nmz787> paperbot: http://www.sciencedirect.com/science/article/pii/S0969804397101233 14:39 < paperbot> http://diyhpl.us/~bryan/papers2/paperbot/d71146ad6212da5f21232aef85cebbf.txt 14:39 < nmz787> hmm, plsget isn't getting called 14:42 -!- paperbot [~paperbot@131.252.130.248] has quit [Remote host closed the connection] 14:42 -!- paperbot [~paperbot@131.252.130.248] has joined ##hplusroadmap 14:42 < kanzure> paperbot: http://www.sciencedirect.com/science/article/pii/S0969804397101233 14:42 < paperbot> http://diyhpl.us/~bryan/papers2/paperbot/7c083f69992eb24b855b020e8cd12c49.txt 14:43 -!- marciogm [~komanda@unaffiliated/marciogm] has joined ##hplusroadmap 14:46 < gnusha> https://secure.diyhpl.us/cgit/paperbot/commit/?id=edc2e3c9 Nathan McCorkle: added remoteprint so i can debug, added incrementing and reset of request_iteration 14:46 < gnusha> paperbot: reload papers 14:46 < paperbot> gnusha: (version: 2014-08-24 21:46:23) 14:46 < nmz787> paperbot: http://www.sciencedirect.com/science/article/pii/S0969804397101233 14:46 < paperbot> http://diyhpl.us/~bryan/papers2/paperbot/88f63bdcdc78a640eacffdfc0933bf7e.txt 14:47 -!- gully_foyle_ja [~theghosto@pool-71-116-68-251.snfcca.dsl-w.verizon.net] has quit [Quit: Leaving] 14:47 < kanzure> you can just use actual logging instead of http logging 14:48 < nmz787> how can i see it? 14:49 < kanzure> /join #paperbot-testing 14:50 < gnusha> https://secure.diyhpl.us/cgit/paperbot/commit/?id=7d14a71f Nathan McCorkle: more debug 14:50 < gnusha> https://secure.diyhpl.us/cgit/paperbot/commit/?id=563c1963 Nathan McCorkle: added logchannel call for debug 14:50 < gnusha> paperbot: reload papers 14:50 < paperbot> gnusha: (version: 2014-08-24 21:50:49) 14:52 < gnusha> https://secure.diyhpl.us/cgit/paperbot/commit/?id=eb88cd64 Nathan McCorkle: changed logchannel to _log 14:52 < gnusha> paperbot: reload papers 14:52 < paperbot> gnusha: (version: 2014-08-24 21:52:24) 14:54 < gnusha> https://secure.diyhpl.us/cgit/paperbot/commit/?id=c783d607 Nathan McCorkle: added another _log 14:54 < gnusha> paperbot: reload papers 14:54 < paperbot> gnusha: (version: 2014-08-24 21:54:09) 14:55 < gnusha> https://secure.diyhpl.us/cgit/paperbot/commit/?id=6928d027 Nathan McCorkle: added another _log 14:55 < gnusha> paperbot: reload papers 14:55 < paperbot> gnusha: (version: 2014-08-24 21:55:09) 14:55 < nmz787> kanzure: i don't see any of the logging in paperbot-testing 14:55 < nmz787> paperbot: http://www.sciencedirect.com/science/article/pii/S0969804397101233 14:55 < paperbot> http://diyhpl.us/~bryan/papers2/paperbot/3ab0de366a9ad2a0d525867e38cdb2cd.txt 14:56 -!- paperbot [~paperbot@131.252.130.248] has quit [Remote host closed the connection] 14:56 -!- paperbot [~paperbot@131.252.130.248] has joined ##hplusroadmap 14:57 < gnusha> https://secure.diyhpl.us/cgit/paperbot/commit/?id=9e7c5ada Nathan McCorkle: changed log to phenny.say 14:57 < gnusha> paperbot: reload papers 14:57 < paperbot> gnusha: (version: 2014-08-24 21:57:14) 14:57 < nmz787> paperbot: http://www.sciencedirect.com/science/article/pii/S0969804397101233 14:57 < paperbot> http://diyhpl.us/~bryan/papers2/paperbot/65a929edf68cb0edc86e159873f47c77.txt 14:58 -!- paperbot [~paperbot@131.252.130.248] has quit [Remote host closed the connection] 14:58 -!- ielo [~ielo@88-106-249-86.dynamic.dsl.as9105.com] has joined ##hplusroadmap 14:59 < gnusha> https://secure.diyhpl.us/cgit/paperbot/commit/?id=9520ff74 Nathan McCorkle: added even earlier phenny.say 14:59 < gnusha> paperbot: reload papers 14:59 -!- marciogm [~komanda@unaffiliated/marciogm] has quit [Ping timeout: 240 seconds] 14:59 -!- paperbot [~paperbot@131.252.130.248] has joined ##hplusroadmap 15:05 < nmz787> kanzure: ok sorry i realize things now 15:09 < gnusha> https://secure.diyhpl.us/cgit/paperbot/commit/?id=36db657f Nathan McCorkle: changed phenny.say back to _log 15:09 < gnusha> paperbot: reload papers 15:09 < paperbot> got response from translation_url 15:09 < paperbot> gnusha: (version: 2014-08-24 22:09:25) 15:22 -!- enceladu [~asakharov@24.60.79.55] has joined ##hplusroadmap 15:26 -!- gene_hacker_ [~chatzilla@c-50-137-46-240.hsd1.or.comcast.net] has joined ##hplusroadmap 15:26 -!- gene_hacker [~chatzilla@c-50-137-46-240.hsd1.or.comcast.net] has quit [Ping timeout: 260 seconds] 15:26 -!- gene_hacker_ is now known as gene_hacker 15:29 -!- gene_hacker [~chatzilla@c-50-137-46-240.hsd1.or.comcast.net] has quit [Read error: Connection reset by peer] 15:30 -!- EnLilaSko [EnLilaSko@unaffiliated/enlilasko] has quit [Quit: - nbs-irc 2.39 - www.nbs-irc.net -] 15:57 < gnusha> https://secure.diyhpl.us/cgit/paperbot/commit/?id=66b7c850 Nathan McCorkle: added paperbot_proxy_request class with get method, replaced request.get in sciencedirect section of download_url 15:57 < gnusha> paperbot: reload papers 15:57 < paperbot> gnusha: (version: 2014-08-24 22:57:01) 15:58 < gnusha> https://secure.diyhpl.us/cgit/paperbot/commit/?id=d1738894 Nathan McCorkle: added _log in sciencedirect section 15:58 < gnusha> paperbot: reload papers 15:58 < paperbot> gnusha: (version: 2014-08-24 22:58:29) 15:59 < gnusha> https://secure.diyhpl.us/cgit/paperbot/commit/?id=c8ccef9f Nathan McCorkle: moved _log function creation to global scope 15:59 < gnusha> paperbot: reload papers 15:59 < paperbot> gnusha: (version: 2014-08-24 22:59:51) 16:07 < gnusha> https://secure.diyhpl.us/cgit/paperbot/commit/?id=e1c43f4a Nathan McCorkle: moved _log function creation back as phenny is being passed in there, added nullLog function to act as black hole if a _log function isn't passed to download_url, added _log param to download_url function 16:07 < gnusha> paperbot: reload papers 16:07 < paperbot> gnusha: (version: 2014-08-24 23:07:32) 16:08 -!- paperbot [~paperbot@131.252.130.248] has quit [Remote host closed the connection] 16:08 -!- paperbot [~paperbot@131.252.130.248] has joined ##hplusroadmap 16:08 < kanzure> i don't think reload() really works 16:11 < gnusha> https://secure.diyhpl.us/cgit/paperbot/commit/?id=039bb842 Nathan McCorkle: added _log as a class variable, setting it prior to calling paperbot_proxy_request 16:11 < gnusha> paperbot: reload papers 16:11 < paperbot> gnusha: (version: 2014-08-24 23:11:48) 16:14 < gnusha> https://secure.diyhpl.us/cgit/paperbot/commit/?id=485a7b0a Nathan McCorkle: added _log in exception 16:14 < gnusha> paperbot: reload papers 16:14 < paperbot> gnusha: (version: 2014-08-24 23:14:15) 16:14 -!- ielo [~ielo@88-106-249-86.dynamic.dsl.as9105.com] has quit [Ping timeout: 245 seconds] 16:16 < gnusha> https://secure.diyhpl.us/cgit/paperbot/commit/?id=68e39fd3 Nathan McCorkle: imported traceback and using that for exception printing 16:16 < gnusha> paperbot: reload papers 16:16 < paperbot> gnusha: (version: 2014-08-24 23:16:23) 16:59 < gnusha> https://secure.diyhpl.us/cgit/paperbot/commit/?id=0fa79eab Nathan McCorkle: added fallback for getting title with lxml.etree 16:59 < gnusha> paperbot: reload papers 16:59 < paperbot> gnusha: (version: 2014-08-24 23:59:15) 17:00 < gnusha> https://secure.diyhpl.us/cgit/paperbot/commit/?id=957752c2 Nathan McCorkle: defined _log in classmethod get 17:00 < gnusha> paperbot: reload papers 17:00 < paperbot> gnusha: (version: 2014-08-25 00:00:49) 17:02 < gnusha> https://secure.diyhpl.us/cgit/paperbot/commit/?id=95d8b68a Nathan McCorkle: changed definition of _log in classmethod get 17:02 < gnusha> paperbot: reload papers 17:02 < paperbot> gnusha: (version: 2014-08-25 00:02:30) 17:08 < gnusha> https://secure.diyhpl.us/cgit/paperbot/commit/?id=131e6cde Nathan McCorkle: tweaking things 17:08 < gnusha> paperbot: reload papers 17:08 < paperbot> gnusha: (version: 2014-08-25 00:08:47) 17:12 < gnusha> https://secure.diyhpl.us/cgit/paperbot/commit/?id=aedd1258 Nathan McCorkle: defined proxy_url_index, incrementing it 17:12 < gnusha> paperbot: reload papers 17:12 < paperbot> gnusha: (version: 2014-08-25 00:12:35) 17:15 < gnusha> https://secure.diyhpl.us/cgit/paperbot/commit/?id=279b395c Nathan McCorkle: I was not passings args right 17:15 < gnusha> paperbot: reload papers 17:15 < paperbot> SyntaxError: invalid syntax (file "/srv/ikiwiki/paperbot/modules/papers.py", line 39) 17:18 < gnusha> https://secure.diyhpl.us/cgit/paperbot/commit/?id=d499af3a Nathan McCorkle: added debug lies 17:18 < gnusha> paperbot: reload papers 17:18 < paperbot> SyntaxError: invalid syntax (file "/srv/ikiwiki/paperbot/modules/papers.py", line 41) 17:20 -!- paperbot [~paperbot@131.252.130.248] has quit [Remote host closed the connection] 17:23 < gnusha> https://secure.diyhpl.us/cgit/paperbot/commit/?id=8188fc3f Nathan McCorkle: tweaking 17:23 < gnusha> paperbot: reload papers 17:36 -!- paperbot [~paperbot@131.252.130.248] has joined ##hplusroadmap 17:37 < gnusha> https://secure.diyhpl.us/cgit/paperbot/commit/?id=a7ce0b5c Nathan McCorkle: forget I was returning tuple 17:37 < gnusha> paperbot: reload papers 17:37 < paperbot> gnusha: (version: 2014-08-25 00:37:47) 17:50 -!- ThomasEgi [~thomas@185.5.8.81] has joined ##hplusroadmap 17:50 -!- ThomasEgi [~thomas@185.5.8.81] has quit [Changing host] 17:50 -!- ThomasEgi [~thomas@panda3d/ThomasEgi] has joined ##hplusroadmap 18:05 < gnusha> https://secure.diyhpl.us/cgit/paperbot/commit/?id=2fa2bccc Nathan McCorkle: added check for relative or absolute URL parsed via lxml.etree 18:05 < gnusha> paperbot: reload papers 18:05 < paperbot> gnusha: (version: 2014-08-25 01:05:09) 18:07 < gnusha> https://secure.diyhpl.us/cgit/paperbot/commit/?id=ce2ded2d Nathan McCorkle: removed some _log, defined headers 18:07 < gnusha> paperbot: reload papers 18:07 < paperbot> gnusha: (version: 2014-08-25 01:07:11) 18:08 < gnusha> https://secure.diyhpl.us/cgit/paperbot/commit/?id=09d95b50 Nathan McCorkle: defined user_agent 18:08 < gnusha> paperbot: reload papers 18:08 < paperbot> gnusha: (version: 2014-08-25 01:08:17) 18:12 < gnusha> https://secure.diyhpl.us/cgit/paperbot/commit/?id= 18:12 < gnusha> paperbot: reload papers 18:12 < paperbot> gnusha: (version: 2014-08-25 01:08:17) 18:14 < gnusha> https://secure.diyhpl.us/cgit/paperbot/commit/?id=0acb27d7 Nathan McCorkle: added newline to push paperbot to reload 18:14 < gnusha> paperbot: reload papers 18:14 < paperbot> gnusha: (version: 2014-08-25 01:14:15) 18:22 < justanotheruser> wow 18:23 < kanzure> :\ 18:28 -!- Viper168_ [~Viper@unaffiliated/viper168] has joined ##hplusroadmap 18:29 -!- Viper168 [~Viper@unaffiliated/viper168] has quit [Ping timeout: 246 seconds] 18:52 -!- CheckDavid [uid14990@gateway/web/irccloud.com/x-vfdmqnccghkqdzrs] has quit [Quit: Connection closed for inactivity] 18:59 -!- ThomasEgi [~thomas@panda3d/ThomasEgi] has quit [Remote host closed the connection] 19:00 < gnusha> https://secure.diyhpl.us/cgit/paperbot/commit/?id=79c58e4a Nathan McCorkle: using data instead of headers, calling json.dumps on data before the request. tried reducing duplication 19:00 < gnusha> paperbot: reload papers 19:00 < paperbot> gnusha: (version: 2014-08-25 02:00:39) 19:06 -!- paperbot [~paperbot@131.252.130.248] has quit [Remote host closed the connection] 19:06 -!- paperbot [~paperbot@131.252.130.248] has joined ##hplusroadmap 19:15 -!- snuffeluffegus [~snuff@2001:9b0:10:2104:216:3eff:feb7:f845] has joined ##hplusroadmap 19:50 -!- cpopell [~cpopell@c-76-26-144-132.hsd1.dc.comcast.net] has joined ##hplusroadmap 20:23 -!- ruthie [~ruthie@CPEbcc810070371-CMbcc81007036e.cpe.net.cable.rogers.com] has joined ##hplusroadmap 20:34 -!- CharlieNobody [~CharlieNo@97-85-244-89.static.stls.mo.charter.com] has joined ##hplusroadmap 21:03 -!- CharlieNobody [~CharlieNo@97-85-244-89.static.stls.mo.charter.com] has left ##hplusroadmap ["Leaving"] 21:08 -!- justanotheruser [~Justan@unaffiliated/justanotheruser] has quit [Quit: Reconnecting] 21:08 -!- justanotheruser [~Justan@unaffiliated/justanotheruser] has joined ##hplusroadmap 21:10 -!- justanot1eruser [~Justan@unaffiliated/justanotheruser] has joined ##hplusroadmap 21:10 -!- justanot1eruser [~Justan@unaffiliated/justanotheruser] has quit [Client Quit] 21:21 -!- lichen [~lichen@c-50-139-11-6.hsd1.or.comcast.net] has quit [Quit: Lost terminal] 21:32 < gnusha> https://secure.diyhpl.us/cgit/paperbot/commit/?id=324073cc Nathan McCorkle: changed proxy url 21:32 < gnusha> paperbot: reload papers 21:32 < paperbot> gnusha: (version: 2014-08-25 04:32:22) 21:33 < nmz787> paperbot: http://www.sciencedirect.com/science/article/pii/S0969804397101233 21:33 < paperbot> http://diyhpl.us/~bryan/papers2/paperbot/%0A%20Measurements%20of%20discrete%20and%20continuous%20X-ray%20spectra%20with%20a%20photodiode%20at%20room%20temperature%0A%20.pdf 21:50 -!- yashgaroth [~ffffff@cpe-76-167-105-53.san.res.rr.com] has quit [Quit: Leaving] 22:04 < gnusha> https://secure.diyhpl.us/cgit/paperbot/commit/?id=51768614 Nathan McCorkle: added debug lines 22:04 < gnusha> paperbot: reload papers 22:04 < paperbot> gnusha: (version: 2014-08-25 05:04:08) 22:05 -!- enceladu [~asakharov@24.60.79.55] has quit [Quit: quit] 22:06 -!- paperbot [~paperbot@131.252.130.248] has quit [Remote host closed the connection] 22:06 -!- paperbot [~paperbot@131.252.130.248] has joined ##hplusroadmap 22:49 < justanotheruser> any thoughts on bitsharesX (3rd largest altcoin by market cap)? It seems like it doesn't solve any problems and introduces new problems with DPoS 22:56 -!- snuffeluffegus [~snuff@2001:9b0:10:2104:216:3eff:feb7:f845] has quit [Quit: Leaving] 23:08 < kanzure> haha 23:25 -!- augur [~augur@216-164-48-148.c3-0.slvr-ubr1.lnh-slvr.md.cable.rcn.com] has quit [Ping timeout: 240 seconds] 23:28 -!- ruthie [~ruthie@CPEbcc810070371-CMbcc81007036e.cpe.net.cable.rogers.com] has quit [Remote host closed the connection] 23:31 -!- ebowden [~ebowden@CPE-121-223-168-30.lns3.bat.bigpond.net.au] has joined ##hplusroadmap 23:32 -!- augur [~augur@216-164-48-148.c3-0.slvr-ubr1.lnh-slvr.md.cable.rcn.com] has joined ##hplusroadmap 23:34 -!- comma8 [comma8@gateway/shell/yourbnc/x-casekzcveceeqady] has joined ##hplusroadmap 23:38 -!- EnLilaSko [EnLilaSko@unaffiliated/enlilasko] has joined ##hplusroadmap 23:40 -!- kyknos_ [~kyknos@89.233.130.143] has joined ##hplusroadmap 23:45 -!- pi- [~Ohmu@cpc2-oxfd18-2-0-cust90.4-3.cable.virginm.net] has joined ##hplusroadmap 23:45 < gnusha> https://secure.diyhpl.us/cgit/paperbot/commit/?id=56115dec Bryan Bishop: paperbot v2 23:45 < gnusha> paperbot: reload papers 23:45 < paperbot> gnusha: (version: 2014-08-25 05:04:08) 23:46 < kanzure> see https://github.com/kanzure/paperbot/commit/56115dec62ee069095dc045b559c878060843ce1 23:53 < kanzure> https://github.com/kanzure/paperbot/blob/master/paperbot/orchestrate.py 23:54 < kanzure> compare https://github.com/kanzure/paperbot/blob/master/modules/papers.py --- Log closed Mon Aug 25 00:00:45 2014