--- Log opened Sat Jun 25 00:00:07 2011 00:29 < kanzure> hi fernan 01:10 < fenn> 3d photolithography how-to http://mrsec.wisc.edu/Edetc/nanolab/3D_print/index.html#Procedure 01:12 < fenn> same thing https://nano-cemms.illinois.edu/materials/3d_printing_full 01:12 -!- JayDugger [~duggerj@pool-173-74-79-43.dllstx.fios.verizon.net] has joined ##hplusroadmap 01:23 -!- JayDugger [~duggerj@pool-173-74-79-43.dllstx.fios.verizon.net] has left ##hplusroadmap ["Leaving."] 02:13 -!- alystair [alystair@24-246-14-18.cable.teksavvy.com] has quit [Ping timeout: 260 seconds] 02:32 < kanzure> fenn: what did ##opengl say? 02:41 -!- PixelScum [~PixelScum@ip98-177-175-88.ph.ph.cox.net] has quit [Read error: Connection reset by peer] 02:42 -!- PixelScum [~PixelScum@ip98-177-175-88.ph.ph.cox.net] has joined ##hplusroadmap 02:42 < fenn> i didnt ask 02:43 < fenn> oh you mean the bug report 02:45 < fenn> no response 03:40 -!- augur [~augur@208.58.6.161] has quit [Remote host closed the connection] 03:40 -!- streety [streety@li139-74.members.linode.com] has quit [Remote host closed the connection] 03:41 -!- streety [streety@li139-74.members.linode.com] has joined ##hplusroadmap 04:04 -!- augur [~augur@129.2.129.34] has joined ##hplusroadmap 04:21 -!- foucist [~foucist@ps14150.dreamhost.com] has joined ##hplusroadmap 05:48 -!- klafka [~textual@cpe-69-205-70-55.rochester.res.rr.com] has joined ##hplusroadmap 05:49 -!- fernan [~pseudo@118.101.154.183] has quit [Ping timeout: 260 seconds] 05:51 -!- Guest89588 [~Jaakko@host86-131-177-233.range86-131.btcentralplus.com] has joined ##hplusroadmap 06:27 -!- klafka [~textual@cpe-69-205-70-55.rochester.res.rr.com] has quit [Quit: Computer has gone to sleep.] 07:08 -!- BaldimerBrandybo [~PixelScum@ip98-177-175-88.ph.ph.cox.net] has joined ##hplusroadmap 07:10 -!- PixelScum [~PixelScum@ip98-177-175-88.ph.ph.cox.net] has quit [Ping timeout: 240 seconds] 07:51 -!- Guest89588 [~Jaakko@host86-131-177-233.range86-131.btcentralplus.com] has quit [Quit: Nettalk6 - www.ntalk.de] 07:54 -!- Guest89588 [~Jaakko@host86-131-177-233.range86-131.btcentralplus.com] has joined ##hplusroadmap 07:55 -!- Guest89588 [~Jaakko@host86-131-177-233.range86-131.btcentralplus.com] has quit [Client Quit] 08:20 -!- AJollyLife [~quassel@unaffiliated/ajollylife] has quit [Read error: Connection reset by peer] 08:21 -!- AJollyLife [~quassel@c-68-57-192-88.hsd1.il.comcast.net] has joined ##hplusroadmap 08:21 -!- AJollyLife [~quassel@c-68-57-192-88.hsd1.il.comcast.net] has quit [Changing host] 08:21 -!- AJollyLife [~quassel@unaffiliated/ajollylife] has joined ##hplusroadmap 08:31 -!- lumos [~lumos@afdy30.neoplus.adsl.tpnet.pl] has joined ##hplusroadmap 08:31 -!- lumos [~lumos@afdy30.neoplus.adsl.tpnet.pl] has left ##hplusroadmap [] 08:36 < kanzure> no i meant ##opengl 08:39 -!- AJollyLife [~quassel@unaffiliated/ajollylife] has quit [Read error: Connection reset by peer] 08:39 -!- AJollyLife [~quassel@c-68-57-192-88.hsd1.il.comcast.net] has joined ##hplusroadmap 08:39 -!- AJollyLife [~quassel@c-68-57-192-88.hsd1.il.comcast.net] has quit [Changing host] 08:39 -!- AJollyLife [~quassel@unaffiliated/ajollylife] has joined ##hplusroadmap 08:52 -!- AJolly [~quassel@c-68-57-192-88.hsd1.il.comcast.net] has joined ##hplusroadmap 08:52 -!- AJolly [~quassel@c-68-57-192-88.hsd1.il.comcast.net] has quit [Changing host] 08:52 -!- AJolly [~quassel@unaffiliated/ajollylife] has joined ##hplusroadmap 08:53 -!- AJollyLife [~quassel@unaffiliated/ajollylife] has quit [Ping timeout: 258 seconds] 09:31 -!- lumos [~lumos@afdy30.neoplus.adsl.tpnet.pl] has joined ##hplusroadmap 09:31 < lumos> hey what u think of this colour scheme, is it good or is it whack http://s2.postimage.org/suunsdvjt/streem.jpg 09:38 < kanzure> who are you 09:39 < lumos> kanzure, its me lumos 09:39 < lumos> kanzure, chanOP 09:39 < lumos> kanzure, make me chanop 2day plz 09:40 -!- AJolly [~quassel@unaffiliated/ajollylife] has quit [Read error: Connection reset by peer] 09:40 -!- AJollyLife [~quassel@c-68-57-192-88.hsd1.il.comcast.net] has joined ##hplusroadmap 09:40 -!- AJollyLife [~quassel@c-68-57-192-88.hsd1.il.comcast.net] has quit [Changing host] 09:40 -!- AJollyLife [~quassel@unaffiliated/ajollylife] has joined ##hplusroadmap 10:04 < kanzure> AJollyLife: you should go to the diybio-boston meetup 10:50 < kanzure> can someone please bug me to write up how to remove watermarks from pdfs like from sciencedirect/iop before i forget 10:51 < kanzure> i use pdftk to remove pages from a pdf without converting the rest to pure-image documents 10:51 < kanzure> and then manually remove repeating watermark footers if those are present 10:52 < kanzure> i should write some code to find those repeating watermarks and remove sensitive metadata 11:16 < streety> kanzure: can you explain what you mean by "i use pdftk to remove pages from a pdf without converting the rest to pure-image documents" 11:17 < streety> I ask because I've spent some time today playing around with pdfminer extracting text from pdfs 11:18 < kanzure> pdftk input.pdf cat $pagestart-$pagestop output.pdf 11:18 < kanzure> that's all i've been using pdftk for so far 11:18 < kanzure> i just learned about it a few weeks ago but i dunno why i haven't seen it before 11:19 < streety> okay, I think I assumed it was more complex due to your mention of images 11:21 < kanzure> well i used imagemagick in the past (via 'convert') to dump pdf to images and then move signatures by coordinates or otherwise blank shit out 11:21 < streety> fair enough, makes sense with context 11:25 < streety> actually you may be interested in what I've been up to with pdfs. I set wget lose on diyhpl.us/~bryan/papers2 (excluding archives) a couple of months ago expecting to get a couple hundred Mbs but ended up with 4.5G before I realised how much there was. I wasn't sure what to do with it all but decided to try extracting text from the pdfs and then automatically tag and group the files. 11:33 < kanzure> a useful thing for you to do would be DOI number extraction from text-based pdfs as well as image-based pdfs 11:33 < kanzure> doi numbers can lead to additional metadata from the web by throwing the number through a resolver and then parsing metadata in META tags on journal sites 11:33 < kanzure> realistically i'm not sure how many papers in my collection are pure images and how many are text 11:34 -!- AJollyLife [~quassel@unaffiliated/ajollylife] has quit [Read error: Connection reset by peer] 11:34 -!- AJollyLife [~quassel@c-68-57-192-88.hsd1.il.comcast.net] has joined ##hplusroadmap 11:34 -!- AJollyLife [~quassel@c-68-57-192-88.hsd1.il.comcast.net] has quit [Changing host] 11:34 -!- AJollyLife [~quassel@unaffiliated/ajollylife] has joined ##hplusroadmap 11:36 < streety> Yeah matching the DOI will definitely be useful. I was considering extracting the pdf title by comparing the size of the text to the average for the doc but it's a fudge that probably won't work all that well 11:37 < streety> everything to do with pdfs is a bit of a fudge 11:37 < kanzure> the whole concept of papers is a fudge 11:40 -!- mayko [~mayko@71-22-217-151.gar.clearwire-wmx.net] has joined ##hplusroadmap 11:42 < kanzure> http://thisiscolossal.com/2011/06/markus-kayser-builds-a-solar-powered-3d-printer-that-prints-glass-from-sand-and-a-sun-powered-laser-cutter/ 11:43 < archels> solar powered? What a showoff. 11:44 < kanzure> solar powered photocopier 11:45 < streety> I've just taken a look at the distribution of page lengths for the pdfs I've extracted text from so far. Looks like perhaps 10% of the documents contain unusually little text 11:46 < kanzure> hey that's not bad 11:47 < streety> it's not a random sampling of the docs (I'm running through the directory with pythons os.walk) but I'm happy with that 11:47 < kanzure> i'm trying to find a paper on the server that has a "Downloaded by" or an IP address watermark 11:48 < kanzure> IEEE always embeds a $xyz amount in a footer somewhere 11:48 < kanzure> example: http://diyhpl.us/~bryan/papers2/neuro/implants/Data%20communication%20between%20brain%20implants%20and%20computer%20-%20short%20-%20IEEENeuralSystemsJune2003.pdf 11:48 < kanzure> i could remove that i guess but it's not particularly harmful 11:52 < streety> I assume it's more removing the name or university that downloaded a document which is more useful 11:53 < kanzure> ah here's one: 11:53 < kanzure> http://diyhpl.us/~bryan/papers2/Patterning%20design%20in%20color%20at%20the%20submicron%20scale.pdf 11:53 < kanzure> see left-hand side 11:58 < kanzure> some of the pdf obj streams seem to be zipped 11:59 -!- uniqanomaly__ [~ua@dynamic-78-8-84-162.ssp.dialog.net.pl] has quit [Quit: uniqanomaly__] 12:01 < streety> strangely text extraction has largely worked on that pdf but it doesn't include the Downloaded by reference 12:02 < kanzure> i found it by googling 12:03 < streety> it looks like google is doing better than I currently am then 12:04 -!- augur [~augur@129.2.129.34] has quit [Remote host closed the connection] 12:06 < streety> mendeley seems to cope just fine as well 12:13 < kanzure> -_- i just spent 10min trying to figure out why the pdf wouldn't change 12:13 < kanzure> editing the wrong file 12:14 < kanzure> soo anyway my first guess was right 12:16 < kanzure> using that same file, try this: 12:16 < kanzure> cat temp.pdf | grep -a "Length " | sort | uniq -c | sort -k2nr 12:16 < kanzure> as you can see, they repeat the watermark four times (once for each page) 12:17 < kanzure> in this case lines 63-67 inclusive are the watermark on the first page 12:20 < kanzure> iirc pypdf can handle FlateDecode? 12:21 < streety> I'm not using pypdf, I think that was the package then returned text but no spaces between words 12:21 < streety> I'm using pdfminer instead. It was a pain to get my head around how it worked but generally produces good output 12:24 < kanzure> is there a way to use zlib's inflate from stdin on bash? 12:29 < streety> time for me to take off, I'll let you know what I manage to create from all those papers 12:31 < kanzure> unfortunately i'm not sure what the contents of that objstream really means 12:31 < kanzure> cat objstream.dat | python -c'import sys;import zlib;data=sys.stdin.read();print zlib.decompress(data)' 12:37 < kanzure> also that's probably just the display of the text and doesn't actually remove the compressed text from the file 12:41 < kanzure> the objects with "Length 40" in this file are the pdf/display commands 12:42 < kanzure> the objects like on line 6, 13, 19 and 25 are the "Downloaded by" lines 12:46 -!- augur [~augur@208.58.6.161] has joined ##hplusroadmap 12:47 -!- mayko [~mayko@71-22-217-151.gar.clearwire-wmx.net] has quit [Remote host closed the connection] 12:57 < kanzure> "Producer: Acrobat Distiller Command 3.01 for Solaris 2.3 and later (SPARC)" 12:57 < kanzure> acs is running on solaris? 13:01 -!- lumos [~lumos@afdy30.neoplus.adsl.tpnet.pl] has left ##hplusroadmap ["Leaving"] 13:19 -!- eudoxia [~eudoxia@r190-135-41-139.dialup.adsl.anteldata.net.uy] has joined ##hplusroadmap 13:19 -!- PixelScum [~PixelScum@ip98-177-175-88.ph.ph.cox.net] has joined ##hplusroadmap 13:22 -!- BaldimerBrandybo [~PixelScum@ip98-177-175-88.ph.ph.cox.net] has quit [Ping timeout: 258 seconds] 13:57 -!- eudoxia [~eudoxia@r190-135-41-139.dialup.adsl.anteldata.net.uy] has quit [Read error: Connection reset by peer] 14:06 -!- uniqanomaly [~ua@dynamic-78-8-84-162.ssp.dialog.net.pl] has joined ##hplusroadmap 14:38 -!- Guest89588 [~Jaakko@host86-131-177-233.range86-131.btcentralplus.com] has joined ##hplusroadmap 14:58 -!- eudoxia [~eudoxia@r190-135-106-229.dialup.adsl.anteldata.net.uy] has joined ##hplusroadmap 15:03 -!- uniqanomaly [~ua@dynamic-78-8-84-162.ssp.dialog.net.pl] has quit [Quit: uniqanomaly] 15:08 -!- Guest89588 [~Jaakko@host86-131-177-233.range86-131.btcentralplus.com] has quit [Quit: Nettalk6 - www.ntalk.de] 16:30 -!- augur [~augur@208.58.6.161] has quit [Read error: Connection reset by peer] 16:30 -!- eudoxia [~eudoxia@r190-135-106-229.dialup.adsl.anteldata.net.uy] has quit [Read error: Connection reset by peer] 16:31 -!- augur [~augur@208.58.6.161] has joined ##hplusroadmap 17:03 -!- eridu [~eridu@gateway/tor-sasl/eridu] has joined ##hplusroadmap 19:10 -!- nchaimov [~nchaimov@c-24-20-202-138.hsd1.or.comcast.net] has quit [Read error: Connection reset by peer] 19:11 -!- nchaimov [~nchaimov@c-24-20-202-138.hsd1.or.comcast.net] has joined ##hplusroadmap 19:42 < kanzure> more graph visualization: http://ubietylab.net/ubigraph/ 19:56 -!- eudoxia [~eudoxia@r190-135-86-128.dialup.adsl.anteldata.net.uy] has joined ##hplusroadmap 20:00 -!- eudoxia [~eudoxia@r190-135-86-128.dialup.adsl.anteldata.net.uy] has quit [Client Quit] 20:06 -!- eridu [~eridu@gateway/tor-sasl/eridu] has quit [Remote host closed the connection] 20:36 -!- eudoxia [~eudoxia@r190-135-86-128.dialup.adsl.anteldata.net.uy] has joined ##hplusroadmap 20:38 -!- eudoxia [~eudoxia@r190-135-86-128.dialup.adsl.anteldata.net.uy] has quit [Client Quit] 20:44 -!- eridu [~eridu@gateway/tor-sasl/eridu] has joined ##hplusroadmap 22:20 -!- eridu [~eridu@gateway/tor-sasl/eridu] has quit [Ping timeout: 250 seconds] 23:35 < QuantumG> http://www.youtube.com/watch?v=S7lAlzMBzLQ 23:35 < QuantumG> pretty impressive --- Log closed Sun Jun 26 00:00:07 2011