2011-06-25.log

--- Log opened Sat Jun 25 00:00:07 2011
kanzurehi fernan00:29
fenn3d photolithography how-to http://mrsec.wisc.edu/Edetc/nanolab/3D_print/index.html#Procedure01:10
fennsame thing https://nano-cemms.illinois.edu/materials/3d_printing_full01:12
-!- JayDugger [~duggerj@pool-173-74-79-43.dllstx.fios.verizon.net] has joined ##hplusroadmap01:12
-!- JayDugger [~duggerj@pool-173-74-79-43.dllstx.fios.verizon.net] has left ##hplusroadmap ["Leaving."]01:23
-!- alystair [alystair@24-246-14-18.cable.teksavvy.com] has quit [Ping timeout: 260 seconds]02:13
kanzurefenn: what did ##opengl say?02:32
-!- PixelScum [~PixelScum@ip98-177-175-88.ph.ph.cox.net] has quit [Read error: Connection reset by peer]02:41
-!- PixelScum [~PixelScum@ip98-177-175-88.ph.ph.cox.net] has joined ##hplusroadmap02:42
fenni didnt ask02:42
fennoh you mean the bug report02:43
fennno response02:45
-!- augur [~augur@208.58.6.161] has quit [Remote host closed the connection]03:40
-!- streety [streety@li139-74.members.linode.com] has quit [Remote host closed the connection]03:40
-!- streety [streety@li139-74.members.linode.com] has joined ##hplusroadmap03:41
-!- augur [~augur@129.2.129.34] has joined ##hplusroadmap04:04
-!- foucist [~foucist@ps14150.dreamhost.com] has joined ##hplusroadmap04:21
-!- klafka [~textual@cpe-69-205-70-55.rochester.res.rr.com] has joined ##hplusroadmap05:48
-!- fernan [~pseudo@118.101.154.183] has quit [Ping timeout: 260 seconds]05:49
-!- Guest89588 [~Jaakko@host86-131-177-233.range86-131.btcentralplus.com] has joined ##hplusroadmap05:51
-!- klafka [~textual@cpe-69-205-70-55.rochester.res.rr.com] has quit [Quit: Computer has gone to sleep.]06:27
-!- BaldimerBrandybo [~PixelScum@ip98-177-175-88.ph.ph.cox.net] has joined ##hplusroadmap07:08
-!- PixelScum [~PixelScum@ip98-177-175-88.ph.ph.cox.net] has quit [Ping timeout: 240 seconds]07:10
-!- Guest89588 [~Jaakko@host86-131-177-233.range86-131.btcentralplus.com] has quit [Quit: Nettalk6 - www.ntalk.de]07:51
-!- Guest89588 [~Jaakko@host86-131-177-233.range86-131.btcentralplus.com] has joined ##hplusroadmap07:54
-!- Guest89588 [~Jaakko@host86-131-177-233.range86-131.btcentralplus.com] has quit [Client Quit]07:55
-!- AJollyLife [~quassel@unaffiliated/ajollylife] has quit [Read error: Connection reset by peer]08:20
-!- AJollyLife [~quassel@c-68-57-192-88.hsd1.il.comcast.net] has joined ##hplusroadmap08:21
-!- AJollyLife [~quassel@c-68-57-192-88.hsd1.il.comcast.net] has quit [Changing host]08:21
-!- AJollyLife [~quassel@unaffiliated/ajollylife] has joined ##hplusroadmap08:21
-!- lumos [~lumos@afdy30.neoplus.adsl.tpnet.pl] has joined ##hplusroadmap08:31
-!- lumos [~lumos@afdy30.neoplus.adsl.tpnet.pl] has left ##hplusroadmap []08:31
kanzureno i meant ##opengl08:36
-!- AJollyLife [~quassel@unaffiliated/ajollylife] has quit [Read error: Connection reset by peer]08:39
-!- AJollyLife [~quassel@c-68-57-192-88.hsd1.il.comcast.net] has joined ##hplusroadmap08:39
-!- AJollyLife [~quassel@c-68-57-192-88.hsd1.il.comcast.net] has quit [Changing host]08:39
-!- AJollyLife [~quassel@unaffiliated/ajollylife] has joined ##hplusroadmap08:39
-!- AJolly [~quassel@c-68-57-192-88.hsd1.il.comcast.net] has joined ##hplusroadmap08:52
-!- AJolly [~quassel@c-68-57-192-88.hsd1.il.comcast.net] has quit [Changing host]08:52
-!- AJolly [~quassel@unaffiliated/ajollylife] has joined ##hplusroadmap08:52
-!- AJollyLife [~quassel@unaffiliated/ajollylife] has quit [Ping timeout: 258 seconds]08:53
-!- lumos [~lumos@afdy30.neoplus.adsl.tpnet.pl] has joined ##hplusroadmap09:31
lumoshey what u think of this colour scheme, is it good or is it whack http://s2.postimage.org/suunsdvjt/streem.jpg09:31
kanzurewho are you09:38
lumoskanzure, its me lumos09:39
lumoskanzure, chanOP09:39
lumoskanzure, make me chanop 2day plz09:39
-!- AJolly [~quassel@unaffiliated/ajollylife] has quit [Read error: Connection reset by peer]09:40
-!- AJollyLife [~quassel@c-68-57-192-88.hsd1.il.comcast.net] has joined ##hplusroadmap09:40
-!- AJollyLife [~quassel@c-68-57-192-88.hsd1.il.comcast.net] has quit [Changing host]09:40
-!- AJollyLife [~quassel@unaffiliated/ajollylife] has joined ##hplusroadmap09:40
kanzureAJollyLife: you should go to the diybio-boston meetup10:04
kanzurecan someone please bug me to write up how to remove watermarks from pdfs like from sciencedirect/iop before i forget10:50
kanzurei use pdftk to remove pages from a pdf without converting the rest to pure-image documents10:51
kanzureand then manually remove repeating watermark footers if those are present10:51
kanzurei should write some code to find those repeating watermarks and remove sensitive metadata10:52
streetykanzure: can you explain what you mean by "i use pdftk to remove pages from a pdf without converting the rest to pure-image documents"11:16
streetyI ask because I've spent some time today playing around with pdfminer extracting text from pdfs11:17
kanzurepdftk input.pdf cat $pagestart-$pagestop output.pdf11:18
kanzurethat's all i've been using pdftk for so far11:18
kanzurei just learned about it a few weeks ago but i dunno why i haven't seen it before11:18
streetyokay, I think I assumed it was more complex due to your mention of images11:19
kanzurewell i used imagemagick in the past (via 'convert') to dump pdf to images and then move signatures by coordinates or otherwise blank shit out11:21
streetyfair enough, makes sense with context11:21
streetyactually you may be interested in what I've been up to with pdfs.  I set wget lose on diyhpl.us/~bryan/papers2 (excluding archives) a couple of months ago expecting to get a couple hundred Mbs but ended up with 4.5G before I realised how much there was.  I wasn't sure what to do with it all but decided to try extracting text from the pdfs and then automatically tag and group the files.11:25
kanzurea useful thing for you to do would be DOI number extraction from text-based pdfs as well as image-based pdfs11:33
kanzuredoi numbers can lead to additional metadata from the web by throwing the number through a resolver and then parsing metadata in META tags on journal sites11:33
kanzurerealistically i'm not sure how many papers in my collection are pure images and how many are text11:33
-!- AJollyLife [~quassel@unaffiliated/ajollylife] has quit [Read error: Connection reset by peer]11:34
-!- AJollyLife [~quassel@c-68-57-192-88.hsd1.il.comcast.net] has joined ##hplusroadmap11:34
-!- AJollyLife [~quassel@c-68-57-192-88.hsd1.il.comcast.net] has quit [Changing host]11:34
-!- AJollyLife [~quassel@unaffiliated/ajollylife] has joined ##hplusroadmap11:34
streetyYeah matching the DOI will definitely be useful.  I was considering extracting the pdf title by comparing the size of the text to the average for the doc but it's a fudge that probably won't work all that well11:36
streetyeverything to do with pdfs is a bit of a fudge11:37
kanzurethe whole concept of papers is a fudge11:37
-!- mayko [~mayko@71-22-217-151.gar.clearwire-wmx.net] has joined ##hplusroadmap11:40
kanzurehttp://thisiscolossal.com/2011/06/markus-kayser-builds-a-solar-powered-3d-printer-that-prints-glass-from-sand-and-a-sun-powered-laser-cutter/11:42
archelssolar powered? What a showoff.11:43
kanzuresolar powered photocopier11:44
streetyI've just taken a look at the distribution of page lengths for the pdfs I've extracted text from so far.  Looks like perhaps 10% of the documents contain unusually little text11:45
kanzurehey that's not bad11:46
streetyit's not a random sampling of the docs (I'm running through the directory with pythons os.walk) but I'm happy with that11:47
kanzurei'm trying to find a paper on the server that has a "Downloaded by" or an IP address watermark11:47
kanzureIEEE always embeds a $xyz amount in a footer somewhere11:48
kanzureexample: http://diyhpl.us/~bryan/papers2/neuro/implants/Data%20communication%20between%20brain%20implants%20and%20computer%20-%20short%20-%20IEEENeuralSystemsJune2003.pdf11:48
kanzurei could remove that i guess but it's not particularly harmful11:48
streetyI assume it's more removing the name or university that downloaded a document which is more useful11:52
kanzureah here's one:11:53
kanzurehttp://diyhpl.us/~bryan/papers2/Patterning%20design%20in%20color%20at%20the%20submicron%20scale.pdf11:53
kanzuresee left-hand side11:53
kanzuresome of the pdf obj streams seem to be zipped11:58
-!- uniqanomaly__ [~ua@dynamic-78-8-84-162.ssp.dialog.net.pl] has quit [Quit: uniqanomaly__]11:59
streetystrangely text extraction has largely worked on that pdf but it doesn't include the Downloaded by reference12:01
kanzurei found it by googling12:02
streetyit looks like google is doing better than I currently am then12:03
-!- augur [~augur@129.2.129.34] has quit [Remote host closed the connection]12:04
streetymendeley seems to cope just fine as well12:06
kanzure-_- i just spent 10min trying to figure out why the pdf wouldn't change12:13
kanzureediting the wrong file12:13
kanzuresoo anyway my first guess was right12:14
kanzureusing that same file, try this:12:16
kanzurecat temp.pdf | grep -a "Length " | sort | uniq -c | sort -k2nr12:16
kanzureas you can see, they repeat the watermark four times (once for each page)12:16
kanzurein this case lines 63-67 inclusive are the watermark on the first page12:17
kanzureiirc pypdf can handle FlateDecode?12:20
streetyI'm not using pypdf, I think that was the package then returned text but no spaces between words12:21
streetyI'm using pdfminer instead.  It was a pain to get my head around how it worked but generally produces good output12:21
kanzureis there a way to use zlib's inflate from stdin on bash?12:24
streetytime for me to take off, I'll let you know what I manage to create from all those papers12:29
kanzureunfortunately i'm not sure what the contents of that objstream really means12:31
kanzurecat objstream.dat | python -c'import sys;import zlib;data=sys.stdin.read();print zlib.decompress(data)'12:31
kanzurealso that's probably just the display of the text and doesn't actually remove the compressed text from the file12:37
kanzurethe objects with "Length 40" in this file are the pdf/display commands12:41
kanzurethe objects like on line 6, 13, 19 and 25 are the "Downloaded by" lines12:42
-!- augur [~augur@208.58.6.161] has joined ##hplusroadmap12:46
-!- mayko [~mayko@71-22-217-151.gar.clearwire-wmx.net] has quit [Remote host closed the connection]12:47
kanzure"Producer:       Acrobat Distiller Command 3.01 for Solaris 2.3 and later (SPARC)"12:57
kanzureacs is running on solaris?12:57
-!- lumos [~lumos@afdy30.neoplus.adsl.tpnet.pl] has left ##hplusroadmap ["Leaving"]13:01
-!- eudoxia [~eudoxia@r190-135-41-139.dialup.adsl.anteldata.net.uy] has joined ##hplusroadmap13:19
-!- PixelScum [~PixelScum@ip98-177-175-88.ph.ph.cox.net] has joined ##hplusroadmap13:19
-!- BaldimerBrandybo [~PixelScum@ip98-177-175-88.ph.ph.cox.net] has quit [Ping timeout: 258 seconds]13:22
-!- eudoxia [~eudoxia@r190-135-41-139.dialup.adsl.anteldata.net.uy] has quit [Read error: Connection reset by peer]13:57
-!- uniqanomaly [~ua@dynamic-78-8-84-162.ssp.dialog.net.pl] has joined ##hplusroadmap14:06
-!- Guest89588 [~Jaakko@host86-131-177-233.range86-131.btcentralplus.com] has joined ##hplusroadmap14:38
-!- eudoxia [~eudoxia@r190-135-106-229.dialup.adsl.anteldata.net.uy] has joined ##hplusroadmap14:58
-!- uniqanomaly [~ua@dynamic-78-8-84-162.ssp.dialog.net.pl] has quit [Quit: uniqanomaly]15:03
-!- Guest89588 [~Jaakko@host86-131-177-233.range86-131.btcentralplus.com] has quit [Quit: Nettalk6 - www.ntalk.de]15:08
-!- augur [~augur@208.58.6.161] has quit [Read error: Connection reset by peer]16:30
-!- eudoxia [~eudoxia@r190-135-106-229.dialup.adsl.anteldata.net.uy] has quit [Read error: Connection reset by peer]16:30
-!- augur [~augur@208.58.6.161] has joined ##hplusroadmap16:31
-!- eridu [~eridu@gateway/tor-sasl/eridu] has joined ##hplusroadmap17:03
-!- nchaimov [~nchaimov@c-24-20-202-138.hsd1.or.comcast.net] has quit [Read error: Connection reset by peer]19:10
-!- nchaimov [~nchaimov@c-24-20-202-138.hsd1.or.comcast.net] has joined ##hplusroadmap19:11
kanzuremore graph visualization: http://ubietylab.net/ubigraph/19:42
-!- eudoxia [~eudoxia@r190-135-86-128.dialup.adsl.anteldata.net.uy] has joined ##hplusroadmap19:56
-!- eudoxia [~eudoxia@r190-135-86-128.dialup.adsl.anteldata.net.uy] has quit [Client Quit]20:00
-!- eridu [~eridu@gateway/tor-sasl/eridu] has quit [Remote host closed the connection]20:06
-!- eudoxia [~eudoxia@r190-135-86-128.dialup.adsl.anteldata.net.uy] has joined ##hplusroadmap20:36
-!- eudoxia [~eudoxia@r190-135-86-128.dialup.adsl.anteldata.net.uy] has quit [Client Quit]20:38
-!- eridu [~eridu@gateway/tor-sasl/eridu] has joined ##hplusroadmap20:44
-!- eridu [~eridu@gateway/tor-sasl/eridu] has quit [Ping timeout: 250 seconds]22:20
QuantumGhttp://www.youtube.com/watch?v=S7lAlzMBzLQ23:35
QuantumGpretty impressive23:35
--- Log closed Sun Jun 26 00:00:07 2011

Generated by irclog2html.py 2.15.0.dev0 by Marius Gedminas - find it at mg.pov.lt!