It looks like you're using an Ad Blocker.
Please white-list or disable AboveTopSecret.com in your ad-blocking tool.
Thank you.
Some features of ATS will be disabled while you continue to use an ad-blocker.
Originally posted by zorgon
Did you ever do that thread on the Canadian files? I believe you said you had a quick d/l for that one?
Originally posted by idealord
Sorry, if I'm over-stepping, and I have not read this whole thread. I'm going to try to write a scraper that will access this listing:
www.footnote.com...[0]=project+blue+book&nav=4294966953+4294961629
Originally posted by idealord
I've got the first 900 or so images
I just need a glass of wine now!
FWIW, I can get you a complete database listing of all the URL's for each of the images, if that'd be helpful. I can easily write something that'll capture each URL's metadata and annotations. Anything that's not Flash is easy to grab with the Perl Mechanize lib and the HTML:TokeParser libs.
Another trick I've used is to use a local proxy of some sort, to capture all GET requests. Maybe I'll try that in the morning...
FWIW, I used to use a Japanese program that would record mouse movements to script that kind of stuff. It seems like you decompiled the Flash and it's well-obfuscated?
I got down 3200 images at 1280 resolution and after reading your response have stopped running it.
Originally posted by Xtraeme
Perhaps Isaac would like two data sets to work from? Maybe one as CBRs and the other as PDFs? My main goal is to get all the content OCR'ed. Without this it's just too much data to manually sift through.
Originally posted by kkrattiger
reply to post by kkrattiger
I realize this thread is for the problem of getting the bluebook footnote pages downloaded, but I was just wondering if anyone read my post at the top of this page.... In the bottom right corner, note "discoverer of Pluto" handwritten note.