It looks like you're using an Ad Blocker.

Please white-list or disable AboveTopSecret.com in your ad-blocking tool.

Thank you.

 

Some features of ATS will be disabled while you continue to use an ad-blocker.

 

The 154 GB NARA Blue Book Archive

page: 1
86
<<   2  3  4 >>

log in

join
share:
+77 more 
posted on Sep, 14 2014 @ 09:31 AM
link   
Several years ago in 2011 Isaac issued a challenge to the ATS community to see if anyone could come up with a way to download the entire National Archive Blue Book library.

I realized at the time there was a simple solution. So I threw in my hat and decided to give it a try.

The bulk of the coding was done in my spare time in a month after Isaac posted the initial thread.

code.google.com...

Thanks to Isaac's willingness to volunteer to test the program. I was able to track down most of the bugs pretty quickly.

What I wasn't prepared for though was just how long it would take to download everything.

The first pass took well into 2012 before it completed. During that initial pass lots of files were still missing. So with all the improvements and fixes, the plan was to let the program start all over again from the top to see how many of the holes it could fill in. By 2013 we had a second copy that was finally good enough that it was usable for basic research.

Since then there has been extra development on the side to convert the files to PDFs and automate the OCR process, but life has been busy with lots of other craziness. So the NARA project has dwindled.

That is up until an an hour ago when I decided it was high time to release what we currently have to the ATS family and let everyone play with the results.

To download the archive you'll need BitTorrent Sync.

link.getsync.com...

There will be additional improvements at some point. I definitely want to convert everything to PDFs, but I need to make sure I have all the pages (there are a couple missing) before I run that leg of the process. [snip]

Anyhow, that is the news!

Enjoy folks!


edit on 2014-9-14 by Xtraeme because: (no reason given)

edit on Sun Sep 14 2014 by DontTreadOnMe because: solicitation snipped



posted on Sep, 14 2014 @ 10:00 AM
link   
Great job,what dedication to ufology you have


Im sure there are many many people that appreciate all ur hard work on this project and I for one want to personally thank you.



posted on Sep, 14 2014 @ 10:03 AM
link   

originally posted by: pez1975
Great job,what dedication to ufology you have


Im sure there are many many people that appreciate all ur hard work on this project and I for one want to personally thank you.


Thanks pez, I appreciate that. The Fold3 project is small potatoes though in comparison to some of the other ideas I have in mind.



edit on 2014-9-14 by Xtraeme because: (no reason given)



posted on Sep, 14 2014 @ 10:23 AM
link   
Great job!

Now we have our own digital hardcopy, away from any manipulative forces. Its passion like yours and Isaac's that continue to bring us closer to the truth.

S+F^10



posted on Sep, 14 2014 @ 10:28 AM
link   
Thank you !
This is fantastic work.
All of us who want to understand all we can
about this entirely real phenomena get a huge boost
from work like yours .
So my sincere thanks again.



posted on Sep, 14 2014 @ 10:45 AM
link   

originally posted by: eisegesis
Great job!

Now we have our own digital hardcopy, away from any manipulative forces.


Here is an interesting factoid. Did you know the NARA archive only includes 10,823 cases when there should be 12,618?

I suspect the missing 1,795 are mixed in amongst the following folders:

9667997, 9669100, 9669191

However, on average, each case works out to be 10.8026 pages long. So 1,795 (missing cases) * 10.8 (pages on average per case) should work out to be 19,391 pages.

Yet we only have 13,322 pages leftover. That means we are off by 6,069 pages or potentially 562 cases (nearly 4.5% of the total).

Even though the official Blue Book archive is supposed to be complete. Fold3 and the National Archives and Records Administration (NARA) periodically update the material on the website.

So one of the goals was to eventually have the tool randomly look for changes. Unfortunately though I never got around to add in the code to have it do periodic spot checks.

Not enough time, sadly.

Maybe at some point in the future because that is potentially a lot of missing material...
edit on 2014-9-14 by Xtraeme because: (no reason given)



posted on Sep, 14 2014 @ 11:11 AM
link   
a reply to: Xtraeme

Son, While I'm fully aware that your did in fact try to do a good thing here; you failed.

This is nothing more than a loosely organized collection of non relational data. And it is absolutely the very last thing needed here.

IF, and that if a very big resounding IF you had actually compiled that data into a real database, you would have solved an issue, answered a question that has haunted Terrestrials for at least 70 years. But, unfortunately it seems that none of you know what a "real database" even looks like...even though several tech companies (like Oracle, Microsoft, and others) have shown all y'all the "way" concerning databases, and have provided free tools for y'all to use. It seems that the data surrounding UFO's will remain unavailable in the most important ways.

Did you know; that IF you had actually built a relational database (think SQL engine) you could have done a probably simple mining operation and proven everything you wished to prove with this data...

The proof of ET and his visitations, and other activities is very likely contained within that 150Gigs! And without proper handling you/we will never see it.


+22 more 
posted on Sep, 14 2014 @ 11:25 AM
link   
a reply to: tanka418

I don't think you know what you are talking about. The raw data is a series of scanned in pieces of paper from the 50s and 60s. Basically NARA and Fold3 locked the material onto their website with a secure perimeter solution to make it as hard (and slow) as possible for anyone to download the information. That is no longer the case. Now at this point people can do whatever they want with it.

If a person wants to try to automate creating a SQL structured database by manipulating the images -- by all means -- go for it! That is exactly the point of why I am making it available. However before you are going to do that, if you want to do anything useful, you are going to have to get at the data in those scanned pieces of paper. That means OCR. Databases are great when you have data already organized and broken out in rows and columns with common elements between the dataset. Not so much when you don't.
edit on 2014-9-14 by Xtraeme because: (no reason given)


+13 more 
posted on Sep, 14 2014 @ 01:07 PM
link   
The following is my opinion as a member participating in this discussion.


originally posted by: tanka418
Son, While I'm fully aware that your did in fact try to do a good thing here; you failed.

He didn't fail, because that was not the idea, this is just the first step of a several steps process, and someone had to do it.

Thanks to that, anyone can now start the next step.



As an ATS Staff Member, I will not moderate in threads such as this where I have participated as a member.



posted on Sep, 14 2014 @ 01:17 PM
link   
a reply to: Xtraeme

I would love a link to the torrent of the whole thing...



posted on Sep, 14 2014 @ 01:33 PM
link   
Jolly Good show old chap!



posted on Sep, 14 2014 @ 02:05 PM
link   
I am truly thankful and impressed for the work you put into this phenomena.
It is people like you that make thing happen.

And suddenly one day you are a part of history



posted on Sep, 14 2014 @ 02:10 PM
link   
a reply to: Xtraeme




That means OCR. Databases are great when you have data already organized and broken out in rows and columns with common elements between the dataset. Not so much when you don't.


Is this a job for Hadoop?




edit on 14-9-2014 by Bybyots because: because everyone knows SO has a cluster under his desk. We'll use his.



posted on Sep, 14 2014 @ 02:13 PM
link   
a reply to: Spacespider

Me too, I tried the torrentsync and it just say there not moving, no peers Im guessing, no initial seed.

idk, never used it. but 154gb is going to take an eternity. and blow my quota lol

ETA

I see its now moving. it will still take me a decade to get. as I cant afford to blow 154gb in one transfer. but over time.

ETA 2

Oh I wish this was browseable online, I love reading documents like these but at a trickle its going to be painful lol.. only 200mb xfered so far.

Cheers OP!
edit on 14-9-2014 by sn0rch because: (no reason given)



posted on Sep, 14 2014 @ 02:22 PM
link   
a reply to: Xtraeme

For my part, I think you are right about the "raw" data. And you've done a great thing making it available like this, out in the open, before an SQL engine database is built. Because if you'd gone and built that, without making raw data available that was used to build it, then someone would come along and say the raw data was fooled with in making that database…..

This data may have been "fooled with," we'll never know. But we have at a stop, right now, what's had in NARA, to build a database using that raw data, the same they have. That's really something to write home about. Now if someone would just give me the map to get home or let me remember long enough to make it myself…..
LOL.
tetra50



posted on Sep, 14 2014 @ 02:55 PM
link   
Nice!! Thanks to whoever coded this algorithm, now we can kick back and relax reading some blue book files!!

While you're reading the BB files you can also listen to this following:

The Pixies - Motorway to Roswell

And

The Pixies - The Happening

I find it interesting that the band that single-handedly created the "alternative music" genre and is considered to be the greatest if not one of the greatest bands of the 20th century, sings about UFOs.

Even though some of the BB data may be "inaccurate" to say the least, it can serve as a starting point for research and detective work. For example if we find a report that was seen by a number of people, it may have made it to a newspaper report, possibly a police report, the we can start looking for those news stories and police reports. If need be we can even find the people and/or their kids and interview them, we can then possibly obtain radar data (if available) and little by little, reveal whether the story is accurate or not.
edit on 14-9-2014 by deloprator20000 because: (no reason given)


+9 more 
posted on Sep, 14 2014 @ 03:43 PM
link   

originally posted by: Xtraeme
To download the archive you'll need BitTorrent Sync.

link.getsync.com...


Xtraeme's short and modest post doesn't do justice to the amount of effort he put into this project and the potential usefulness of his work to the wider ufological community.

If there were a few more people like Xtraeme active in ufological research, we might actualy be able to make a bit of progress!


Thanks again for your hard work Xtraeme.



posted on Sep, 14 2014 @ 03:56 PM
link   
a reply to: Xtraeme
Absolutely a Great Job!

Skills make the difference.
Thanks to you and Isaac

S&F.



posted on Sep, 14 2014 @ 06:52 PM
link   
Great job, but I have to ask. Why is the archive so large? If it consists of 129,000 pages and is 154 GB in size, that works out to about 837 KB per page. I realize that the pages are scanned images, but what format? If they are in a non-compressed format (like .tif or .bmp) surely they can be re-sampled as .jpg or .png and reduced in size ten to 100-fold. It would be a lot easier to download 1.5 GB than 154.



posted on Sep, 14 2014 @ 06:53 PM
link   

originally posted by: Spacespider
a reply to: Xtraeme

I would love a link to the torrent of the whole thing...


For the moment I want to keep it in BT Sync since there are probably about 50 to a 100 pages still missing from the dataset. This way with BT Sync if people stay attached they can get the extras. Once I am confident there is nothing left to snag, at that point I'll let the PDF to OCR process rip and drop the PDFs in the root folder. Then I'll create two finalized static torrents, one for the images and another for the PDFs. Another thing to add to the todo list.

edit on 2014-9-14 by Xtraeme because: (no reason given)



new topics

top topics



 
86
<<   2  3  4 >>

log in

join