Reddit co-founder Aaron Swartz is tired of tussling with federal agencies over Freedom of Information Act requests, and he wants Obama's CTO to put public government data online for all to see. "We've had to do some crazy stuff to get data out of the government," says Swartz, who also runs, a hub for political data and activism. "The average programmer, the first time they have to call up a government agency, [they find] it's not worth it.”

Swartz goes even further, proposing a national project to scan the millions of paper documents that live in the National Archives, Library of Congress, and other repositories"a massive program that would dwarf Google's effort to scan library books. "Imagine if we had a public process to take this and put it online"not just for one company, Google, but for the public at large," he says.
Aaron Swartz was one of the chaps i did SWAG with ... have been following his career ever since.  It's interesting that he has started billing himself as an activist.

I posted thedailybeast article on reddit and picked up 427 points and 83 comments.

Looks like if you actually log in and create an account you can start to see what this site is all about.   What is amazing is just how closely that that page reflects this one, proving again that great minds run in tandem.  Btw, this is the URL to my account.

Apparently the data that they are using has been collecting in this directory.


The really big problem with such an amount of data & information is trying to digest it into a useable form. Just trying to find out what happened in the news on a truthful level is beyond most data processing without involving a human.
Well if we are dealing with massive repetitive (relational) data, then i don't understand why making that available to SQL queries would not be the best.  In that case we are usually not dealing with interpretations by only raw data and accuracy and currency should be the opertative words and not "truth".    The digesting of the data should then be left to whoever is reading it.  The same goes for public documents that are published.  In that case we just need good indexing to find what you are looking for.  IOW, truth and digestion is up to the consumers of the information.

Yep, you need someone unbiased to index it & abstract it which is never absent bias when a human is involved & also takes a lot of resources & time. Partisans who want to bury something need only index it wrongly.
I wonder why liberals & pragmatists seem to be worried about the truth. See this article in the Wikipedia:
source: ... Concept of truth

Going back to James, in pragmatism its spoken truth is not ready-made, but jointly we and reality "make" truth. This idea has two senses, one which is often attributed to William James and F.C.S. Schiller, and another that is more widely accepted by in pragmatism: (1) truth is mutable, and (2) truth is relative to a conceptual scheme.

(1) Mutability of truth

One major difference within pragmatism about the definition of 'truth' is the question of whether beliefs can pass from being true to being untrue and back. For James, beliefs are not true until they have been made true by verification. James believed propositions become true over the long term through proving their utility in a person's specific situation. The opposite of this process is not falsification, but rather a belief ceasing to be a "live option." F.C.S. Schiller, on the other hand, very clearly asserted that beliefs could pass into and problems. If I want to know how to return home safely, the true answer will be whatever is useful to solving that problem. Later on, when faced with a different problem, what I came to believe when faced with the earlier problem may now be false. As my problems change and as the most useful way to solve a problem shifts, so does the property of truth.

... to the zen, truth is what's so.

Well it will be interesting to see what Aaron and Co come up with at  My immediate interest is in the Data phase of the project. 
Data. icon There's a lot of great information out there about politics — votes, lobbying records, campaign finance reports. Unfortunately, it's split across a dozen different web sites and often hidden behind confusing interfaces. We're pulling all of that together and letting you explore it in one elegant, unified interface. (Plus, we're sharing all the results so you can come up with new ways to explore it.)
If you have ever tried to find or get data from a government website you will certainly sympathize with that goal.  The question is how far the Obama administration is going to go in that direction to start with.  My solution would be to have the US government open up a SQL server with plenty of capacity and bandwidth.  Put all redundant information and metadata about government documents in relational tables and define URLs on that data to define the constants contained there in.  Allow people to tag the data just like at fastblogit.  From that database anyone should be able to format outputs.  Any website could format and display the extracted data.  But Aaron will probably opt to just republish the information he finds in RDF.  Then it will be up to the user to install high volume semantic web tools to process it.  So the big question is what way is the new administration going to go?  They should hire me ... i could tell them how to do it.

But, imho,  this item is not about the metaphysics of truth - it is about raw uninterperted data.  It is about making the raw data generated by the government accessible to ordinary people.  Anybody can filter and project the data in various tables and graphs - but there should be a common unfiltered data source. 

Mark de LA says
What I am saying is that there is too damned much of it. You may get somewhere by digitizing it & then using some kind of robust ai application to extract the information into a database, but I doubt that's going to happen anytime soon. I guess that truth is just my hang-up! I just hate lies!  If you want to practice on some, try extracting the information in say - this document - & see if you can get anything useful out of it.
 - which of course is not US governmental.

M 2009-01-08 07:04:25 11224
Are you pluke on reddit? Did the reddit experience uncover anything useful or interesting? The first thing I would like to see posted would be Obama's real birth certificate.

Yep i'm pluke.  The reddit is obviously more interested in debunking Aaron's contribution to reddit than in anything new that he is doing ... just like you were more interested in your take on information and truth than in any new aged governmental transparency.  As to the birth certificate, well i wouldnt expect something as hyperpartisan as that to ever show up on here; but that page needs to be fleshed out.  As you can see, my own take on what this should be, is far different that what is actually happening, although i did fortell that it appears to be mostly an excuse to do a massive semantic web application.  From my point of view, the semantic web is still a pipe dream ... why not lobby the new administration to actually make their data and metadata available via traditional SQL servers.  But i'm game ... if the government makes their metadata available to all via the semantic web ... well that would certainly validate TBL's dream.

M 2009-01-08 09:20:27 11224

Hyperpartisan or not, the birth certificate thingy shows how much BHO is really interested in transparency.  I suspect most are more interested in exposing the Bush administration. Too bad they will also hit the Clinton administration which will expose a whole bunch of Obama's appointees.  When you get a large amount of information with a large degree of complexity in the relationships amongst the bits of data then you will find that the needle in the haystack principle prevails & very little useful will come of it. It will be like hiding something in plain sight. That which is probably more interesting is the classified information useful to both our enemies & each other.  That's the way they will tie it all back up behind the veil.  One good thing might be that BHO in his bailout plans might devote a few billion to hire some of us old computer programmers & the like to help solve the problem.
Remember this item is about

There has not been any activity at this website for a long time ... perhaps it is only one of Aaron's dead projects. 

