Jose Sandoval - Software Developer - Software Development: The problem with blogging

Jose Sandoval

Resume Book Software Drawings Home	Search Web Search josesandoval.com

About me
My name is Jose Sandoval, and I'm a software developer. I consult on all areas of software engineering. You can reach me at jose@josesandoval.com.

If you are interested in RESTful web services, I wrote a book titled RESTful Java Web Services.

I was named after my father (Jose) and my grandfather (Felix). The Sandoval last name comes from a little town located in the northern part of the province of Burgos, Spain, called Sandoval de la Reina.

One of the first Sandovals to have set foot in the continent of America was Gonzalo de Sandoval, a conquistador who was led by Hernan Cortez in 1519.

As with many Spanish last names, a coat of arms for it exists.

Most popular posts
» REST in Java (Restlet demo)
» Semantic Journal
» Google's GWT + RSS Feed Reader
» Google's GWT + Google's API Example (AJAX)
» XMLHttpRequest Example

Recent posts
» Does comedy hurt comedians?
» Google's culture
» Copyright Law Breaker?
» Web 2.0 at its best?
» What's in a number?
» Free Energy?
» The Economist Idea of Innovation
» Intentional Software
» You have to spend money to make money
» Canadian Internet Usage

Archives
» September 2004
» October 2004
» November 2004
» December 2004
» January 2005
» February 2005
» March 2005
» April 2005
» May 2005
» June 2005
» July 2005
» August 2005
» September 2005
» October 2005
» November 2005
» December 2005
» January 2006
» February 2006
» March 2006
» April 2006
» May 2006
» June 2006
» July 2006
» August 2006
» September 2006
» October 2006
» November 2006
» December 2006
» January 2007
» February 2007
» March 2007
» April 2007
» May 2007
» June 2007
» July 2007
» August 2007
» September 2007
» October 2007
» November 2007
» December 2007
» January 2008
» February 2008
» March 2008
» April 2008
» May 2008
» June 2008
» July 2008
» August 2008
» September 2008
» October 2008
» November 2008
» December 2008
» January 2009
» February 2009
» March 2009
» April 2009
» May 2009
» June 2009
» July 2009
» August 2009
» September 2009
» October 2009
» November 2009
» December 2009
» January 2010

The problem with blogging
Thursday, April 05, 2007

I want to solve one of the problems with blogging. And I have a rather silly solution.

It's not that blogging has many issues (some people may disagree here), but there is a particular problem I'm interested in: it's the dead link/404 problem.

What I mean with the "404 linking" problem is essentially dead or stale links.

The number 404 is the code given by the HTTP protocol inventors to the "resource not found" error message. So that anyone creating a web server knows the convention of what to do when something is not found and so the browser displays some friendly message to the user, i.e., the dreaded "page not found" error message.

One the major aspects of blogging is linking to other sources on the internet (well, it's the life line of the web), but many sites go out of date quickly so whenever anyone is looking at historical entries in any blog some links are no longer valid.

What's a blogger to do, if he or she wants to keep blog entries relevant for all eternity?

My solution is simple, but I'm not sure if it is breaking copyright laws--may be not; isn't everything on the web free? ;)

What I do is take a snap shot in the form of an image or copy the actual HTML of the original content, hosted locally and link to it in my blog entry.

I only do this for a few sites that I think are rather important to the context of my entries. Of course, I still have a link to the original site and then clearly mark the local content as an exact copy of the original. My legal defense is that I'm not selling the information, I'm giving full credit for the content, and my site is not a commercial application.

It's more of a referencing exercise. Without references, there would be no new academic work or any new books written, for that matter. So I can use any idea, as crazy as it may be, as long as I give credit where credit is due.

And there are some thing you can copy without permission, for example, in Canada I can reproduce up 10% of any copyrighted material for academic purposes and not break the law. The source of this information was the WLU librarian on duty a few weeks ago, and I'm assuming he knows what he is talking about--I would recommend to check with your friendly neighborhood librarian to find out what the rules are in your location.

So my solution is not really a solution, but a hack. But I'm trying to design a permanent, fully legal, and open sourced solution. One that will work for everyone and forever.

What's similar out there? In social networking sites such as slashdot or digg, a mirror is used temporarily to solve the slashdotting or digged effect.

What these mirror sites do is copy the content locally so web surfers can temporarily visit the mirror sites instead of the original to alleviate the immense load these sites (slashdot or digg) generate when a story makes it to the front page.

This works, and it has actually created a fringe industry out of it, based on advertising revenues. But it's only temporary. There is not enough bandwidth nor incentive to keep a permanent mirror of other sites' content. In addition, it's probably illegal.

Another feasible solution is the way back machine. But there are two main reasons this won't work: first, who knows when the site will go belly up and stop archiving web sites; second, it only takes snapshots at predetermined dates--it is not really archiving everything published.

To me, the 404 blogging issue need fixing, but my solution of copying the content and offer it locally is not scalable.

So, unless every site on the internet stays up for as long as mine does, which is the ideal, what other solution is there?

I'm hoping to come up with a rather good solution. Although, I'm not working too hard on it.

Comments:

This page is powered by Blogger. Isn't yours?


Guestbook	© Jose Sandoval 2004-2009 jose@josesandoval.com