Friday, June 7, 2013

Freshness Dating Wizard


One of the more elusive search feats is to determine a date of publication. It's right up there with trying to track down an author when the creator of information uses a pseudonym or no name at all.

This prompted me to create a new search wizard to retrieve metadata from pages. In this case, the metadata is http header information that is transmitted when pages are sent by a server. If the pages are htm or html (static Web pages), some of the metadata includes Last-Modified information, which may be a clue to the age of the information.

Last Modified information may be retrieved from Firefox using Page Info (right click on the page), but it seems to have disappeared from other browsers. Since students who use our Information Researcher challenges don't always have access to Firefox, providing another search tool seemed a good idea.

Last Modified information is not an exact way to determine when material was created, but it is useful. For example, if you search this html blog post (the one you are reading now) using the Wizard, you will get Last-Modified information for the last time the entire blog was updated. Blogspot is an example of a dynamically created page, not a static page. Elements of the page, namely the ads, have never appeared here before you clicked on it. If you search metadata for older blogs on the site you will see the same Last-Modified date. Another method is needed to determine the publication date of a blog, which is fairly easy to find at the top of the post itself or the URL.

Dynamically created pages don't really send Last-Modified data, but the day and time the server sent the information, which is when you searched for it. Students can be confused using Firefox for this reason. Dynamically created pages (those that have extensions such as .asp, .php, xhtml, and no extensions at all) are displayed in Firefox's Page Info as having a Last Modified date. In our Wizard, if you use the simple version it will tell you if Last-Modified is not available.

There's also a more comprehensive version of the Metadata search that retrieves server information, expiration date, cookie information, etc. for those who would like to see more information about a page, particularly dynamically created ones.

Try the new Wizard!

More information on Static and Dynamically created pages

No comments: