Saturday, November 20, 2010

Week 11

Web Search Engines Parts One and Two
David Hawking

After reading the first article in this series, I can’t say I’m an expert on web-crawling! But it gave a decent overview and familiarized me with the meaning of the term a bit more.

The second article was easier for me to understand. The portion about ‘term lookup’ was interesting; I did not consider the fact that all languages would be included in a search, but that makes perfect sense.

The Deep Web
Michael Bergman

This is a good discussion and explanation of the deep web versus the surface web. I now realize that when I am entering a search term into Google, the results that come up are far less than I think I am getting. I always took it for granted that when I search I am getting good results, and if I don’t, then it is my fault because my search term was not optimal. I guess this is not the case if many webpages are so remotely embedded in the deep web that they are not pulled up in my search. The BrightPlanet Technology about which the author speaks sounds like a positive change for web searching, but unfortunately, I don’t think we’ve reached this level yet and it’s been almost 10 years since the publication of this article.

Current Developments and Future Trends for the OAI Protocol for Metadata Harvesting
Shreeves, Habing, Hagedorn, Young

As I understand it, OAI is dedicated to the distribution of archival content. The OAI has been applicable not only to archives, but museums and libraries as well, and the authors note the current developments of this project. I appreciated the fact that not only they discussed the positive benefits of this initiative, but also the shortcomings and where it could be improved. I also liked learning about the different types of initiatives that are taking place in the field.

1 comment:

  1. It is surprising that after all of this time that much of the web is still "hidden." I was thinking that this might have something to do with not only page rankings but also with their highly specialized content.

    ReplyDelete