markethon

One cannot begin to talk about an idea, without first presenting its history. The concepts of memory extension and hypertext date back to the mid 1940’s, when Vannaver Bush’s, “As We May Think” was published in The Atlantic Monthly. He urged scientists to work together to help build a body of knowledge for all mankind:

“The human mind does not work this way. It operates by association… Man cannot hope fully to duplicate this mental process artificially, but he certainly ought to be able to learn from it. In minor ways he may even improve, for his records have relative permanency.”

Bush’s work encouraged scientists to further explore the idea of indexing knowledge and search through the database of knowledge. Their research led us to what we know now about the WWW (world wide web) and search engines. Bush’s concept of the mind operating by association, an old idea found in David Hume, leads search engines to implement a measure of association when ranking results. Search engines started out by simply matching the search term with the terms listed in the documents. This random, chaotic display of results presented problems with data organization and search engine accuracy that required a paradigm shift lead by Google.

This paradigm shift toward precision, recall, and ranking enticed users to utilize a search engine when seeking information. ‘Recall’ is acquiring information available for your search, while ‘Precision’ is the relevance of the search. For example, you perform a search for gorilla and a dog website is the first result listing. The result was not precise thus the relevance of the search result is extraneous. ‘Ranking’ is ordering the results in a meaningful way, usually from highest to lowest. Google was one of the first to implement a “Page Rank” indicator to measure a page’s importance and consequently its relevance – which led to the first popular subjective search engine algorithm. The details of this subjective algorithm are as follows:

“Academic citation literature has been applied to the web, largely by counting citations or back links to a given page. This gives some approximation of a page’s importance or quality. PageRank extends this idea by not counting links from all pages equally, and by normalizing by the number of links on a page. PageRank is defined as follows:

“We assume page A has pages T1…Tn which point to it (i.e., are citations). The parameter d is a damping factor which can be set between 0 and 1. We usually set d to 0.85. Also C(A) is defined as the number of links going out of page A. The PageRank of a page A is given as follows:

PR(A) = (1-d) + d (PR(T1)/C(T1) + … + PR(Tn)/C(Tn))

This biased approach to a pages ‘importance or quality,’ is what makes this algorithm subjective. A search engine is used to acquire information relevant to the search. Relevancy is important when dealing with information. Therefore search engines should produce relevant results, which are the most precise results when searching for a keyword or keyword phrase. My view is that search engines should not take a subjective approach to acquiring results. This subjective relationship approach creates false results. For search engines to be effective, they need to produce the most relevant results.

I will begin by presenting the observation of non relevant results when using a subjective algorithm design under the assumption that the results are non-relevant from the perspective of the user. Next I will show the exclusion of relevant results when using a subjective algorithm design, from the perspective of the user. Lastly, the kind of questions that are supposed to be asked and probed for answers, in relation to search engines and how they are to be structured.

We’ve just seen how Google claims that their Page Rank subjective algorithm presents the most relevant results to the user. There are two arguments that can demonstrate how a subjective algorithm produces irrelevant results. First, non relevant results can be found throughout search queries. We will perform a keyword search for the terms ‘failure’ and ‘miserable failure’ using Google. The first result, supposedly the most relevant, comes up as a bibliography of George Bush. Although one may argue that this is the most relevant result when searching for these terms, the fact is that the page presented does not represent a page about the associated failures. Needless to say, the page result never mentions ‘failure.’ Google might respond by arguing that its search results are generated by computer programs that rank web pages in large part, by examining the number and relative popularity of the sites that link to them. However, by using a practice called Google bombing, determined pranksters can occasionally produce odd results. The term “Google bomb” is used both as a verb and a noun; it is an attempt to influence the rank of a given page by using consistent anchor text (see the above example) from a large number of sites. The above response fails because it does not address the real issue at hand – the relevance of the term?

Google might also argue by claiming that ‘pranks like this may be distracting to some, but they don’t effect the overall quality of our search service, whose objectivity, as always, remains the core of our mission.” This response also fails because such results do effect the overall quality of the search service. Such results go against the mission of “relevant search” and provide evidence against a subjective algorithm. We have seen that none of Google’s potential responses to the argument that subjective algorithms produce irrelevant results succeed. Hence, we should reject Google’s claim that a subjective algorithm reliably produces relevant results.

The irrelevance of search results has been presented and I will now observe the exclusion of relevant results due to the presence of non relevant results. My first argument will rely on the search term ‘blue.’ If one knew nothing about ‘blue’ and went to a subjective search engine to learn more about the term, they would find nothing related to the color as we know it. What one would find is sites on cars, mountains, phones, and people. Google might respond that there is no site that provides information on ‘blue,’ thus the results for sites using the term.? However this response fails, because there is a site with an article that provides more information on ‘blue.’

Another way that Google might respond to my arguments, is by claiming that the user did not use the correct search terms in order to find the correct website. This response also fails, because the user has never made a claim to follow the rules of the subjective algorithm. Rather the subjective algorithm has attempted to understand the user, through their Page Rank variable. The user is simply relying on the subjective algorithm to provide more information for ‘blue.’ Google fails to produce this information, thus we should reject the claim that a subjective algorithm provides the most relevant results.

We’ve just seen how a subjective algorithm approach to search engines, fails to produce relevant results. I will now argue for the view that a new paradigm shift through true mathematical statements can produce relevant search results the first change should be to reduce the current weight placed on inbound links. This dependence on other sites to place a rank on a page creates subjective search results that lead to irrelevancy. Second, the user should have the final say as to what is relevant and not the algorithm. Third, a voting system should be implemented that allows users, those who search, to decide the ‘rank’ of a site. Finally, the last rank should be determined by users and not by an algorithm that cannot fully understand the entire dynamic of web sites. A rank by users is a more accurate vote, even if they rank differently, than the one given by a web page. The strongest objection to a paradigm shift would come from operators of the current subjective algorithm arguing that it is not possible to have one hundred percent relevant results when dealing with such a large amount of information. This objection does not succeed because the subjective algorithm should not be the only tool used to display search results. It is obvious that the algorithm won’t be able to read the mind of the user, but he or she should not be subjected to the output of the algorithm alone.

From the statements above the following questions should be asked and probed for answers in relation to search engines and how they are to be structured, in order to create a rival theory. One might ask themselves if it is possible to have a relevant search engine or if (? Never ask a question) it is even desirable to have a relevant search engine. Or one might ask themselves how much weight, should be placed on the vote from other sites when determining the ranking, in importance, of a site. Finally, one might wonder about how much say so a user should have in determining the ranking and importance of a site. It is obvious that we are not yet at the point where we can give one hundred percent relevant results. (Never ask a question) means that we should give up on the idea? Although all the answers to the above questions are not known, I will leave them open to suggestion and progression. What we do know, is a subjective search engine algorithm is not adequate enough in and of itself to produce ‘relevant results.’ Rival theories are necessary in order to confidently make the claim that the current search engine theory, based on a subjective algorithm produces accurate, relevant, results.

A search engine is used to acquire information relevant to the search. Relevancy is important when dealing with information and, therefore, search engines should produce relevant results. It has been shown that a subjective algorithm produces irrelevance. {{The exclusion of relevant results due to the presence of non relevant results, and we now know that a paradigm shift is required in the algorithm design to produce relevant search results through true mathematical statements and user selection.}} One might argue that it is unethical to have non-relevant search results, but that this is unavoidable in practice. I will leave this topic for another day and time.

This article is brought to you by Chad Ledford. Chad specializes in coupon codes .

Leave a Reply

Your email address will not be published.