Systems and methods are disclosed for improving search quality. Search queries are expanded using a variety of linguistic techniques. For example, the words in a query can be supplemented with related words obtained from a database of compound words, inflectional forms, and/or orthographic variations. The expanded queries can be used to perform searches for responsive documents. A document index can be expanded using similar techniques.
Systems and methods for direct navigation to and/or highlighting a specific portion of a target document such as query-relevant portion of the document are disclosed. The method may include generating a search result link to a search result document and generating an instruction to a client document browser to navigate directly to an intra-document portion related to the query within the search result document. The search result may include a snippet extracted from the search result document such that the instruction causes navigation directly to at least a portion of the snippet. The instruction may be an artificial anchor undefined in the search result document, e.g., designated by a preassigned artificial anchor designator. The client browser may have an artificial anchor module installed to execute the instruction to navigate directly to and optionally highlight the intra-document portion within the target document in response to the document link being selected.
Systems and methods for generation of hyperlinks and anchor text from data such as reference text in HTML and in non-HTML documents are disclosed. The method generally includes locating a text reference in a source document, searching using a search engine for a target document relating to the text reference, computing anchor text from the text reference, generating a hyperlink to the target document, and associating the hyperlink with the computed anchor text. The locating and/or computing may be based on a respective statistical model of text formatting and/or lexical cues. The text reference may be parsed into pieces such that the searching, computing, generating, and associating are performed for each piece of text. The source document may be an HTML or non-HTML document. The text reference may be a reference to, for example, a paper, article, company, institution, product, search engine, image, object, and geographical location.