FAQ

How does indexing work at infotiger?

Text search engine

At the moment, infotiger is a text only search engine, covering two languages (English+German) in two seperate indexes.

Pre-processing

Prior to indexing text crawled from websites, a few pre-processing steps are done. The same pre-processing is also applied to every search query, before sending the query to the index.

Tokenizing

In a first step, the "words" in a text or search query are separated, usually at charaters as " " or ".". E.g. "de.wikipedia.org" would be split into three words "de" "wikipedia" "org".

Stopword removal

After tokenizing, the (language dependent) stopwords are removed, as there is very little information in these, e.g. "in", "have", "do", etc. would not be indexed, and are not searchable, but will still appear in the results and snipplets.

Stemming

Finally, stemming algorithms are applied, so e.g. "fishing", "fished", and "fisher" would all be stemmed to "fish".

Similarity index

Besides the classical

Is there any extended search syntax?

The art of phrasing queries :)

Boolean search

By default, search terms are treated as "OR", so

Occam's razor

is identical to:

Occam's OR razor

if you want all of your search terms in a page, you could try:

Occam's AND razor

or even:

(occham's AND razor) OR (Ockhams AND Rasiermesser)

Search for full phrases

surround your phrase by quotation marks '"'

"Latin lex parsimoniae"

Site search

you may search restrict your search to a particular site:

site:de.wikipedia.org

or even search within that site:

site:de.wikipedia.org AND "Ockhams razor"

And how can I filter results even more?

With infotiger you have the possibility to narrow down the results of your search in a variety of ways.

Filter by language

As our search index is language dependent, you have to decide in which index to search. The language for displaying the web pages in the browser is preset, which in most cases corresponds to your national language. If you want to search in our English/German index, you could set the language for the search query accordingly in the drop-down menu.

Filter by publication date

If a web page has a valid publication date, it will be displayed together with the search result. In order to limit the result set to a period of time, you can set this using the corresponding drop-down menu below the input box. Please note: pages without a publication date will not be displayed when applying this filter.

Filter by TigeRank

TigeRank is a ranking for web pages, which assigns higher ranks to popular pages. The procedure is based on the well-known PageRank algorithm, but not identical to it. You can, for example, restrict the results to the top 5% ranked pages, so that only results from pages are displayed, that are among the top 5% of the pages rated highest by TigeRank.

What about the similarity search?

Similarity search

Where can I submit my URL for indexing?

You may submit your URL or site to the infotiger index at our add url page. Currently only pages in English or German language do have a chance to make their way into the index.

How do I know that infotiger is visiting my website?

Infotiger visits web pages to index them. If you run a website yourself, read more about our web crawler here.

Is infotiger reachable via TOR (.onion) networks?

Yes it is: infotiger4xywbfq45mvd5drh43jpqeurakg2ya7gqwvjf2bbwnixzqd.onion