Tool interfaces

This module defines interfaces for each tool or decision maker used in the searching process. This makes it easy to test new ways or to tune one aspect of the search engine while keeping most of the code unchanged.

See the swisstext.cmd.searching.tools module for implementations.

class swisstext.cmd.searching.interfaces.IQueryBuilder[source]

Bases: object

Do some preprocessing on the query before submitting it to a search engine. The builder can be used to prepare a query from a seed, such as quoting words, using “AND” keywords, etc.

prepare(query: str, **kwargs)[source]

By default, does nothing.

class swisstext.cmd.searching.interfaces.ISaver[source]

Bases: abc.ABC

[ABSTRACT] The saver is responsible for persisting everything somewhere, such as a database, a file or the console.

class LinkStatus[source]

Bases: enum.IntEnum

An enumeration.

BLACKLISTED = 2
EXISTS = 1
NOT_EXIST = 0

Test if the url already exists in the persistence layer. Returns false by default.

abstract save_seed(seed: swisstext.cmd.searching.data.Seed, was_used: bool)[source]

[ABSTRACT] Should persist a seed and, if was_used is true, its associated results.

seed_exists(seed: str, **kwargs) → bool[source]

Return whether a seed already exist in the backend.

class swisstext.cmd.searching.interfaces.ISearcher[source]

Bases: abc.ABC

[ABSTRACT] This tool is the core of the pipeline: it is an interface to a real search engine.

abstract search(query) → Iterable[str][source]

[ABSTRACT] Should query a real search engine and returns an iterator of results/URLs. The use of an iterator allows for lazy implementations, in case we are limited by an API quota.

top_results(query, max_results=10) → List[source]

Use the search() to return the x top results for a query.