wikibaseintegrator.wbi_helpers

Multiple functions or classes that can be used to interact with the Wikibase instance.

class wikibaseintegrator.wbi_helpers.BColors[source]

Bases: object

Default colors for pretty outputs.

BOLD = '\x1b[1m'
ENDC = '\x1b[0m'
FAIL = '\x1b[91m'
HEADER = '\x1b[95m'
OKBLUE = '\x1b[94m'
OKCYAN = '\x1b[96m'
OKGREEN = '\x1b[92m'
UNDERLINE = '\x1b[4m'
WARNING = '\x1b[93m'
__init__()
wikibaseintegrator.wbi_helpers.delete_page(title=None, pageid=None, reason=None, deletetalk=False, watchlist='preferences', watchlistexpiry=None, login=None, **kwargs)[source]

Delete a page

Parameters:
  • title (str | None) – Title of the page to delete. Cannot be used together with pageid.

  • pageid (int | None) – Page ID of the page to delete. Cannot be used together with title.

  • reason (str | None) – Reason for the deletion. If not set, an automatically generated reason will be used.

  • deletetalk (bool) – Delete the talk page, if it exists.

  • watchlist (str) – Unconditionally add or remove the page from the current user’s watchlist, use preferences (ignored for bot users) or do not change watch. One of the following values: nochange, preferences, unwatch, watch

  • watchlistexpiry (str | None) – Watchlist expiry timestamp. Omit this parameter entirely to leave the current expiry unchanged.

  • login (_Login | None) – A wbi_login.Login instance

  • kwargs (Any)

Return type:

dict

Returns:

wikibaseintegrator.wbi_helpers.download_entity_ttl(entity, wikibase_url=None, user_agent=None)[source]

Downloads the TTL (Terse RDF Triple Language) content of a specific entity from a Wikibase instance.

Args: :rtype: str

  • entity (str): The identifier of the entity to download the TTL content for.

  • wikibase_url (str | None): The base URL of the Wikibase instance. If None, the default URL from the configuration

    will be used.

  • user_agent (str | None): The user agent string to be used in the HTTP request headers. If None, the default user

    agent from the configuration will be used if available.

Returns: - str: The TTL content of the requested entity.

Raises: - HTTPError: If the HTTP request to retrieve the TTL content fails (status code other than 2xx).

Note: The function relies on a configuration setup (presumably a ‘config’ dictionary) containing at least the keys ‘WIKIBASE_URL’ and ‘USER_AGENT’ for the default Wikibase URL and user agent respectively.

Parameters:
  • entity (str)

  • wikibase_url (str | None)

  • user_agent (str | None)

Return type:

str

wikibaseintegrator.wbi_helpers.edit_entity(data, id=None, type=None, baserevid=None, summary=None, clear=False, is_bot=False, tags=None, site=None, title=None, **kwargs)[source]

Creates a single new Wikibase entity and modifies it with serialised information.

Parameters:
  • data (dict) – The serialized object that is used as the data source. A newly created entity will be assigned an ‘id’.

  • id (str | None) – The identifier for the entity, including the prefix. Use either id or site and title together.

  • type (str | None) – Set this to the type of the entity to be created. One of the following values: form, item, lexeme, property, sense

  • baserevid (int | None) – The numeric identifier for the revision to base the modification on. This is used for detecting conflicts during save.

  • summary (str | None) – Summary for the edit. Will be prepended by an automatically generated comment.

  • clear (bool) – If set, the complete entity is emptied before proceeding. The entity will not be saved before it is filled with the “data”, possibly with parts excluded.

  • is_bot (bool) – Mark this edit as bot.

  • login – A login instance

  • tags (list[str] | None) – Change tags to apply to the revision.

  • site (str | None) – An identifier for the site on which the page resides. Use together with title to make a complete sitelink.

  • title (str | None) – Title of the page to associate. Use together with site to make a complete sitelink.

  • kwargs (Any) – More arguments for Python requests

Return type:

dict

Returns:

The answer from the Wikibase API

wikibaseintegrator.wbi_helpers.execute_sparql_query(query, prefix=None, endpoint=None, user_agent=None, max_retries=1000, retry_after=60)[source]

Static method which can be used to execute any SPARQL query

Parameters:
  • prefix (str | None) – The URI prefixes required for an endpoint, default is the Wikidata specific prefixes

  • query (str) – The actual SPARQL query string

  • endpoint (str | None) – The URL string for the SPARQL endpoint. Default is the URL for the Wikidata SPARQL endpoint

  • user_agent (str | None) – Set a user agent string for the HTTP header to let the Query Service know who you are.

  • max_retries (int) – The number time this function should retry in case of header reports.

  • retry_after (int) – the number of seconds should wait upon receiving either an error code or the Query Service is not reachable.

Return type:

dict[str, dict]

Returns:

The results of the query are returned in JSON format

wikibaseintegrator.wbi_helpers.format2wbi(entitytype, json_raw, allow_anonymous=True, wikibase_url=None, **kwargs)[source]
Return type:

BaseEntity

Parameters:
  • entitytype (str)

  • json_raw (str)

  • allow_anonymous (bool)

  • wikibase_url (str | None)

wikibaseintegrator.wbi_helpers.format_amount(amount)[source]

A formatting function mostly used for Quantity datatype. :type amount: int | str | float :param amount: A int, float or str you want to pass to Quantity value. :rtype: str :return: A correctly formatted string amount by Wikibase standard.

Parameters:

amount (int | str | float)

Return type:

str

Perform a fulltext search on the mediawiki instance. It’s an exception to the “only wikibase related function” rule! WikibaseIntegrator is focused on wikibase-only functions to avoid spreading out and covering all functions of MediaWiki.

Parameters:
  • search (str) – Search for page titles or content matching this value. You can use the search string to invoke special search features, depending on what the wiki’s search backend implements.

  • max_results (int) – How many total pages to return. The value must be between 1 and 500.

  • allow_anonymous (bool) – Allow anonymous interaction with the MediaWiki API. ‘True’ by default.

  • kwargs (Any) – Extra parameters for mediawiki_api_call_helper()

Return type:

list[dict[str, Any]]

Returns:

wikibaseintegrator.wbi_helpers.generate_entity_instances(entities, allow_anonymous=True, **kwargs)[source]

A method which allows for retrieval of a list of Wikidata entities. The method generates a list of tuples where the first value in the tuple is the entity’s ID, whereas the second is the new instance of a subclass of BaseEntity containing all the data of the entity. This is most useful for mass retrieval of entities.

Parameters:
  • entities (str | list[str]) – A list of IDs. Item, Property or Lexeme.

  • allow_anonymous (bool) – Allow anonymous edit to the MediaWiki API. Disabled by default.

  • kwargs (Any)

Return type:

list[tuple[str, BaseEntity]]

Returns:

A list of tuples, first value in the tuple is the entity’s ID, second value is the instance of a subclass of BaseEntity with the corresponding entity data.

wikibaseintegrator.wbi_helpers.get_user_agent(user_agent=None)[source]

Return a user agent string suitable for interacting with the Wikibase instance.

Parameters:

user_agent (str | None) – An optional user-agent. If not provided, will generate a default user-agent.

Return type:

str

Returns:

A correctly formatted user agent.

wikibaseintegrator.wbi_helpers.lexeme_add_form(lexeme_id, data, baserevid=None, tags=None, is_bot=False, **kwargs)[source]

Adds Form to Lexeme

Parameters:
  • lexeme_id – ID of the Lexeme, e.g. L10

  • data – The serialized object that is used as the data source.

  • baserevid (int | None) – Base Revision ID of the Lexeme, if edit conflict check is wanted.

  • tags (list[str] | None) – Change tags to apply to the revision.

  • is_bot (bool) – Mark this edit as bot.

  • kwargs (Any)

Return type:

dict

Returns:

wikibaseintegrator.wbi_helpers.lexeme_add_sense(lexeme_id, data, baserevid=None, tags=None, is_bot=False, **kwargs)[source]

Adds a Sense to a Lexeme

Parameters:
  • lexeme_id – ID of the Lexeme, e.g. L10

  • data – JSON-encoded data for the Sense, i.e. its glosses

  • baserevid (int | None) – Base Revision ID of the Lexeme, if edit conflict check is wanted.

  • tags (list[str] | None) – Change tags to apply to the revision.

  • is_bot (bool) – Mark this edit as bot.

  • kwargs (Any)

Return type:

dict

Returns:

wikibaseintegrator.wbi_helpers.lexeme_edit_form(form_id, data, baserevid=None, tags=None, is_bot=False, **kwargs)[source]

Edits representations and grammatical features of a Form

Parameters:
  • form_id (str) – ID of the Form or the concept URI, e.g. L10-F2

  • data – The serialized object that is used as the data source.

  • baserevid (int | None) – Base Revision ID of the Lexeme, if edit conflict check is wanted.

  • tags (list[str] | None) – Change tags to apply to the revision.

  • is_bot (bool) – Mark this edit as bot.

  • kwargs (Any)

Return type:

dict

Returns:

wikibaseintegrator.wbi_helpers.lexeme_edit_sense(sense_id, data, baserevid=None, tags=None, is_bot=False, **kwargs)[source]

Edits glosses of a Sense

Parameters:
  • sense_id (str) – ID of the Sense or the concept URI, e.g. L10-S2

  • data – The serialized object that is used as the data source.

  • baserevid (int | None) – Base Revision ID of the Lexeme, if edit conflict check is wanted.

  • tags (list[str] | None) – Change tags to apply to the revision.

  • is_bot (bool) – Mark this edit as bot.

  • kwargs (Any)

Return type:

dict

Returns:

wikibaseintegrator.wbi_helpers.lexeme_remove_form(form_id, baserevid=None, tags=None, is_bot=False, **kwargs)[source]

Removes Form from Lexeme

Parameters:
  • form_id (str) – ID of the Form or the concept URI, e.g. L10-F2

  • baserevid (int | None) – Base Revision ID of the Lexeme, if edit conflict check is wanted.

  • tags (list[str] | None) – Change tags to apply to the revision.

  • is_bot (bool) – Mark this edit as bot.

  • kwargs (Any)

Return type:

dict

Returns:

wikibaseintegrator.wbi_helpers.lexeme_remove_sense(sense_id, baserevid=None, tags=None, is_bot=False, **kwargs)[source]

Adds Form to Lexeme

Parameters:
  • sense_id (str) – ID of the Sense, e.g. L10-S20

  • baserevid (int | None) – Base Revision ID of the Lexeme, if edit conflict check is wanted.

  • tags (list[str] | None) – Change tags to apply to the revision.

  • is_bot (bool) – Mark this edit as bot.

  • kwargs (Any)

Return type:

dict

Returns:

wikibaseintegrator.wbi_helpers.mediawiki_api_call(method, mediawiki_api_url=None, session=None, max_retries=100, retry_after=60, **kwargs)[source]

A function to call the MediaWiki API.

Parameters:
  • method (str) – ‘GET’ or ‘POST’

  • mediawiki_api_url (str | None)

  • session (Session | None) – If a session is passed, it will be used. Otherwise, a new requests session is created

  • max_retries (int) – If api request fails due to rate limiting, maxlag, or readonly mode, retry up to max_retries times

  • retry_after (int) – Number of seconds to wait before retrying request (see max_retries)

  • kwargs (Any) – Any additional keyword arguments to pass to requests.request

Return type:

dict

Returns:

The data returned by the API as a dictionary

wikibaseintegrator.wbi_helpers.mediawiki_api_call_helper(data, login=None, mediawiki_api_url=None, user_agent=None, allow_anonymous=False, max_retries=1000, retry_after=60, maxlag=5, is_bot=False, **kwargs)[source]

A simplified function to call the MediaWiki API. Pass the data, as a dictionary, related to the action you want to call, all commons options will be automatically managed.

Parameters:
  • data (dict[str, Any]) – A dictionary containing the JSON data to send to the API

  • login (_Login | None) – A wbi_login._Login instance

  • mediawiki_api_url (str | None) – The URL to the MediaWiki API (default Wikidata)

  • user_agent (str | None) – The user agent (Recommended for Wikimedia Foundation instances)

  • allow_anonymous (bool) – Allow an unidentified edit to the MediaWiki API (default False)

  • max_retries (int) – The maximum number of retries

  • retry_after (int) – The timeout between each retry

  • maxlag (int) – If applicable, the maximum lag allowed for the replication (An lower number reduce the load on the replicated database)

  • is_bot (bool) – Flag the edit as a bot

  • kwargs (Any) – Any additional keyword arguments to pass to requests.request

Return type:

dict

Returns:

The data returned by the API as a dictionary

wikibaseintegrator.wbi_helpers.merge_items(from_id, to_id, login=None, ignore_conflicts=None, is_bot=False, **kwargs)[source]

A static method to merge two items

Parameters:
  • from_id (str) – The ID to merge from. This parameter is required.

  • to_id (str) – The ID to merge to. This parameter is required.

  • login (_Login | None) – A wbi_login.Login instance

  • ignore_conflicts (list[str] | None) – List of elements of the item to ignore conflicts for. Can only contain values of “description”, “sitelink” and “statement”

  • is_bot (bool) – Mark this edit as bot.

  • kwargs (Any)

Return type:

dict

wikibaseintegrator.wbi_helpers.merge_lexemes(source, target, login=None, summary=None, is_bot=False, **kwargs)[source]

A static method to merge two lexemes

Parameters:
  • source (str) – The ID to merge from. This parameter is required.

  • target (str) – The ID to merge to. This parameter is required.

  • login (_Login | None) – A wbi_login.Login instance

  • summary (str | None) – Summary for the edit.

  • is_bot (bool) – Mark this edit as bot.

  • kwargs (Any)

Return type:

dict

wikibaseintegrator.wbi_helpers.remove_claims(claim_id, summary=None, baserevid=None, is_bot=False, **kwargs)[source]

Delete a claim from an entity

Parameters:
  • claim_id (str) – One GUID or several (pipe-separated) GUIDs identifying the claims to be removed. All claims must belong to the same entity.

  • summary (str | None) – Summary for the edit. Will be prepended by an automatically generated comment.

  • baserevid (int | None) – The numeric identifier for the revision to base the modification on. This is used for detecting conflicts during save.

  • is_bot (bool) – Mark this edit as bot.

  • kwargs (Any)

Return type:

dict

wikibaseintegrator.wbi_helpers.search_entities(search_string, language=None, strict_language=False, search_type='item', max_results=50, dict_result=False, allow_anonymous=True, **kwargs)[source]

Performs a search for entities in the Wikibase instance using labels and aliases. You can have more information on the parameters in the MediaWiki API help (https://www.wikidata.org/w/api.php?action=help&modules=wbsearchentities)

Parameters:
  • search_string (str) – A string which should be searched for in the Wikibase instance (labels and aliases)

  • language (str | None) – The language in which to perform the search. This only affects how entities are selected. Default is ‘en’ from wbi_config. You can see the list of languages for Wikidata at https://www.wikidata.org/wiki/Help:Wikimedia_language_codes/lists/all (Use the WMF code)

  • strict_language (bool) – Whether to disable language fallback. Default is ‘False’.

  • search_type (str) – Search for this type of entity. One of the following values: form, item, lexeme, property, sense, mediainfo

  • max_results (int) – The maximum number of search results returned. The value must be between 0 and 50. Default is 50

  • dict_result (bool) – Return the results as a detailed dictionary instead of a list of IDs.

  • allow_anonymous (bool) – Allow anonymous interaction with the MediaWiki API. ‘True’ by default.

  • kwargs (Any)

Return type:

list[dict[str, Any]]