Using BertClient

Installation

The best way to install the client is via pip. Note that the client can be installed separately from BertServer or even on a different machine:

pip install bert-serving-client

Note

The client can be running on both Python 2 and 3.

Client-side API

class client.BertClient(ip='localhost', port=5555, port_out=5556, output_fmt='ndarray', show_server_config=False, identity=None, check_version=True, check_length=True, check_token_info=True, ignore_all_checks=False, timeout=-1)[source]

Bases: object

A client object connected to a BertServer

Create a BertClient that connects to a BertServer. Note, server must be ready at the moment you are calling this function. If you are not sure whether the server is ready, then please set ignore_all_checks=True

You can also use it as a context manager:

with BertClient() as bc:
    bc.encode(...)

# bc is automatically closed out of the context
Parameters:
  • ip (str) – the ip address of the server
  • port (int) – port for pushing data from client to server, must be consistent with the server side config
  • port_out (int) – port for publishing results from server to client, must be consistent with the server side config
  • output_fmt (str) – the output format of the sentence encodes, either in numpy array or python List[List[float]] (ndarray/list)
  • show_server_config (bool) – whether to show server configs when first connected
  • identity (str) – the UUID of this client
  • check_version (bool) – check if server has the same version as client, raise AttributeError if not the same
  • check_length (bool) – check if server max_seq_len is less than the sentence length before sent
  • check_token_info (bool) – check if server can return tokenization
  • ignore_all_checks (bool) – ignore all checks, set it to True if you are not sure whether the server is ready when constructing BertClient()
  • timeout (int) – set the timeout (milliseconds) for receive operation on the client, -1 means no timeout and wait until result returns
close()[source]

Gently close all connections of the client. If you are using BertClient as context manager, then this is not necessary.

encode(texts, blocking=True, is_tokenized=False, show_tokens=False)[source]

Encode a list of strings to a list of vectors

texts should be a list of strings, each of which represents a sentence. If is_tokenized is set to True, then texts should be list[list[str]], outer list represents sentence and inner list represent tokens in the sentence. Note that if blocking is set to False, then you need to fetch the result manually afterwards.

with BertClient() as bc:
    # encode untokenized sentences
    bc.encode(['First do it',
              'then do it right',
              'then do it better'])

    # encode tokenized sentences
    bc.encode([['First', 'do', 'it'],
               ['then', 'do', 'it', 'right'],
               ['then', 'do', 'it', 'better']], is_tokenized=True)
Parameters:
  • is_tokenized (bool) – whether the input texts is already tokenized
  • show_tokens (bool) – whether to include tokenization result from the server. If true, the return of the function will be a tuple
  • texts (list[str] or list[list[str]]) – list of sentence to be encoded. Larger list for better efficiency.
  • blocking (bool) – wait until the encoded result is returned from the server. If false, will immediately return.
  • timeout (bool) – throw a timeout error when the encoding takes longer than the predefined timeout.
Returns:

encoded sentence/token-level embeddings, rows correspond to sentences

Return type:

numpy.ndarray or list[list[float]]

encode_async(batch_generator, max_num_batch=None, delay=0.1, **kwargs)[source]

Async encode batches from a generator

Parameters:
  • delay – delay in seconds and then run fetcher
  • batch_generator – a generator that yields list[str] or list[list[str]] (for is_tokenized=True) every time
  • max_num_batch – stop after encoding this number of batches
  • **kwargs – the rest parameters please refer to encode()
Returns:

a generator that yields encoded vectors in ndarray, where the request id can be used to determine the order

Return type:

Iterator[tuple(int, numpy.ndarray)]

fetch(delay=0.0)[source]

Fetch the encoded vectors from server, use it with encode(blocking=False)

Use it after encode(texts, blocking=False). If there is no pending requests, will return None. Note that fetch() does not preserve the order of the requests! Say you have two non-blocking requests, R1 and R2, where R1 with 256 samples, R2 with 1 samples. It could be that R2 returns first.

To fetch all results in the original sending order, please use fetch_all(sort=True)

Parameters:delay (float) – delay in seconds and then run fetcher
Returns:a generator that yields request id and encoded vector in a tuple, where the request id can be used to determine the order
Return type:Iterator[tuple(int, numpy.ndarray)]
fetch_all(sort=True, concat=False)[source]

Fetch all encoded vectors from server, use it with encode(blocking=False)

Use it encode(texts, blocking=False). If there is no pending requests, it will return None.

Parameters:
  • sort (bool) – sort results by their request ids. It should be True if you want to preserve the sending order
  • concat (bool) – concatenate all results into one ndarray
Returns:

encoded sentence/token-level embeddings in sending order

Return type:

numpy.ndarray or list[list[float]]

server_status
Get the current status of the server connected to this client
Returns:a dictionary contains the current status of the server connected to this client
Return type:dict[str, str]
status
Get the status of this BertClient instance
Return type:dict[str, str]
Returns:a dictionary contains the status of this BertClient instance
class client.ConcurrentBertClient(max_concurrency=10, **kwargs)[source]

Bases: client.BertClient

A thread-safe client object connected to a BertServer

Create a BertClient that connects to a BertServer. Note, server must be ready at the moment you are calling this function. If you are not sure whether the server is ready, then please set check_version=False and check_length=False

Parameters:max_concurrency (int) – the maximum number of concurrent connections allowed
close()[source]

Gently close all connections of the client. If you are using BertClient as context manager, then this is not necessary.

encode(**kwargs)[source]

Encode a list of strings to a list of vectors

texts should be a list of strings, each of which represents a sentence. If is_tokenized is set to True, then texts should be list[list[str]], outer list represents sentence and inner list represent tokens in the sentence. Note that if blocking is set to False, then you need to fetch the result manually afterwards.

with BertClient() as bc:
    # encode untokenized sentences
    bc.encode(['First do it',
              'then do it right',
              'then do it better'])

    # encode tokenized sentences
    bc.encode([['First', 'do', 'it'],
               ['then', 'do', 'it', 'right'],
               ['then', 'do', 'it', 'better']], is_tokenized=True)
Parameters:
  • is_tokenized (bool) – whether the input texts is already tokenized
  • show_tokens (bool) – whether to include tokenization result from the server. If true, the return of the function will be a tuple
  • texts (list[str] or list[list[str]]) – list of sentence to be encoded. Larger list for better efficiency.
  • blocking (bool) – wait until the encoded result is returned from the server. If false, will immediately return.
  • timeout (bool) – throw a timeout error when the encoding takes longer than the predefined timeout.
Returns:

encoded sentence/token-level embeddings, rows correspond to sentences

Return type:

numpy.ndarray or list[list[float]]

encode_async(**kwargs)[source]

Async encode batches from a generator

Parameters:
  • delay – delay in seconds and then run fetcher
  • batch_generator – a generator that yields list[str] or list[list[str]] (for is_tokenized=True) every time
  • max_num_batch – stop after encoding this number of batches
  • **kwargs – the rest parameters please refer to encode()
Returns:

a generator that yields encoded vectors in ndarray, where the request id can be used to determine the order

Return type:

Iterator[tuple(int, numpy.ndarray)]

fetch(**kwargs)[source]

Fetch the encoded vectors from server, use it with encode(blocking=False)

Use it after encode(texts, blocking=False). If there is no pending requests, will return None. Note that fetch() does not preserve the order of the requests! Say you have two non-blocking requests, R1 and R2, where R1 with 256 samples, R2 with 1 samples. It could be that R2 returns first.

To fetch all results in the original sending order, please use fetch_all(sort=True)

Parameters:delay (float) – delay in seconds and then run fetcher
Returns:a generator that yields request id and encoded vector in a tuple, where the request id can be used to determine the order
Return type:Iterator[tuple(int, numpy.ndarray)]
fetch_all(**kwargs)[source]

Fetch all encoded vectors from server, use it with encode(blocking=False)

Use it encode(texts, blocking=False). If there is no pending requests, it will return None.

Parameters:
  • sort (bool) – sort results by their request ids. It should be True if you want to preserve the sending order
  • concat (bool) – concatenate all results into one ndarray
Returns:

encoded sentence/token-level embeddings in sending order

Return type:

numpy.ndarray or list[list[float]]

server_status

Get the current status of the server connected to this client

Returns:a dictionary contains the current status of the server connected to this client
Return type:dict[str, str]
status

Get the status of this BertClient instance

Return type:dict[str, str]
Returns:a dictionary contains the status of this BertClient instance