Getting Start

Installation

The best way to install the bert-as-service is via pip. Note that the server and client can be installed separately or even on different machines:

pip install -U bert-serving-server bert-serving-client

Warning

The server MUST be running on Python >= 3.5 with Tensorflow >= 1.10 (one-point-ten). Again, the server does not support Python 2!

Note

The client can be running on both Python 2 and 3.

Download a Pre-trained BERT Model

Download a model listed below, then uncompress the zip file into some folder, say /tmp/english_L-12_H-768_A-12/

List of pretrained BERT models released by Google AI:

BERT-Base, Uncased 12-layer, 768-hidden, 12-heads, 110M parameters
BERT-Large, Uncased 24-layer, 1024-hidden, 16-heads, 340M parameters
BERT-Base, Cased 12-layer, 768-hidden, 12-heads , 110M parameters
BERT-Large, Cased 24-layer, 1024-hidden, 16-heads, 340M parameters
BERT-Base, Multilingual Cased (New) 104 languages, 12-layer, 768-hidden, 12-heads, 110M parameters
BERT-Base, Multilingual Cased (Old) 102 languages, 12-layer, 768-hidden, 12-heads, 110M parameters
BERT-Base, Chinese Chinese Simplified and Traditional, 12-layer, 768-hidden, 12-heads, 110M parameters

Note

As an optional step, you can also fine-tune the model on your downstream task.

Start the BERT service

After installing the server, you should be able to use bert-serving-start CLI as follows:

bert-serving-start -model_dir /tmp/english_L-12_H-768_A-12/ -num_worker=4

This will start a service with four workers, meaning that it can handle up to four concurrent requests. More concurrent requests will be queued in a load balancer.

Below shows what the server looks like when starting correctly:

../_images/server-demo.gif

Start the Bert service in a docker container

Alternatively, one can start the BERT Service in a Docker Container:

docker build -t bert-as-service -f ./docker/Dockerfile .
NUM_WORKER=1
PATH_MODEL=/PATH_TO/_YOUR_MODEL/
docker run --runtime nvidia -dit -p 5555:5555 -p 5556:5556 -v $PATH_MODEL:/model -t bert-as-service $NUM_WORKER

Use Client to Get Sentence Encodes

Now you can encode sentences simply as follows:

from bert_serving.client import BertClient
bc = BertClient()
bc.encode(['First do it', 'then do it right', 'then do it better'])

It will return a ndarray, in which each row is the fixed representation of a sentence. You can also let it return a pure python object with type List[List[float]].

As a feature of BERT, you may get encodes of a pair of sentences by concatenating them with |||, e.g.

bc.encode(['First do it ||| then do it right'])

Below shows what the server looks like while encoding:

../_images/server-run-demo.gif

Use BERT Service Remotely

One may also start the service on one (GPU) machine and call it from another (CPU) machine as follows:

# on another CPU machine
from bert_serving.client import BertClient
bc = BertClient(ip='xx.xx.xx.xx')  # ip address of the GPU machine
bc.encode(['First do it', 'then do it right', 'then do it better'])

Note

You only need pip install -U bert-serving-client in this case, the server side is not required.

Want to learn more? Checkout our tutorials below: