Getting Start¶
Installation¶
The best way to install the bert-as-service
is via pip. Note that the server and client can be installed separately or even on different machines:
pip install -U bert-serving-server bert-serving-client
Warning
The server MUST be running on Python >= 3.5 with Tensorflow >= 1.10 (one-point-ten). Again, the server does not support Python 2!
Note
The client can be running on both Python 2 and 3.
Download a Pre-trained BERT Model¶
Download a model listed below, then uncompress the zip file into some folder, say /tmp/english_L-12_H-768_A-12/
List of pretrained BERT models released by Google AI:
BERT-Base, Uncased | 12-layer, 768-hidden, 12-heads, 110M parameters |
BERT-Large, Uncased | 24-layer, 1024-hidden, 16-heads, 340M parameters |
BERT-Base, Cased | 12-layer, 768-hidden, 12-heads , 110M parameters |
BERT-Large, Cased | 24-layer, 1024-hidden, 16-heads, 340M parameters |
BERT-Base, Multilingual Cased (New) | 104 languages, 12-layer, 768-hidden, 12-heads, 110M parameters |
BERT-Base, Multilingual Cased (Old) | 102 languages, 12-layer, 768-hidden, 12-heads, 110M parameters |
BERT-Base, Chinese | Chinese Simplified and Traditional, 12-layer, 768-hidden, 12-heads, 110M parameters |
Note
As an optional step, you can also fine-tune the model on your downstream task.
Start the BERT service¶
After installing the server, you should be able to use bert-serving-start CLI as follows:
bert-serving-start -model_dir /tmp/english_L-12_H-768_A-12/ -num_worker=4
This will start a service with four workers, meaning that it can handle up to four concurrent requests. More concurrent requests will be queued in a load balancer.
Below shows what the server looks like when starting correctly:
Start the Bert service in a docker container¶
Alternatively, one can start the BERT Service in a Docker Container:
docker build -t bert-as-service -f ./docker/Dockerfile .
NUM_WORKER=1
PATH_MODEL=/PATH_TO/_YOUR_MODEL/
docker run --runtime nvidia -dit -p 5555:5555 -p 5556:5556 -v $PATH_MODEL:/model -t bert-as-service $NUM_WORKER
Use Client to Get Sentence Encodes¶
Now you can encode sentences simply as follows:
from bert_serving.client import BertClient
bc = BertClient()
bc.encode(['First do it', 'then do it right', 'then do it better'])
It will return a ndarray
, in which each row is the fixed representation of a sentence. You can also let it return a pure python object with type List[List[float]]
.
As a feature of BERT, you may get encodes of a pair of sentences by concatenating them with |||
, e.g.
bc.encode(['First do it ||| then do it right'])
Below shows what the server looks like while encoding:
Use BERT Service Remotely¶
One may also start the service on one (GPU) machine and call it from another (CPU) machine as follows:
# on another CPU machine
from bert_serving.client import BertClient
bc = BertClient(ip='xx.xx.xx.xx') # ip address of the GPU machine
bc.encode(['First do it', 'then do it right', 'then do it better'])
Note
You only need pip install -U bert-serving-client
in this case, the server side is not required.
Want to learn more? Checkout our tutorials below: