Like other notes on this site, this note contains only a few noteworthy points of the topic.
👉 My Github Repo for this note: dinhanhthi/google-vertex-ai
- You should choose the same location/region for all services (google project, notebook instances,...). 👉 Check section “Choose the same locations”.
gcloud ai
references (for Vertex AI)
- ALways use Logging service to track the problems.
- When making models, especially for serving on prod, don't forget to use
logging
services.
- When creating a new notebook instance, consider to choose a larger size for "boot disk" (100GB is not enough as it is).
- If you run the gcp command lines in workbench, you don't have to give the credential for the connecting to gcp. It's automatically passed.
- What is Vertex AI? -- Official video.
- Google Cloud Vertex AI Samples -- Official github repository.
- Vertex AI Documentation AIO: Samples - References -- Guides.
If you are going to create images with
docker
inside the virtual machine, you should choose more boot disk space (default = 100GB but you should choose more than that). In case you wanna change the size of disk, you can go to Compute Engine / Disks (ref).Remember to shutdown the notebook if you don't use it!!
👉 Note: Google Colab
👉 Official doc. Below are some notable points.
1# Start instance
2gcloud compute instances start thi-managed-notebook --zone=europe-west1-d
1# Stop instance
2gcloud compute instances stop thi-managed-notebook --zone=europe-west1-d
Inside the notebook, open Terminal tab. Then install the Github CLI (ref),
1curl -fsSL <https://cli.github.com/packages/githubcli-archive-keyring.gpg> | sudo dd of=/usr/share/keyrings/githubcli-archive-keyring.gpg
1echo "deb [arch=$(dpkg --print-architecture) signed-by=/usr/share/keyrings/githubcli-archive-keyring.gpg] <https://cli.github.com/packages> stable main" \
2 | sudo tee /etc/apt/sources.list.d/github-cli.list > /dev/null
1sudo apt update
2sudo apt install gh
Login to gh,
1gh auth login
Then following the guides.
The JupyterLab is running on Vertex Notebook at port
8080
. You have 2 options to open it on your local machine:You have to use User-managed notebook! Managed notebook doesn't allow you to use SSH (officially). If you wanna connect via SSH for managed notebook, read section “SSH to managed notebook” in this note.
First, connect using
gcloud
👉 Note: Google Cloud CLI.👉 Note: SSH
You have to re-install the GPU driver on the virtual machine. Check this official instruction.
When creating a new notebook, make sure to enable terminal for this notebook. Open the notebook and then open the terminal.
1# On your local machine => check the public keys
2cat ~/.ssh/id_rsa.pu
1# On managed notebook, make sure you're at /home/jupyter
2pwd
3mkdir .ssh
4touch .ssh/authorized_keys
5vim .ssh/authorized_keys
6# Paste the public key here
7# Then save & exit (Press ESC then type :wq!)
1# Check
2cat .ssh/authorized_keys
1# Check the external ip address of this notebook instance
2curl -s <http://whatismyip.akamai.com>
Connect from local,
1ssh -i ~/.ssh/id_rsa jupyter@<ip-returned-in-previous-step>
Remark: This section is almost for me only (all the steps here are already described in previous steps).
Remember to shutdown the notebook if you don't use it!!
👉 The official blog about this task (the notebook is good but you need to read this blog too, there are useful points and links)
👉 My already-executed notebook (There are my comments there).
In case you skip the training phase and just use the model given by Hugging Face community.
👉 Notebook for testing load/use models from Hugging Face.
👉 Notebook for creating an image and deploying to vertex AI.
👉 Notebook for creating an image and deploying to vertex AI.
To make some tests with
curl
, check REST API with cURL. Below a shortcodes,1instance = b"Who are you voting for in 2020?"
2b64_encoded = base64.b64encode(instance)
3test_instance = {
4 "instances": [
5 {
6 "data": {
7 "b64": b64_encoded.decode('utf-8')
8 },
9 "labels": ["Europe", "public health", "politics"]
10 }
11 ]
12}
13
14payload = json.dumps(test_instance)
15r = requests.post(
16 f"<http://localhost:7080/predictions/{APP_NAME}/>",
17 headers={"Content-Type": "application/json", "charset": "utf-8"},
18 data=payload
19)
20
21r.json()
You can check a full example in this notebook. In this section, I note about the use of Transformers'
pipeline
using TorchServe
and Vertex AI.The principle idea focuses on the file
custom_hanler.py
which is used with TorchServe
when creating a new container image for serving the model.In this
custom_handler.py
file, we have to create methods initialize()
, preprocess()
, inference()
which extend the class BaseHandler
. Most of the problems come from the format of the outputs in these methods.For using
pipeline()
, we can define initialze()
, preprocess()
and inference()
like below,For online prediction requestion, fortmat the prediction input instances as JSON with
base64
encoding as shown here:1[
2 {
3 "data": {
4 "b64": "<base64 encoded string>"
5 }
6 }
7]
👉 Vertex locations (You can check all supported locations here)
Below are some codes where you have to indicate the location on which your service will be run (Remark: They're not all, just what I've met from these notebooks),
Step 1: Artivate Artifact Registry API.
Step 2: Go to Artifacet Registry. If you see any warning like "*You have gcr.io repositories in Container Registry. Create gcr.io repositories in Artifact Registry?", click CREATE GCR. REPOSITORIES.
Step 3: Copy images from Container Registry to Artifact Registry. What you need is the URLs of "from" CR and "to" AR.
- Check in page AR, there is small warning icon ⚠️, hover it to see the "not complete" url. Example: Copy your images from
eu.gcr.io/ideta-ml-thi
toeurope-docker.pkg.dev/ideta-ml-thi/eu.gcr.io
- Check in page CR, click the button copy, a full url of the image will be copied to clipboard, eg. gcr.io/ideta-ml-thi/pt-xlm-roberta-large-xnli_3
- Finally, combine them with the tag (use
:lastest
if you don't have others already).
- Example, from
gcr.io/ideta-ml-thi/pt-xlm-roberta-large-xnli_3:latest
tous-docker.pkg.dev/ideta-ml-thi/gcr.io/pt-xlm-roberta-large-xnli_3:latest
.
👉 Transitioning to repositories with gcr.io domain support (also on this link, copy from container to artifact)
Step 4: Route to AR (After step 3, the iamges in AR has the same route as in CR but the traffic only regconize it from CR. We need this step to make all traffics use AR's instead). You need these permissions to perform the action (click the button ROUTE TO ARTIFACT).