Google Vertex AI

Anh-Thi Dinh
Like other notes on this site, this note contains only a few noteworthy points of the topic.
👉 My Github Repo for this note: dinhanhthi/google-vertex-ai

Good to know

  • You should choose the same location/region for all services (google project, notebook instances,...). 👉 Check section “Choose the same locations”.
  • When making models, especially for serving on prod, don't forget to use logging services.
  • When creating a new notebook instance, consider to choose a larger size for "boot disk" (100GB is not enough as it is).
  • If you run the gcp command lines in workbench, you don't have to give the credential for the connecting to gcp. It's automatically passed.

Tutorials & references

  1. What is Vertex AI? -- Official video.
  1. Google Cloud Vertex AI Samples -- Official github repository.
  1. Vertex AI Documentation AIO: Samples - References -- Guides.

Notebooks (Workbench)

If you are going to create images with docker inside the virtual machine, you should choose more boot disk space (default = 100GB but you should choose more than that). In case you wanna change the size of disk, you can go to Compute Engine / Disks (ref).
🚨
Remember to shutdown the notebook if you don't use it!!

Workbench notebook vs Colab

👉 Note: Google Colab

"Managed notebook" vs "User-managed notebook"

👉 Official doc. Below are some notable points.

gcloud CLI

1# Start instance
2gcloud compute instances start thi-managed-notebook --zone=europe-west1-d
1# Stop instance
2gcloud compute instances stop thi-managed-notebook --zone=europe-west1-d

Sync with Github using gh CLI

Inside the notebook, open Terminal tab. Then install the Github CLI (ref),
1curl -fsSL <https://cli.github.com/packages/githubcli-archive-keyring.gpg> | sudo dd of=/usr/share/keyrings/githubcli-archive-keyring.gpg
1echo "deb [arch=$(dpkg --print-architecture) signed-by=/usr/share/keyrings/githubcli-archive-keyring.gpg] <https://cli.github.com/packages> stable main" \
2  | sudo tee /etc/apt/sources.list.d/github-cli.list > /dev/null
1sudo apt update
2sudo apt install gh
Login to gh,
1gh auth login
Then following the guides.

Open Juputer notebook on your local machine

The JupyterLab is running on Vertex Notebook at port 8080. You have 2 options to open it on your local machine:

SSH to User-managed notebook

⚠️
You have to use User-managed notebook! Managed notebook doesn't allow you to use SSH (officially). If you wanna connect via SSH for managed notebook, read section “SSH to managed notebook” in this note.
First, connect using gcloud 👉 Note: Google Cloud CLI.
👉 Note: SSH

If you change the GPU type?

You have to re-install the GPU driver on the virtual machine. Check this official instruction.

SSH to managed notebook

When creating a new notebook, make sure to enable terminal for this notebook. Open the notebook and then open the terminal.
1# On your local machine => check the public keys
2cat ~/.ssh/id_rsa.pu
1# On managed notebook, make sure you're at /home/jupyter
2pwd
3mkdir .ssh
4touch .ssh/authorized_keys
5vim .ssh/authorized_keys
6# Paste the public key here
7# Then save & exit (Press ESC then type :wq!)
1# Check
2cat .ssh/authorized_keys
1# Check the external ip address of this notebook instance
2curl -s <http://whatismyip.akamai.com>
Connect from local,
1ssh -i ~/.ssh/id_rsa jupyter@<ip-returned-in-previous-step>

AIO steps

Remark: This section is almost for me only (all the steps here are already described in previous steps).
🚨
Remember to shutdown the notebook if you don't use it!!

Troubleshooting

With Hugging Face models (A-Z)

A-Z text classification with PyTorch

👉 The official blog about this task (the notebook is good but you need to read this blog too, there are useful points and links)
👉 My already-executed notebook (There are my comments there).

Just deploying?

In case you skip the training phase and just use the model given by Hugging Face community.
👉 Notebook for testing load/use models from Hugging Face.
👉
Notebook for creating an image and deploying to vertex AI.
To make some tests with curl, check REST API with cURL. Below a shortcodes,
1instance = b"Who are you voting for in 2020?"
2b64_encoded = base64.b64encode(instance)
3test_instance = {
4    "instances": [
5        {
6            "data": {
7                "b64": b64_encoded.decode('utf-8')
8            },
9            "labels": ["Europe", "public health", "politics"]
10        }
11    ]
12}
13
14payload = json.dumps(test_instance)
15r = requests.post(
16    f"<http://localhost:7080/predictions/{APP_NAME}/>",
17    headers={"Content-Type": "application/json", "charset": "utf-8"},
18    data=payload
19)
20
21r.json()

Using Transformers' pipeline with Vertex AI?

You can check a full example in this notebook. In this section, I note about the use of Transformers' pipeline using TorchServe and Vertex AI.
The principle idea focuses on the file custom_hanler.py which is used with TorchServe when creating a new container image for serving the model.
In this custom_handler.py file, we have to create methods initialize(), preprocess(), inference() which extend the class BaseHandler. Most of the problems come from the format of the outputs in these methods.
For using pipeline(), we can define initialze(), preprocess() and inference() like below,

Encode example text in base64 format

For online prediction requestion, fortmat the prediction input instances as JSON with base64 encoding as shown here:
1[
2  {
3    "data": {
4      "b64": "<base64 encoded string>"
5    }
6  }
7]

Testing created endpoint

Some remarks for Hugging Face's things

Choose the same locations

👉 Vertex locations (You can check all supported locations here)
Below are some codes where you have to indicate the location on which your service will be run (Remark: They're not all, just what I've met from these notebooks),

Container Registry to Artifact Registry

Step 1: Artivate Artifact Registry API.
Step 2: Go to Artifacet Registry. If you see any warning like "*You have gcr.io repositories in Container Registry. Create gcr.io repositories in Artifact Registry?", click CREATE GCR. REPOSITORIES.
Step 3: Copy images from Container Registry to Artifact Registry. What you need is the URLs of "from" CR and "to" AR.
  • Check in page AR, there is small warning icon ⚠️, hover it to see the "not complete" url. Example: Copy your images from eu.gcr.io/ideta-ml-thi to europe-docker.pkg.dev/ideta-ml-thi/eu.gcr.io
  • Finally, combine them with the tag (use :lastest if you don't have others already).
  • Example, from gcr.io/ideta-ml-thi/pt-xlm-roberta-large-xnli_3:latest to us-docker.pkg.dev/ideta-ml-thi/gcr.io/pt-xlm-roberta-large-xnli_3:latest.
Step 4: Route to AR (After step 3, the iamges in AR has the same route as in CR but the traffic only regconize it from CR. We need this step to make all traffics use AR's instead). You need these permissions to perform the action (click the button ROUTE TO ARTIFACT).

Problems?