darshankr's picture
Upload 795 files
287c28c verified

A newer version of the Gradio SDK is available: 5.44.1

Upgrade

Serving using Azure Machine Learning

Pre-requisites

cd inference/triton_server

Setting AML environment

Set the environment for AML:

export RESOURCE_GROUP=Dhruva-prod
export WORKSPACE_NAME=dhruva--central-india
export DOCKER_REGISTRY=dhruvaprod

Also remember to edit the yml files accordingly.

Pushing the docker image to Container Registry

az acr login --name $DOCKER_REGISTRY
docker tag tts_triton $DOCKER_REGISTRY.azurecr.io/tts/triton-tts-coqui:latest
docker push $DOCKER_REGISTRY.azurecr.io/tts/triton-tts-coqui:latest

Creating the execution environment

az ml environment create -f azure_ml/environment.yml -g $RESOURCE_GROUP -w $WORKSPACE_NAME

Deployment

Since we have different models for different languages, to reduce the no. of deployments, we recommend that some of the models be grouped and deployed together, based on how much we can fit into the GPU RAM we're deploying on.

In our case, we group it as follows:

  • North-Indian languages
    • Indo-Aryan languages: as, bn, gu, hi, mr, or, pa, raj
    • Language wise folders should be placed in inference/checkpoints/indo-aryan/checkpoints
  • South-Indian languages
    • Dravidian languages: kn, ml, ta, te
    • Language wise folders should be placed in inference/checkpoints/dravidian/checkpoints
  • Remaining languages
    • Miscellaneous languages: en, brx, mni
    • Language wise folders should be placed in inference/checkpoints/misc/checkpoints

In this tutorial, we show example on how to perform a deployment for North-Indian languages, the config files for which are available in the directory: azure_ml/indo-aryan. (For other groups, follow similarly)

Registering the model

az ml model create --file azure_ml/indo-aryan/model.yml --resource-group $RESOURCE_GROUP --workspace-name $WORKSPACE_NAME

Publishing the endpoint for online inference

az ml online-endpoint create -f azure_ml/indo-aryan/endpoint.yml -g $RESOURCE_GROUP -w $WORKSPACE_NAME

Now from the Azure Portal, open the Container Registry, and grant ACR_PULL permission for the above endpoint, so that it is allowed to download the docker image.

Attaching a deployment

az ml online-deployment create -f azure_ml/indo-aryan/deployment.yml --all-traffic -g $RESOURCE_GROUP -w $WORKSPACE_NAME

Testing if inference works

  1. From Azure ML Studio, go to the "Consume" tab, and get the endpoint domain (without https:// or trailing /) and an authentication key.
  2. In client.py, enable ENABLE_SSL = True, and then set the ENDPOINT_URL variable as well as Authorization value inside HTTP_HEADERS.
  3. Run python3 client.py