A newer version of the Gradio SDK is available:
5.44.1
Serving using Azure Machine Learning
Pre-requisites
cd inference/triton_server
Setting AML environment
Set the environment for AML:
export RESOURCE_GROUP=Dhruva-prod
export WORKSPACE_NAME=dhruva--central-india
export DOCKER_REGISTRY=dhruvaprod
Also remember to edit the yml
files accordingly.
Pushing the docker image to Container Registry
az acr login --name $DOCKER_REGISTRY
docker tag tts_triton $DOCKER_REGISTRY.azurecr.io/tts/triton-tts-coqui:latest
docker push $DOCKER_REGISTRY.azurecr.io/tts/triton-tts-coqui:latest
Creating the execution environment
az ml environment create -f azure_ml/environment.yml -g $RESOURCE_GROUP -w $WORKSPACE_NAME
Deployment
Since we have different models for different languages, to reduce the no. of deployments, we recommend that some of the models be grouped and deployed together, based on how much we can fit into the GPU RAM we're deploying on.
In our case, we group it as follows:
- North-Indian languages
- Indo-Aryan languages:
as
,bn
,gu
,hi
,mr
,or
,pa
,raj
- Language wise folders should be placed in
inference/checkpoints/indo-aryan/checkpoints
- Indo-Aryan languages:
- South-Indian languages
- Dravidian languages:
kn
,ml
,ta
,te
- Language wise folders should be placed in
inference/checkpoints/dravidian/checkpoints
- Dravidian languages:
- Remaining languages
- Miscellaneous languages:
en
,brx
,mni
- (Combination of Indian-English and Tibeto-Burman languages)
- Language wise folders should be placed in
inference/checkpoints/misc/checkpoints
- Miscellaneous languages:
In this tutorial, we show example on how to perform a deployment for North-Indian languages, the config files for which are available in the directory: azure_ml/indo-aryan
. (For other groups, follow similarly)
Registering the model
az ml model create --file azure_ml/indo-aryan/model.yml --resource-group $RESOURCE_GROUP --workspace-name $WORKSPACE_NAME
Publishing the endpoint for online inference
az ml online-endpoint create -f azure_ml/indo-aryan/endpoint.yml -g $RESOURCE_GROUP -w $WORKSPACE_NAME
Now from the Azure Portal, open the Container Registry, and grant ACR_PULL permission for the above endpoint, so that it is allowed to download the docker image.
Attaching a deployment
az ml online-deployment create -f azure_ml/indo-aryan/deployment.yml --all-traffic -g $RESOURCE_GROUP -w $WORKSPACE_NAME
Testing if inference works
- From Azure ML Studio, go to the "Consume" tab, and get the endpoint domain (without
https://
or trailing/
) and an authentication key. - In
client.py
, enableENABLE_SSL = True
, and then set theENDPOINT_URL
variable as well asAuthorization
value insideHTTP_HEADERS
. - Run
python3 client.py