Overview

This repo demonstrates the use of triton inference server with deepstream for the purpose of dynamic model reload.

It uses the peoplenet model with a deepstream pipeline, and demonstrates the ability to load one of two different versions available without stopping the deepstream pipeline.

Usage

These steps are tested on a Nvidia Jetson target running Jetpack 6.2 and Deepstream 7.1

Connect a camera to the Jetson (or, optionally change the deepstream source file to use the file input instead)
Run tao_download_and_convert_to_plan.sh, following instructions to install necessary dependencies. This will download the tao files from NVIDIA and prepare a local directory with extracted and converted contents suitable for use with this demo.
Run setup_model_repo_with_version.sh $version to setup the model repo with the desired peoplenet version based on the versions listed in the environment.sh file. Running with no arguments will provide a list of supported versions.
Open a dedicated command window (or screen/tmux session) and run start_triton_server.sh to start the triton server with configuration for the model location setup in the previous step. You can leave this window open to monitor the triton server.
Open a command window on a session attached to a UI screen and run start_deepstream_pipeline.sh to start the deepstream pipeline.
Open a command window and run get_model_stats.sh. Note that the version printed here should match the version of the model loaded in the setup_model_repo_with_version.sh step.
With the pipeline still running, setup the model repo with a new model version using the setup_model_repo_with_version.sh script.
With the pipeline still running, reload the model on the inference server using the reload_model_on_server.sh script. You should notice:

The triton server should note a changed version.
The pipeline should continue running, now with the updated model.
The version reported by get_model_stats.sh will match the version of the newly loaded model.

Artificially corrupting a model

In order to more obviosly show the difference in model reload, you can use the clobber_model.py script to corrupt one of the input models.

Run the clobber_model.py script to corrupt the model weights, passing a path to the ngc_models download .onnx file.
Re-generate the model.plan file using the trtexec function in the download_and_convert_to_plan script.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Overview

Usage

Artificially corrupting a model

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
deepstream/samples/configs/deepstream-app-triton		deepstream/samples/configs/deepstream-app-triton
.gitignore		.gitignore
README.md		README.md
clobber_model.py		clobber_model.py
environment.sh		environment.sh
get_model_stats.sh		get_model_stats.sh
reload_model_on_server.sh		reload_model_on_server.sh
setup_model_repo_with_version.sh		setup_model_repo_with_version.sh
start_deepstream_pipeline.sh		start_deepstream_pipeline.sh
start_triton_server.sh		start_triton_server.sh
tao_download_and_convert_to_plan.sh		tao_download_and_convert_to_plan.sh

Trellis-Logic/triton-jetson-demo

Folders and files

Latest commit

History

Repository files navigation

Overview

Usage

Artificially corrupting a model

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages