microsoft/onnxruntime-extensions

Public

mirrored fromhttps://github.com/microsoft/onnxruntime-extensionsAvailable

CodeCommitsIssuesPull requestsActionsInsightsSecurity
2cf9bab611e9ad563822dee69c44e23bd017fadc

Branches

Tags

  • No tags available.
0Branches0Tags
Go to file
Add file
Code

Clone

HTTPS

Download ZIP

onnxruntime_extensions/tools/Example usage of the PrePostProcessor.md

208lines · modecode

1# Example usage of the PrePostProcessor
2
3The PrePostProcessor can be used to add pre and post processing operations to an existing model.
4
5## Installation
6
7Please install the latest onnxruntime_extensions package using
8`pip install --index-url https://aiinfra.pkgs.visualstudio.com/PublicPackages/_packaging/ORT-Nightly/pypi/simple/ onnxruntime-extensions`.
9The official release will be available on pypi soon.
10
11## Initial imports
12
13Import the pre/post processing infrastructure. This includes the PrePostProcessor class, the available pre and post
14processing steps, and utilities to simplify creating new model inputs.
15
16```py
17import onnx
18from onnxruntime_extensions.tools.pre_post_processing import *
19```
20
21## Example of creating the pre and post processing pipelines
22
23The following is an example pre-processing pipeline to update a model to take bytes from an jpg or png image as input.
24The original model input was pre-processed float data with shape {1, channels, 244, 244}, requiring the user to
25manually convert their input image to this format.
26
27### Create new input/s for the model
28
29First, if you're adding pre-processing you need to create new inputs to the model that the pre-processing will use.
30
31In our example we'll create a new input called 'image' containing uint8 data of length 'num_bytes'.
32
33```py
34new_input = create_named_value('image', onnx.TensorProto.UINT8, ['num_bytes'])
35```
36
37### Create PrePostProcessor
38
39Create our PrePostProcessor instance with the new input/s and the ONNX opset to use.
40Minimum allowed is opset 16. Opset 18 or higher is preferred as the Resize operator has anti-aliasing support.
41
42```py
43pipeline = PrePostProcessor([new_input], onnx_opset=18)
44```
45
46### Add pre-processing steps
47
48Add the preprocessing steps to the PrePostProcessor in the desired order.
49You can pick-and-choose from the predefined steps in the pre_post_processing.Steps module or create your own custom steps.
50If there's some common pre or post processing functionality that is missing please reach out and we'll look at adding
51the necessary Step implementations for it.
52
53Configure the steps as needed.
54
55```py
56pipeline.add_pre_processing(
57 [
58 ConvertImageToBGR(), # jpg/png image to BGR in HWC layout. output shape is {h_in, w_in, channels}
59 Resize(256), # resize so smallest side is 256.
60 CenterCrop(224, 224),
61 ChannelsLastToChannelsFirst(), # ONNX models are typically channels first. output shape is {channels, 244, 244}
62 ImageBytesToFloat(), # Convert uint8 values in range 0..255 to float values in range 0..1
63 Unsqueeze(axes=[0]), # add batch dim so shape is {1, channels, 244, 244}. we now match the original model input
64 ]
65)
66```
67
68Outputs from the previous step will be automatically connected to the next step (or model in the case of the last step),
69in the same order.
70i.e. the first output of the previous step is connected to the first input of the next step, etc. etc.
71until we run out of outputs or inputs (whichever happens first).
72
73It is also possible to manually specify connections. See [IoMapEntry](#iomapentry_usage)
74
75
76### Add post-processing steps
77
78Similarly the post-processing is assembled the same way. Let's say it's simply a case of applying Softmax to the
79first model output:
80
81``` py
82pipeline.add_post_processing(
83 [
84 Softmax()
85 ]
86)
87```
88
89Neither pre-processing or post-processing is required. Simply add what you need for your model.
90
91### Execute pipeline
92
93Once we have assembled our pipeline we simply run it with the original model, and save the output.
94
95The last pre-processing step is automatically connected to the original model inputs,
96and the first post-processing step is automatically connected to the original model outputs.
97
98```py
99model = onnx.load('my_model.onnx')
100new_model = pipeline.run(model)
101onnx.save_model(new_model, 'my_model.with_pre_post_processing.onnx')
102```
103
104
105## Helper to create new named model inputs.
106
107The `create_named_value` helper from [pre_post_processing.utils](./docs/pre_post_processing/utils.md#) can be used
108to create model inputs.
109
110- The `name` value must be unique for the model.
111- The `data_type` should be an onnx.TensorProto value like onnx.TensorProto.UINT8 or onnx.TensorProto.FLOAT from the
112list defined [here](https://github.com/onnx/onnx/blob/759907808db622938082c6eeaa8f685dee3dc868/onnx/onnx.proto#L483).
113- The `shape` specifies the input shape. Use int for dimensions with known values and strings for symbolic dimensions.
114 e.g. ['batch_size', 1024] would be a rank 2 tensor with a symbolic first dimension named 'batch_size'.
115
116
117## IoMapEntry usage
118
119When the automatic connection of outputs from the previous step to inputs of the current step is insufficient,
120an IoMapEntry can be used to explicitly specify connections.
121
122As an example, let's look at a subset of the operations in the pre and post processing for a super resolution model.
123In the pre-processing we convert the input from RGB to YCbCr using `PixelsToYCbCr`.
124That step produces 3 separate outputs - `Y`, `Cb` and `Cr`. The model has one input and is automatically connected
125to the `Y` output when PixelsToYCbCr is the last pre-processing step.
126We want to consume the `Cr` and `Cr` outputs in the post-processing by joining that with new `Y'` model output.
127
128
129```py
130 pipeline = PrePostProcessor(inputs)
131 pipeline.add_pre_processing(
132 [
133 ...
134 # this produces Y, Cb and Cr outputs. each has shape {h_in, w_in}. only Y is input to model
135 PixelsToYCbCr(layout="BGR"),
136 ]
137 )
138```
139
140In order to do that, the post-processing entry can be specified as a tuple of the Step and a list of IoMapEntries.
141Each IoMapEntry has a simple structure of `IoMapEntry(producer, producer_idx, consumer_idx)`.
142- The `producer` is the name of the Step that produces the output.
143- The `producer_idx` is the index of the output from that step.
144- The `consumer_idx` is the input number of the Step that we want to connect to.
145
146
147```py
148 pipeline.add_post_processing(
149 [
150 # as we're selecting outputs from multiple previous steps we need to map them to the inputs using step names
151 (
152 YCbCrToPixels(layout="BGR"),
153 [
154 # the first model output is automatically joined to consumer_idx=0
155 IoMapEntry("PixelsToYCbCr", producer_idx=1, consumer_idx=1), # Cb value
156 IoMapEntry("PixelsToYCbCr", producer_idx=2, consumer_idx=2) # Cr value
157 ],
158 ),
159 ConvertBGRToImage(image_format="png")
160 ]
161 )
162```
163
164By default the name for the each Step is the class name. When instantiating a step you can override the `name` property
165to provide a more descriptive name or resolve ambiguity (e.g. if there are multiple steps of the same type).
166
167In our example, if we used `PixelsToYCbCr(layout="BGR", name="ImageConverter")` in the pre-processing step,
168we would use `IoMapEntry("ImageConverter", producer_idx=1, consumer_idx=1)` in the post-processing step to match that
169name.
170
171Note that the automatic connection between steps will still occur. The list of IoMapEntry values is used to override the
172automatic connections, so you only need to provide an IoMapEntry for connections that need customization. In our
173example the model output is automatically connected to the first input of the `YCbCrToPixels` step so it wasn't
174necessary to provide an IoMapEntry for consumer_idx=0.
175
176
177## Debug step usage
178
179If you are creating your own pipeline if can sometimes be necessary to inspect the output of a pre or post processing
180step if the final results are unexpected. The easiest way to do this is to insert a `Debug` step into the pipeline.
181
182The Debug step will create graph outputs for the outputs from the previous step. That means they will be available
183as outputs when running the updated model, and can be inspected.
184
185The Debug step will also pass through its inputs to the next step, so no other changes to the pipeline are required.
186
187Considering our pre-processing example, if we wanted to inspect the result of the conversion from an input image
188we can insert a Debug step like below. The existing steps remain unchanged.
189
190```py
191pipeline.add_pre_processing(
192 [
193 ConvertImageToBGR(), # jpg/png image to BGR in HWC layout. output shape is {h_in, w_in, channels}
194 Debug(),
195 Resize(256), # resize so smallest side is 256.
196```
197
198The model will now have an additional output called 'bgr_data' (the default output name of the ConvertImageToBGR step).
199
200Note that if the previous step produces multiple outputs the Debug step must be configured with this information.
201
202e.g.
203
204```py
205PixelsToYCbCr(layout="BGR"),
206Debug(num_inputs=3),
207...
208```