Using mixed-mode (FP + QUANT) for network evaluation

Nadav · January 22, 2024, 7:28am

When debugging loss of accuracy situation, in many cases, the majority of the loss is associated with only a few layers in the model. To test this, and idenity which are thos layers, we can use the “mixed-mode” operation of the emulator. Basically, we will emulate some layer(s) in full precision (32b) vs. optimized/quantized.

This is a basic code for doing so:

def network_mixed_eval(runner, target, image, native_layers, all_native=False):
    with runner.infer_context(target) as ctx:
        wrapped_keras_model_mixed = runner.get_keras_model(ctx) 
        keras_model_mixed = wrapped_keras_model_mixed.layers[0] 
        for lname, val in keras_model_mixed.layers.items(): 
            if all_native and 'postprocess' not in lname:
                keras_model_mixed.layers[lname].disable_lossy()
            elif lname.endswith(native_layers):
                keras_model_mixed.layers[lname].disable_lossy()
        result = keras_model_mixed(np.expand_dims(image, axis=0)) 
        return result

This is a proposed way for using that function. For rhis example, we’re going to use as simple 224x224 Mobilenet-v1 model.
First, we need to load a test image:

resized_image =  np.array(Image.open('000000000139.jpg').resize((224,224)))

Second, we need to have an optimized Mobilenet-v1 model, let’s load it into a runner:

runner = ClientRunner(har='mnv1.q.har')

The last part is the execution:

debug = network_mixed_eval(runner, InferenceContext.SDK_QUANTIZED, resized_image, ('conv13', 'conv14', 'fc1'))

For completeness, thos are the imports needed for all the above:

from hailo_sdk_client import ClientRunner, InferenceContext
import numpy as np
from PIL import Image

Topic		Replies	Views
Accuracy degradation after quantization for Hailo HW General network	1	134	March 1, 2024
Custom ONNX models on H8L Raspberry General raspberry-pi , hailo8	2	227	June 27, 2024
Can I run inference on a model that was quantized to have fully 16-bit weights? General python , cpp	0	113	April 9, 2024
DFC parser cannot parse a Tensorflow model trained with Quantization aware training (QAT) General dfc	1	62	January 18, 2024
My accuracy slightly improved after quantization General network	2	60	January 29, 2024

Using mixed-mode (FP + QUANT) for network evaluation

Related Topics