Using mixed-mode (FP + QUANT) for network evaluation

When debugging loss of accuracy situation, in many cases, the majority of the loss is associated with only a few layers in the model. To test this, and idenity which are thos layers, we can use the “mixed-mode” operation of the emulator. Basically, we will emulate some layer(s) in full precision (32b) vs. optimized/quantized.

This is a basic code for doing so:

def network_mixed_eval(runner, target, image, native_layers, all_native=False):
    with runner.infer_context(target) as ctx:
        wrapped_keras_model_mixed = runner.get_keras_model(ctx) 
        keras_model_mixed = wrapped_keras_model_mixed.layers[0] 
        for lname, val in keras_model_mixed.layers.items(): 
            if all_native and 'postprocess' not in lname:
                keras_model_mixed.layers[lname].disable_lossy()
            elif lname.endswith(native_layers):
                keras_model_mixed.layers[lname].disable_lossy()
        result = keras_model_mixed(np.expand_dims(image, axis=0)) 
        return result

This is a proposed way for using that function. For rhis example, we’re going to use as simple 224x224 Mobilenet-v1 model.
First, we need to load a test image:

resized_image =  np.array(Image.open('000000000139.jpg').resize((224,224)))

Second, we need to have an optimized Mobilenet-v1 model, let’s load it into a runner:

runner = ClientRunner(har='mnv1.q.har')

The last part is the execution:

debug = network_mixed_eval(runner, InferenceContext.SDK_QUANTIZED, resized_image, ('conv13', 'conv14', 'fc1'))

For completeness, thos are the imports needed for all the above:

from hailo_sdk_client import ClientRunner, InferenceContext
import numpy as np
from PIL import Image