Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FPGA Output is Zero in CNN model with 8,512 parameters. #1048

Open
zsrabbani opened this issue Aug 8, 2024 · 7 comments
Open

FPGA Output is Zero in CNN model with 8,512 parameters. #1048

zsrabbani opened this issue Aug 8, 2024 · 7 comments
Labels

Comments

@zsrabbani
Copy link

zsrabbani commented Aug 8, 2024

I have a CNN model. I used the hls4ml and all file and bitfile generated completely. Now I used the deployment code to implement on FPGA(ZCU104), the prediction output of FPGA is always Zero.

Total params: 8512 (33.25 KB)
Trainable params: 8344 (32.59 KB)
Non-trainable params: 168 (672.00 Byte)

I will appreciate for helping me.

Here is the Model:

rf_in = Input(shape=(1024, 2), name = 'rf_input')

x = Conv1D(16, 7, activation=None, padding='same', use_bias=False)(rf_in)
x = BatchNormalization()(x)
x = Activation('relu')(x)
x = MaxPooling1D(2, strides = 2, padding='same') (x)

x = Conv1D(16, 7, activation=None, padding='same', use_bias=False)(x)
x = BatchNormalization()(x)
x = Activation('relu')(x)
x = MaxPooling1D(2, strides = 2, padding='same') (x)

x = Conv1D(16, 5, activation=None, padding='same', use_bias=False)(x)
x = BatchNormalization()(x)
x = Activation('relu')(x)
x = MaxPooling1D(2, strides=2, padding='same') (x)

x = Conv1D(16, 3, activation=None, padding='same', use_bias=False)(x)
x = BatchNormalization()(x)
x = Activation('relu')(x)
x = MaxPooling1D(2, strides=2, padding='same') (x)

x = Conv1D(8, 5, activation=None, padding='same', use_bias=False)(x)
x = BatchNormalization()(x)
x = Activation('relu')(x)
x = MaxPooling1D(2, strides=2, padding='same') (x)

x = Conv1D(8, 3, activation=None, padding='same', use_bias=False)(x)
x = BatchNormalization()(x)
x = Activation('relu')(x)
x = MaxPooling1D(2, strides=2, padding='same') (x)

x = Conv1D(4, 3, activation=None, padding='same', use_bias=False)(x)
x = BatchNormalization()(x)
x = Activation('relu')(x)
x = MaxPooling1D(2, strides=2, padding='same') (x)

x = Flatten()(x)

dense_1 = Dense(64, activation='relu', use_bias=False)(x)
dropout_1 = Dropout(0.35)(dense_1)
dense_2 = Dense(16, activation='relu', use_bias=False)(dropout_1)
dropout_2 = Dropout(0.55)(dense_2)
softmax = Dense(7, activation='softmax', use_bias=False)(dropout_2)

model = keras.Model(rf_in, softmax)
opt = keras.optimizers.Adam(learning_rate=0.001)
model.compile(loss='categorical_crossentropy', optimizer=opt, metrics=["accuracy"])

model.summary()

Here is the HLSML code:

image

Here s the deployment code:

image

@zsrabbani zsrabbani added the bug label Aug 8, 2024
@zsrabbani zsrabbani changed the title FPGA Output is Zero with 8,512 parameters. FPGA Output is Zero in CNN model with 8,512 parameters. Aug 8, 2024
@GeorgeMentzos
Copy link

I can confirm that I am also encountering similar behaviour where I am using the standard CNN from the hls4ml tutorial quantized at 6-bits and the prediction output I am getting is also random. I would like to add that this only occurs when I am using the Resource Strategy where I am observing considerable accuracy loss (from 84% to 18%) just by switching from Latency to Resource mode.

@vloncar
Copy link
Contributor

vloncar commented Aug 8, 2024

Try to get a better understanding of how configuration of types works from the documentation, what are the effects of using fixed precision and quantization, and ultimately profile your application. See the tutorial, especially part 2 and 4.

@zsrabbani
Copy link
Author

As you can see, I used the correct setup but did not get any results.
Could you help me with it?

@returnwellbeing
Copy link

Hi, I've encountered same issue. I am using the example for extension api, kreverse. https://fastmachinelearning.org/hls4ml/advanced/extension.html#
I used vivadoaccelerator and got the final hardware block. but when i deploy the hardware on pynq-z2 board, I got only zero-filled output.

@nghielme
Copy link
Contributor

nghielme commented Sep 3, 2024

Hi, I would suggest to first of all check if hls_model.predict(x) (HLS model simulated on CPU) corresponds to model.predict(x)(Keras model); they should be at least close each other. If they are not, the problem can be related to accumulators datatype in the network. For that you can try using auto so that the size of the accumulator is inferred by the operations that use the accumulator. This jmitrevs:keras-config-auto can be helpful in using auto for properly handle accumulators datatype.

@returnwellbeing
Copy link

returnwellbeing commented Sep 6, 2024

@nghielme Thanks for suggestion. I found that hls_model.predict(x) and model.predict(x) are different. your advice was a great help in finding the cause.

@zsrabbani In my case, there are some error when generating {OUTPUT_DIR}/firmware/myproject.cpp.
There must are some function calls at the end of the myproject.cpp. please check yours.

void myproject(
    // Here are some inputs
) {

    // hls-fpga-machine-learning insert IO
    // Here are some pragmas

#ifndef __SYNTHESIS__
    static bool loaded_weights = false;
    if (!loaded_weights) {
        // hls-fpga-machine-learning insert load weights
        loaded_weights = true;
    }
#endif

    // ****************************************
    // NETWORK INSTANTIATION
    // ****************************************

    // hls-fpga-machine-learning insert layers
    // Some function calls should be here. if not, the outputs of hardware block are always ZERO
}

@zsrabbani
Copy link
Author

@returnwellbeing
As I check myproject.cpp, everything looks fine and I didn't get that comment.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

5 participants