-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Clarification on Threshold Setting and Using Side Information for Binarized Data #157
Comments
Hi Ankit,
|
Here's an example with sideinfo on both sides: smurff/python/test/test_macau.py Line 17 in c5c5d50
|
When I put threshold=0.5 I got output like this: The test data is also a sparse matrix and it is in |
I think your Train/Test matrix is sparse, not scarse. Have look at the difference here. |
This is my code and all the data is in `import smurff logging.basicConfig(level = logging.INFO) train = load_npz("train_matrix.npz") c_threshold = 0.5 trainSession.addTrainAndTest(train, test, smurff.ProbitNoise(c_threshold)) Even after converting the input data into |
Can you plot me a histogram on the values in |
Can you try:
|
It's not working |
Of course not, it should be |
I am using SMURFF for matrix factorization with already binarized data (values are either 0 or 1). I noticed in one of your notebooks that data is binarized during the training process using a threshold (pIC50 > 6.0). Since my data is already binarized, I am unsure about the correct threshold to set. Could you clarify the following points?
Threshold Setting: Should I still set a threshold in
ProbitNoise
when my data is already binary (0 or 1)? If yes, what value should it be? If no, how do I handle this?Side Information: My dataset includes side information for both rows and columns. How should I incorporate this dual side information effectively in my setup? For instance:
Should I use
direct = False
for both row and column side information?Is there anything specific I need to modify in my current pipeline?
The text was updated successfully, but these errors were encountered: