Description
Hi authors of Numo-narray
I'm working with biological networks so I need to use matrix operations frecuently. In my project I was using the old Nmatrix library but recently I found your project. Your numpy style and your support to the community convince me to replace Nmatrix by Numo-array in my project. I observed a great improvement in memory, speed and code clarity using your library but i have problems serializing large matrix. With a matrix of 17078x17078 elements i have the following error:
in dump: long too big to dump (TypeError)
I will provide a context for the error. In my pipeline, the first step is take a network defined by pairs and transform it in a Numo::Narray object and then write to disk with:
File.binwrite(options[:output_matrix_file], Marshal.dump(matrix))
in this way I can do several different executions without waste time in the plain text to matrix transformation. I read my serialized matrix and perform the operation I'm interested in. The plain text to matrix transformation is done with the following code:
def add_pair(node_a, node_b, weight, connections)
query = connections[node_a]
if !query.nil?
query[node_b] = weight
else
subhash = Hash.new(0.0)
subhash[node_b] = weight
connections[node_a] = subhash
end
end
connections = {}
source.each do |line|
node_a, node_b, weight = line.chomp.split("\t")
weight.nil? ? weight = 1.0 : weight = weight.to_f
add_pair(node_a, node_b, weight, connections)
add_pair(node_b, node_a, weight, connections)
end
names = connections.keys
matrix = Numo::DFloat.zeros(names.length, names.length)
connections.each do |nodeA, subhash|
index_A = names.index(nodeA)
subhash.each do |nodeB, weight|
index_B = names.index(nodeB)
matrix[index_A, index_B] = weight
end
end
Source is an IO object with the plain text file and my strategy is build a hash with the pair relations to use it in the matrix filling process. I attach the data to create this matrix in the following link :
https://www.dropbox.com/s/vqmzkazag3m3fgz/pairs.tar.gz?dl=0
This was executed with numo-narray (0.9.1.5) and suse
openSUSE 12.3 (x86_64)
VERSION = 12
PATCHLEVEL = 3
CODENAME = Malachite
Thank you in advance
Pedro Seoane