Skip to content

Issue with marshal dump #144

Open
Open
@seoanezonjic

Description

@seoanezonjic

Hi authors of Numo-narray
I'm working with biological networks so I need to use matrix operations frecuently. In my project I was using the old Nmatrix library but recently I found your project. Your numpy style and your support to the community convince me to replace Nmatrix by Numo-array in my project. I observed a great improvement in memory, speed and code clarity using your library but i have problems serializing large matrix. With a matrix of 17078x17078 elements i have the following error:
in dump: long too big to dump (TypeError)
I will provide a context for the error. In my pipeline, the first step is take a network defined by pairs and transform it in a Numo::Narray object and then write to disk with:
File.binwrite(options[:output_matrix_file], Marshal.dump(matrix))
in this way I can do several different executions without waste time in the plain text to matrix transformation. I read my serialized matrix and perform the operation I'm interested in. The plain text to matrix transformation is done with the following code:

def add_pair(node_a, node_b, weight, connections)
	query = connections[node_a]
	if !query.nil?
		query[node_b] = weight
	else
		subhash = Hash.new(0.0)
		subhash[node_b] = weight
		connections[node_a] = subhash
	end
end

connections = {}
source.each do |line|
	node_a, node_b, weight = line.chomp.split("\t")
	weight.nil? ? weight = 1.0 : weight = weight.to_f
	add_pair(node_a, node_b, weight, connections)
	add_pair(node_b, node_a, weight, connections)
end
names = connections.keys
matrix = Numo::DFloat.zeros(names.length, names.length)
connections.each do |nodeA, subhash|
	index_A = names.index(nodeA)
	subhash.each do |nodeB, weight|
		index_B = names.index(nodeB)
		matrix[index_A, index_B] = weight
	end
end	

Source is an IO object with the plain text file and my strategy is build a hash with the pair relations to use it in the matrix filling process. I attach the data to create this matrix in the following link :
https://www.dropbox.com/s/vqmzkazag3m3fgz/pairs.tar.gz?dl=0

This was executed with numo-narray (0.9.1.5) and suse
openSUSE 12.3 (x86_64)
VERSION = 12
PATCHLEVEL = 3
CODENAME = Malachite

Thank you in advance
Pedro Seoane

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions