Skip to content
This repository has been archived by the owner on Dec 22, 2020. It is now read-only.

Document too large: This BSON document is limited to 4194304 bytes. (BSON::InvalidDocument) #101

Open
wdarosh opened this issue Aug 3, 2015 · 4 comments

Comments

@wdarosh
Copy link

wdarosh commented Aug 3, 2015

I have been working with MongoDB 2.4.12 attempting to migrate to PostgreSQL 9.4.X for a system migration. Most of the collections translate but I am unable to get past this error.

I have tried swapping up the driver however I had had no luck with MoSQL detecting and utilizing the new driver.

/var/lib/gems/1.9.1/gems/bson-1.12.3/lib/bson/bson_c.rb:20:in `serialize': Document too large: This BSON document is limited to 4194304 bytes. (BSON::InvalidDocument)
        from /var/lib/gems/1.9.1/gems/bson-1.12.3/lib/bson/bson_c.rb:20:in `serialize'
        from /var/lib/gems/1.9.1/gems/bson-1.12.3/lib/bson.rb:19:in `serialize'
        from /var/lib/gems/1.9.1/gems/mosql-0.4.3/lib/mosql/schema.rb:212:in `transform'
        from /var/lib/gems/1.9.1/gems/mosql-0.4.3/lib/mosql/streamer.rb:147:in `block (3 levels) in import_collection'
        from /var/lib/gems/1.9.1/gems/mongo-1.12.3/lib/mongo/cursor.rb:343:in `each'
        from /var/lib/gems/1.9.1/gems/mosql-0.4.3/lib/mosql/streamer.rb:146:in `block (2 levels) in import_collection'
        from /var/lib/gems/1.9.1/gems/mosql-0.4.3/lib/mosql/streamer.rb:70:in `block in with_retries'
        from /var/lib/gems/1.9.1/gems/mosql-0.4.3/lib/mosql/streamer.rb:68:in `times'
        from /var/lib/gems/1.9.1/gems/mosql-0.4.3/lib/mosql/streamer.rb:68:in `with_retries'
        from /var/lib/gems/1.9.1/gems/mosql-0.4.3/lib/mosql/streamer.rb:145:in `block in import_collection'
        from /var/lib/gems/1.9.1/gems/mongo-1.12.3/lib/mongo/collection.rb:291:in `find'
        from /var/lib/gems/1.9.1/gems/mosql-0.4.3/lib/mosql/streamer.rb:144:in `import_collection'
        from /var/lib/gems/1.9.1/gems/mosql-0.4.3/lib/mosql/streamer.rb:122:in `block (2 levels) in initial_import'
        from /var/lib/gems/1.9.1/gems/mosql-0.4.3/lib/mosql/streamer.rb:120:in `each'
        from /var/lib/gems/1.9.1/gems/mosql-0.4.3/lib/mosql/streamer.rb:120:in `block in initial_import'
        from /var/lib/gems/1.9.1/gems/mosql-0.4.3/lib/mosql/streamer.rb:108:in `each'
        from /var/lib/gems/1.9.1/gems/mosql-0.4.3/lib/mosql/streamer.rb:108:in `initial_import'
        from /var/lib/gems/1.9.1/gems/mosql-0.4.3/lib/mosql/streamer.rb:28:in `import'
        from /var/lib/gems/1.9.1/gems/mosql-0.4.3/lib/mosql/cli.rb:167:in `run'
        from /var/lib/gems/1.9.1/gems/mosql-0.4.3/lib/mosql/cli.rb:16:in `run'
        from /var/lib/gems/1.9.1/gems/mosql-0.4.3/bin/mosql:5:in `<top (required)>'
        from /usr/local/bin/mosql:23:in `load'
        from /usr/local/bin/mosql:23:in `<main>'
@dmitrypisanko
Copy link

I have the same problem. Any solution?

@bbdurall
Copy link

bbdurall commented Jun 16, 2016

I've figured out a solution, but my Ruby knowledge is very minimal, so I'll need some assistance in getting this patch into the proper form to add to the repo.

First, install the deep clone gem, from the Unix shell:
gem install ruby_deep_clone

Then, comment out line 212 of schema.rb:
obj = BSON.deserialize(BSON.serialize(obj))

and underneath insert the following lines:
require "deep_clone"
obj = DeepClone.clone(original)

I don't think this is the proper way to introduce an external dependency to the project, but as a quick hack it worked for me. It's quite slow on large objects (it took over 5 mins to process: ~2000 rows containing large PDFs), but it eventually inserts them into the postgres db.

@ebroder
Copy link
Contributor

ebroder commented Jun 16, 2016

Hmm, the issue here is likely that BSON.serialize uses the original default maximum BSON size (4MB). The maximum has since been raised, but increasing it relies on negotiating the new limit with the connection.

Replacing BSON.serialize with something like BSON::BSON_CODER.serialize(obj, false, false, 16*1024*1024) will likely also fix your issue (without requiring a new dependency)

@bbdurall
Copy link

I've verified, changing line 212 of schema.rb to:
obj = BSON.deserialize(BSON::BSON_CODER.serialize(obj, false, false, 16*1024*1024))
fixes the issue. I tried to push up a new branch for the fix, but I don't seem to have permission to do so. What's the best way to get this fix into the master branch?

jdjkelly added a commit to uniiverse/mosql that referenced this issue Oct 22, 2018
ankitpopli1891 added a commit to upkeepapp/mosql that referenced this issue Dec 11, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants