You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm trying to use Senti4SD on a large dataset (~100M lines of text) and would like to instrument most of it from R to improve performance. In particular, I'm trying to avoid the creation of the large CSV file containing the features.
For that, I want to run Senti4SD on chunks of the data. However, this considerably slows down the whole process because each time the script is called, Senti4SD-fast.jar needs to reload dsm.bin. To overcome that problem, I want to use rJava to load the JVM from R itself, load the dsm.bin and run the feature extraction on chunks without storing the result in a file.
Is there any documentation available that would allow me to easily call with rJava the feature extraction without creating files?
The text was updated successfully, but these errors were encountered:
I'm trying to use Senti4SD on a large dataset (~100M lines of text) and would like to instrument most of it from R to improve performance. In particular, I'm trying to avoid the creation of the large CSV file containing the features.
For that, I want to run Senti4SD on chunks of the data. However, this considerably slows down the whole process because each time the script is called, Senti4SD-fast.jar needs to reload dsm.bin. To overcome that problem, I want to use rJava to load the JVM from R itself, load the dsm.bin and run the feature extraction on chunks without storing the result in a file.
Is there any documentation available that would allow me to easily call with rJava the feature extraction without creating files?
The text was updated successfully, but these errors were encountered: