Hipi on Spark #31

sdikby · 2016-11-03T08:36:01Z

Dear HIPI developers,

do you plan on integrating apache spark instead of the old mapreduce?? if so when?
Otherwise could you give me some hints on how to do it?
My use case is that i need to classify millions of images and with mapreduce it will not be efficient as i need it to be.
@sweeneychris @liuliu @voigtlandier @zverham @hafnium

yangboz · 2016-11-04T11:56:38Z

@sdikby have you tried hipi hibImport.sh with millions of images successfully?

sdikby · 2016-11-13T21:05:12Z

@yangboz sorry for the delay.
no, i didn't even start to use HIPI. My use case is to process millions of images in hadoop. But i don't think that it is performant enough with MapReduce or if it is even possible with Hipi, as it is not maintained since around a year now (the last commit was on 12 april).

yangboz · 2016-11-14T01:16:44Z

@sdikby thanks for your reply, totally agree with your comments of lack of updates on HIPI source code, also I found code issue #30, none response..
by the way, except the HIPI solution, any other Hadoop sequence file solutions for millions of images files?

sdikby · 2016-11-14T09:52:48Z

@yangboz i know some 2 other tools for image processing, but i didn't try them yet (i just began my master thesis :) )
there is Mipr: https://github.com/sozykin/mipr
and this one: https://github.com/okstate-robotics/hipl
The two are based also on mapreduce.
Otherwise i don't know the ´difference between them. Feel free to test them and i would be happy to get a feedback from you

yangboz · 2016-11-14T10:04:59Z

@sdikby thanks for your ideas suggestion, I will try them, and my ideas comes from :
http://dinesh-malav.blogspot.com/2015/05/image-processing-using-opencv-on-hadoop.html ,
It is a great tutorial on CDH(MR1)+HIPI v1+ant, but nowadays,HIPI using gradlew, version v2+,that's why I am struggling on code base modifications.

sdikby · 2016-11-14T10:24:13Z

@yangboz it would be also great to know how the 3 tools/frameworks store images on hdfs (to deal with the blocksize problem for example) and the big differences between them(read/write performance from/into hdfs).

yangboz · 2016-11-14T10:32:36Z

@sdikby before those 3 tools/framework, existed solutions that I have studied on Ceph and even Cassandra image blob storage. Conclusion will coming soon.

yangboz · 2016-11-28T13:15:47Z

@sdikby compare Mipr: https://github.com/sozykin/mipr (full documentation an code example passed)
with this one: https://github.com/okstate-robotics/hipl (missing of documentation!)

sdikby · 2016-11-28T18:24:28Z

@yangboz oh good job ! and what's about performance? did you compare the both in terms of # image write/read per second?
and how they both store images on HDFS, specially how they deal with the block size problem ?

yangboz · 2016-12-05T06:22:19Z

@sdikby there is a paper(please drop a letter to me if you need it.) on hadoop/spark performance compare includes indexing and retrieval
according to its compare result, integrate hadoop and spark to process 160k pictures on 30 node cluster that improve the efficiency.

sdikby · 2016-12-09T12:44:25Z

@yangboz could you please provide me this paper.
I would do a performance test between the 3 cited frameworks in the next months.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Hipi on Spark #31

Hipi on Spark #31

sdikby commented Nov 3, 2016

yangboz commented Nov 4, 2016

sdikby commented Nov 13, 2016

yangboz commented Nov 14, 2016

sdikby commented Nov 14, 2016

yangboz commented Nov 14, 2016 •

edited

Loading

sdikby commented Nov 14, 2016

yangboz commented Nov 14, 2016

yangboz commented Nov 28, 2016

sdikby commented Nov 28, 2016

yangboz commented Dec 5, 2016

sdikby commented Dec 9, 2016

Hipi on Spark #31

Hipi on Spark #31

Comments

sdikby commented Nov 3, 2016

yangboz commented Nov 4, 2016

sdikby commented Nov 13, 2016

yangboz commented Nov 14, 2016

sdikby commented Nov 14, 2016

yangboz commented Nov 14, 2016 • edited Loading

sdikby commented Nov 14, 2016

yangboz commented Nov 14, 2016

yangboz commented Nov 28, 2016

sdikby commented Nov 28, 2016

yangboz commented Dec 5, 2016

sdikby commented Dec 9, 2016

yangboz commented Nov 14, 2016 •

edited

Loading