Not able to embed more than 1000 bytes in provided png image #60

sharmaprakhar · 2020-05-05T03:28:32Z

SteganoGAN version or git commit:0.1.2
Python version (output of python --version):3.7
Pip version (output of pip --version):19.3.1
PyTorch version (output of python -c "import torch; print(torch.__version__)"):1.3.0
Operating System:iOS

Description

Describe what you were trying to get done.
Tell us what happened, what went wrong, and what you expected to happen.
I installed steganogan using pip (followed instructions from the repo). I was trying to experiment how many bytes I was able to embed into 'input.png' provided within the 'research' directory. I am able to successfully encode a random python ascii_lowercase string into the image to generate a steg image. Any more than that and the decode function fails to find a message.

The python script I am using is this :

m = 2
l = 1000
while True:
    cur_str = ''.join(choice(ascii_lowercase) for i in range(l))
    if sys.getsizeof(cur_str)>2000:
        print('length of current string:', len(cur_str))
        print('size of the string: ', sys.getsizeof(cur_str))
        break
    l = l*m


# process_str = "steganogan encode " + infile + " -o " + savepath + " " + cur_str
process_str = "steganogan encode " + infile + " -o " + savepath + " " + cur_str
subprocess.call(process_str, shell=True)

What I am looking for

The paper says that RS BPP is dependent on dataset and message. Could you point me to an image, in which I can encode close to or more than 4.4 BPP using off the shelf steganogan. If not, could you share which dataset/settings/hyperparameters I should use to train my model with

The text was updated successfully, but these errors were encountered:

k15z · 2020-05-05T21:35:53Z

The image provided in the research directory looks significantly different from the types of images in the training set (i.e. it's much higher resolution and has different types of objects); I would hypothesize that this is why the performance is significantly lower than expected. You should be able to achieve better results using images from the MS-COCO or Div2K datasets (specifically, their validation sets), I believe the current set of pre-trained models were trained on MS-COCO, so you might want to start there.

Also, you shouldn't use sys.getsizeof to try and estimate the memory usage, you'll probably want to do the math yourself. Here's a relevant article explaining why this function shouldn't be used. In your example, you're only handling regular ASCII characters so, regardless of the encoding (utf-8 or ascii), each character should take up 8 bits; this means each string requires 8000 bits of storage. This is slightly different from what sys.getsizeof reports.

Finally, if you want to train the model yourself to reproduce the results in the paper, you should train it on the MS-COCO dataset with the default hyperparameters, the Dense Fit.ipynb notebook in the research directory shows an example of how to use the fit API.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Not able to embed more than 1000 bytes in provided png image #60

Not able to embed more than 1000 bytes in provided png image #60

sharmaprakhar commented May 5, 2020 •

edited

Loading

k15z commented May 5, 2020 •

edited

Loading

Not able to embed more than 1000 bytes in provided png image #60

Not able to embed more than 1000 bytes in provided png image #60

Comments

sharmaprakhar commented May 5, 2020 • edited Loading

Description

What I am looking for

k15z commented May 5, 2020 • edited Loading

sharmaprakhar commented May 5, 2020 •

edited

Loading

k15z commented May 5, 2020 •

edited

Loading