Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Not able to embed more than 1000 bytes in provided png image #60

Open
sharmaprakhar opened this issue May 5, 2020 · 1 comment
Open

Comments

@sharmaprakhar
Copy link

sharmaprakhar commented May 5, 2020

  • SteganoGAN version or git commit:0.1.2
  • Python version (output of python --version):3.7
  • Pip version (output of pip --version):19.3.1
  • PyTorch version (output of python -c "import torch; print(torch.__version__)"):1.3.0
  • Operating System:iOS

Description

Describe what you were trying to get done.
Tell us what happened, what went wrong, and what you expected to happen.
I installed steganogan using pip (followed instructions from the repo). I was trying to experiment how many bytes I was able to embed into 'input.png' provided within the 'research' directory. I am able to successfully encode a random python ascii_lowercase string into the image to generate a steg image. Any more than that and the decode function fails to find a message.

The python script I am using is this :

m = 2
l = 1000
while True:
    cur_str = ''.join(choice(ascii_lowercase) for i in range(l))
    if sys.getsizeof(cur_str)>2000:
        print('length of current string:', len(cur_str))
        print('size of the string: ', sys.getsizeof(cur_str))
        break
    l = l*m


# process_str = "steganogan encode " + infile + " -o " + savepath + " " + cur_str
process_str = "steganogan encode " + infile + " -o " + savepath + " " + cur_str
subprocess.call(process_str, shell=True)

What I am looking for

The paper says that RS BPP is dependent on dataset and message. Could you point me to an image, in which I can encode close to or more than 4.4 BPP using off the shelf steganogan. If not, could you share which dataset/settings/hyperparameters I should use to train my model with

@k15z
Copy link
Contributor

k15z commented May 5, 2020

The image provided in the research directory looks significantly different from the types of images in the training set (i.e. it's much higher resolution and has different types of objects); I would hypothesize that this is why the performance is significantly lower than expected. You should be able to achieve better results using images from the MS-COCO or Div2K datasets (specifically, their validation sets), I believe the current set of pre-trained models were trained on MS-COCO, so you might want to start there.

Also, you shouldn't use sys.getsizeof to try and estimate the memory usage, you'll probably want to do the math yourself. Here's a relevant article explaining why this function shouldn't be used. In your example, you're only handling regular ASCII characters so, regardless of the encoding (utf-8 or ascii), each character should take up 8 bits; this means each string requires 8000 bits of storage. This is slightly different from what sys.getsizeof reports.

Finally, if you want to train the model yourself to reproduce the results in the paper, you should train it on the MS-COCO dataset with the default hyperparameters, the Dense Fit.ipynb notebook in the research directory shows an example of how to use the fit API.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants