Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

report two Ace bugs #126

Open
Hitatm opened this issue Jan 14, 2019 · 15 comments
Open

report two Ace bugs #126

Hitatm opened this issue Jan 14, 2019 · 15 comments

Comments

@Hitatm
Copy link

Hitatm commented Jan 14, 2019

I am intersted at your crashmonkey,but i found some obvious bugs. maybe,you can fix it.

1 at copy_diff.sh:27

     if [ -e build/diff* ]  
when there are two diff* files match ; shell script will  exit incorrectly;
 suggest : if [ -n "$(ls build/| grep diff)" ]

2 at ace.py variable file_names, incomplete type judgment

   if isinstance(file_names, basestring): 
       balabalabala
  else
     file_name = file_names[0]   #when file_names is None , it can not generate nested seq-1 ;

i have tried "python ace.py -l 1 -n True -d false",your ace will always report None Type Error.

@Hitatm
Copy link
Author

Hitatm commented Jan 14, 2019

the first bug ,when runnig your seq2 testcase for seconds ,then interrupt it, and rerun,you will find some testcase report "Could not run test".

@ashmrtn
Copy link
Member

ashmrtn commented Jan 14, 2019

From the description of bug 1, it sounds like an artifact of how CrashMonkey exits when interrupted. My guess would be that CrashMonkey hasn't cleaned up some of the resources it uses, and so the second time you try and run the test it fails due to the resource issue. I know CrashMonkey does not exit cleanly when interrupted. The resources that cause it to fail will be cowBrd and DiskWrapper kernel modules already inserted in the kernel and a socket with CrashMonkey in the name in /tmp. Looking in \tmp for a socket from CrashMonkey and using lsmod to see if the cowBrd or DiskWrapper kernel modules are inserted after the test harness has been interrupted should be enough to confirm this is the issue.

As for bug 2, I'm not really familiar with the inner workings of ACE, so I'm not sure what the issue is off the top of my head

@Hitatm
Copy link
Author

Hitatm commented Jan 16, 2019

Thanks @ashmrtn for your detailed response. for bug2 i find I can use the script under dir crashmonkey/ace/specific_generator_scripts;
In addition,I notice that you have done a lot of work on cowBrd and DiskWrapper. I have read your paper (B3) in detail and try to apply your dirver to my andriod phone ,but my phone's linux kernel(google’s AOSP ,current version is 4.9 , next version is 4.14) does not match your driver . I'm not familiar with the block layer,i tried,but didn't make it work :( . could you please adapt your driver to 4.9 and 4.14 ? if you could help , i'll always appreciate it :)

@ashmrtn
Copy link
Member

ashmrtn commented Jan 16, 2019

I can see if I have some extra time in the next few weeks to try porting CrashMonkey to 4.9. My guess would be it's some mismatch in the names of variables or the build process, but I don't know if it'll be a quick fix or not since I've never tried running it on an Android device.

So that I have a little more background information, how are you building CrashMonkey for Android? The Makefile currently in the repo uses Linux kernel headers to compile the kernel modules in the project, so I would assume something similar for Android would be required, though I've never actually tried it.

Could you also give me some more information about about exactly is not working so that I might have a better idea of what the problem may be (ex. build errors, some error while executing, etc)? Output from these errors would also be helpful, and maybe a quick link/guide/overview to how you are building stuff, if you don't mind, since I've never worked with Android kernels before :)

@Hitatm
Copy link
Author

Hitatm commented Jan 17, 2019

Sorry, @ashmrtn I did not make it clear. I'm trying to test my filesystem,under linux mainline on PC(qemu). Since android kernel is 4.9, I prepare to move my filesystem to linux 4.9 and do some crash consistent test.
my kernel version 4.9.0-rc8+, no build errors. and i can correctly insmod cowBrd and DiskWrapper.

run the following cmds, there is no errors, 
sudo insmode cow_brd.ko;
sudo insmod disk_wrapper.ko target_device_path=/dev/cow_ram_snapshot1_0   flags_device_path=/dev/vdb
dmesg | tail 

[  580.753693] cow_brd: module unloaded
[  591.843479] cow_brd: module loaded with 1 disks and 1 snapshots
[  883.808620] hwm: Hello World from module
[  883.809523] hwm: Wrapping device /dev/cow_ram_snapshot1_0 with flags device /dev/vdb
[  883.811676] hwm: working with queue with:
[  883.811676]  flags 0xf02a00
[  883.814521] hwm: initialized

then i rmmod the two drivers and then run
sudo python xfsMonkey.py -f /dev/vdb -d /dev/cow_ram0 -t f2fs -e 102400 -u build/tests/seq1/
it got stuck immediately,here is the test log and trace msg.

Recorded workload:
bio # time               sector             size
0     2679.261549        0                  0
    flags 0x8000000000000000: checkpoint
1     2679.299274        0x1000             0x3000
    flags 0x3070            : sync, meta, prio, flush, read ahead,
2     2679.318784        0x1                0
    flags 0x8000000000000000: checkpoint
========== PHASE 3: Running tests based on recorded data ==========
Writing data out to each Checkpoint and checking with fsck

I guess when replay workload the filesystem was destroyed,but i have no idea of how to fix it.

trace

cow_brd: module verification failed: signature and/or required key missing - tainting kernel
cow_brd: module loaded with 1 disks and 20 snapshots
c_harness (1976): drop_caches: 3
hwm: Hello World from module
hwm: Wrapping device /dev/cow_ram_snapshot1_0 with flags device /dev/vdb
hwm: working with queue with:
 flags 0xf02a00
hwm: initialized
hwm: clearing data logs
hwm: turning on data logging
hwm: bio rw of size 12288 headed for 0x200000 (sector 0x1000) has flags:
hwm: making checkpoint in log
hwm: turning off data logging
hwm: no log entry here
hwm: Cleaning up bye!
------------[ cut here ]------------
WARNING: CPU: 6 PID: 1976 at fs/f2fs/node.c:1159 __get_node_page.part.33+0x220/0x420
Modules linked in: cow_brd(OE) [last unloaded: disk_wrapper]
CPU: 6 PID: 1976 Comm: c_harness Tainted: G           OE   4.9.0-rc8+ #8
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-2.fc27 04/01/2014
 ffffacda446cfc28 ffffffffbb41ebaf 0000000000000000 0000000000000000
 ffffacda446cfc68 ffffffffbb0a13db 00000487afdb0800 ffffd84448d2bcc0
 ffff9e11afdb0800 0000000000000003 ffff9e11b05a5d48 0000000000000000
Call Trace:
 [<ffffffffbb41ebaf>] dump_stack+0x63/0x84
 [<ffffffffbb0a13db>] __warn+0xcb/0xf0
 [<ffffffffbb0a150d>] warn_slowpath_null+0x1d/0x20
 [<ffffffffbb36d710>] __get_node_page.part.33+0x220/0x420
 [<ffffffffbb36d922>] get_node_page.part.34+0x12/0x20
 [<ffffffffbb36f20b>] get_node_page+0x1b/0x20
 [<ffffffffbb350f9e>] f2fs_iget+0x11e/0x800
 [<ffffffffbb35c03a>] f2fs_fill_super+0x82a/0x1060
 [<ffffffffbb24e4dc>] mount_bdev+0x17c/0x1b0
 [<ffffffffbb35b810>] ? f2fs_commit_super+0xf0/0xf0
 [<ffffffffbb358485>] f2fs_mount+0x15/0x20
 [<ffffffffbb24efb2>] mount_fs+0x32/0x160
 [<ffffffffbb26ca3d>] vfs_kern_mount.part.18+0x5d/0xf0
 [<ffffffffbb26ef60>] do_mount+0x520/0xc30
 [<ffffffffbb246c9a>] ? __check_object_size+0xba/0x1f0
 [<ffffffffbb2203cf>] ? kmem_cache_alloc_trace+0x15f/0x1c0
 [<ffffffffbb26e82c>] ? copy_mount_options+0x2c/0x220
 [<ffffffffbb26f988>] SyS_mount+0x98/0xe0
 [<ffffffffbb85bd77>] entry_SYSCALL_64_fastpath+0x1a/0xa9
---[ end trace 32fbab667e323f53 ]---
F2FS-fs (cow_ram_snapshot1_0): Failed to read root inode
------------[ cut here ]------------

@ashmrtn
Copy link
Member

ashmrtn commented Jan 17, 2019

Thanks for the extra information!

That is actually an interesting dmesg log that you got back. Based on the dmesg output, I think you are correct that the file system somehow gets destroyed.

I'll see if I have some time this weekend to look at this more closely

@Hitatm
Copy link
Author

Hitatm commented Jan 18, 2019

OK,thanks

@jayashreemohan29
Copy link
Member

@Hitatm Thanks for identifying the nit in the shell script. It has been fixed #127

@Hitatm
Copy link
Author

Hitatm commented Jan 25, 2019

@Hitatm Thanks for identifying the nit in the shell script. It has been fixed #127

OK, nice,thanks. @jayashreemohan29

@vijay03
Copy link
Member

vijay03 commented Jan 27, 2019

@Hitatm @ashmrtn ported Crashmonkey to 4.9 here: #128. Can you test your code with that?

@Hitatm
Copy link
Author

Hitatm commented Jan 31, 2019

@Hitatm @ashmrtn ported Crashmonkey to 4.9 here: #128. Can you test your code with that?
Sorry I've just noticed your message, thanks @vijay03 and @ashmrtn , I will do the test and give me a feedback

@Hitatm
Copy link
Author

Hitatm commented Jan 31, 2019

I tested seq-1 on my VM,it works well. thanks for your excellent work. @vijay03 @ashmrtn . (By the way, xfsMonkey.py:94, mkdir cmd add -p option will be better when it exits abnormally @jayashreemohan29 )

@Hitatm Hitatm closed this as completed Jan 31, 2019
@Hitatm
Copy link
Author

Hitatm commented Jan 31, 2019

I mean the driver modules no kernel WARNING,but there are 3 bugs reported, it might be my own fault.

Running test #281 : j-lang34 : Failed test
DIFF: Content Mismatch /foo
        Expected File Size = 32768
        Actual File Size = 36864
Running test #283 : j-lang26 : Failed test
DIFF: Content Mismatch /foo
        Expected File Size = 32768
        Actual File Size = 65536
Running test #312 : j-lang46 : Failed test
DIFF: Content Mismatch /foo
        Expected File Size = 32768
        Actual File Size = 36864

@Hitatm Hitatm reopened this Jan 31, 2019
@Hitatm Hitatm closed this as completed Feb 1, 2019
@jayashreemohan29
Copy link
Member

@Hitatm These are bugs. Can you tell me which file system and kernel version you are running these tests on? If you take a look at the workload files corresponding to the bugs you have reported, they are three different scenarios of the fallocate syscall with FALLOC_FL_KEEP_SIZE flag set. In all three cases, although the file size should have been unmodified after fallocate, CrashMonkey is reporting that the file size has changed. CrashMonkey found a similar bug on F2FS (https://github.com/utsaslab/crashmonkey/blob/master/newBugs.md#bug-9--fallocate-beyond-the-eof-recovers-to-an-incorrect-file-size), which corresponds to j-lang26.cpp. But the other two bugs you are reporting seem new.

@Hitatm
Copy link
Author

Hitatm commented Feb 1, 2019

I move the f2fs of andrioid-p to linux 4.9.0-rc8 and test it on qemu vm. I apply the f2fs-bug9 patch and case j-lang26/34 pass thanks @jayashreemohan29. About j-lang46 case, there may be something wrong introduced by myself, I am working on it .

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants