Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hang inside _IO_proc_close() during destruction. #10

Open
acidtonic opened this issue Feb 4, 2019 · 4 comments
Open

Hang inside _IO_proc_close() during destruction. #10

acidtonic opened this issue Feb 4, 2019 · 4 comments

Comments

@acidtonic
Copy link

I am running into issues with hangs when trying to close the Gnuplot object, it's destructor appears to hang inside of an fclose command.

When this happens I am rendering a chart every few seconds, destroying it, and rendering another chart with new values.

My first suspicion was dangling newline or something causing gnuplot to wait for input but I cannot validate that idea.
I also ran the program in valgrind to make sure I wasn't corrupting memory or anything wonky.

Have you ran into this before? Seems only to happen on some machines I test it on and others can run through just fine without any hangs. Is my cycling running out of file descriptors or something? When it happens the machine has been constantly rendering charts for a few minutes.

Below is the stack trace. I have redacted unrelated function names on my side.

#0  0x00007f6a717ee848 in _IO_proc_close () from /lib64/libc.so.6
#1  0x00007f6a717fb97c in _IO_file_close_it () from /lib64/libc.so.6
#2  0x00007f6a717ec3fe in fclose () from /lib64/libc.so.6
#3  0x00005588888fae09 in gnuplotio::FileHandleWrapper::fh_close (this=0x7fff9c71c7e0) at [REDACTED]gnuplot-iostream.h:1536
#4  0x00005588888fb572 in gnuplotio::Gnuplot::~Gnuplot (this=0x7fff9c71c750, __in_chrg=<optimized out>, __vtt_parm=<optimized out>)
    at [REDACTED]gnuplot-iostream.h:1647
#5  0x00005588888feda4 in BaseChart<float>::render (this=0x55888a485c90, height=463, width=1763) at [REDACTED]BaseChart.hpp:80
#6  0x00005588888e7fb2 in [REDACTED] (this=0x55888ae4f800, io_condition=Glib::IO_IN, ch=..., is_out=false)
    at [REDACTED]:655
#7  0x00005588888eb3f9 in [REDACTED]::<lambda(Glib::IOCondition)>::operator()(Glib::IOCondition) const (__closure=0x55888b198d90, cond=Glib::IO_IN)
    at [REDACTED]:1242
#8  0x00005588888f8760 in sigc::adaptor_functor<[REDACTED]::[REDACTED]()::<lambda(Glib::IOCondition)> >::operator()<const Glib::IOCondition&>(const Glib::IOCondition &) const (this=0x55888b198d90, _A_arg1=@0x7fff9c71d164: Glib::IO_IN) at /usr/include/sigc++-2.0/sigc++/adaptors/adaptor_trait.h:89
#9  0x00005588888f7d4e in sigc::internal::slot_call1<[REDACTED]::[REDACTED]()::<lambda(Glib::IOCondition)>, bool, Glib::IOCondition>::call_it(sigc::internal::slot_rep *, sigc::type_trait_take_t) (rep=0x55888b198d60, a_1=@0x7fff9c71d164: Glib::IO_IN) at /usr/include/sigc++-2.0/sigc++/functors/slot.h:148
#10 0x00007f6a889bdc1f in ?? () from /usr/lib64/libglibmm-2.4.so.1
#11 0x00007f6a862e47f5 in g_main_context_dispatch () from /usr/lib64/libglib-2.0.so.0
#12 0x00007f6a862e4bb8 in ?? () from /usr/lib64/libglib-2.0.so.0
#13 0x00007f6a862e4ed2 in g_main_loop_run () from /usr/lib64/libglib-2.0.so.0
#14 0x00007f6a882819b5 in gtk_main () from /usr/lib64/libgtk-3.so.0
#15 0x00007f6a899cca96 in Gtk::Main::run(Gtk::Window&) () from /usr/lib64/libgtkmm-3.0.so.1
#16 0x0000558888866544 in main (argc=2, argv=0x7fff9c71d628) at [REDACTED]:135

Appreciate any help.

@acidtonic
Copy link
Author

Forgot to mention the platform is Linux on Gentoo.
Linux 4.14.61-gentoo

@dstahlke
Copy link
Owner

I replied to the github email a while back, but I'm not sure whether that went anywhere, since it didn't appear on this page.

I'm running Linux too (Ubuntu), so that's the most tested platform. I have no idea what could cause the hang. The pclose function will hang until the gnuplot process exits, which should normally happen right away. Are you using fork? That might do something odd to the file descriptors. Threads should be okay, I think, as long as you don't have two threads messing with the same Gnuplot object.

As for running out of file descriptors... I would hope that would not be a problem as long as they are being closed. But I don't know that much about it. I've never used it in a mode where the Gnuplot objects are rapidly created and destroyed. I am interested in finding out what's going on here, but if you need a workaround you might try just creating a single Gnuplot object and re-using that. The included example program example-misc.cc contains an animation example that rapidly re-plots new data with a single gnuplot invocation.

Can you send me a minimal example program that reproduces the problem?

@acidtonic
Copy link
Author

acidtonic commented Feb 22, 2019

Sorry for the delay, thought I fixed it but ended up reproducing it.

I have determined that gnuplot is hung as well when this happens, might be their issue, not sure.

I noticed when this happens, I have a spare gnuplot process running with the -persist argument. I attached gdb to it and got a stack trace (no debug symbols for now, but it's in syscalls anyways).

I cannot get a test program together at the moment, codebase on our side is massive. I'll try to see if I can get something but it's doubtful.

Only thing I noticed is gnuplot is inside of a signal handler and blocked on a write. Our side has already entered the destructor so perhaps our side is destroying the pipe too quickly via a race?

(gdb) bt
#0  0x00007ffff3193148 in write () from /lib64/libc.so.6
#1  0x00007ffff311e74d in _IO_file_write () from /lib64/libc.so.6
#2  0x00007ffff311d953 in ?? () from /lib64/libc.so.6
#3  0x00007ffff311fb12 in _IO_do_write () from /lib64/libc.so.6
#4  0x00007ffff311ef27 in _IO_file_xsputn () from /lib64/libc.so.6
#5  0x00007ffff31115e8 in fwrite () from /lib64/libc.so.6
#6  0x00007ffff79471db in ?? () from /usr/lib64/libgd.so.3
#7  0x00007ffff2c32dba in png_write_sig () from /usr/lib64/libpng16.so.16
#8  0x00007ffff2c2f8b9 in png_write_info_before_PLTE () from /usr/lib64/libpng16.so.16
#9  0x00007ffff2c2fad7 in png_write_info () from /usr/lib64/libpng16.so.16
#10 0x00007ffff7949fdd in ?? () from /usr/lib64/libgd.so.3
#11 0x00007ffff794b2b9 in gdImagePng () from /usr/lib64/libgd.so.3
#12 0x000055555566aaf6 in ?? ()
#13 0x0000555555673343 in ?? ()
#14 0x00005555555e9136 in ?? ()
#15 <signal handler called>
#16 0x00007ffff3193148 in write () from /lib64/libc.so.6
#17 0x00007ffff311e74d in _IO_file_write () from /lib64/libc.so.6
#18 0x00007ffff311d953 in ?? () from /lib64/libc.so.6
#19 0x00007ffff311fb12 in _IO_do_write () from /lib64/libc.so.6
#20 0x00007ffff311ffb3 in _IO_file_overflow () from /lib64/libc.so.6
#21 0x00007ffff3121154 in _IO_default_xsputn () from /lib64/libc.so.6
#22 0x00007ffff311ef71 in _IO_file_xsputn () from /lib64/libc.so.6
#23 0x00007ffff31115e8 in fwrite () from /lib64/libc.so.6
#24 0x00007ffff79471db in ?? () from /usr/lib64/libgd.so.3
#25 0x00007ffff2c32e31 in png_write_chunk_data () from /usr/lib64/libpng16.so.16
#26 0x00007ffff2c32f80 in ?? () from /usr/lib64/libpng16.so.16
#27 0x00007ffff2c34fd8 in ?? () from /usr/lib64/libpng16.so.16
#28 0x00007ffff2c358df in ?? () from /usr/lib64/libpng16.so.16
#29 0x00007ffff2c30312 in png_write_row () from /usr/lib64/libpng16.so.16
#30 0x00007ffff2c3067b in png_write_image () from /usr/lib64/libpng16.so.16
#31 0x00007ffff794a2d3 in ?? () from /usr/lib64/libgd.so.3
#32 0x00007ffff794b2b9 in gdImagePng () from /usr/lib64/libgd.so.3
#33 0x000055555566aaf6 in ?? ()
#34 0x0000555555673292 in ?? ()
#35 0x0000555555673303 in ?? ()
#36 0x000055555567cf87 in ?? ()
#37 0x000055555559c7da in ?? ()
#38 0x000055555559cb1d in ?? ()
#39 0x000055555558b16c in ?? ()
#40 0x00007ffff30c0ae7 in __libc_start_main () from /lib64/libc.so.6
#41 0x000055555558c3ea in ?? ()

@dstahlke
Copy link
Owner

At first I thought the signal might be SIGPIPE, sent when the writer (the gnuplot-iostream library) calls pclose. However, the pipe(7) man page says that SIGPIPE is only sent to the writer. The reader (gnuplot) should just get EOF. So I have no idea what that signal is or why gnuplot would hang on writing the png, unless the filesystem is blocking (e.g. if it is flushing the write cache after tons of writes).

It's really bizarre that the stack trace goes png_write_image -> fwrite -> signal handler -> png_write_info -> fwrite. Maybe gdb is confused here or the stack got screwed up.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants