Skip to content

[VPP-1711] vppctl makes VPP hang-up occasionally #3174

Closed
@vvalderrv

Description

@vvalderrv

Description

> Description

We found that sometimes VPP itself freeze when we use vppctl command repeatedly, 

This issue doesn't happen immediately but it almost happened in 3-7 days under heavy repeated vppctl command (ex, call vppctl per second)

> VPP version

VPP 18.10 with 2 patches,

    - https://gerrit.fd.io/r/#/c/12680/ - https://gerrit.fd.io/r/#/c/18826/

    > log & coredump

    This is gdb backtrace when VPP hang-up and was stoped by manually HAP.

    Please see attached crashdump file in details.

    (gdb) bt

    #0 0x00007fbdc1f24207 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:55

    #1 0x00007fbdc1f258f8 in __GI_abort () at abort.c:90

    #2 0x0000000000407cee in os_exit (code=code@entry=1) at /usr/src/debug/vpp-18.10/src/vpp/vnet/main.c:349

    #3 0x00007fbdc35e749c in unix_signal_handler (signum=, si=, uc=) at /usr/src/debug/vpp-18.10/src/vlib/unix/main.c:157

    #4

    #5 0x00007fbdc1fd0d47 in sched_yield () at ../sysdeps/unix/syscall-template.S:81

    #6 0x00007fbdc2d0a0be in spin_acquire_lock (sl=sl@entry=0x7fbd808a5384) at /usr/src/debug/vpp-18.10/src/vppinfra/dlmalloc.c:466

    #7 0x00007fbdc2d0b960 in mspace_malloc (msp=0x7fbd808a5010, bytes=bytes@entry=13) at /usr/src/debug/vpp-18.10/src/vppinfra/dlmalloc.c:4339

    #8 0x00007fbdc2d0d1c8 in mspace_get_aligned (msp=0x7fbd808a5010, n_user_data_bytes=13, n_user_data_bytes@entry=9, align=align@entry=8, align_offset=align_offset@entry=8)

    at /usr/src/debug/vpp-18.10/src/vppinfra/dlmalloc.c:4178

    #9 0x00007fbdc2d02880 in clib_mem_alloc_aligned_at_offset (os_out_of_memory_on_failure=1, align_offset=8, align=8, size=9) at /usr/src/debug/vpp-18.10/src/vppinfra/mem.h:118

    #10 vec_resize_allocate_memory (v=v@entry=0x0, length_increment=length_increment@entry=1, data_bytes=9, header_bytes=, header_bytes@entry=0,

    data_align=data_align@entry=8) at /usr/src/debug/vpp-18.10/src/vppinfra/vec.c:59

    #11 0x00007fbdc35ddf73 in _vec_resize_inline (data_align=, header_bytes=, data_bytes=, length_increment=, v=)

    at /usr/src/debug/vpp-18.10/src/vppinfra/vec.h:145

    #12 unix_cli_add_pending_output (uf=0x7fbd8181238c, buffer=0x7fbdc35efa6d "\r", buffer_bytes=1, cf=) at /usr/src/debug/vpp-18.10/src/vlib/unix/cli.c:544

    #13 0x00007fbdc35e0e50 in unix_cli_file_welcome (cf=0x7fbd819af9cc, cm=) at /usr/src/debug/vpp-18.10/src/vlib/unix/cli.c:1216

    #14 0x00007fbdc2cebb6c in timer_interrupt (signum=) at /usr/src/debug/vpp-18.10/src/vppinfra/timer.c:129

    #15

    #16 0x00007fbdc2d0b866 in mspace_malloc (msp=0x7fbd808a5010, bytes=bytes@entry=20) at /usr/src/debug/vpp-18.10/src/vppinfra/dlmalloc.c:4355

    #17 0x00007fbdc2d0d1c8 in mspace_get_aligned (msp=0x7fbd808a5010, n_user_data_bytes=20, n_user_data_bytes@entry=16, align=align@entry=8, align_offset=align_offset@entry=8)

    at /usr/src/debug/vpp-18.10/src/vppinfra/dlmalloc.c:4178

    #18 0x00007fbdc2d02880 in clib_mem_alloc_aligned_at_offset (os_out_of_memory_on_failure=1, align_offset=8, align=8, size=16) at /usr/src/debug/vpp-18.10/src/vppinfra/mem.h:118

    #19 vec_resize_allocate_memory (v=v@entry=0x0, length_increment=1, data_bytes=16, data_bytes@entry=8, header_bytes=, header_bytes@entry=0, data_align=data_align@entry=8)

    at /usr/src/debug/vpp-18.10/src/vppinfra/vec.c:59

    #20 0x00007fbdc35e1f70 in _vec_resize_inline (data_align=8, header_bytes=0, data_bytes=8, length_increment=1, v=) at /usr/src/debug/vpp-18.10/src/vppinfra/vec.h:145

    #21 vlib_process_get_events (data_vector=, vm=0x7fbdc37fd200 <vlib_global_main>) at /usr/src/debug/vpp-18.10/src/vlib/node_funcs.h:562

    #22 unix_cli_process (vm=0x7fbdc37fd200 <vlib_global_main>, rt=0x7fbd819d6000, f=) at /usr/src/debug/vpp-18.10/src/vlib/unix/cli.c:2530

    #23 0x00007fbdc35ba126 in vlib_process_bootstrap (_a=) at /usr/src/debug/vpp-18.10/src/vlib/main.c:1232

    #24 0x00007fbdc2cc26a4 in clib_calljmp () from /lib64/libvppinfra.so.18.10

    #25 0x00007fbd80fffbd0 in ?? ()

    #26 0x00007fbdc35bb299 in dispatch_process (vm=0x7fbdc37fd200 <vlib_global_main>, p=0x7fbd819d6000, last_time_stamp=0, f=0x0) at /usr/src/debug/vpp-18.10/src/vlib/main.c:1254

    #27 0x0000000000000000 in ?? ()

    (gdb) 


    Thanks

    Tatsumi.

    Assignee

    Chris Luke

    Reporter

    Yusuke Tatsumi

    Comments

    • chrisluke (Fri, 12 Jul 2019 15:04:18 +0000): Thank you for testing the patch and for letting us know the outcome. I shall close this issue now, but let us know of any further issues.
    • yusuketatsumi (Fri, 12 Jul 2019 15:01:18 +0000): Chris Luke,

    With your patch, our VPPs works fine so far.

    Thanks a lot!

    • chrisluke (Wed, 10 Jul 2019 20:45:47 +0000): Yusuke Tatsumi The patch has now been merged; please let us know if it appears to resolve the issue you observed. Thanks!
    • yusuketatsumi (Wed, 10 Jul 2019 04:05:40 +0000): Chris Luke

    Thank you for the patch-set! I'll try this immediately.

    • chrisluke (Wed, 10 Jul 2019 03:41:41 +0000): https://gerrit.fd.io/r/20573 is my candidate for a real cure for this; please try if it you are able. Thanks!
    • yusuketatsumi (Mon, 24 Jun 2019 02:39:25 +0000): I used split command to device large filesize to small one.

    Please revert files as bellow,

    cat vpp_main-1560739085.20856.tar.gz.a* > vpp_main-1560739085.20856.tar.gz

    cat vpp-debuginfo-18.10-4_ci_builddate_20190603_1508.x86_64.rpm.a* > vpp-debuginfo-18.10-4_ci_builddate_20190603_1508.x86_64.rpm

    cat vpp-plugins-18.10-4_ci_builddate_20190603_1508.x86_64.rpm.a* vpp-plugins-18.10-4_ci_builddate_20190603_1508.x86_64.rpm

    Thanks.

    Original issue: https://jira.fd.io/browse/VPP-1711

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions