KafkaImporter crashes on RHEL 7.9

Comments

20 comments

  • Avatar
    Candido Dessanti

    Hi @bsharsha ,

    is this happening just RHEL 7.9 or also with older version of the OS?

    Regards, Candido

    0
    Comment actions Permalink
  • Avatar
    Raj Kiran

    Hi candido

    No ..we have been using Kafka importer in older rhel version of 7.0 and its stable ..this issue we have seen only on rhel 7.9 ..we infact tried to upgrade glibc to latest version but the result remains same and Kafka improter fails to start with above(as reproted by harsha ) error .

    0
    Comment actions Permalink
  • Avatar
    Candido Dessanti

    Hi @Raj_Kiran,

    So I guess we use an outdated version of Kafka lib, and probably we should upgrade.

    I'll check with the engineering team and get back to you.

    Regards, Candido

    0
    Comment actions Permalink
  • Avatar
    Raj Kiran

    Thanks @candido.dessanti . Shall await your further inputs .We use this utility extensively and hence it has become blocker for us .

    0
    Comment actions Permalink
  • Avatar
    Harsha

    @candido.dessanti - Request advise on the feedback received from engineering team.

    Regards , Harsha

    0
    Comment actions Permalink
  • Avatar
    Candido Dessanti

    HI,

    I haven't any news. We aren't able to reproduce, so I'ìm going to compile a more recent libkafkard with the tool and give it to you, but I need some time.

    Regards, Candido

    0
    Comment actions Permalink
  • Avatar
    Harsha

    Thank you for the feedback @candido.dessanti . Appreciate :+1:

    We shall await for your suggestions further. Please feel free to keep us posted if you may require any debug info from our labs which may assist for your analysis.

    0
    Comment actions Permalink
  • Avatar
    Candido Dessanti

    Hi @bsharsha and @Raj_Kiran ,

    I compiled the utility with a recent version of the library (1.8.2), and shared it with our google drive

    https://drive.google.com/file/d/1jhRmvfdUhvZGbHrQjycuoOFfSG8LNb1_/view?usp=sharing

    Let me know if the problem is resolved.

    Regards, Candido

    0
    Comment actions Permalink
  • Avatar
    Harsha

    Thank you @candido.dessanti . On verification , we still observe module crashes with below memory corruption error output. Kernel version details shared for reference . Request your further advise on same , please keep posted if you may need any further details

    Observation

    [volte@unvpcrf02 scripts]$ tr Error in `/opt/volte/KafkaImporter': malloc(): memory corruption: 0x0000000001d5cb30 ======= Backtrace: ========= /lib64/libc.so.6(+0x82aa6)[0x7f38ddef0aa6] /lib64/libc.so.6(__libc_malloc+0x4c)[0x7f38ddef36fc] /opt/volte/KafkaImporter(Znwm+0x15)[0xb82655] /opt/volte/KafkaImporter(_ZN17RowToColumnLoader18get_row_descriptorEv+0x5e)[0x64e20e] /opt/volte/KafkaImporter(_Z11msg_consumePN7RdKafka7MessageER17RowToColumnLoaderN13import_export10CopyParamsERKSt3mapINSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEESt4pairISt10unique_ptrIN5boost11basic_regexIcNSF_12regex_traitsIcNSF_16cpp_regex_traitsIcEEEEEESt14default_deleteISL_EESE_ISC_SM_ISC_EEESt4lessISC_ESaISD_IKSC_SR_EEEb+0x16c)[0x6351bc] /opt/volte/KafkaImporter(_Z12kafka_insertR17RowToColumnLoaderRKSt3mapINSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEESt4pairISt10unique_ptrIN5boost11basic_regexIcNSA_12regex_traitsIcNSA_16cpp_regex_traitsIcEEEEEESt14default_deleteISG_EES9_IS7_SH_IS7_EEESt4lessIS7_ESaIS8_IKS7_SM_EEERKN13import_export10CopyParamsEbS7_S7_S7+0x79b)[0x63675b] /opt/volte/KafkaImporter(main+0x196d)[0x63195d] /lib64/libc.so.6(__libc_start_main+0xf5)[0x7f38dde90555] /opt/volte/KafkaImporter[0x634d2e] ======= Memory map: ======== 00400000-00e11000 r-xp 00000000 fd:00 301657947 /opt/volte/KafkaImporter 01010000-01067000 r-xp 00a10000 fd:00 301657947 /opt/volte/KafkaImporter 01067000-0107a000 rwxp 00a67000 fd:00 301657947 /opt/volte/KafkaImporter 0107a000-01086000 rwxp 00000000 00:00 0 01d37000-01dc1000 rwxp 00000000 00:00 0 [heap] 7f38c4000000-7f38c4021000 rwxp 00000000 00:00 0 7f38c4021000-7f38c8000000 ---p 00000000 00:00 0 7f38c8000000-7f38c8021000 rwxp 00000000 00:00 0 7f38c8021000-7f38cc000000 ---p 00000000 00:00 0 7f38cc000000-7f38cc021000 rwxp 00000000 00:00 0 7f38cc021000-7f38d0000000 ---p 00000000 00:00 0 7f38d0000000-7f38d0021000 rwxp 00000000 00:00 0 7f38d0021000-7f38d4000000 ---p 00000000 00:00 0 7f38d4000000-7f38d4021000 rwxp 00000000 00:00 0 7f38d4021000-7f38d8000000 ---p 00000000 00:00 0 7f38db240000-7f38db255000 r-xp 00000000 fd:00 86 /usr/lib64/libgcc_s-4.8.5-20150702.so.1 7f38db255000-7f38db454000 ---p 00015000 fd:00 86 /usr/lib64/libgcc_s-4.8.5-20150702.so.1 7f38db454000-7f38db455000 r-xp 00014000 fd:00 86 /usr/lib64/libgcc_s-4.8.5-20150702.so.1 7f38db455000-7f38db456000 rwxp 00015000 fd:00 86 /usr/lib64/libgcc_s-4.8.5-20150702.so.1 7f38db456000-7f38db462000 r-xp 00000000 fd:00 202400 /usr/lib64/libnss_files-2.17.so 7f38db462000-7f38db661000 ---p 0000c000 fd:00 202400 /usr/lib64/libnss_files-2.17.so 7f38db661000-7f38db662000 r-xp 0000b000 fd:00 202400 /usr/lib64/libnss_files-2.17.so 7f38db662000-7f38db663000 rwxp 0000c000 fd:00 202400 /usr/lib64/libnss_files-2.17.so 7f38db663000-7f38db669000 rwxp 00000000 00:00 0 7f38db669000-7f38db66a000 ---p 00000000 00:00 0 7f38db66a000-7f38dbe6a000 rwxp 00000000 00:00 0 7f38dbe6a000-7f38dbe6b000 ---p 00000000 00:00 0 7f38dbe6b000-7f38dc66b000 rwxp 00000000 00:00 0 7f38dc66b000-7f38dc66c000 ---p 00000000 00:00 0 7f38dc66c000-7f38dce6c000 rwxp 00000000 00:00 0 7f38dce6c000-7f38dce6d000 ---p 00000000 00:00 0 7f38dce6d000-7f38dd66d000 rwxp 00000000 00:00 0 7f38dd66d000-7f38dd66e000 ---p 00000000 00:00 0 7f38dd66e000-7f38dde6e000 rwxp 00000000 00:00 0 7f38dde6e000-7f38de032000 r-xp 00000000 fd:00 202382 /usr/lib64/libc-2.17.so 7f38de032000-7f38de231000 ---p 001c4000 fd:00 202382 /usr/lib64/libc-2.17.so 7f38de231000-7f38de235000 r-xp 001c3000 fd:00 202382 /usr/lib64/libc-2.17.so 7f38de235000-7f38de237000 rwxp 001c7000 fd:00 202382 /usr/lib64/libc-2.17.so 7f38de237000-7f38de23c000 rwxp 00000000 00:00 0 7f38de23c000-7f38de253000 r-xp 00000000 fd:00 202408 /usr/lib64/libpthread-2.17.so 7f38de253000-7f38de452000 ---p 00017000 fd:00 202408 /usr/lib64/libpthread-2.17.so 7f38de452000-7f38de453000 r-xp 00016000 fd:00 202408 /usr/lib64/libpthread-2.17.so 7f38de453000-7f38de454000 rwxp 00017000 fd:00 202408 /usr/lib64/libpthread-2.17.so 7f38de454000-7f38de458000 rwxp 00000000 00:00 0 7f38de458000-7f38de559000 r-xp 00000000 fd:00 202390 /usr/lib64/libm-2.17.so 7f38de559000-7f38de758000 ---p 00101000 fd:00 202390 /usr/lib64/libm-2.17.so 7f38de758000-7f38de759000 r-xp 00100000 fd:00 202390 /usr/lib64/libm-2.17.so 7f38de759000-7f38de75a000 rwxp 00101000 fd:00 202390 /usr/lib64/libm-2.17.so 7f38de75a000-7f38de75c000 r-xp 00000000 fd:00 202388 /usr/lib64/libdl-2.17.so 7f38de75c000-7f38de95c000 ---p 00002000 fd:00 202388 /usr/lib64/libdl-2.17.so 7f38de95c000-7f38de95d000 r-xp 00002000 fd:00 202388 /usr/lib64/libdl-2.17.so 7f38de95d000-7f38de95e000 rwxp 00003000 fd:00 202388 /usr/lib64/libdl-2.17.so 7f38de95e000-7f38de980000 r-xp 00000000 fd:00 202375 /usr/lib64/ld-2.17.so 7f38deb65000-7f38deb6d000 rwxp 00000000 00:00 0 7f38deb7c000-7f38deb7f000 rwxp 00000000 00:00 0 7f38deb7f000-7f38deb80000 r-xp 00021000 fd:00 202375 /usr/lib64/ld-2.17.so 7f38deb80000-7f38deb81000 rwxp 00022000 fd:00 202375 /usr/lib64/ld-2.17.so 7f38deb81000-7f38deb82000 rwxp 00000000 00:00 0 7ffc5c0bb000-7ffc5c0dd000 rwxp 00000000 00:00 0 [stack] 7ffc5c178000-7ffc5c17a000 r-xp 00000000 00:00 0 [vdso] ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0 [vsyscall]


    Kernel details uname -a Linux unvpcrf02.ims.unwls.com 3.10.0-1160.el7.x86_64 #1 SMP Tue Aug 18 14:50:17 EDT 2020 x86_64 x86_64 x86_64 GNU/Linux Server version Red Hat Enterprise Linux Server release 7.9 (Maipo)


    KafkaImporter version md5sum /opt/volte/KafkaImporter 66858f5fa0d0a47287bf835d515de27f /opt/volte/KafkaImporter


    0
    Comment actions Permalink
  • Avatar
    Candido Dessanti

    Hi,

    could you share with us the DDL of the table you are loading, the line command you are using, and the logs of the KafkaImport itself (you can find them in a directory called log created typically under the path you are running the exec

    The log directory look like this

    candido@zion-legion:/mapd_storage/github/alter_session/heavydb-internal/build/bin/log$ ls -ltr
    total 100
    -rw-rw-r-- 1 candido candido 12520 lug  1 16:21 KafkaImporter.INFO.20220701-162113.log
    -rw-rw-r-- 1 candido candido 12435 lug  4 10:33 KafkaImporter.INFO.20220704-103342.log
    -rw-rw-r-- 1 candido candido 12435 lug  4 10:34 KafkaImporter.INFO.20220704-103359.log
    -rw-rw-r-- 1 candido candido 12435 lug  4 10:36 KafkaImporter.INFO.20220704-103519.log
    -rw-rw-r-- 1 candido candido 12435 lug  4 10:37 KafkaImporter.INFO.20220704-103625.log
    lrwxrwxrwx 1 candido candido    38 lug  4 10:39 KafkaImporter.INFO -> KafkaImporter.INFO.20220704-103907.log
    -rw-rw-r-- 1 candido candido 12435 lug  4 10:39 KafkaImporter.INFO.20220704-103907.log
    

    You can also try to bump up the log level to DEBUG1 or DEBUG2 with the --log-severity switch of utility.

    Regards, Candido

    0
    Comment actions Permalink
  • Avatar
    VishnuCleago

    Hello @candido.dessanti, Please find the requested DDL and log files. KafkaImporter file md5sum and crash time as well mentioned in the file. Let me know if you need more data.

    https://drive.google.com/file/d/1RNXSkHOejCwWvJqjVnfelxdk8ztfhs4k/view?usp=sharing

    0
    Comment actions Permalink
  • Avatar
    VishnuCleago

    Hello @candido.dessanti. Also find the command line which we will be using to start the module

        nohup /opt/omnisci/bin/KafkaImporter WDBS_GX_REQUEST voltetracker -u volte -p volte123 --port 6274 --batch 1 --delim ',' --host 10.0.1.222 --brokers 10.0.1.222  --topic GxReqTopic --group-id TRACKER --log-directory '/opt/volte/var/log/' --log-file-name %d-%m-%Y.kafkaImp.GXREQUEST.%H%M.log --log-severity DEBUG2 --log-rotate-daily 1 --log-rotation-size 524288000 --print_error > /dev/null &
    
        nohup /opt/omnisci/bin/KafkaImporter WDBS_GX_RESPONSE voltetracker -u volte -p volte123 --port 6274 --batch 1 --delim ',' --host 10.0.1.222 --brokers 10.0.1.222 --topic GxResTopic --group-id TRACKER --log-directory '/opt/volte/var/log/' --log-file-name %d-%m-%Y.kafkaImp.GXRESPONSE.%H%M.log --log-severity DEBUG2 --log-rotate-daily 1 --log-rotation-size 524288000 > /dev/null &
    
        nohup /opt/omnisci/bin/KafkaImporter WDBS_RX_REQUEST voltetracker -u volte -p volte123 --port 6274 --batch 1 --delim ',' --host 10.0.1.222 --brokers 10.0.1.222 --topic RxReqTopic --group-id TRACKER --log-directory '/opt/volte/var/log/' --log-file-name %d-%m-%Y.kafkaImp.RXREQUEST.%H%M.log --log-severity DEBUG2 --log-rotate-daily 1 --log-rotation-size 524288000 > /dev/null &
    
        nohup /opt/omnisci/bin/KafkaImporter WDBS_RX_RESPONSE voltetracker -u volte -p volte123 --port 6274 --batch 1 --delim ',' --host 10.0.1.222 --brokers 10.0.1.222 --topic RxResTopic --group-id TRACKER --log-directory '/opt/volte/var/log/' --log-file-name %d-%m-%Y.kafkaImp.RXRESPONSE.%H%M.log --log-severity DEBUG2 --log-rotate-daily 1 --log-rotation-size 524288000 > /dev/null &
    
    0
    Comment actions Permalink
  • Avatar
    Candido Dessanti

    Hi @VishnuCleago,

    I have been able to get a crash on the tool with this message.

    malloc(): invalid next size (unsorted) Aborted (core dumped)

    I had to change the message, changing the size of integer between a message an another; I'll go deeper with a debug build tomorrow

    0
    Comment actions Permalink
  • Avatar
    VishnuCleago

    Hi @candido.dessanti

    Thank you

    0
    Comment actions Permalink
  • Avatar
    Candido Dessanti

    Hi guys,

    I have some news, and they are, as usual, bad and good at the same time.

    The good news is that we're able to reproduce your issue with your DDLs and DATAs on Centos 7.9.

    I haven't tried with the older version, so I'm taking for granted that with the same version of the database you aren't experiencing the issue on previous versions of CentOS. The same executable with the same data and Heavydb version works on Ubuntu, so it's likely that's an issue afflicting CentOS only.

    The bad news is that we aren't able to fix it right now because isn't clear which library is making the system crash (I had the utility crash on a different segment of code randomly). So I'm going to open an internal ticket about that, and I hope it'll be fixed.

    In the meantime, you can use an older version of KafkaImporter, that looks to work in CentOS, with your DDLs and data.

    For your convenience, you can download it here (it's the 5.5.3 version)

    Let me know if the workaround works.

    Regards, Candido

    0
    Comment actions Permalink
  • Avatar
    Candido Dessanti

    Hi,

    As an update, the engineering team came out with a solution to the problem

    I'm sharing the fixed version. Let us know if this fix your issue

    best regards, Candido

    0
    Comment actions Permalink
  • Avatar
    VishnuCleago

    Hi @candido.dessanti, thank you for your great assistance along with us.

    0
    Comment actions Permalink
  • Avatar
    Candido Dessanti

    Hi,

    I think I could be faster at solving that, but without the data, you gave me we couldn't be able reproducing the issue; I'd ask for it before.

    Anyway have you tried the new executable? is it working for you?

    0
    Comment actions Permalink
  • Avatar
    VishnuCleago

    Hi @candido.dessanti, We have tested with the executable you have shared and we could not see any failure/crashes as of now. We can observe its working with our data. We shall inform you if any crashes are getting observed while testing. Thank you

    0
    Comment actions Permalink
  • Avatar
    Candido Dessanti

    Hi,

    thanks for testing. The KafkaImporter will be part of 6.1.1 that's is going to be released soon.

    0
    Comment actions Permalink

Please sign in to leave a comment.