How to use KafkaImporter transform option

Comments

8 comments

  • Avatar
    Candido Dessanti

    Hi @zachariam,

    The transformation option of KafkaImporter works like the one of StreamImporter so that you can apply a regular expression to one or more fields, and it can't add or remove columns.

    The transformation format is like this Transformation format: <column name>:s/<regex pattern>/<fmt string>/ and you can specify transformations for more than one column using commas as delimiters, for example, using -t f1:"s/2002/02/", f2:"s/17/12/" will transform the fields called f1 and f2, and the output of the transformation would be

    transform f1: s/2002/02/
    transform f2: s/17/12/
    

    You can use the transformations you want using a syntax compatible with regex_replace of boost.

    If you want to combine or remove fields using only the HeavyAI's Kafkaimporter, you must transform the data after the load.

    I can create a Jira to track this new feature request.

    Regrads, Candido

    0
    Comment actions Permalink
  • Avatar
    zacharia mathew

    can you give regular expression to replace NULL or empty data to replace with -1. i tried s//-1 , but not working

    0
    Comment actions Permalink
  • Avatar
    Rasheeda Ka

    Hi Candido,

    Thank you for sharing transform format details of Kafka Importer. Can u please suggest how to achieve if value itself is not present in topic data to use reg expression and replace.

    We have a set of Events data EVENT 1 (Topic1) : Param1, Param2,Param3,Params4 EVENT 2 (Topic2) : Param1,Param5

    Table EVENT_TAB to load each event data in real time . EVENT_TAB(PARAM1 TEXT, PARAM2 TEXT,PARAM3 TEXT,PARAM4 TEXT,PARAM5 TEXT)

    so in case of EVENT1 : Param5 should be inserted as -1 in EVENT_TAB and rest values same as that of data published. ex: insert into EVENT_TAB(Param1,param2,param3,param4,-1)

    EVENT2: Details which are not present should be filled with -1. insert into EVENT_TAB(Param1,-1,-1,-1,Param5) .

    Looking forward to your valuable suggestions.

    0
    Comment actions Permalink
  • Avatar
    zacharia mathew

    regular expression of multiple field not working, i given 2 replace but only first one is working. --transform NAME:"s/zacharia/mathew/", MARK:"s/20/12" , i given like this in KafkaImporter and send kafka data as "zacharia,20" , but in table only NAME column value replaced, MARK column value not changed.

    select * from EMP;

    NAME|MARK mathew|20

    0
    Comment actions Permalink
  • Avatar
    Candido Dessanti

    Hi @zachariam ,

    The teansformation for each field would end with a slash character to work. Thats the reason why mark hasn't been changed.

    0
    Comment actions Permalink
  • Avatar
    Candido Dessanti

    I tried to replace a non existent value ith something elae rhia afternoon bu it hasn't worked. I had to use a space to replace an arbitray value with the expression s/ /-1/ Using s//-1 hasn't worked. I guess that I should go more in deep the syntax of regex_replace syntax

    0
    Comment actions Permalink
  • Avatar
    zacharia mathew

    i tried --transform NAME:"s/zacharia/mathew/",MARK:"s/20/12/" , but not working,

    select * from EMP;

    NAME|MARK mathew|20

    can u please test and verify it, please confirm for your test case is it working or not, i don't know why it is not working for me.

    0
    Comment actions Permalink
  • Avatar
    Candido Dessanti

    Hi,

    Ah, I'm sorry but it's needed also a space between the two transformations so NAME:“s/zacharia/mathew/”, MARK:“s/20/12/” it's also working using just 1 space without the comma NAME:“s/zacharia/mathew/” MARK:“s/20/12/”

    I have to double-check the source of the commands and extend our docs for StramInserver and KafkaImporter, because they obliviously lack some pieces of information.

    Thanks for pointing me out to this.

    Regards, Candido

    1
    Comment actions Permalink

Please sign in to leave a comment.