Is there any way to synchronize the data from other databases to heavydb in real time? 

Comments

4 comments

  • Avatar
    Candido Dessanti

    Hi jieguo,

    Without using any third-party solution, and If you are on the Free or Enterprise version of the software, you can use Heavy Connect to cache data from External sources in a local cache in near real-time. The format of those sources can be CSV, Parquet, or RDBMS using the ODBC interface (at the time of writing, we are supporting only a few databases like PostgreSQL, Snowflake, Goggle Big Data, and Amazoin's Redshift), with data synchronized automatically each time a query is run, or on a time basis, or manually.
    This kind of sync works only if the data in the source database is appended; in the case of massive updates or deletion in the source database, the cache should be manually rebuilt, and this could take some time, depending on the size of the source data, and the number of columns requested.

    The documentation about HC can be found on our docs site at this link. Specifically, the ODBC data wrapper doc is here.

    Besides that, you can think about using software like Oracle's Golden Gate using the JDBC Generic handler as target with our JDBC driver, or other open-source software like OpenLogReplicator to capture changes from Oracle's REDO LOGS to a Kafka Topic that can be subscribed to load Data using the KafkaImporter.

    Then you have other alternatives like using SQLImporter, COPYing CSV, or Parquet's data from S3 Sources.

    Hope this help,
    Candido.

    1
    Comment actions Permalink
  • Avatar
    jieguo
    The import of data sources such as csv does not seem to achieve the effect of incremental real-time refresh like oracle goldengate.
    0
    Comment actions Permalink
  • Avatar
    Candido Dessanti

    Hi,

     

    If the table is incrementally loaded without deletions or updates, importing a CSV or Parquet file containing the newly created records into the source table achieves similar outcomes to using GoldenGate.

    With GoldenGate, you can feed modifications of the source table into another table (possibly in CSV format—although I've typically used a table for such scenarios).

    Nevertheless, keeping two different databases in sync often requires third-party software and involves a complex and time-consuming process that can be challenging to manage.

    Have you attempted to use Heavy Connect with an ODBC data source?



    0
    Comment actions Permalink
  • Avatar
    Adam Leszczyński

    If you want to use OpenLogReplicator to get data from an Oracle instance you don't have to spool the output to a Kafka source. You can also use json files as output of transactions.

    The architecture of OLR is very flexible so if there is demand for other formats, the could be introduced too.

    0
    Comment actions Permalink

Please sign in to leave a comment.