Advice on running an API
Greetings,
Looking to use omnisci to serve as an API endpoint to our frontend dashboards, and expect to run no more than 10 queries in parallel. We are currently experimenting with some data right now and the speedup is convincing. I have some specific questions that I couldn't get answered directly by documentation, and if I missed anything, my apologies.
There is a limit of one active query at any given time, so I am looking at sharding the database to overcome that. It looks like sharding can be set up as a single instance where it appears to function like horizontal partitioning, or it can be set up as a cluster where queries will be distributed across nodes.
-
The way I see it is, when used without clustering, sharding will reduce memory usage and increases performance of joins, but outside of that, will not necessarily make the search quicker for large datasets, is this correct?
-
How far does #1 get me before I have to use clustering? I realize that speed of queries is important here, but if we ignore that, there are other things to consider:
- database size on disk, and RAM. What is feasible here in regards to data set size?
-
the database will periodically experience heavy writes as a result of ETL when we import data, and we can't interrupt service.
-
Given what I said above, should I be looking at clustering. Is this available in the opensource configuration or is it enterprise only?
-
Our main back-end development platform is Java, but the drivers seem to be relatively basic. What is the recommended driver to use in regards to support (I saw: JDBC, ODBC, python, javascript)? I got the sense that the javascript implementation has the most features, is this correct?
-
Hi @why_not,
Thanks for your interest in OmniSci Database and to have joined our community.
About your BIG question, I suggest reading this message written by one of our engineers
https://community.heavy.ai/t/concurrent-queries/2399/2?u=candido.dessanti
Depending on your current configurations, you can change the num_executors parameter and check if the response time of the batch of queries is better.
Anyway, we are working hard to overcome the limitation of running one query at a time on a system. Still, I wouldn't overlook this feature because the increase of performance isn't huge and depends heavily on the kind of queries you are running against the database (did some benchmarks by myself).
To reply more precisely to your questions. About sharding.
Sharding a table can make your queries faster on lots of scenarios, even in a single system, because it gives you better control over the number of rows sent to each GPU. On our database, the table data is subdivided in fragments, and the fragments are sents to GPUs for processing; if the number of fragments is equal to the number of the GPUs, you will get the best performance from the query, but if the number if different, the performance will be worse.
Imagine having 2 GPUs and a table of 90 Million records (2 fragments of 32 Million records each and a fragment of 26 Million); the fragments 1 and 2 will be processed by the GPUs at the same time, while the 3rd will be processed by the first GPU with the second one idling; but creating a sharded table with many shards as the number of GPUs in the server will increase the performance because during execution all GPUs will be crunching data.
So while sharding won't make your queries run in parallel (cluster or single server is the same), it can reduce the response time of your queries, so the reply to you first question is... it's not correct
-
You have to use clustering when you run out of CPU or GPU Ram; typically, on a on-premise server, you can't use more than 8 GPUs, so assuming you are using a bunch of RTX 8000 that comes with 48GB of VRAM each, each query is limited to 384GB. The total amount of data that fit in a dual-socket is around 2/4TB (assuming you can mount 8x256GB dimms per socket or Intel's Optane memory), so depending on your hardware and the data volume you need to process, you will be forced to move to a cluster. Loading data won't stop your application from querying the database; the response time will be a little higher because the new data has to be moved to CPU or GPU Ram to be processed.
-
Cluster is available with Enterprise Edition only as of the ODBC driver; this table will give you an idea about the features you are missing using the OS version
I suggest you ask for a temporary enterprise key (here) and try out the DB with Immerse; it's an amazing experience
- I'm not sure about javascript, but I used Python and JDBC without big problems, and if java is your development platform, maybe is better to go with JDBC.
The driver can be extended and has a dedicated repository on Github
If you have further questions feel, free to ask ;)
Regards
-
-
@candido.dessanti Thanks for an excellent answer. This helps a lot. I did not mean to make it sound like I wanted to run parallel queries on the database level. I meant to give you a general idea of number simultaneous request that could be made with "reasonable" queries. Either way, you answered my question, your extensive response is very much appreciated. We already have a test environment, and I am currently profiling the performance with various schema configurations. I'm sure we will need to test the enterprise level setup at some point, but not just yet.
Please sign in to leave a comment.
Comments
2 comments