A specific practical scenario requiring fast GPU groupby queries

Comments

1 comment

  • Avatar
    Candido Dessanti

    Hi Yanzhuo Zhou,

    If you run the same queries with different filter values, the LLVM plan could be reused. Additionally, GPUs are faster than CPUs regarding filtering, thanks to their superior bandwidth.

    One helpful optimization you might consider is taking advantage of 'fragment skipping,' a form of partition pruning. In our database, each table is divided into fragments, typically with a default size of 32 MB. When a simple filter requests data outside the minimum and maximum values of a column for a specific fragment, that fragment is skipped. As a result, the fragment is not read from disk, and no caching is performed, leading to the complete exclusion of unnecessary data.

    If you frequently use a specific field for filtering, loading data sorted based on that field will increase the likelihood of fragment skipping. This means that certain fragments can be skipped during query execution, resulting in improved performance.

    Additionally, you might consider adjusting the fragment size if you find that this type of partition pruning is beneficial for your workloads. Experimenting with different fragment sizes can help optimize query performance based on your specific use cases.


    Regards,
    Candido

    I'm truly sorry for the late reply, but I just see your message.

     

     

    0
    Comment actions Permalink

Please sign in to leave a comment.