Query fails on join with date
The following query fails with a generic server error.
TIOStreamTransport.java:read:130:org.apache.thrift.transport.TTransportException: Socket is closed by peer.
-- root_node()
SELECT r."date__id",
max_dim_date."the_date" AS "date",
r."amount"
FROM
(
-- leaf_node(a, fact_sales)
SELECT fact_sales."the_date" AS "date__id",
SUM(fact_sales."sales_amt") AS "amount"
FROM fact_sales
GROUP BY fact_sales."the_date"
-- [end] leaf_node(a, fact_sales)
) AS r
JOIN max_dim_date
ON max_dim_date.the_date = r.date__id
ORDER BY r."amount" DESC
LIMIT 10000
-- [end] root_node()
Looks like it's because of the JOIN
on a date column, because if I comment it out the query runs.
-- root_node()
SELECT r."date__id",
-- max_dim_date."the_date" AS "date",
r."amount"
FROM
(
-- leaf_node(a, fact_sales)
SELECT fact_sales."the_date" AS "date__id",
SUM(fact_sales."sales_amt") AS "amount"
FROM fact_sales
GROUP BY fact_sales."the_date"
-- [end] leaf_node(a, fact_sales)
) AS r
--JOIN max_dim_date
-- ON max_dim_date.the_date = r.date__id
ORDER BY r."amount" DESC
LIMIT 10000
-- [end] root_node()
Also note that a similar query that joins on an integer columns works fine.
-- root_node()
SELECT r."region__id",
dim_region."name" AS "region",
r."amount"
FROM
(
-- leaf_node(a, fact_sales)
SELECT dim_store."region_id" AS "region__id",
SUM(fact_sales."sales_amt") AS "amount"
FROM fact_sales
JOIN dim_store
ON dim_store.store_id = fact_sales.store_id
GROUP BY dim_store."region_id"
-- [end] leaf_node(a, fact_sales)
) AS r
JOIN dim_region
ON dim_region.region_id = r.region__id
ORDER BY r."amount" DESC
LIMIT 10000
-- [end] root_node()
-Larry
-
Hi @Larry_Parker,
run a query similar to the on is crashing your server, but it's working
select a.ClientEventTime,b.ClientEventTime, sum_income from ( select ClientEventTime,sum(income) sum_income from hits_v1 group by 1) a join hits_v1_ts b on a.ClientEventTime=b.ClientEventTime order by sum_income limit 10;
could you share the DDLs of your tables?
-
Hi @Larry_Parker,
I have been able to reproduce the error you are getting.
It happens with DATE and TIMESTAMP datatypes when you use CTE and you join the result placing the field of the right table in the left part of the condition
so the
JOIN max_dim_date ON max_dim_date.the_date = r.date__id
will make the server crash, while the condition written this way
JOIN max_dim_date ON r.the_date = max_dim_date.date__id
will work.
So if you run this query it will work without any problem
-- root_node() SELECT r."date__id", max_dim_date."the_date" AS "date", r."amount" FROM ( -- leaf_node(a, fact_sales) SELECT fact_sales."the_date" AS "date__id", SUM(fact_sales."sales_amt") AS "amount" FROM fact_sales GROUP BY fact_sales."the_date" -- [end] leaf_node(a, fact_sales) ) AS r JOIN max_dim_date ON r.date__id = max_dim_date.the_date ORDER BY r."amount" DESC LIMIT 10000 -- [end] root_node()
it's obviously a bug and I guess it will be corrected in the future release.
-
Hi @candido.dessanti,
You're saying the issue on this thread has been fixed, or the one I just posted about?
https://community.heavy.ai/t/left-outer-join-and-is-not-null/2671
How do I find the OmniSci version?
SELECT VERSION()
doesn't seem to work.-Larry
Please sign in to leave a comment.
Comments
8 comments