An issue with approx_count_distinct operator in 6.X version
Hello Team,
In 6. X throws the error: Failure from KBNet Engine - "View Location_Availability republish failed!. Error: TOmniSciException({ErrorMsg: View 'm_645b86c1_Location_Availability' query has failed with an error: 'Consider using approx_count_distinct operator instead of count_distinct operator to lower the memory requirements'.\nThe view must be dropped and re-created to resolve the error.
We ran the below query:
SELECT COUNT(Distinct @location.id) AS "m_Location_Count", COUNT(Distinct Case when @location.activeflag =1 Then @location.id else Null End) AS "m_Active_Location_Count", COUNT(DISTINCT Case when @loc_summary.locationid IS NULL and @location.activeflag =1 Then @location.id else Null End) AS "m_Available_Location_Count", COUNT(DISTINCT Case when @loc_summary.locationid IS NOT NULL Then @location.id else Null End) AS "m_Used_Location_Count" FROM #location LEFT JOIN #loc_summary ON @location.id = @loc_summary.locationid WHERE @location.locationgroup_name = 'Main'
Gtt the following error message:
Failure from KBNet Engine - "View Location_Availability republish failed!. Error: TOmniSciException({ErrorMsg:View 'm_645b86c1_Location_Availability' query has failed with an error: 'Consider using approx_count_distinct operator instead of count_distinct operator to lower the memory requirements'.\nThe view must be dropped and re-created to resolve the error. \nQuery:\nSELECT COUNT(DISTINCT m_645b86c1_location.\"m_id\") AS \"m_Location_Count\", COUNT(DISTINCT CASE WHEN m_645b86c1_location.\"m_activeflag\" = 1 THEN m_645b86c1_location.\"m_id\" ELSE NULL END) AS \"m_Active_Location_Count\", COUNT(DISTINCT CASE WHEN m_645b86c1_loc_summary.\"m_locationid\" IS NULL AND m_645b86c1_location.\"m_activeflag\" = 1 THEN m_645b86c1_location.\"m_id\" ELSE NULL END) AS \"m_Available_Location_Count\", COUNT(DISTINCT CASE WHEN m_645b86c1_loc_summary.\"m_locationid\" IS NOT NULL THEN m_645b86c1_location.\"m_id\" ELSE NULL END) AS \"m_Used_Location_Count\" FROM m_645b86c1_location LEFT JOIN m_645b86c1_loc_summary ON m_645b86c1_location.\"m_id\" = m_645b86c1_loc_summary.\"m_locationid\" WHERE m_645b86c1_location.\"m_locationgroup_name\" = 'Main';})"
could you please help?
Thanks,
Kalyani
-
Hi Kaylani,
Please provide the exact version you're using, by going to http(s)://yourImmerseServer/version.txt and reporting the results. TOmniSci exception is surprising to observe in currently supported versions of the application, so confirming the version is helpful.
The error message is warning that your system is likely unable to process your query without running out of memory. The messaging stems from the use of our watchdog process which is enabled by default. You can disable this by setting `enable-watchdog = false` in your heavy.conf file and restarting services. However, if your system does not have adequate memory to process the query, you will instead encounter the followingERR_OUT_OF_CPU_MEM: Not enough host memory to execute the query
However, you should definitely consider the suggestion provided by the error message to, replace within your view all instances of `COUNT(DISTINCT expression) with APPROX_COUNT_DISTINCT(expression). Count Distinct is indeed a very memory intensive operation, and replacing with APPROX_COUNT_DISTINCT will significantly improve performance and reduce memory requirements.
Here's some further background on APPROX_COUNT_DISTINCT function:
APPROX_COUNT_DISTINCT(x, e) gives an approximate count of the value x, based on an expected error rate defined in e. The error rate is an integer value from 1 to 100. The lower the value of e, the higher the precision, and the higher the memory cost. Select a value for e based on the level of precision required. On large tables with large cardinalities, consider using APPROX_COUNT_DISTINCT when possible to preserve memory. When data cardinalities permit, HEAVY.AI uses the precise implementation of COUNT(DISTINCT x) for APPROX_COUNT_DISTINCT.
-
HI Kayalani,
In the 6.x the engineering did some changes on count distinct, so probably your case is a taking a new code path.
could you run the query
select min(m_645b86c1_location.m_id=,max(m_645b86c1_location."m_id"),approx_count_distinct(m_645b86c1_location."m_id",10) from m_645b86c1_location
This could help us to reproduce what's going on with your data and fix the problem.
Candido
Please sign in to leave a comment.
Comments
4 comments