HeavyIQ - Auto select data source
I'm expecting HeavyIQ to get automatically the correct data source, in case I don 't select one, especially if no other table has the same column name or column tag, nor something even similar as meaning.
All my columns are tagged, and I'm using a prompt with appropriate words.
But I always get
Internal Server Error: Internal Server Error
Until I manually select the data source. Then, with the same prompt, HeavyIQ finds the data and works fine.
-
Gianfranco Campana are you saying the auto selection of tables is not working, but SQL query generation if you do manually select tables is working?
It sounds like it could be a version mismatch of either HeavyIQ or the web server... Can you paste the output of the following?
http://<immerse_ip>:<immerse_port>/version.txt ? -
I upgraded to 8.1, and I'm still experimenting with auto-select source table.
What I noted, is that the search for an appropriate data source table is actually executed in background: if I login with a user that only have visibility on very few tables, HeavyIQ returns :
NLtoTableException: No relevant tables found to answer question.So this means HeavyIQ is searching among the available tables what table is appropriate for the query and cannot find one.
When I use a user with full visibility on all tables, the error changes to:
Internal Server Error: Internal Server Errormeaning some other error is happening while searching for, or once find, a relevant table.
-
I found the culprit in the /logs/iq/console log file: the error was generated by a view referencing a not existent table: maybe the error description in Immerse could be improved, because in the console log file is very well explained.
TDBException(error_msg="View 'top_cat' query has failed with an error: 'SQL Error: From line 3, column 6 to line 3, column 31: Object 'gr_pro' not found'.\nThe view must be dropped and re-created to resolve the error.
This issue can be marked as solved, thank you.
-
Thanks Candido.
Now the searching for a data source takes place, but hungs up with these messages in the HeavyIQ console:
Parsing nodes: 0%| | 0/30 [00:00<?, ?it/s] Parsing nodes: 100%|██████████| 30/30 [00:00<00:00, 563.29it/s] Generating embeddings: 0%| | 0/32 [00:00<?, ?it/s]127.0.0.1:49912 - "GET /version.txt HTTP/1.1" 200
while the Immerse Notebook keeps showing the "Working on it..." message:
-
Hi Gianfranco Campana, this is likely because your system is trying to run embeddings locally on CPU. We just created a public url for an online embeddings server, can you add this to your `heavy.conf` file?
rag_embed_server_base = "https://embeddings.heavy.ai"
-
Hi Todd,
I added the online embeddings server to my heavy.conf:
allowed-import-paths = ["/var/share/ingesting/"]
enable-logs-system-tables = 1
use-estimator-result-cache = 0
# allow-cpu-retry = 1
enable-watchdog = 0
[web]
jupyter-url = "http://jupyterhub:8000"
servers-json = "/var/lib/heavyai/servers.json"
rag_embed_server_base = "https://embeddings.heavy.ai"but now the log reports a this error:
started chromadb server... Args: --path storage/rag_storage/chromadb --port 8009 See logs at chromadb.log Started monitoring chromaDB server process... 127.0.0.1:39242 - "GET /version.txt HTTP/1.1" 200 127.0.0.1:33000 - "GET /version.txt HTTP/1.1" 200 Parsing nodes: 0%| | 0/30 [00:00<?, ?it/s] Parsing nodes: 100%|██████████| 30/30 [00:00<00:00, 874.60it/s] Generating embeddings: 0%| | 0/32 [00:00<?, ?it/s][2024-08-07 07:02:12 +0000] [69] [ERROR] Worker (pid:61631) was sent code 139! [2024-08-07 07:02:12 +0000] [61695] [INFO] Booting worker with pid: 61695 [2024-08-07 07:02:12 +0000] [61695] [INFO] Started server process [61695] [2024-08-07 07:02:12 +0000] [61695] [INFO] Waiting for application startup. [2024-08-07 07:02:13 +0000] [61695] [INFO] Application startup complete. 127.0.0.1:33592 - "GET /version.txt HTTP/1.1" 200
while the Immerse Notebook shows:
As far as I know, I'm using the HeavyIQ model in the docker image, without any customization, therfore if OpenAI is being used it is not intentional. Though I don't see how.
-
This is the full heavy.conf: I added custom_llm_type with no luck:
allowed-import-paths = ["/var/share/ingesting/"]
enable-logs-system-tables = 1
use-estimator-result-cache = 0
# allow-cpu-retry = 1 # Si sposta cu CPU se memoria GPU insufficiente
enable-watchdog = 0 # Esegue su GPU in stage o layer virtuali in successione
[web]
jupyter-url = "http://jupyterhub:8000"
servers-json = "/var/lib/heavyai/servers.json"
rag_embed_server_base = "https://embeddings.heavy.ai"
[iq]
custom_llm_type = "API_VLLM" # API or AZUREAnd I got the same result:
Parsing nodes: 0%| | 0/29 [00:00<?, ?it/s] Parsing nodes: 100%|██████████| 29/29 [00:00<00:00, 1136.25it/s] Generating embeddings: 0%| | 0/30 [00:00<?, ?it/s]127.0.0.1:51280 - "GET /version.txt HTTP/1.1" 200
Please sign in to leave a comment.
Comments
14 comments