Keeping geometric data types in any capacity when joining
Hi,
I want to create a join a table that I will call objects
and a table that I call geometries
. The objects
table has a geometry ID that is also present in the geometries
table. Other than the geometry ID, the geometries
table has three additional columns with point, polygon and multipolygon information.
However when I run the following SQL query:
SELECT * FROM objects INNER JOIN geometries ON geometries.geometry_id= objects.geometry_id LIMIT 100
I get the following exception:
Exception: Columnar conversion not supported for variable length types
I tried changing what I am joining on what or directly writing this to a new table. However, all of these queries had the same issues in the end.
I am aware that this touches on the know issues but I was wondering whether this is absolutely impossible or whether you have any idea on how to bring together information of the geospatial
with the objects
table?
Thanks so much in advance
Best Frederic
-
Hi @candido.dessanti , yes this happened when I was just joining the two tables.
More specifically this happens when I add one of the geometry columns to the Select statement. If I just return the id or any other non-geometric column it works like a charm.
Let me know if I should to replicate this in one of the demo environments.
Best Frederic
-
Hi,
I can't reproduce the issue.
I created two tables that I join with a bigint column. The object table contains 5m records and the geometries table around 200k, and the relationship is N-1.
omnisql> \\d geometries CREATE TABLE geometries ( sez2011 BIGINT, omnisci_mp GEOMETRY(MULTIPOLYGON, 4326) ENCODING COMPRESSED(32), omnisci_p GEOMETRY(POINT, 4326) ENCODING COMPRESSED(32)); omnisql> \\d objects CREATE TABLE objects ( sez2011 BIGINT, name TEXT, place TEXT, SHARED DICTIONARY (name) REFERENCES italy_poi(name), SHARED DICTIONARY (place) REFERENCES italy_poi(place)); omnisql> select count(*) from objects; EXPR$0 5769509 1 rows returned. Execution time: 167 ms, Total time: 168 ms omnisql> select count(*) from geometries; EXPR$0 269009 1 row returned. Execution time: 12 ms, Total time: 13 ms
if I execute a query similar to yours, I haven't any problem
omnisql> select * from objects o inner join geometries g on o.sez2011=g.sez2011 limit 10; 240080000039|NULL|NULL|240080000039|MULTIPOLYGON (((11.3269491453315 45.5305159862761,11.3276593439875 45.5305015274931,11.3278205279856 45.5307614922201,11.3273821544495 45.5310764841414,11.3269112591291 45.5309173118001,11.326338188409 45.5305284334023,11.3258662034412 45.5305740728651,11.3254992437202 45.5304487634126,11.3252829067992 45.5306376915102,11.3253637083459 45.5308115740916,11.3250277616666 45.531050206875,11.3247265160665 45.5310428308002,11.3252616167652 45.5301205699938,11.3248200581059 45.5300507906499,11.3248009473667 45.5302852324817,11.3244266115708 45.5302140701237,11.3244778249992 45.529981262763,11.3240775053036 45.529899371569,11.3241412915867 45.5295830385432,11.3235102180967 45.529420387712,11.3229827449298 45.5293613372042,11.3218416326315 45.5291730377493,11.3216229487777 45.5296162727892,11.3220948499265 45.5298789616348,11.3218293950529 45.5301206538128,11.3213630259602 45.5299906085851,11.3212807994901 45.5298595156194,11.3210435916302 45.5297743554831,11.3209727645484 45.5289881748748,11.3211777859 45.5288354985085,11.3214424864023 45.5285758271481,11.3210858364222 45.52838953935,11.3209041167614 45.5282537106091,11.3211873412697 45.5281332207509,11.3215317536711 45.5280227053575,11.3212812185852 45.5276925002819,11.3214639440744 45.5276190329006,11.3214224536537 45.5275456074288,11.3213707373111 45.5274543704127,11.3217754155964 45.5272526180033,11.3225007854973 45.5270646119151,11.322808988077 45.5266195328565,11.3232088048585 45.5260668301611,11.3233267382362 45.5258978929026,11.3232647121527 45.5256358746093,11.3229429308898 45.5253633789371,11.3226431102132 45.5250791905099,11.3230833277679 45.5245729235581,11.3232384767957 45.5243762422001,11.3239054248314 45.5242479152625,11.3245117717071 45.5241230668147,11.3246626459643 45.5245227997771,11.3249096606508 45.5250781008625,11.3250829145895 45.5253198349501,11.3252082240419 45.5256413228464,11.3255518820721 45.5261338853865,11.3253329467612 45.5262598653912,11.3253891893315 45.5263824926346,11.3254340325135 45.5264603186056,11.324857273756 45.5266813493924,11.3247152843162 45.5268867898392,11.3250928052352 45.5269578264686,11.3247249235049 45.5274311525409,11.3244183973057 45.5278401894159,11.3232347049393 45.5275492535567,11.3231568370588 45.527917135287,11.3234762713888 45.5279961347245,11.3249085710034 45.5283905451784,11.3265077543103 45.5286640047695,11.3262441434554 45.5290249295202,11.3260760024777 45.5294514007538,11.3261315744958 45.5294795220389,11.3260974601498 45.5298155106277,11.3263204187743 45.5300989865931,11.3269491453315 45.5305159862761)))|POINT (11.3233019278028 45.526939092915) [CUT] 10 rows returned. Execution time: 30 ms, Total time: 34 ms
I also tried to duplicate the records in the geometries table, changing the relationship from N-1 to N-N, but the query runs without issues. Which datatype are you using to join the tables? Wich relationship between the tables?
We do this kind of join in Immerse since 4.0 release for choropleth maps, so I'm really curious to find out the reason of error you are getting.
Regards, Candido
-
Hi.
Thank you for your quick answer and I am happy to hear that this should generally be possible.
Below you will find the table definitions for the geometries table and the relevant part of the objects table.
If this helps in any way I could also try to make a small part of our data available to you for testing internally.
CREATE TABLE geometries ( location_mapped_id TEXT ENCODING DICT(32), api TEXT ENCODING DICT(16), original_type TEXT ENCODING DICT(16), creation_date DATE ENCODING DAYS(32), point_wkt GEOMETRY(POINT, 4326) ENCODING COMPRESSED(32), polygon_wkt GEOMETRY(POLYGON, 4326) ENCODING COMPRESSED(32), mpolygon_wkt GEOMETRY(MULTIPOLYGON, 4326) ENCODING COMPRESSED(32));
And the relevant part of the objects table
CREATE TABLE objects ( location_mapped_id TEXT, location_mapped_voc TEXT, location_mapped_country TEXT, SHARED DICTIONARY (location_mapped_id) REFERENCES objects(location_mapped_id), SHARED DICTIONARY (location_mapped_voc) REFERENCES objects(location_mapped_voc), SHARED DICTIONARY (location_mapped_country) REFERENCES objects(location_mapped_country);
Just by looking at it, I see that they are different data types and one is part of a shared dictionary that I must have forgotten in the other table. If you recommend this I would be happy to try changing the geometries table accordingly.
The relationship is / should be N-1 meaning we have a repetition of the location_mapped_id variable in the objects table but not in the geometries table.
Best Frederic
-
Hi @ldnker,
Looking at your DDLs, I noticed that you are referencing as a shared dictionary column the table itself.
[quote="ldnker, post:5, topic:2726"]
CREATE TABLE objects ( location_mapped_id TEXT, location_mapped_voc TEXT, location_mapped_country TEXT, **SHARED DICTIONARY (location_mapped_id) REFERENCES objects(location_mapped_id),** SHARED DICTIONARY (location_mapped_voc) REFERENCES objects(location_mapped_voc), SHARED DICTIONARY (location_mapped_country) REFERENCES objects(location_mapped_country);
[/quote]
It's wrong, and I haven't any idea about the problem that could arise from this syntax (likely nothing special), but I tried. The join is working, as it's working without using a shared dictionary, and it's also working using a shared dictionary key used by both tables.
CREATE TABLE geometries_de ( sez2011 TEXT NOT NULL ENCODING DICT(32), omnisci_mp GEOMETRY(MULTIPOLYGON, 4326) ENCODING COMPRESSED(32), omnisci_p GEOMETRY(POINT, 4326) ENCODING COMPRESSED(32), omnisci_pl GEOMETRY(POLYGON, 4326) ENCODING COMPRESSED(32));
CREATE TABLE objects_de_s ( sez2011 TEXT NOT NULL, name TEXT ENCODING DICT(32), place TEXT ENCODING DICT(32), SHARED DICTIONARY (sez2011) REFERENCES geometries_de(sez2011));
the query
select * from objects_de_s o inner join geometries_de g on o.sez2011=g.sez2011 limit 10;
is running without issues with many-to-one and many-to-many relationship.
I'm sharing the dump of both tables here
https://drive.google.com/drive/folders/1Wgk5E79vBNeGlIJDknRvlWSE-3YARuVy?usp=sharing
open omnisql and run the commands
restore table objects from '/path/objects_de_s.dmp' restore table geometries from '/path/geometries.dmp'
the path can be the $OMNISCI_STORAGE/mapd_import or another one in the import allow-list (allowed-import-paths = ["root_path_1", "root_path_2", ...])
If you can share your tables I can try out them in our systems
Regards, Candido
-
Hi @ldnker,
I think I reproduced your issue but to be sure you would run this query on your
geometries
table.select max(cast(offset_in_fragment() as bigint))+1 as max_fragment_offset,count(*) the_count from geometries
If the max_fragment_offset is lower than the_count it0s likely your table is spawning more than fragment, so the error when you run your join with the limit keyword (you should get another one without the limit keyword).
To try to fix this, you can try to
Increase the fragment size (the default is 32M, meaning that in 1 fragment can't be more than 32M rows) and the chunck_size to the max value of 2GB with a CTAS
create table geometries_uf as select * from geometries_mf with (fragment_size=64000000,max_chunk_size=2147483648);
and re-run the query to be sure that the table is one fragment only.
Regards, Candido
Please sign in to leave a comment.
Comments
9 comments