Create Line Geometry similarly to ST_Point
I am currently working on preparing data for plotting a Linemap. Natively we have the data in the following format :
id | lat_start | lon_start | lat_end | lon_end
Ideally, I would like to solve this within the OmniSciDB. I saw in this thread and in the documentation that it is possible to convert lon and lat to a point with the ST_Point function. However, there does not seem to be a similar function for loading in
LINESTRINGS. Is this correct?
A second approach I thought of was to concat values together to hack myself towards a WKT representation could then maybe be used in the ST_GeomFromText function. However, this was not possible as my longitude and latitude data is stored numerically and OmniSciDB does not seem to like being casted to TEXTs (as also established in this thread).
I then tried to use IBIS but realized that even though the interface is python the background code is just being compiled by the DB and therefore the casting limitations are the same or?
Lastly, as the table is not especially large I could theoretically just download it, do the transformation in python and then upload it again. However, it seems that loading pandas dataframes through ibis is not yet supported in the create_table() function or?
I would very much appreciate any suggestions on how to approach or circuvemt these problems.
We have an experimental parameter called enable-interoperability that make usable the sqlite functions on projections query, so at some extent its possible to cast numbers into dictionary encoded none strings, but it unlikely you will be able to build an WKT and use it in omnisql to directly insert linestrings.
I hope those limitations will be removed soon
Anyway, If you haven't any other tools or for whatever reason, you want to use just the omnisql tool, you can do the job using the parameter I talked to you about before.
Start the database enabling --enable-interoperability; this will enable some text function on simple projection query.
so assuming you have two tables that will call linestring_source and linestring_target
CREATE TABLE linestring_source ( lon0 FLOAT, lat0 FLOAT, lon1 FLOAT, lat1 FLOAT, wkt TEXT ENCODING DICT(32)); CREATE TABLE linestring_target ( ls GEOMETRY(LINESTRING, 4326) ENCODING COMPRESSED(32));
you can write a query that, with some string concatenations, will build a WKT
omnisql> select 'LINESTRING('||lon0||' '||lat0||','||lon1||' '||lat1||')' as wkt from linestring_source; wkt LINESTRING(42.2000007629395 75.1999969482422,43.2000007629395 75.1999969482422) 1 rows returned.
then copy to an external file, and copy that file to the target table
omnisql> copy (select 'LINESTRING('||lon0||' '||lat0||','||lon1||' '||lat1||')' from linestring_source) to '/tmp/ls.csv' with (header='false'); omnisql> copy linestring_target from '/tmp/ls.csv' with(header='false'); Result Loaded: 1 recs, Rejected: 0 recs in 0.022000 secs
now you have the table with a linestring geometry, populated with the linestring.
I did some benchmarks, using a quad-core processor with the NYC Taxi dataset, and create a table with roughly 14 million records, took 36051+52975 to unload and load (36051+11162 using a target temporary table) so the TP is floating around 157000 and 296000 records per second; the performance isn't so bad, but everything it's a little tricky.
Let me know if this helps
Please sign in to leave a comment.