Speaking of scale bars, I've been trying to look for landmass/coastline datasets, and it seems they like describing their accuracy in terms of scale like "1:1000000", and I'm confused because I'm looking for vector data that I can use in computations, I'm not going to be displaying or printing this at any scale, how is a scale ratio supposed to tell me how accurate your data is, is a smaller or larger value better, ARGH!
Also like most the alleged datasets I'm finding references to seem to be dead links, if anyone bothered to provide a link at all.
Vector data sources (split from “Most widely used projections”)
Re: Most widely used projections
What region and coverage are you looking for, and at what scale(!)?Milo wrote: ↑Wed Aug 30, 2023 10:43 pm Speaking of scale bars, I've been trying to look for landmass/coastline datasets, and it seems they like describing their accuracy in terms of scale like "1:1000000", and I'm confused because I'm looking for vector data that I can use in computations, I'm not going to be displaying or printing this at any scale, how is a scale ratio supposed to tell me how accurate your data is, is a smaller or larger value better, ARGH!
Imagine a physical map at the scale given. Meaning, pick whatever local area you wish that is contained in the data set, and assume a conformal projection for that region at the given scale. Then imagine digitizing this map, producing vectors that faithfully represent the linework. That’s what the scale means. All else being equal, given two data sets of the same region but with different scales, the lower ratio will have more detail.
— daan
Re: Most widely used projections
Whole world, in whatever scale I can find.
What I'm looking for is a list of all landmasses (continents/islands), with detailed coastline coordinates, along with the name and total area of each. I want it to include even really small islands like Moto Nui and Maher Island, and to be able to recognize that, say, Afro-Eurasia is a single continguous landmass, but Severny Island and Yuzhny Island are two separate islands.
First-level landmasses are enough, I don't need lake islands

I noticed that the software used to compute Point Nemo has recently been made publicly available, so I've been trying to play around with it, but it needs datasets.
Yeah, but... physical ink-on-paper maps can have varying resolution. Some could be printed with enough detail in the placement of ink molecules that you would need a magnifying glass to make it out, while others could be visibly pixellated even to the naked eye.
Sure, usually you'd aim to have about the level of detail that the human eye can make out, because more than that is overkill, but it's not like everyone's eyes are identical, so it's good to have a little overkill just to be on the safe side.
So you can tell me that you scaled a kilometer on the planet to a millimeter on the map, or whatever, but how much actual detail is represented in that one millimeter!? It's not like a computer using floating-point numbers would have any difficulty processing digitized nanometer-scale data, if I tell it to.
Re: Most widely used projections
Maybe you could assume 300 dpi or so. Yes, the printer is likely to have printed the map at 600 dpi or more, but I wouldn't expect the level of detail represented to exceed around 300 dpi.
Re: Most widely used projections
While 300 dpi is a reasonable assumption for a printed map, the “frequency” of directional change in a vector isn’t going to be anywhere near that rapid. Trying to translate into DPI is fraught, but most linework caricatures aren’t even going to change with a frequency of 72 dpi.
If not a scale value, what would you suggest as a measure for detail? — while still being meaningful to the bulk of cartographers who are not particularly technical.
— daan
Re: Most widely used projections
By far the best freely usable data set for worldwide coverage at small scales is Natural Earth Data. It has no licensing restrictions and is kept up to date. You can choose from 1:10,000,000, 1:50,000,000, and 1:110,000,000 levels of detail. Obviously you can do your own generalization, but these data sets are curated beyond automated generalization. They are in Shapefiles, if that’s all you care about, but they come with SQL tables for querying features as well.Milo wrote: ↑Thu Aug 31, 2023 1:27 amWhole world, in whatever scale I can find.
What I'm looking for is a list of all landmasses (continents/islands), with detailed coastline coordinates, along with the name and total area of each. I want it to include even really small islands like Moto Nui and Maher Island, and to be able to recognize that, say, Afro-Eurasia is a single continguous landmass, but Severny Island and Yuzhny Island are two separate islands.
First-level landmasses are enough, I don't need lake islands![]()
I haven’t checked whether the database tables include area (probably not), but GeographicLib will do that for you given any arbitrary geographical polygon. Also freely licensed.
— daan
Re: Most widely used projections
On the other hand, a planar polygon area calculation is trivial, and so is projecting to an equal-area map (even with ellipsoid accuracy), so that’s what I do to compute areas. — Unless the region intersects with a projection boundary, in which case you’ve got more complications to deal with. GeographicLib is handy for handling all that on the reference spheroid.daan wrote: ↑Thu Aug 31, 2023 8:08 am I haven’t checked whether the database tables include area (probably not), but GeographicLib will do that for you given any arbitrary geographical polygon. Also freely licensed.
— daan
Re: Most widely used projections
Maximum error, measured in degrees, arcseconds, meters, or the like. So if you say your data is accurate to 1 kilometer, then that means all features of 1 kilometer or larger are represented, with coordinates that lie within 1 kilometer of their correct location, and that points making up a vector polyline are typically 1 kilometer apart.
I've looked at it a little. The only dataset that includes names appears to be the Physical Labels one, and it doesn't list any of the three islands that defined Point Nemo, and it's completely separate from the main coastline data. I can't find SQL anywhere?daan wrote: ↑Thu Aug 31, 2023 8:08 amBy far the best freely usable data set for worldwide coverage at small scales is Natural Earth Data. It has no licensing restrictions and is kept up to date. You can choose from 1:10,000,000, 1:50,000,000, and 1:110,000,000 levels of detail. Obviously you can do your own generalization, but these data sets are curated beyond automated generalization. They are in Shapefiles, if that’s all you care about, but they come with SQL tables for querying features as well.
Though I guess it's possible they're represented as unnamed islands (Hrvoje Lukatela mentions - all the way at the bottom - that his original DCW dataset - which appears to have disappeared off the web - didn't name Pandora Islet, either).
Also the Land dataset casually mentions that "continental polygons broken into smaller, contiguous pieces to avoid having too many points in any one polygon", which is exactly what I don't want. I guess if it's ONLY the major continents then I might be able to fix it myself by gluing them back together, but how hard that is depends on just how many pieces they mean, and if they just casually mention something like this it suggests that keeping landmasses in one piece wasn't considered a priority, so I can't be 100% confident how reliable the data is even for smaller islands.
And apparently it's normal for shapefiles to split their polygons along the 180th meridian in order to avoid confusing GIS software that doesn't understand that the earth is round (how the heck does such software even exist!?). I suppose I could use Wikipedia's list of islands that cross the 180th meridian to make sure I catch them all...
Other possible sources I've found are VMAP0 (apparently an official successor to DCW?) and GSHHG, but I haven't tried downloading them yet to check if they suit my purposes, and the web pages are rather... uninformative... about what exactly they're offering.
Re: Most widely used projections
While more rigorous, that’s probably a lot less interpretable for most mapmakers.Milo wrote: ↑Thu Aug 31, 2023 8:50 am Maximum error, measured in degrees, arcseconds, meters, or the like. So if you say your data is accurate to 1 kilometer, then that means all features of 1 kilometer or larger are represented, with coordinates that lie within 1 kilometer of their correct location, and that points making up a vector polyline are typically 1 kilometer apart.
The SQL tables are part of the Shapefile spec and found in the files having the .dbf extension.Milo wrote: I can't find SQL anywhere?
I don’t think that comment in the Lands link is accurate. The Shapefile spec does not even permit mixed geometry types, so if a Shapefile has some polygons in it, then all of what it holds must be polygons. I can confirm that all of Eurasia/Africa is a single polygon, as is North and South America, in the coastline data set.Milo wrote: Also the Land dataset casually mentions that "continental polygons broken into smaller, contiguous pieces to avoid having too many points in any one polygon", which is exactly what I don't want. I guess if it's ONLY the major continents then I might be able to fix it myself by gluing them back together, but how hard that is depends on just how many pieces they mean, and if they just casually mention something like this it suggests that keeping landmasses in one piece wasn't considered a priority, so I can't be 100% confident how reliable the data is even for smaller islands.
I haven’t paid any attention to the labeling, but given that there are labels and points associated with them for every coverage, the situation looks a lot better to me than you describe. For example, there is a minor_islands_coastline data set, and a minor_islands_label_points dataset that presumably pairs with the coastline data set.Milo wrote: The only dataset that includes names appears to be the Physical Labels one, and it doesn't list any of the three islands that defined Point Nemo, and it's completely separate from the main coastline data.
— daan
Re: Most widely used projections
Natural Earth does not do this. As for the rant about GIS software… see Chrisman 2016, Calculating on a round planet. Yes, we are all boggled.Milo wrote: ↑Thu Aug 31, 2023 8:50 am And apparently it's normal for shapefiles to split their polygons along the 180th meridian in order to avoid confusing GIS software that doesn't understand that the earth is round (how the heck does such software even exist!?). I suppose I could use Wikipedia's list of islands that cross the 180th meridian to make sure I catch them all...
— daan