I managed to find a
Global Islands /
Global Shoreline Vector (apparently a description, rather than an actual name?) that boasts 30-meter resolution and a total of 340691 islands including an explicit layer for ones smaller than 0.0036 km^2. That... should be enough for my purposes.
Now what the heck is an .mpk file and how do I open it?
It appears to be a 7-Zip archive, but the files inside it are equally opaque. A lot of them have "gdb" in their name, which probably stands for Geo Database or something (and not GNU Debugger like I'm used to).
And why does it contain two separate copies of its gigabyte-large data, differing only in the tiny .mxd header file at their root?
I can probably open it with
this. Doesn't explain what the .mpk or .mxd files are.
Ah, found it. (
Here, too.) Well, sorta. Still doesn't clarify what the info in the .mxd file is and whether I actually need it, but I think probably not. Looks like it's largely relevant to drawing the data as an actual graphic map, which I'm not doing.
Okay. I can probably work with this. Let's see...
Yup, I'm in!
The areas this time are:
Code: Select all
| | Wikipedia | Error
Borneo | 723154.066521 km^2 | 748168 km^2 | 3.34%
Madagascar | 592521.410312 km^2 | 587041 km^2 | 0.93%
Baffin Island | 507204.948981 km^2 | 507451 km^2 | 0.05%
Sumatra | 428134.156904 km^2 | 443065 km^2 | 3.37%
Maybe it's Wikipedia that's wrong.
Maher Island and Motu Nui are definitely in this time, with area values that look right. Pandora might be... I'm finding too
many islands named Pandora and it would take some more work to figure out which is the relevant one. Oh yeah, this database has name data as well.
Though confusingly, this dataset appears to give two different area values for each island. The other one is called by the database field name "Area_Geode", which is presumably short for "geodesic", and not, say, mineral geodes (there seems to be a limit of 10 characters in field names).
Why doesn't this thing come with documentation?
...Okay, found something. Hidden deep inside "a00000004.gdbtable" is the following junk code that I would be surprised if there's any way to retrieve from within GDAL:
Code: Select all
<Process ToolSource="c:\arcgis\pro\Resources\ArcToolbox\toolboxes\Data Management Tools.tbx\CalculateGeometryAttributes" Date="20200323" Time="164400">CalculateGeometryAttributes FinalMerged_GlobalIslands_Clean "IslandArea_km2 AREA;IslandCoastline_km PERIMETER_LENGTH;Area_Geodesic_km2 AREA_GEODESIC;Coastline_Geodesic_km PERIMETER_LENGTH_GEODESIC" Kilometers "Square kilometers" #</Process>
Apparently "AREA_GEODESIC" is an actual data type in ArcGIS.
Here's the documentation. And... what. WHAT. Why would you even include non-geodesic measurements!?
The planar and geodesic areas are almost the same because the data is nominally in the Mollweide projection, which is, of course, equal-area (at least when you account for ellipsoidal flattening before projecting). Which makes it slightly worrying that they're not
exactly the same. But the data also includes
planar coastline lengths and
why would you include that. Not that I need coastline lengths for my current application, but still. If I ever do, I'll need to be careful to query "Coast_Geod" and not "IslandCoas".
Okay, so using PROPER areas this time:
Code: Select all
| | Wikipedia | Error
Borneo | 718332.882620 km^2 | 748168 km^2 | 3.99%
Madagascar | 589438.117224 km^2 | 587041 km^2 | 0.41%
Baffin Island | 509659.742371 km^2 | 507451 km^2 | 0.44%
Sumatra | 425283.080335 km^2 | 443065 km^2 | 4.01%
It's not like I was
expecting them to match. If they did, I'd have noticed sooner.
I seriously hope that the
actual polygon coordinates are not stored in the Mollweide projection rather than something sensible.
Terrible data formats aside, I really think this is the dataset I'm looking for. Now to see if I can get the
other half of my scheme to work...
EDIT: It looks like mainland Antarctica is missing from the dataset, even though Antarctic islands are present. (Or at least, Siple Island and Maher Island are included. Some other Antarctic islands seem to be missing.)
UPDATE: Okay, so it turns out that yes, the internal coordinates
are in the Mollweide projection.
Why would you take a perfectly good dataset and then do
this to it. Why.
Also, the dataset appears to do the "splitting at the 180th meridian" thing. The dataset recognizes six continents: Australia, Africa, South America, North America, Eurasia, and Chukchi Peninsula. (They aren't actually
named in the files - despite having names for minor islands, they didn't bother to name the continents for some reason - but I identified most of them by matching their areas, and Chukchi by deciphering its coordinates.)
Now I'm looking, I can tell that Vanua Levu is likewise split, with the data explicitly having entires for "Vanua Levu east of dateline" (area 5784.13 km^2) and "Vanua Levu west of dateline" (area 281.57 km^2). Confusingly, since according to Wikipedia the whole island is only 5587.1 km^2. It isn't even using "dateline" correctly, since the international date line isn't simply the 180th meridian all the way through, and it definitely runs east of Vanua Levu.
I'm not certain what's up with other islands that are supposed to be on the 180th meridian. Taveuni only gets named in the dataset once.
ANOTHER EDIT: So it turns out this dataset thinks Delmarva is an island, instead of a peninsula. Among other issues.
Not that I care greatly about Delmarva specifically, but how do you mistake a peninsula for an island and then claim to have 30 meter accuracy? (According to Wikipedia, the narrowest point of the isthmus connecting Delmarva to the mainland is 19 kilometers.)