Skip to main content Link Search Menu Expand Document (external link)

build

Creates a new feature library from an OpenStreetMap data file.

Usage:

gol build [<options>] <gol-file> <source-file>

<gol-file> is the name of the library to build. If no extension is given, .gol will be added.

<source-file> is the file that contains the OpenStreetMap data. Currently, only files in OSM-PBF format are supported. This format is popular due to its compact size and wide tool support.

Sources of OpenStreetMap Data

A complete copy of the worldwide map data (updated weekly) is available for download on the OSM website. Because of the size of the download (multiple gigabytes) and limited server bandwidth, it is highly recommended to use torrents.

A better alternative may be using a mirror, or a provider of processed datasets.

If you are looking for regional or country-sized subsets, GeoFabrik offers a large selection of datasets (updated daily).

Hardware Requirements

Building a library is a resource-intense operation. To build a GOL that contains the worldwide OpenStreetMap dataset, you should have a machine with at least 8 physical cores and 24 MB of RAM, or the build process may take multiple hours.

For smaller datasets, any reasonably modern machine will do fine (a 10-year-old dual-core laptop with 8 GB RAM will build country-size datasets such as Germany, France or Japan in about 20 minutes each). A fast solid-state drive is recommended in any case.

Your drive should have free space equal to at least three times the size of the .osm.pbf file (in addition to that file). So if you wish to download the planet file (currently 60 GB) and turn it into a GOL, you should have 240 GB of disk space available.

The resulting GOL itself will only be 30% to 50% larger than the planet file (The additional storage is needed to accommodate temporary files).

An alternative to building a GOL is downloading a GeoDesk tile set (via the load command). This is much faster and requires only minimal hardware.

Options

-c, --config=<FILE>

Instructs the command to use a specific configuration file.

If the option is specified without a value, the program writes a configuration file containing the default settings to the path where the library would be created, without actually building the library itself. The user can then customize these settings and invoke the build command again.

For example, gol build gols/planet -c creates planet.fab in the gols folder.

-q, --quiet

Displays only minimal output. Apart from error messages, only minimal progress updates are written to stderr.

-k, --keep-work

Retains the temporary files generated by the build process (instead of deleting them by default). If you might want to change a setting and rebuild the library, adding this option may significantly cut the build time by re-using work done by the previous invocation of the command. It does, however, increase the required amount of disk space.

-s, --silent

No output at all is written to stderr, not even error messages. (Whether a command succeeded or failed can only be ascertained via the status code returned by the process).

-u, --updatable

Creates a GOL that can be incrementally updated.

See build setting updatabale.

-v, --verbose

Writes extra information to stderr.

Build Settings

area-tags 0.2

The tags that determine whether a closed OSM way is treated as an area or a linear ring. Rules can be specified for one or more keys. A closed way is treated as an area if it fulfills at least one of these rules (or is explicitly tagged area=yes), and is not tagged area=no.

Key rules have the following format:

key [ ( only|except value+ ) ]

Multiple key rules and values must be separated by whitespace and/or commas.

Example:

area-tags:
  building                            // Any "building" tag (except "building=no")
  barrier (only city_wall, ditch)     // "barrier=city_wall" or "barrier=ditch" 
  man_made (except embankment)        // Any "man_made" tag (but not "man_made=embankment" 
                                      // or "man_made=no")     

id-indexing 0.3

Value: yes / no (defaults to value of updatable)

If enabled, instructs the build command to retain the external ID indexes used during building, so incremental updates can be processed faster (updatable must be enabled).

In case this option is disabled, the update command can also re-create these indexes if needed.

indexed-keys

To enhance query performance, GOLs organize features into separate indexes based on their tags. The index-keys section specifies which keys should be considered for indexing. The ideal keys for indexing are those that create categories of features (similar to layers in a traditional GIS database), such as highway, landuse or shop. As the number of indexes is limited (see max-key-indexes), multiple keys may be consolidated into one index (This is done automatically on a per-type, per-tile basis). Features whose tags have multiple indexed keys (e.g. tourism and amenity for a hotel that is also a restaurant) are consolidated with features with the same key, or placed into a separate mixed-key index.

Keys that should always be placed into the same index can be specified as key-pairs by placing forward slashes between these keys (useful for rare-but-similar categories like telecom/communication).

Place an exclamation mark (!) after a key or key-pair to indicate keys that should be considered more important (i.e. more likely to be queried) than others. Likewise, mark entries with a question mark (?) to lower their importance. 0.2

Example:

indexed-keys:
  amenity
  building?
  highway
  natural/geological
  shop

key-index-min-features

Value: 0 – 1,000,000 (default: 300)

If there are fewer features in a key index than this number, these features will be consolidated into another index.

Used with indexed-keys and max-key-indexes.

max-key-indexes

Value: 0 – 30 (default: 8)

The maximum number of key-based indexes to create, per feature type (node, way, area, relation). A higher number boosts the performance of queries that make use of indexed keys (queries that require the presence of a key/tag). However, a higher number of key indexes may reduce the performance of queries not based on indexed keys. Key indexes are very storage-efficient, so specifying a higher number has a minimal impact on file size.

If the number of key indexes is lower than the number of keys and key-pairs in indexed-keys, features with less frequent keys will be consolidated in one or more combined indexes. Index consolidation also happens if the number of features in an index is below key-index-min-features

Specifying 0 disables key indexing.

max-strings

Value: 256 – 65,535 (default: 16,384)

The maximum number of strings that will be stored in the GOL’s Global String Table. A higher value results in a smaller GOL file and increased query performance. (Loading a larger string table consumes more memory and may cause a slight delay when opening a GOL file, but the impact of this is generally negligible.)

The actual number of strings will be less if fewer strings meet the minimum usage threshold (min-string-usage)

max-tiles

Value: 1 – 8,000,000 (default: 65,535)

The maximum number of tiles into which the features of the GOL are organized. The actual number of tiles may be significantly less, based on min-tile-density. A lower tile count results in a more compact GOL, while a higher tile count improves the performance of certain large-area spatial queries.

A higher setting is also preferred if you intend to host a tile repository, as a more granular tileset reduces the amount of data users will have to download for their regions of interest.

If this number is set too low, a tile may exceed the maximum size of 1 GB (uncompressed). An unreasonably low setting may also cause the build process to fail with an OutOfMemoryError.

min-string-usage

Value: 1 – 100,000,000 (default: 300)

Specifies the minimum number of times a string must be used by features (as a tag key or value, or as a role in a relation) in order to be included in the GOL’s Global String Table.

See max-strings.

min-tile-density

Value: 1 – 10,000,000 (default: 75,000)

If there are fewer nodes in a tile area than this number, the tile will be omitted, and all features in the tile area will be placed into tiles at lower zoom levels. A lower threshold will result in more tiles, up to the maximum specified by max-tiles.

properties

A section with key-value pairs that are stored as GOL metadata, which are displayed by gol info and can be read by other applications.

Common properties include:

generatorThe program used to create the GOL (“geodesk/gol 0.2.0”)
copyrightText indicating the copyright holder of the data (“OpenStreetMap contributors”)
licenseThe license under which the data is distributed (“Open Database License 1.0”)
license-urlLink to the website where the license text can be found (“https://opendatacommons.org/licenses/odbl/1-0/”)
tileset-urlThe default URL from which tiles can be downloaded or updated (e.g. “https://data.geodesk/world”)

If you wish to distribute tilesets based on OpenStreetMap data, you must do so in accordance with the Open Database License. You can use the build command to create a GOL from any geodata in OSM-PBF format, so in theory, GOLs could contain data from non-OSM sources (or very old OSM datasets distributed under a Creative Commons License) – but in general, you should not override the defaults for copyright, license and license_url.

To set properties from the command line, use --property:property=value or -p:property=value.

rtree-bucket-size

Value: 1 – 256 (default: 16)

GOLs use R-tree indexes to accelerate spatial queries. This setting specifies the maximum number of features (or child buckets) in each bucket (A bucket is a node in the R-tree). A larger number reduces the size of the GOL, a smaller number increases query performance (but set too low, it may have the opposite effect). Square numbers (4, 9, 16, 25, etc.) tend to perform best.

tag-duplicate-nodes

Value: yes / no (default)

By default, untagged nodes are discarded (unless they are members or one or more relations) — they simply exist as locations on ways. In certain cases, problems arise if two or more untagged nodes share the same location, because it becomes unclear whether features with coincident geometry are supposed to be connected. Enabling this option causes such duplicate nodes to be tagged with geodesk:duplicate=true, which turns them into feature nodes (with a distinct identity).

tag-orphan-nodes

Value: yes / no (defaults to value of updatable)

By default, any nodes that have no tags and aren’t referenced by any ways or relations are discarded. However, if this option is enabled, such orphan nodes are tagged geodesk:orphan=true and retained as feature nodes.

If a GOL is updatable, this option is enabled by default.

tile-zoom-levels

The zoom levels at which tile-tree nodes should be created. Together with max-tiles and min-tile-density, this setting shapes the tile structure of a GOL.

  • Zoom levels must be between 0 and 12.
  • The difference between zoom levels must not exceed 3 (e.g. you can specify 0,3,6,9, but not 0,4,6,12).
  • The root level (0) is always included implicitly.
  • Fewer zoom levels result in a flatter tree that may yield better query performance, but cause a higher variance in tile sizes.
  • Setting the top zoom level too low may cause the maximum tile size (1 GB uncompressed) to be exceeded. (Very large tiles may also cause the build process to run out of memory.)

updatable 0.3

Value: yes / no (default)

Enables incremental updates to the GOL file (using the update command). If enabled, additional storage is required for the way-node indexes (an extra 20% above the size of the GOL). The build command will also take slightly longer.

For GOLs built from a planet-wide dataset, it is highly recommended to also enable id-indexing, which speeds up the processing of updates. The id-indexing option does not create extra work for the build command, as it has to create these indexes anyway. If kept, these indexes do however consume significant extra storage (25% above the size of the GOL, with a higher ratio for extracts).

If you no longer need to update a GOL, you can delete its external indexes and recover the storage. If you delete the ID indexes (or disable id-indexing during the initial build), you can later re-create them with the --index option of the update command. However, deleting the way-node indexes permanently disables updating of the GOL, as these indexes cannot be re-created (You would then need to run the build command again to build a new updatable GOL).