Shapefile
ESRI Shapefiles is a file format for storing geospatial vector data.
@loaders.gl/shapefile- Wikipedia - ESRI Shapefile Whitepaper - Notes on Shapefile usage
- DBF header - data types - code pages - _implementation notes_
Note that Shapefiles are falling out of favor in modern usage (likely due to the significant inconvenience of having to deal with multiple files). However, a lot of valuable geospatial data available is still provided in Shapefile format, and sometimes only in this format. Additional information and some strong opinions can be found at switchfromshapefile.org.
A multi-file format
A Shapefile consists of a number of files that must be read and written together.
Because of this, they are typically stored together with the same file name but different extensions.
These related files are usually stored in the same directory or inside a common zip archive.
While it is possible to just load the geometries from a .shp file, files with extensions .shp, .shx, .dbf are often expected to exist,
and additional files with other extensions such as .prj and .cpg may also exist, if needed.
A common problem with shapefiles is that the user only opens the .shp file but not the accompanying files such as .dbf.
| File | Type | Contents |
|---|---|---|
.shp | Binary | The geometry, i.e. the geometry column in the resulting table. |
.dbf | Binary | The attributes, i.e. the data columns in the resulting table. |
.shx | Binary | The index (technically required, however it is sometimes possible to open shapefiles without the index) |
.prj | Text | A small usually single line text file containing a WKT-CRS style projection. WGS84 is assumed if not present. |
.cpg | Text | A small text file containing a text encoding name for the DBF text fields. latin1 is assumed if not present. |
Coordinate Systems
Arbitrary coordinate reference systems are supported for Shapefiles.
Such coordinate systems are reprojected to WGS84 on import.
Encodings
The optional "code page" file (.cpg) specifies the encoding of any text data in the Shapefile (or more precisely, in the sidecar .dbf file). If no .cpg file is provided, latin1 encoding is assumed.
Geometries
A Shapefile always encodes a single type of geometries. The following geometries are supported:
| Shape type | GeoJSON | loaders.gl | Value | Fields |
|---|---|---|---|---|
Null shape | null | ✅ | 0 | None |
Point | Point | ✅ | 1 | X, Y |
Polyline | LineString | ✅ | 3 | MBR, Number of parts, Number of points, Parts, Points |
Polygon | Polygon | ✅ | 5 | MBR, Number of parts, Number of points, Parts, Points |
MultiPoint | MultiPoint | ✅ | 8 | MBR, Number of points, Points |
PointZ | Point | ✅ | 11 | X, Y, Z Optional: M |
PolylineZ | LineString | ✅ | 13 | MBR, Number of parts, Number of points, Parts, Points, Z range, Z array Optional: M range, M array |
PolygonZ | Polygon | ✅ | 15 | MBR, Number of parts, Number of points, Parts, Points, Z range, Z array Optional: M range, M array |
MultiPointZ | MultiPoint | ✅ | 18 | MBR, Number of points, Points, Z range, Z array Optional: M range, M array |
PointM | Point | ✅ | 21 | X, Y, M |
PolylineM | LineString | ✅ | 23 | MBR, Number of parts, Number of points, Parts, Points Optional: M range, M array |
PolygonM | Polygon | ✅ | 25 | MBR, Number of parts, Number of points, Parts, Points Optional: M range, M array |
MultiPointM | MultiPoint | ✅ | 28 | MBR, Number of points, Points Optional Fields: M range, M array |
MultiPatch | ❌ | 31 | MBR , Number of parts, Number of points, Parts, Part types, Points, Z range, Z array Optional: M range, M array |
valueis the internal shapefile encoding
Version History
- The shapefile format was introduced with ArcView GIS version 2 in the early 1990s.
Troubleshooting
- No data columns: The most common problem with shapefile is probably that they user only opens the main
.shpfile. In this case only the geometry is included, but no data columns are present. - Geometry projection issues: geometry may fail to load or be visualized incorrectly without the associated
.prjfile. - Incorrect strings: Encodings may not be correct without the
.cpgfile.
Also note that there is a very large number of possible projections and it is hard to test that every possible projection is supported. If your data is old or known to be problematic, it may be worth double checking that things look correct after importing.