Skip to main content

Shapefile

ESRI Shapefiles is a file format for storing geospatial vector data.

Note that Shapefiles are falling out of favor in modern usage (likely due to the significant inconvenience of having to deal with multiple files). However, a lot of valuable geospatial data available is still provided in Shapefile format, and sometimes only in this format. Additional information and some strong opinions can be found at switchfromshapefile.org.

A multi-file format

A Shapefile consists of a number of files that must be read and written together. Because of this, they are typically stored together with the same file name but different extensions. These related files are usually stored in the same directory or inside a common zip archive. While it is possible to just load the geometries from a .shp file, files with extensions .shp, .shx, .dbf are often expected to exist, and additional files with other extensions such as .prj and .cpg may also exist, if needed.

A common problem with shapefiles is that the user only opens the .shp file but not the accompanying files such as .dbf.

FileTypeContents
.shpBinaryThe geometry, i.e. the geometry column in the resulting table.
.dbfBinaryThe attributes, i.e. the data columns in the resulting table.
.shxBinaryThe index (technically required, however it is sometimes possible to open shapefiles without the index)
.prjTextA small usually single line text file containing a WKT-CRS style projection. WGS84 is assumed if not present.
.cpgTextA small text file containing a text encoding name for the DBF text fields. latin1 is assumed if not present.

Coordinate Systems

Arbitrary coordinate reference systems are supported for Shapefiles.

Such coordinate systems are reprojected to WGS84 on import.

Encodings

The optional "code page" file (.cpg) specifies the encoding of any text data in the Shapefile (or more precisely, in the sidecar .dbf file). If no .cpg file is provided, latin1 encoding is assumed.

Geometries

A Shapefile always encodes a single type of geometries. The following geometries are supported:

Shape typeGeoJSONloaders.glValueFields
Null shapenull0None
PointPoint1X, Y
PolylineLineString3MBR, Number of parts, Number of points, Parts, Points
PolygonPolygon5MBR, Number of parts, Number of points, Parts, Points
MultiPointMultiPoint8MBR, Number of points, Points
PointZPoint11X, Y, Z Optional: M
PolylineZLineString13MBR, Number of parts, Number of points, Parts, Points, Z range, Z array Optional: M range, M array
PolygonZPolygon15MBR, Number of parts, Number of points, Parts, Points, Z range, Z array Optional: M range, M array
MultiPointZMultiPoint18MBR, Number of points, Points, Z range, Z array Optional: M range, M array
PointMPoint21X, Y, M
PolylineMLineString23MBR, Number of parts, Number of points, Parts, Points Optional: M range, M array
PolygonMPolygon25MBR, Number of parts, Number of points, Parts, Points Optional: M range, M array
MultiPointMMultiPoint28MBR, Number of points, Points Optional Fields: M range, M array
MultiPatch31MBR , Number of parts, Number of points, Parts, Part types, Points, Z range, Z array Optional: M range, M array
  • value is the internal shapefile encoding

Version History

  • The shapefile format was introduced with ArcView GIS version 2 in the early 1990s.

Troubleshooting

  • No data columns: The most common problem with shapefile is probably that they user only opens the main .shp file. In this case only the geometry is included, but no data columns are present.
  • Geometry projection issues: geometry may fail to load or be visualized incorrectly without the associated .prj file.
  • Incorrect strings: Encodings may not be correct without the .cpg file.

Also note that there is a very large number of possible projections and it is hard to test that every possible projection is supported. If your data is old or known to be problematic, it may be worth double checking that things look correct after importing.