How georeference.it works

Millions of natural history specimens in GBIF have no coordinates — only a written description of where they were collected. georeference.it is a platform for volunteers to fix that, one locality at a time.

The problem

Natural history collections hold hundreds of millions of specimens collected over centuries. Most were recorded long before GPS existed. A specimen label might read "Serra da Estrela, Portugal" or "near the old mill, Sierra Nevada" — rich locality information, but no decimal coordinates that a computer can use.

Without coordinates, these records are invisible to species distribution models, protected area assessments, climate change analyses, and most modern biodiversity informatics workflows. The records exist in GBIF but are excluded from the majority of research uses.

What georeferencing is

Georeferencing is the process of interpreting a textual locality description and assigning it decimal coordinates and a coordinate uncertainty radius — the smallest circle that could reasonably contain the actual collection location.

A well-georeferenced record includes:

  • decimalLatitude / decimalLongitude — the best estimate of the location
  • coordinateUncertaintyInMeters — the radius of the uncertainty circle
  • geodeticDatum — the coordinate reference system (WGS84)
  • georeferenceProtocol — the method used
  • georeferencedBy / georeferencedDate — provenance

We follow the Georeferencing Best Practices (Chapman & Wieczorek, 2020) and the Georeferencing Quick Reference Guide.

Locality groups

Many GBIF records share the same written locality — different specimens of different species all collected at "Parque Nacional da Peneda-Gerês, Minho, Portugal", for example. Georeferencing one of them effectively georeferences all of them.

georeference.it groups records by their combination of locality fields (country, state/province, county, municipality, island, verbatim locality). Each group is georeferenced once, and the result applies to all records in the group. This makes the work much more efficient: a single georeferencing effort can improve hundreds or thousands of GBIF records simultaneously.

Suggestions and validation

Anyone can submit a georeferencing suggestion for a locality group. Suggestions are reviewed by other contributors. To avoid relying on a single person's judgement, a consensus mechanism is used:

  1. 1 A contributor places a point on the map and sets the uncertainty radius. They may add remarks explaining their reasoning.
  2. 2 Other contributors review the suggestion. They can agree with it (adding a validation vote), submit a competing suggestion if they disagree, or leave a comment.
  3. 3 Once a suggestion accumulates enough weighted validation votes, it is marked validated and becomes the platform's official georeferencing for that locality group.
  4. 4 Data publishers (natural history collections) can retrieve validated georeferences via the API and submit them back to GBIF, improving the public dataset.

Consistency checking

For localities where GBIF already has coordinates (georeferenced by the data publisher), georeference.it runs an automatic consistency check. It clusters all georeferenced occurrences within a locality group and checks whether they agree spatially.

If records in the same named locality are spread across widely separated locations, something is likely wrong — a coordinate error, a transcription mistake, or a genuine ambiguity in the locality name. These groups are flagged as inconsistent and presented to contributors for review, with one competing suggestion per cluster so the community can identify which coordinates are correct.

System auto-suggestions

Many locality groups contain a mix of records: some already have GBIF coordinates (georeferenced by the collection), others do not. When a group has at least one georeferenced occurrence, georeference.it automatically creates a system suggestion using those existing coordinates as a starting point.

These auto-suggestions are marked as coming from georeference.it system and carry lower weight than human contributions. They serve as a useful baseline — a pre-filled suggestion that contributors can accept, refine, or replace based on their own assessment of the locality description.

Auto-suggestions are only created when no human suggestion already exists for the group, so they never override community work.

Contributor levels and vote weight

Not all validation votes carry the same weight. Contributors earn experience by having their suggestions validated by others. As their track record grows, their votes carry more weight — reflecting the community's trust in their judgement.

A suggestion becomes validated when its total accumulated vote weight reaches 60 points. Current contributor levels:

Level Minimum validated contributions Vote weight Validations needed to pass a suggestion
Beginner None 10 6
Contributor 10 20 3
Mapper 100 30 2
Curator 500 40 2
Expert 1000 50 2

Levels and thresholds are reviewed periodically as the community grows.

Status labels

Ungeoreferenced

No coordinates exist for this locality group in GBIF or georeference.it.

Suggestion pending

A georeferencing suggestion has been submitted and is awaiting validation.

Conflicted

Two or more competing suggestions exist. The community needs to resolve the disagreement.

Validated ✓

A suggestion has accumulated enough validation votes and is the platform's accepted georeferencing.

GBIF georef

GBIF already has coordinates for this group from the data publisher. May still need review.

Reviewed ✓

GBIF's coordinates have been reviewed and confirmed correct by the community.

Ready to contribute?

No account needed to start georeferencing. It takes about two minutes per locality.