Metadata
Definition
Metadata are ‘data about data’—descriptions that explain what a dataset is, how it was created, when it was updated, who owns it, where it applies, its quality, license, and how to access it. In geospatial contexts, metadata cover spatial extent, coordinate reference system, scale, lineage, accuracy statistics, attributes, usage constraints, and contact information. Good metadata are structured for machines yet readable for humans. They enable discovery in catalogs, support reproducibility, and reduce misuse by clarifying fitness-for-use. Metadata should travel with data through pipelines so derivatives retain provenance. Persistent identifiers and versioning allow citation. Without metadata, maps become untrustworthy pictures. Treat metadata updates as code changes—reviewed, versioned, and deployed—so catalogs remain trustworthy as systems evolve. Treat metadata updates as code changes—reviewed, versioned, and deployed—so catalogs remain trustworthy as systems evolve. Provide clear usage notes, QA artifacts, and version history to aid reuse and review. Provide clear usage notes, QA artifacts, and version history to aid reuse and review. Provide clear usage notes, QA artifacts, and version history to aid reuse and review.
Application
Agencies publish open data with standardized metadata to power portals. Scientists attach metadata to datasets for peer-reviewed research. Companies maintain internal catalogs so analysts can find authoritative layers instead of duplicating work. Emergency operations rely on up-to-date metadata for confidence in critical decisions.
FAQ
How detailed should metadata be?
Enough to enable reuse without guesswork: purpose, methods, dates, CRS, accuracy, lineage, license, and contacts. Avoid boilerplate; be specific.
Can metadata be generated automatically?
Partially: schemas, CRS, and field types can be detected; lineage and intent require human input. Pipelines should fill what they can and prompt owners for the rest.
How are versions handled?
Assign identifiers and changelogs; record deprecated fields; keep prior versions accessible or reproducible for audits.
What happens if metadata are missing?
Trust erodes; analysts may misuse data or rebuild it from scratch. Treat metadata as a first-class deliverable.