Here’s how to make your ecology mapping workflows accurate, efficient & standardised in time for next field season. PART 4: Data
Introduction
In part 1 of this article we looked at how to conduct a GIS audit and develop a GIS Roadmap to guide your upgrade of mapping workflows. The next part looked at how to ensure you have the right GIS software and IT infrastructure to support it. The third looked at staff learning pathways, to ensure they are able to benefit from incredible open-source software like QGIS.
We are lucky. The UK is one of the best mapped countries in the world. Which shouldn’t be too much of a surprise; the Ordnance Survey (OS) has been working on it since 1747! So our ‘spatial data framework’ is well established. Right down to your garden shed.
In this final part, we turn our attention to data. We’ll explore how the data you choose to use as background directly impacts the accuracy, and time, it takes to deliver your work. Specifically, I will:
show you how to appraise the usefulness of a dataset
discuss how you licence them for use on a project
signpost some great datasets
explore how these datasets improve your workflows
Appraising accuracy & precision of GIS datasets
When I worked at GiGL, London’s Local Environmental Record Centre, we would classify data as Internal or External. It’s a definition I still find it useful today. By Internal, I mean data you own / have created. By External, I mean data that you have sourced / licenced to inform your work. When I’m thinking about incorporating External data into my workflows some key things I check, either by reviewing the metadata, or loading into QGIS, are:
Accuracy
Positional
How close the feature's location is to its true position.
Visually inspect alignment with a trusted base map, like OS Mastermap or OS National Geographic Database.
Attribute Accuracy
How correct the descriptive data (e.g. habitat classification).
Compare to your own ‘ground truth’ / survey data, or a trusted dataset.
Temporal Accuracy
How current or timely is the data.
Is it time-stamped / clearly attributed with data of capture?
Precision
Map projection
There are many projections but the most accurate for the UK is EPSG 27700: check the data is in that, or expect the accuracy to be maximum 2m.
Coordinate Precision
Have the Eastings / Northings been rounded up with 0’s at the end? E.g. 523000, 112000 is rounded to 1km grid.
If you are looking at raster data, the pixel size will determine what you can identify so zoom in and measure it (see also section below)
Digitising Precision
Check the smoothness and detail in vector shapes and scale of data capture, e.g. a field boundary with too few nodes will appear jagged.
Attribute Precision
Check the level of detail in descriptive data (e.g., “tree” vs. “Oak tree”).
Availability
Essentially, this comes down to licence terms and conditions, and cost. All datasets come with a licence, which you will need to check allows you to use the data in the way you want to. Here’s a useful guide to licenceing from the Open Data Institute. Some common licences you will come across are:
The UK Open Government Licence (most UK government OS data uses this licence)
Public Sector Geospatial Agreement (if you are working on a project for a PSGA organisation you can look to use OS premium data as a contractor)
Google Geo-guidelines
BING terms
Client expectations
A final point on appraising data is to remember is that clients and LPAs will be appraising the data you provide them in much the same way as above. The time when you could hide poor quality digitising by displaying your data on a small scale PDF map in a report are coming to an end. Depending on the client, you may well now be asked to provide the data as evidence of your BNG assessments, or to share the data with project partners, like landscape architects. Using good quality (background) data to underpin your analysis will make it a lot easier to achieve good quality outputs. Use something like Google or BING and you’ll likely find the data is in the middle of the road compared with LPA / OS data.
Signposting data
There are a myriad of GIS datasets available now. The above section will help you appraise datasets for yourself, but I thought it might also be helpful to signpost some sources I think you’ll find super useful.
Vector Data
As you are no doubt aware, this represents geographic features as points, lines and polygons. With vector data, its easy in QGIS to digitise your own data by snapping to existing boundaries and/or re-classify features based, for instance, on your habitat survey.
OS National Geographic Database
Probably the single most useful vector data I can think of is the enhanced land cover in the NGD. This data is as accurate as OS Mastermap but includes EUNIS and UK BAP Broad Habitat definitions, and UKhab is coming soon. Being OS, it provides unrivalled coverage, accuracy and precision, plus regular updates.
UK Government Opendata portals
Authoritative UK government protected area, habitat, and other environmental datasets, are available from the following data portals:
Raster Data
As a reminder, this is grid-based format consisting of pixels, each with a value (e.g., elevation, temperature, land cover). What you can distinguish depends on the resolution / pixel size. Drone pixels are typically c.2cm, aerial imagery c.10cm, and satellite images c.10m. There has been an explosion of Earth Observation (EO) data in recent years, and many tools are available to process and classify the resulting raster data in QGIS. For instance, you can now easily calculate vegetation density / health via Normalised Difference Vegetation Index. Here are a few datasets that may be helpful to you:
Bluesky Aerial Imagery & National Tree Map
I appraised a bunch of datasets that could usefully inform a BNG assessment. I liked the Bluesky data so much I signed up as a reseller. This was because, unlike Google and BING:
You know the date the image was taken (and it’s guaranteed to be within the last 2 years)
The images are 12.5 or 5 cm resolution, and are ortho-rectified so they align with OS Mastermap/NGD (no more sides of buildings / not knowing where the true footprint is).
The Bluesky National Tree Map data points can be classified for BNG, which saves you needing to digitise individual tree layer entirely.
The licencing is clear, unlike Google and BING.
Environment Agency LiDAR data
Light Detection and Ranging (LIDAR) is an airborne mapping technique, which uses a laser to measure the height of the terrain and surface objects on the ground such as trees and building. There is c.99% coverage of England at 1m spatial resolution.
An incredible, free, resource that can be used to calculate areas on slopes, conduct visual impact assessments, and used to identify different kinds of vegetation and archaeological features.
Conclusion
As we wrap up this four-part series, I think perhaps the key takeaway from this final part on data is: the UK is one of the best mapped countries in the world, so make the most of the data that is already available.
I keep hearing from clients that they are increasingly being asked to prove their BNG habitat boundaries are accurate, and/or supply the exact location of quadrats used on a condition assessment. If you're starting a project from scratch and haven’t sourced or assessed existing datasets, there’s a good chance you’re wasting valuable time digitising features that have already been mapped - often more accurately. With so many high-quality datasets now available, both open and licensed, there’s rarely a need to reinvent the wheel.
Without high-quality baseline data, in the correct projection, even the best-intentioned ecology mapping workflows can be a real struggle. Misaligned data will cause headaches when trying to match up your red line boundary with baseline habitat data or post-development layouts. At best, this slows down your workflow and undermines confidence in your outputs. At worst, it could result in your habitat polygons ending up in the middle of the road when the Local Planning Authority (LPA) inspects your data.
Good data is the foundation of a robust BNG assessment. Taking the time now to standardise your data inputs, align projections (EPSG:27700 is your friend), and integrate trusted sources will save you hours later - and help ensure your work holds up under scrutiny.
Maplango is a OS Partner and Bluesky reseller. If you would like to discuss integrating premium data into your workflows, or would like to do more with open-data, drop me a line: matt@maplango.com.