README for digitizing Northern Ellesmere Island ice shelf and ice tongue extents (and other complementary geospatial data)

Derek Mueller, Carleton University 2016-11-16 (derek.mueller@carleton.ca)

Recommended Citation: 
English:
Mueller, DR, Copland, L, Jeffries, MO (2017) Northern Ellesmere Island ice shelf and ice tongue extents, v. 1.0 (1906-2015). Nordicana D28, doi:10.5885/45455XD-24C73A8A736446CC

French:
Mueller, DR, Copland, L, Jeffries, MO (2017) tendue des plates-formes de glace et des langues glacires de la cte nord de l'le d'Ellesmere, v. 1.0 (1906-2015). Nordicana D28, doi:10.5885/45455XD-24C73A8A736446CC

===================================================================================================================
Purpose: This dataset was generated to examine the extent of the Ellesmere Island ice shelves and ice tongues over time.  Data from 13 observation years were combined to examine ice shelves and ice tongues along the northern coast of Ellesmere Island.  The dataset also contains polygons for other ice types and can be used to track environmental change in the Arctic.  Precursors of this dataset have been used in Mueller DR, Crawford A, Copland L, Van Wychen W (2013) Ice Island and Iceberg Fluxes from Canadian High Arctic Sources. Northern Transportation Assessment Initiative, Innovation Policy Branch, Transport Canada, Ottawa and in the 2015, 2011 and 2008 editions of the Atlas of Canada, Natural Resources Canada.

For more information, please consult the following:  Mueller DR, Copland L, Jeffries MO (2017) Changes in Canadian Arctic ice shelf extent since 1906. In: Copland L, Mueller DR (eds) Arctic Ice Shelves and Ice Islands, Springer SBM, Dordrecht 

Abstract (from this book chapter)
The ice shelves along the northern coast of Ellesmere Island have been in a state of decline since at least the early 20th century. Available data derived from explorers' journals, aerial photographs and satellite imagery have been compiled into a single geospatial database of ice shelf and glacier ice tongue extent over 13 observation periods between 1906 to 2015. During this time there was a loss of 8,061 km2 (94%) in ice shelf area. The vast majority of this loss occurred via episodic calving, in particular during the first six decades of the 20th century. More recently, between 1998 and 2015, 515 km2 of shelf ice calved. Some ice shelves also thinned in situ, transitioning to thinner and weaker ice types that can no longer be considered ice shelf, although the timing of this shift is difficult to constrain with the methods used here. Some ice shelves composed partly of ice tongues (glacier or composite ice shelves) also disintegrated to the point where the ice tongues were isolated, representing a loss of ice shelf extent. Our digitization methods were typically repeatable to within 3%, and generally agreed with past determinations of extent. This research highlights the fact that the break-up of these massive features is an ongoing phenomenon. It is hoped that this comprehensive dataset will provide a basis for comparison of future changes in this region.

Keywords: Ice shelf, Ice tongue, Break-up, Calving, Climate change, Change detection, Remote sensing, Geographical Information System (GIS), Arctic, Satellite imagery, Synthetic Aperture Radar (SAR), Ellesmere Island, Coastal ice

===================================================================================================================
What is included in this data set: 

The data are available in a single comma-separated-values file EllesmereIS.csv which is defined below. 
There are shapefile versions of the data as well. "EllesmereIS.shp" contains the same data as EllesmereIS.csv.  As well, the data are also provided as one shapefile per observation year "IS_YYYY_#Season.shp".  

Files include are provided in the following directories:
-scripts -- code written in R and python to manipulate the data and create outputs (maps, graphs) for analysis - note that some of these products are identical to what is included in the book chapter - if you wish to use these, duly consider copyright issues.  
-outputs -- what the script generates
-vector/250K_Ellesmere -- shapefiles of the coast and glacier layers from NRCan NTDB (2009) that was used in generating the dataset
-vector/IS_Extent -- the geospatial data!  And a few other things (Soundings by Ross Marvin, from Bushnell 1956 and ObjAreas that defines fiord areas for some ice type object identifiers) 
-raster - no raster files are included at this time (copyright issues) 

===================================================================================================================
Data description: 

The data were generated at 1:25,000 scale but are not always sufficiently accurate at that scale (depends on the data source).  They are plenty accurate for 1:100,000 or smaller scale. 

obsyear: The year and season of the observation
	Observation years are as folllows: 
		Summer: 1906, 1959, 1963, 1988, 
		Winter: 1992, 1998, 2003, 2006, 2009, 2011, 2012, 2013, 2015

type: The type of ice 
	Valid types: IS, IF, EL, FY, MI, FG, IR, II (Ice shelf, Iceshelf/tongue Fragment, Epishelf lake, First year ice, Multiyear landfast sea ice, Floating Glacier (ice tongue), Ice Rise, Ice Island)

	Data completeness/consistency by type: 
		- IS - all done for each year
		- IF - all done but coverage is variable (due to imagery availabiliy)
		- II - offshore IF essentially - not examined systematically due to coverage issues. 
		- FG - all done for glaciers that are >1-2 km wide that are within the 1906 IS extent
		- EL - only done for some years and locations but could easily be generated by GIS operations
		- FY - only for areas that are 'behind' the coast line (see below)
		- IR - not done at all really
		- Any other codes are not 'official' and have special meaning - see comments for details (if any)

name: The name of the feature (unofficial)

obj: Object ID that encodes provenance (ice shelf and ice tongue) or location (other ice types)
	Each ice shelf and floating glacier (ice tongue) was given an object ID that encodes its provenance.  This is automatically generated by scripts and this object ID can be used to analyze spatiotemporal relationships. 

area: The area of each polygon in km^2

comment: Comment field 
note: Second comment field

imgref1: The name of the image or data source that was used to create the polygon. Note the date/time of the image is encoded in the imageref name

imgref2: Second imgref for polygons with more than one source

===================================================================================================================
Methods summary: 
-digitized polygons by tracing each against the 2009 1:250,000 coast vector.  This was not always adequate so crossing the coast was done in certain places (see below)
-used the same shapefile template for each observation year
-2 fields were mandatory - the type (ice type -- see above) and imgref1 (image reference 1) - note the date of the source is embedded into the file name
-imgref2 recorded secondary source info
-the name field was used for convenience
-the obsyear was calculated 
-polygon area was calculated (in km2)
-coast was calculated using an ArcPy script (3 values 2009 = against the coast, -2009 = 'behind' the coast, 0 = offshore, 999 = crossing the coast) The latter value flagged a breach of protocol and these were eliminated (note that the dataset extends to the minimum extent of all land-based features)
-comments - contained info on the image, object or other relevant matters - not always completed the same way every time. 

===================================================================================================================
More detailed methods and further data quality rules/assessments:

For complete methods and error analysis, please consult Mueller DR, Copland L, Jeffries MO (2017) Changes in Canadian Arctic ice shelf extent since 1906. In: Copland L, Mueller DR (eds) Arctic Ice Shelves and Ice Islands, Springer SBM, Dordrecht. In Press. 

Data used in this project were typically synthetic aperature radar (SAR) imagery along with optical satellite data and aerial photos, maps and other sources.  Many of these images are copyrighted and so they are not included here. These were first projected and aligned with the coast (and quality was confirmed). 
Digitization of polygons proceeded using a shapefile template in ArcGIS
Mandatory fields - ice type and imgref1
Polygons were traced against the coast layer but can cross the coast layer if required (when the coast is obviously wrong) - it must be split later
No multipart features - made sure to explode if needed. (explode tool on edit toolbar)  - QGIS has a topology checker
Polygons were clipped to ensure no ice types occupied the same place at the same time -- order important
	- IF clips, EL; EL clips IS; and so on... 
	- Say you clip IS with EL (internal EL on Serson and Milne for example)
		- Select EL inside the IS with mouse, make new temp shp. 
		- Merge them all into one multipart poly. 
		- Copy the IS poly into that temp shp
		- **Uncheck** the original layer and then select the merged EL poly, clip/discard the overlap
		- Copy that clipped IS back into the original; delete the unclipped one if it still exists

-If the coastline was crossed then data were split - selected the coast layer and used the topology polygon split tool (advanced editor toolbar), cluster set to default
	- this cut polygons that cross the coast layer into pieces inside and outside of the coast. 
	
-ran coast.py/model 
	- this identified the relationship with the coast layer.  
	- 0 = no relationship; 2009 = touching coast; -2009 = on 'land'; 999 = crosses coast (error)
	- this model can be batched
	- may need to reload Arc (buggy) to view tables
-calculate Area with field geometry calculator.  Use km2 - can calculate in R too

Data QA/QC
There are several scripts in R to do this:
- Check Geometry // Repair Geometry  - remove null shapes
	- run R tool to find offending rows

- Overlaps and gaps
	- run the ESRI intersect tool to determine if there are any overlapping polygons per obsyear.  
	- QGIS has a topology checker that can find overlaps and gaps.  Note that the tolerance for Arc is 0.001 m and it is nanometres in QGIS. 
	- fix manually by clipping or moving vertices  or use the integrate or eliminate tool

- Make sure fields are sensible and complete - in R run scripts to show 
	- coast = 999  - if so, return to split above
	- imgref empty  - insert imgref
	- area not correct  - rerun calc Area
	- geometry bad  - Fix by moving vertices or clipping

The data were then moved out of the ArcGIS framework and a series of R scripts that cleaned the data, did mapping, plotting and sundry analyses.  These scripts should work on subsequent versions of the data set and are included here.  One notable thing that the scripts do is generate an object ID (obj).  The principle is described below:
     Obj -- The script written in R does the job.  Propagate through each IS splitting as you go. 
	- 01 is the first feature, then digit 3 and 4 yield the next designator from 00 (for the biggest section), to 01 (for the next biggest section) etc
	- as the feature splits, the next digit is used to designate children always starting with 0 (for the biggest section) then 1, 2, 3. 
	- to view an feature and all it's descendents, do wild card searches - 9???? or 901?? 
	- repeat process with FG.  
	- for other ice types - EL, IF and so on... There is a shapefile that defines areas where they belong so you can filter on fiord essentially. 

The script outputs maps for sections of the study area (included in data set).   


