Tuesday, January 26, 2010

Research Needs for Water Resource Applications of GIS

Water resources are challenging to model in a geographic information system because these features are often so temporally variable. John P. Wilson, et al. discuss in the article, “Water Resource Applications of Geographic Information Systems” many methods of water resource management and assessment. Often, these topics reiterate the relation of data sharing with the increasing level of technology. These advancements allow a greater number of less professionally trained individuals to contribute what can possibly be high quality data with a varying degree of uncertainty. For this reason, future technology will need to incorporate these individuals’ level of training and geographic and scientific understanding into the framework of tool building. Thus interfaces for using such tools must be created to suit a wide audience efficiently. Otherwise, difficulty in using technology directly translates to uncertainty. Scale is addressed often as well when reviewing other issues ranging from the integration of spatial data acquisition technology to data sharing practices.
Primarily, it is stressed that raster grid resolution selection should attempt to match the size of the smallest features that are being modeled.  Accuracy of these resources is quickly degraded due to excess aggregation of topographic features.  Generalization in this regard minimizes the number possibilities of variation for a given area where a smaller stream feature, for instance, could exist in reality.  Using a raster resolution of 30 meters rather than the 2-10 meter resolutions suggested by several studies mentioned easily smoothes over smaller changes in topography.  Further, a 30 meter cell size yields only 21% - 30% of correct slope gradients.  However, if only larger (major) features are required to be analyzed or communicated, 30 meter resolution can sometimes be quite sufficient.

Many water resource applications are GIS-based.  The National Hydrography Dataset is a prime example of how to digitally store and communicate a hydrologic network by logically displaying geographic information with its organizational and descriptive parameters embedded within in a GIS environment.  An example of temporal water quality monitoring is the Florida DEP's Watershed Monitoring program and EPA & FDEP’s STORET (“STOrage and RETrieval” database) program, where water quality data from across the state of Florida is collected, managed in databases, and can be served internally or publicly via internet mapping service, while being provided to in-house personnel for GIS analysis via ArcSDE & ArcGIS Server. Advancements in technology such as these increase the dissemination of information to a growing number of skilled as well as improperly trained stewards and contributors of information.  Increased availability must be mirrored by an increased accounting of data collection information in the form of metadata to ensure reliability.  A method of controlling uncertainty for every step of data collection is currently available by reporting appropriate metadata parameters prior to distribution of any data.  Although this information is mainly included on official final-drafts of published datasets, more can be done deeper within an organization to include metadata to aide in the quality assurance of final versions of disseminated data.

Overall, it will be important to develop efficient models of water resources that can be used by an audience with contrasting degrees of familiarity with geographic information.  By conducting further research on the most effective way to properly incorporate data with varying degrees of uncertainty, as well as investigating the most efficient scales at which modeling hydrologic networks should take place, the GIS community can continue to accurately model water resources in the future.

Monday, January 25, 2010

Challenges in using GIS for Emergency Preparedness & Response


My sister and myself after hurricane Andrew in 1992, south Miami

Hazards are common to every location on Earth and often shape the characteristics that define a given landscape. In the US, California is well known for earthquakes and wildfires, Florida is known for hurricanes, and the Midwest is specifically referred to as “tornado alley.” Of course these occurrences are by no means limited to these locations, and these locations are not limited to these natural hazards. Hazards include any natural event (meteorological, geological, hydrological, biological, etc.) or human-induced process or phenomenon (war, industrial accidents, etc.) that may potentially impact, endanger, or threaten lives, resources, and the environment. Having a sound understanding of how such events impact and shape the geography of a region is essential for efficient planning and mitigation of the affects of damage and loss of life when these events arise in the future. Using GIS as a tool to analyze the factors associated with a specific hazard allows for the rapid assessment and response to costly situations. Although GIS can be used as a powerful tool to analyze information in attempts to reduce risk, damage, loss, and recovery time resulting from any hazard or combination of hazards, the limitations of time and resources frequently characterize the overall usefulness of GIS in three general stages. The May 2000 ESRI white paper, “Challenges for GIS in Emergency Preparedness and Response,” explains the stages of a hazard event and continues to discuss challenges for GIS in management and analysis of information during these periods.

Prior to an event, proactive planning assesses potential risk, and whenever possible evacuation procedures are imposed to limit the affects of the hazard. During this stage, time is limited as the hazard becomes imminent, and valid information is necessary to be produced hastily. The urgency and importance of such information can be exemplified in a situation where rising flood threaten a populated location. In this circumstance, it is imperative to provide planning and rescue efforts a list of affected roadway names. To produce accurate information, a project must begin with accurate and up-to-date information. If changes in a city’s roadway network have been added or altered without being reflected in a GIS coverage, it could be possible to send responding emergency vehicles into flooded areas. Risk mapping and emergency simulation require the most current information. Reactive response begins when the event begins. After the event has ended, another reactive response situation begins to unfold. During this subsequent stage appropriate resource management and planning must occur to efficiently being the path to rebuilding and recovery. Overall, there is a need to move from reactive response to prevention and planning to me most efficient in minimizing loss. Adhering to a proactive planning method will reduce the general cost of evacuation when that time presents itself in the future.

The white paper discusses a number of important factors concerned with the acquisition and integration of spatial data in an emergency planning environment. It is necessary to identify what sources of data are available, what kinds of data needs to be obtained, who needs access to which types of data, what problems to expect when integrating sources of spatial information, and what amount of uncertainty comes with a dataset, and to what extent is the uncertainty acceptable. When working with several agencies in an emergency, it is important to have strategies in place to avoid the loss of efficiency due to poor or difficult methods of integration. One solution to this comes from the advancement of distributed computing systems. Highly efficient, temporally narrow analysis can be conducted on interactive geographic information systems via a link to immediately updated remotely sensed information in the field, however if several organizations are unaware of how to share this information, efficiency of planning and response can be significantly hindered.

The article mentions that dynamic representation, or temporal visualization, is an area that needs to be researched further, and that GIS is not the best method for representing temporal data. This was written in 2000, however, and there have since been a number of powerful options which allow for the integration of GIS information with other software packages to create dynamic maps to render time-lapsed data, etc. quite effectively. Such tools have been built directly into ArcGIS for the past few versions that create powerful digital video with ease from even the most simplistic of datasets.

Finally, scale is an issue that must be addressed in emergency management in GIS. One main problem is that DEM datasets can be inadequate for analysis in (certain) cases – especially when modeling hydrologic events. Shuttle Radar Topography Mission (SRTM) data comes in 30 meter and 90 meter resolutions. Emergency management often deals with larger scales than this would prove useful for more than very basic analysis. Higher resolution Lidar would be a more appropriate choice to acquire in order to perform surface analysis to model emergency situations involving flooding, etc.

GIS attempts to reduce the amount of risk, damage, loss, and recovery time resulting from many different hazards in the field of emergency management. There are many ways to plan proactively to contribute to a safe and efficient means of protecting society. Keeping up-to-date records and communicating with other agencies are some of the most important issues to consider to provide a well prepared emergency response strategy that will minimize loss for many affected parties.

Sunday, January 24, 2010

Hydrograpic Vocab: Streams

Parts of a stream:
Some of these were obtained, or altered from the (Wikipedia entry on Streams)

Spring - The point at which a stream emerges from an underground course through unconsolidated sediments or through caves. A stream can, especially with caves, flow aboveground for part of its course, and underground for part of its course.
Swalette -  The portion of a disappearing stream where the water drains into a sinkhole into an aquifer or subterranean cave or culvert; characteristic of Karst hydrology and topography.  This system may return to the surface at a spring.
Source - The spring from which the stream originates, or other point of origin of a stream.
Headwaters - The part of a stream or river proximate to its source. The word is most commonly used in the plural where there is no single point source.
Confluence - The point at which the two streams merge. If the two tributaries are of approximately equal size, the confluence may be called a fork.
Run - A somewhat smoothly flowing segment of the stream.
Pool - A segment where the water is deeper and slower moving.
Riffle - A segment where the flow is shallower and more turbulent.
Channel - A depression created by constant erosion that carries the stream's flow.
Floodplain - Lands adjacent to the stream that are subject to flooding when a stream overflows its banks.
Stream bed - The bottom of a stream.
Gauging station - A point of demarkation along the route of a stream or river, used for reference marking or water monitoring.
Thalweg - The river's longitudinal section, or the line joining the deepest point in the channel at each stage from source to mouth.
Wetted perimeter - The line on which the stream's surface meets the channel walls.
Nickpoint - The point on a stream's profile where a sudden change in stream gradient occurs.
Waterfall or cascade - The fall of water where the stream goes over a sudden drop called a nickpoint; some nickpoints are formed by erosion when water flows over an especially resistant stratum, followed by one less so. The stream expends kinetic energy in "trying" to eliminate the nickpoint.
Mouth - The point at which the stream discharges, possibly via an estuary or delta, into a static body of water such as a lake or ocean
Meander - The natural bending and winding of a section of river; usually indicative of a mature or old, established  feature.  Meandering is the result of water's mechanics of erosion, sediment transportation.  Faster, turbid water has a higher capacity to carry load, whereas the same water column will drop its load when it slows.  As a stream approaches a turn in a watercourse, water will flow more slowly along the inside shoreline where sediment load is dropped and deposition occurs.  Conversely, water moves faster along the outer edge, where undercutting and erosion of the shoreline occur, which can be deposited downstream.  Over time, this process deviates the section of river laterally (from the stream run), creating a turn which can occur in succession along a watercourse.
Oxbow Lake - A relic of a stream meander which has exceeded its maximum potential to change course.  A meander eventually erodes through neighboring shorelines to connect again as a straight, shorter stream run.  This process leaves a closed segment of the meander's previous course as a non-flowing U-shaped lake to the side of the current stream.  Over a long period of geologic time, a floodplain may exhibit signs of meander scarring, where many iterations of sinuous variation have occured.

Temporal Establishment:
Perennial - Established waterbodies, present throughout a year under a normal hydroperiod
Intermittent - A stream that is present for most of the year, which stops flowing for weeks or months at a time
Ephemeral - Short lived features, usually flowing after significant precipitation events
Winterbourne - A stream which flows only during winter months and is dry during the summer (dry season)

Drainage Patterns:
Images courtesy Michael E. Ritter. Read more about these drainage patterns in his free e-book: The Physical Environment: an Introduction to Physical Geography


Dendritic - A very common branching fractal pattern found nature that can also be found in leaves, branches, tree roots, and veins to name a few.
Large scale example in Apalachee Bay south of St. Marks, Florida (be sure to zoom in and out, and pan to the east for a few miles to observe the varying scales in which these patterns exist in nature)
Smaller scale example showing the confluence in Pensacola Bay in western Florida

Dendritic drainage pattern

Read more about these additional types of drainage in the free e-book, The Physical Environment: an Introduction to Physical Geography.


Parallel - Similar to dendritic features, but the branches are skewed to run more or less in a similar direction influenced by changes in elevation

Parallel drainage pattern


Trellis -Read more about this pattern here.

Trellis drainage pattern


Rectangular - Read more about this pattern here.

Rectangular drainage pattern



Radial - Caused by centrally located elevated land forms; from uplift, volcanoes, or other geological phenomena.

Radial drainage pattern



Centripetal - Caused by centrally located depression in geography which can drain into a number of features including a depressional wetland, a swalette in a disappearing stream, an intermittent or ephemeral lake which can leave a salt flats in the dry lake bed, etc.

Centripetal drainage pattern


Deranged - A sign of significant disturbance of historically established features.

Deranged drainage pattern

Saturday, January 23, 2010

Multispectral & Hyperspectral Remote Sensing Image Extraction

Multispectral sensors such as the Landsat series, SPOT, IKONOS, and QuickBird acquire anywhere from three to ten simultaneous bands of information across a scene.  Each of these bands cover a relatively broad spectral range of electromagnetic radiation observation.  Hyperspectral remote sensing, as its name implies, is generally composed of a greater number of spectral bands which observe a more precise (narrow) spectral threshold.  The MODIS sensor uses 36 bands, while NASA's AVIRIS sensor captures as many as 224 bands.

John R. Jensen continues to describe in his book, Introductory Digital Image processing: A Remote Sensing Perspective, two methods of multi- and hyperspectral image acquisition methods: the whiskbroom system, and the linear and area array technique.  In both methods, instantaneous field-of-view (IFOV) radiant flux – observed reflectance from the Earth's surface – is passed through a spectrometer in order to disperse, or separate the light into separate bands; ranging from blue to near infrared (NIR) and IR wavelengths, which are passed onto a spectrometer where it is dispersed and focused onto a array of detectors which digitally record the field of view.  The whiskbroom method uses a rotating mirror to reflect and direct radiant flux through the spectrometer to a linear array detector which individually measures the value of radiation of each band that has been separated.  This technique is best suited for capturing broad spectral ranges and is utilized by multispectral sensors such as Landsat MMS and SPOT, etc.  These sensors undersample the observed radiant flux by making only a few measurements from wide spectral bands; as wide as several hundred nanometers which may cover more than one color of the spectrum simultaneously.  Although the use of a dispersing element is similarly used in each method to separate incoming light into individual bands, the alternative method of does not use a scanning mirror; thus allowing a longer amount of time for a detector to record the incoming radiant flux of a given area.  This extra duration of detection yields improved geometric and radiometric accuracy.

These methods have dramatic implications on the type of information that is produced.  The varying techniques influence spectral, temporal, radiometric, and even spatial resolution of a produced image.  Various types of investigation require unique parameters in the type of data that will be used in analysis.  Having an understanding of the different types of information that are produced by multi- and hyperspectral imagery with a respect to ground conditions that are being observed will ensure the most accurate results are obtained during image analysis and exploration.

The process of extracting information from multispectral and hyperspectral imagery are largely similar, however there are a few preliminary steps one must complete before analyzing hyperspectral datasets.  Various forms of calibration – radiometric, geometric, etc. – are common between these two types of imagery, however the larger volume of highly specific spectral bands associated with hyperspectral imagery permit the construction of "spectra" that closely resemble the quality of spectral signatures captured by spectroradiometers in laboratories.  Further, initial image quality assessment of hyperspectral can be a much more tedious undertaking, although it is occasionally possible to use hyperspectral images with poor quality for atmospheric correction.  Due to this more time consuming and redundant task of visual examination, many image processing packages have animation functions that provide a more efficient means of inspecting up to hundreds of images in a single session.

Overall, the dimensionality is the most distinguishing characteristic between these two types of remote sensing techniques.  The low data dimensionality of multispectral imagery is significantly more accessible and less difficult to work with in basic research due to the low number of spectral bands that make up an image.  The higher number of bands/images involved with hyper- and ultraspectral imagery generate numerous obstacles, ranging from data storage to processing abilities.  Numerous images of highly specific spectral bands – often representing bandwidths of just 10 nm each – produce certain amounts redundant information.  Statistical analysis aids in identifying, and removing or transforming data in order to reduce the dimensionality of the overall hyperspectral dataset, improving the efficiency of exploration and analysis.  Although there are indeed variations that exist with regard to image processing techniques between these two types of remote sensing data, the causes again are largely due to dimensionality.  An analyst must decide whether few, spectrally broad images or numerous and specific images will best suit his or her analysis.

Monday, January 18, 2010

Google Installer Virus / House Full of Virii



A fractal model of malware code found on my computer*

Not sure where it came from, but I recently had a small battle with some sort of virus that presented itself as a Google installation.  I never would have noticed (I'm getting used to Lappy Tappy running slowly), but every five minutes or so I would get an "End Program" error about a Google Installer program failing.

Then one day every web site I went to redirected me to comcastsomething.net  I forgot the actual URL, but it looked quite legit.  The web site claimed to be Comcast asking for user info - which screamed 1997 AOL scam.  Running my previous methods using Malwarebytes cured the browser redirection, but I still had a problem with the Google installer, and there were a few infections showing up in my virus/malware scanners.

Enter ComboFix from Bleepingcomputer:

This is great.  Very light.  Very effective.  As a side note, my roommate had a virus of his own that prohibited him from using Windows in anything but Safe Mode for more than five minutes.  He was ready to drop over a hundred bucks at a local computer tech shop, but ComboFix kicked that trouble to the curb**!

** ComboFix allowed my roommate and myself to get around the virus/malware to scan, clean, and continue protecting our computers again.  Here's a list of my recommended scanners and utilities:

Recommended Scanners:
All of my recommendations are freeware.  If you have the means, purchase a license or contribute a few bucks, as these are great utilities and are well worth the money.  Otherwise, get ready to spend at least $100 at a tech shop or Best Buy.


ComboFix  - (download from bleepingcomputer.com only) This is a very light (3.64 MB) standalone/no install program that searches deep within the boot sector of your hard drive for malware which may be prohibiting other utilities to run.  Often, malicious software prohibits this, and many other scanners from running, so it may be necessary (just go ahead and do this) to rename the executable file from ComboFix.exe to anything else that is not ComboFix.exe.  I used the suggested Ieexplore.exe and it worked swimmingly.

I saved this to C:\Program Files\ComboFix Anti-Malware\ and copied the shortcut (with the modified name) into my Start\Programs\Utilities menu for easy future access (ala forbid).  If you use this, follow the directions listed on the bleepingcomputer website verbatim.  This was not difficult to run, but there seem to be more warnings than are usually provided with such software, so be careful.  Overall, there aren't really any decisions.  Just follow it's directions and choose Yes to install the Windows Recovery Console thing and you'll be good to go.

AVG Anti-Virus - This is great anti-virus software.  It's also FREE and seems lighter and faster than other software packages I've used in the past.  McAfee was great but it sometimes gets clunky.

Malwarebytes' Anti-Malware - This saved me from the dreaded Spyware Guard 2008 pain in the ass extortionware fiasco.  I don't think the software is completely free, but I used the free portion to disable malware that was blocking other free scanners.  Life saver.  Good scanner to have in this case.

Lavasoft Ad-Aware - A standard in adware detection and removal software.  I rarely run or scan, but it's a good tool to have in the arsenal.

Spybot Search and Destroy - Another free package.  Many ups provided by users.

Tips to Consider:
  • If a software package does not seem to run, or you cannot install on an infected machine, rename the file you're trying to run.  I had to burn Malwarebytes to a CD from another computer under a fake install name to disable to malware that was prohibiting me from scanning.  Same thing with ComboFix; where I had to rename ComboFix.exe to Ieexplore.exe, and had no subsequent trouble.
  • If you're lazy like me and HATE scanning your computer on a regular basis, at least update your scanners every once in a while.  It takes ten minutes tops and can really save your butt - just in case.
  • If you find an infection, stop using your computer and scan the shit out of it.  It may take a few days of scanning overnight, wake up, run another scan, go to work, come home and follow up with another damn scan.  I recommend several software packages because A) they're free, 2) they're quite powerful and comprehensive, and D) they're free.  Take the time to fix your computer on your own.  Otherwise have fun shelling out a wad of cash to loose your computer for a week or more.
  • Upgrading to the full version of these programs seems to provide "live scans" and more control over scheduled scans.  No thanks.  The free versions of these software are wonderful.  Again, purchase or support whenever possible.  These rock.



* This is actually the AIDS virus - not code, nor a fractal

Saturday, January 16, 2010

An Essay on Scale



Scale can be described as the level or extent at which observations or phenomena are represented geographically or temporally.  Scale can be described in aggregate or specific terms.  Generally, one can describe varying levels using such terms as “individual, household, neighborhood, city-wide, nation-wide, daily, annually” etc., or conversely by using more discrete definitions to quantify the physical dimensions or duration of represented events.  In regards to cartography, scale refers more specifically to the former; or the ratio and relationship between corresponding elements in reality to those represented on a map.  This ratio can often be expressed as a fraction – for instance 1/100,000 or 1:100,000, where one unit of measure on a map is equal to 100,000 of the same units in reality.

When obtaining data, it is important to consider scale to determine the appropriateness of a GIS coverage for a given project.  For example, analysis conducted on a large scale – such as a parcel of forest land – requires data obtained at a high resolution.  Whether these data are vector rivers or roadways, digitized at a high resolution or captured using many GPS observations, or the data have been obtained by using high resolution raster imagery or interpolated surfaces, it is important for the data to provide enough information about a geographic extent to give a suitable representation of reality for a desired level of study.  Conversely, the same high resolution data may not be necessary for analysis of an area at a smaller scale – such as a state-wide or regional analysis of forested lands.  In this smaller-scale example, data obtained at a scale of 1:100,000 would be more practical to obtain and utilize for analysis, whereas data for parcel-level analysis may need to be obtained at a scale greater than 1:10,000.

These examples briefly illustrate how the accuracy of data is defined by scale.  If features are digitized at a scale greater than 1:10,000, the layer becomes a more accurate depiction of ground conditions as the project’s scale decreases – the 1:10,000-scale stream features are much more accurate at 1:100,000-scale.  In contrast, features digitized at 1:100,000 quickly loose accuracy as the project’s scale increases to parcel-level.  Further, a high resolution raster dataset may affect the efficiency of running analysis over a wide extent, while a low resolution surface may not appropriately represent the events occurring across a given area.  For these reasons, it is apparent that scale characterizes the degrees of suitability for which data is valid and useful.

Scale is an important issue to consider in the world of environmental and ecological studies.  One frequently occurring matter is a mismatch between the scale at which ecological processes occur and the scale at which decisions on them are made regarding these processes.  Many parameters influence the product of any project, and these parameters time and again lay at varying levels of scale.  Reading materials from class explain how focusing upon a single scale may neglect important interactions that are vital to the complex issue at hand.  Further, examination of an ecosystem may be performed at too large or too small of a scale to fully observe an occurring phenomenon.  Finally, the modifiable aerial unit problem (MAUP) is another example of scale-based error that emphasizes the previous challenge; however may be more of a matter of preference to meet a desired result.  The problem stems from the imposition of artificial aggregation of an unbounded area.  When a continuous area is artificially bounded, the results can be unavoidably skewed.  Altering scale, in this matter, allows control of the resulting spatial patterns.  Basing policy on assumptions containing overlooked conditions can easily impose unpredictable and degrading consequences on the environment and society; thus, scale is an important feature to observe and regard when examining spatial information.

A few years ago I contributed to a pilot project at the Florida Department of Environmental Protection to update Florida’s version of the 1:24,000-scale National Hydrography Dataset (24k NHD).  During this project I encountered a fundamental issue previously discussed.  At what scale should features be digitized to provide a valid 1:24,000-scale coverage?  It was determined that digitizing state-wide hydrologic features be captured most efficiently and validly by digitizing at a larger scale: between 1:15,000 to 1:17,000.  By increasing the scale, a slightly more accurate depiction of ground conditions are observed and obtained for the 24k dataset, while providing enough generalization to be able to progress across vast landscapes of the state with some amount of speed. Additionally, several issues of scale were discussed in the article, “Seagrass as pasture for seacows: landscape-level dugong habitat evaluation.” Raster imagery of seagrass meadows were obtained at 200 meter resolution.  At this resolution the continuous nature of seagrasses were suitably sampled, and analysis was performed appropriately.  It was not necessary to acquire imagery at a larger scale, as patches of seagrass species were easily captured at this resolution, while a larger resolution would begin to overlook patches of assorted species.

Remote Sensing for Landscape Level Dugong Habitat Evaluation

This is a review of an article presented to Dr. Yang's GIS for Environmental Modeling class at Florida State University in the spring of 2008.
Sheppard, James K., Lawler, Ivan R., Marsh, Helene. (2007). Seagrass as Pasture for Seacows: Landscape-Level Dugong Habitat Evaluation. Estuarine Coastal & Shelf Science, 71(1-2):117-132.




The dugong is a cousin to the West Indian manatee, which feeds on specific types of seagrasses in area around northern Australia, Indonesia, and along similar latitudes along coastal areas of the Indian Ocean.  This study uses GIS to determine which specific types of seagrasses are preferred by grazing dugongs in waters off of Queensland, Australia.  GIS is used to observe remotely sensed near infrared spectroscopy to quantify the composition and configuration of seagrass communities.

Dugongs are tagged with GPS tracking devices, and their locations are tracked using the Animal Movement Analyst Extendsion in ArcView.  After three months of tracking, a study area is developed from the many line features created by the movement of the animals to estimate the extent of popular grazing locations in Hervey Bay.  Several field methods are used to capture samples of seagrasses at various locations throughout the study area – ranging from intertidal treks in shallow water to sample plant species by hand, to using submarine video equipment to analyze the plant structure from a boat.  Both methods use GPS to derive latitude and longitude data and to record other information.  Further, a steel sediment grab is used to sample benthic composition and sediment nutrients.

The information is loaded into a GIS where bathymetry, seagrass cover, and nutrient profiles are created from the GIS point data that was captured.  Each of these layers are interpolated into a raster surface using the Kriging method, as it yields the most precise output for the data captured.  Nutrient analysis is conducted with Near Infrared Reflectance Spectroscopy (NRIS) due to its ability to identify the composition of organic samples very efficiently.

The analysis sampled five seagrasses, ranging in levels of nitrogen, starch, and fiber.  The dugongs convert nitrogen into protein, and use starch for energy.  Fiber is a tertiary element in the seagrasses that is difficult for the dugongs to digest, therefore it is mainly avoided in large amounts.  The analysis in this study confirms these findings from past research.

Remote sensing in this study is limited in evaluating biomass and species at low densities.  More valid results are easier to obtain when higher concentrations of any particular species exist over a larger area, though patches can be discovered and accounted for when sampling.  This aligns well with the general structure of seagrass meadows.  Further, remote sensing is limited to shallow water, or where seagrass stalks reach to shallow depths of water were the electromagnetic wave can penetrate and return to the capturing device.  Water quickly absorbs much of the beam emitted from sensing devices.  In addition, turbid water is another limiting factor to using remote sensing in open, flowing water.

Overall, this is a very interesting article that touches on several powerful uses of GIS and remote sensing technology.  The biological analysis addressed seems to get moderately complex, however the conclusions are clearly represented and are easily interpreted in context with the spatial component of this research.

Thursday, January 14, 2010

An Essay on Spatial Interpolation

 Spatial interpolation is the technique of estimating a complete, continuous raster surface based off of a patchwork of known point values captured across a surface in reality.  The surface that is created attempts to represent the gradation, or smooth progression between varying spatial values found in real-world observations.  Interpolation is a powerful visualization method used to make sense of point data that is otherwise very difficult to envision and interpret.  Applications of such raster surfaces include maps of population densities, precipitation levels, temperature differences, etc.  This technique has nearly limitless uses and can create a surface for any spatially-based phenomenon – one of the most notable and easily interpreted of which is an interpolation of elevation values.  It is possible to use interpolation techniques in a vector environment using a triangulated irregular network (TIN), however much of my experience has remained in the world of raster based interpolation.

Spatial interpolation works by estimating values for the unknown, unsampled area between two or more known points.  A basic method of this is to derive isolines between sample points – this can even be done by hand.  I have performed a manual contour interpolation in a meteorology course in the past, where isopleths – lines of equal barometric pressure – were estimated from a map containing only weather station point data.  These lines were sketched, and another layer – frontal boundaries – could then be inferred from the location and orientation of the isolines.  Although this exercise included only a hand full of points, a surprisingly accurate map could be produced by hand, however the addition of more points can quickly force manual interpolation to become time consuming and increase chances of error.  Another method is to create a continuous surface – usually using advanced computer algorithms to automatically calculate the hundreds, if not millions or more raster cells that make up the interpolated surface.  My experience with these types of raster surfaces begin with a project conducted for the Florida State University Environmental Service Program in 2006, which used raster surfaces to model the efficiency of the University’s recycling program on campus, and to suggest new recycling bin locations.  Point locations were collected for centers of student population around campus – classroom buildings, libraries, parking lots, dorms, etc. – as well as for trash can and recycling bin locations.  Each point was assigned an appropriate weight value, and a raster surface was created for each layer.  The interpolations for each layer of centers of student population were combined using a raster calculation, and contour surfaces were created and overlaid with the raster surface of recycling bin distribution on campus.  It was then very easy to observe how the centers of high student traffic compared with the current distribution of recycling facilities on campus.

One final description of spatial interpolation is enforced by Tobler’s First Law of Geography where locations close to each other have more in common than locations which are farther apart.  Spatial autocorrelation, a formulation of Tobler’s Law, continues by measuring how clustered (positive spatial autocorrelation) or dispersed (negative autocorrelation) a set of spatial features exhibit.  These ideas illustrate the significance of neighboring points (or point sets) upon one another and how neighboring points effect the resulting interpolated surface.  More advanced interpolation techniques using these concepts will be discussed.

As with any area of scientific estimation and modeling, there are many issues that affect the accuracy of an interpolated raster or vector surface.  Basic affects stem from any combination of the number of points used for interpolation (the more, the better), and the distribution (positive or negative spatial autocorrelation) or distance between points (the location and proximity of neighbors significantly effect estimations between points).  More advanced effects begin with directional influence, and continue with the presence of barriers, local neighbor under- or overcorrection, and instrument error.  Finally, possibly one of the most important factors affecting the accuracy of an interpolated surface resides with choosing the most appropriate method to perform the surface interpolation.

Due to this great amount of effect that the method of spatial interpolation holds upon the accuracy of the resulting surface, it is important to carefully choose a method that is most appropriate for the nature of the given data.  The class notes discuss in detail the differences between several methods.  Generally, Inverse Distance Weighted (IDW) methods limit the influence of neighbors as their distance increases (implementing Tobler’s Law).  More sample points promote a more smooth surface, however areas of littler or no data will skew the surface toward the overall mean of the dataset, creating holes: an evenly spaced distribution of input points avoids these holes. Thiessen (Voronoi, or natural neighbor) polygons often have odd-shaped boundaries to transition between polygons.  Continuous variables are not well represented.  The trend surface method uses multiple regression (predicting a dependent elevation, Z variable with independent X & Y, location variables) to approximate values; however this method rarely intersects the original point.  Splines are used to interpolate smooth curves, and are best for surfaces that are already smooth.  Kriging is a more in-depth, random, weighted average technique using more advanced algorithms and spatial autocorrelation, best suited when correlated distances are known or if there is directional bias in the data.  Finally, it is important to understand issues associated with exact, and approximate interpolators, as well as deterministic and geostatistical interpolators.

The goal of interpolation is to minimize error.  It is the responsibility of the GIS user to understand the discussed differences between varying techniques used to interpolate surfaces.  Combining a working knowledge of the issues pertaining to spatial interpolation with the resources of the GIS reference information will assist in the creation of accurate representations.

An Essay on Uncertainty in Spatial Data Integration

Spatial data come in a variety of forms from oral or text descriptions and values, to any number of digital representations of the real world.  There are a plethora of techniques used to describe the world in space through raster and vector data alone.  The quality of such data however can be described by addressing four key factors.  Accuracy measures how close data can matches true values and descriptions.  Not all values can be exactly measured.  Due to human error or the detection/precision limits of equipment accuracy can easily be skewed.  Accuracy however, is largely associated with scale – decreasing scale increases accuracy.  Precision – reproducible performance qualities – measures how exactly data measured and stored.  In an assessment using high precision, several errors can be repeated with similar patterns of the same error, however low precision will yield little or no representation to the original error.  Error is a typical deviation or variation from reality.  Finally, uncertainty is the overall doubt or lack of confidence in data.  Though error and uncertainty are similar, the discrepancies are known and can possibly be avoided, where uncertainty deals with the absence of knowing the truth of a situation.  Moreover, spatial data quality is a measure of how well GIS data represent reality.

Due to the fact that a map is a model of reality and that model can never be a completely valid representation of reality, it is important to expect a certain level of uncertainty when working on a project as a quality control or assurance method.  Further, the differences in data types that are attempting to represent the same real-world features inherently exhibit a degree of variation.  For instance a raster image and a vector representation of the same coastline will align separately; by any number of spatial units.  Adjusting the scale at which the features are obtained may alleviate this misalignment.  Combining vector data digitized separately will rarely align perfectly.  Thus, combining two such coverages will result in sliver polygons, disconnected vertices, or polyline dangles.  It will be necessary to use overlay tools – such as a spatial join, union, etc. – to aggregate datasets while avoiding unwanted errors.

Storing information in metadata files is another powerful method that can be used to track errors that are present in GIS application.  Metadata is a means of recording information about a given dataset.  It can describe, among many other matters and parameters, which methods have been used to capture data in a coverage.  Providing such information can easily and efficiently inform others who use the dataset what he or she should anticipate as far as what deviation or error occur.  A lecture in class discussed how improvement in awareness of data quality is necessary, as data quality can have large impacts on geographic data analysis if it is not addressed properly; or even overlooked entirely.

Scale is another issue that may assist in the avoidance of uncertainty.  Determining an appropriate level at which any error that may occur in a dataset will be minimized is a sure way to improve the accuracy of a dataset.

Controlled uncertainty is another spatial data quality issue that works in inverse from the previously discussed methods.  Deliberate degradation of data is used as a mechanism for protecting the data at hand, whether a dataset covers records, or locations whose location or confidentiality may be guarded.  The Fisher reading summarizes well that analysis performed on a project without accommodating for uncertainty will have significantly degrade its validity, and therefore its usefulness is questionable.  By planning for a certain degree of error, it is possible to continue to create valid results.