Current research activities

This page provides an overview of my research activities (current and past) as a Research Scientist for the Commonwealth Scientific and Industrial Research Organisation (CSIRO), within the Data61 business unit (Remote Sensing and Image Integration team) in Canberra, Australia. Please feel free to contact me directly for more information on these topics; alternatively, check out the various publications cited at the bottom of this page.

Monitoring the water quality of inland waterbodies

Cyanobacteria (blue-green algae) can form blooms in freshwaters that impact on water quality and pose a major hazard to both consumptive and environmental water users. Monitoring algae in Australia's surface freshwaters is therefore essential to ensure the safety of all water users and the supply of safe potable water. Cyanobacterial management has traditionally been supported by a monitoring program, where water samples are collected in the field from major reservoirs and rivers and then sent to a laboratory for analysis. This involves the identification and counting of the cells present by microscopy. This method, while reliable and the best currently available, may have a lag time of several days between sample collection in the field and the analytical data becoming available from the laboratory. Sampling water from point locations also only provides localised information, and does not provide adequate data on the spatial extent of the bloom.

This project aims to develop procedures that use remote sensing technologies to support cyanobacterial monitoring, including real-time data for immediate management and a wider spatial coverage of major inland waters. Implementation of this technology leads to improvements in the management of algal blooms in Australia's water bodies, and a more efficient use of resources in targeting specific problem areas.

As part of this project, we have developed a software framework and product delivery system to deploy algal detection algorithms, aiming to achieve automated rapid processing of satellite data streams into algal reports from satellite imagery. This framework has been implemented on Australia's National Computational Infrastructure (NCI) and makes use of the Australian Geoscience Data Cube infrastructure (a joint venture between CSIRO, Geoscience Australia and the NCI), thereby leveraging Australia's highest performance computing cloud to enable efficient processing of a 30 year time series of fully-processed satellite imagery over the Australian continent.

Outputs from the water quality algorithm, applied to each available time slice in the time series, are subsequently presented to the user in a web interface for interactive spatial and temporal investigation (see Figure 1). The algal alert system is designed such that it can accommodate new sensors and data streams, can be scaled up to large spatial extents and/or other countries (e.g. continental scale, developing countries, etc.) and can be easily updated with new algorithms.

Figure 1: Web interface displaying a water quality map from the Algal Alert System prototype over Lake
Burley Griffin in Canberra, ACT. The eastern parts of the lake show elevated levels of suspended sediments,
pointing to an increased risk of potential algal bloom occurrence.

Fusion of SAR and Landsat data

This project is concerned with the combined processing of Landsat imagery and synthetic aperture radar (SAR) data, for the purpose of forest mapping and monitoring. Optical satellite imagery typically provides a vast historical data archive (going back to 1972 in the case of Landsat), but can however be significantly affected by clouds. SAR-based remote sensing data represents a more recent dataset, and can be acquired round-the-clock and regardless of weather conditions. Analysing the advantages (and drawbacks) of combining such datasets is therefore very important for the development of large-scale forest information systems based on the use of remote sensing imagery. The following figure shows a representative example of Landsat and ALOS-PALSAR data over a small study site in north-eastern Tasmania, which includes one of three national calibration sites considered as part of the GEO-FCT task, which this project is a part of.

Figure 2: Examples of Landsat data (bands 5/4/2 in RGB) and PALSAR data (HH/HV/HH-HV in RGB) in north-eastern
Tasmania, Australia, in 2007; white areas represent missing data (masked out pixels).

A statistical method that is central to the joint analysis of Landsat and PALSAR data in our work is that of Canonical Variate Analysis (CVA). This is a linear discriminant technique that provides (among other things) quantifiable metrics of separation between various land cover types over the study area. For instance, CVA can be used as a tool to select a number of separable land cover classes, which can be subsequently used as input to a maximum-likelihood (ML) classifier of the remote sensing data. Figure 3 shows an example of ML output (forest/non-forest, F/NF) using this approach for a small area which is known to be challenging for a Landsat-only or a PALSAR-only classification (subset of the images presented in Figure 2). This result shows that the joint processing of optical and SAR data is advantageous from a F/NF perspective: the joint classification results (bottom-right image) are much closer to the ground truth (top-right) compared to the Landsat-only and a PALSAR-only classifications (bottom-centre and bottom-left, respectively). CVA can also be used to derive measures of how much information is provided by various image bands for the classification, as well as the amount of information provided by different types of data such as Landsat and PALSAR (and thus determine which dataset is more relevant for the classification of certain types of land covers). More information on this topic can be found in [1,2].

Figure 3: Forest/non-forest classifications from Landsat data, PALSAR data, and combined
(bottom row), with corresponding data and ground truth (top row).

Our second approach to a joint multi-temporal processing of the Landsat and PALSAR data is closely related to the methods developed as part of Australia's National Carbon Accounting System (NCAS), an operational continental-scale forest monitoring system developed by the CSIRO. These methods involve the use of a contrast-directed CVA in order to select linear discriminant functions and thresholds for classification of each image of remote sensing data. The resulting forest probabilities are subsequently considered as observations of a latent state (F/NF) variable; estimates of this state variable (i.e. forest presence/absence) are finally provided as the output of a Conditional Probability Network (CPN, based on a hidden Markov model) which carries out a spatial-temporal processing of each pixel in each image for all years of the considered time series. Figure 4 provides a symbolic representation of this approach. In our work, the time series of remote sensing data typically contains a mixture of Landsat and PALSAR data, thus leading to an output that effectively "blends" the two different datasets into one single forest information product. The detail of this work can be found in [3,4].

Figure 4: Symbolic representation of spatial-temporal CPN processing. The resulting F/NF estimates (in orange,
bottom row) draw on the information provided by all images (i.e. SAR and optical) in the time series.

Data assimilation for water resources assessment

Research on water resources assessment and accounting requires making the best use of multiple sources of data in order to produce reliable accounting predictions. For any given quantity of interest such as soil moisture (one of the main physical quantities with a key role in water resources accounting), available sources of data could be directly observed by ground probes, derived indirectly from a remotely-sensed surrogate (e.g. through brightness temperature using retrieval models), or represent the output from deterministic hydrological models. For tasks such as model—data fusion or the evaluation of remotely-sensed products, multiple data sources for individual quantities of interest must be assimilated to optimise the use of available information. Spatial statistical models, constructed in a Bayesian hierarchical framework, offer an intuitive and unifying approach for this purpose. It coherently utilises multiple data sources by addressing the mismatch in both spatial and measurement scales (ground-based vs. remotely-sensed products), and can also spatially interpolate missing data to infer likely values.

Figure 5: Symbolic representation of Bayesian hierarchical model for soil moisture
data assimilation (figure © G. Chiu, reproduced from [6]).

In this project (part of the WIRADA partnership between CSIRO and Australia's Bureau of Meteorology), we developed a number of Bayesian hierarchical models (BHMs) for purposes such as the blending/assimilation of multiple soil moisture datasets, as well as the evaluation (benchmarking) of remotely-sensed soil moisture product against weather station data. Figure 5 shows a symbolic representation of one such statistical framework, representing a single unified model that integrates three different data sources: in situ ground-based probe data, remotely-sensed soil moisture from the AMSR-E satellite, and the rainfall (precipitation) product AWAP, with each dataset having its own spatial scale (different pixel sizes, or point-level measurements). Figure 5 shows the relationships between these different variables and the latent soil moisture (state) variable. Here, the precipitation covariate is assumed to be the principal driver for soil moisture, which in turn generates the response variables measured either on the ground or via remote sensing. An additional spatial term is also modelled as part of the precipitation covariate to impart spatial correlation (smoothness) to the various estimated quantities. These relationships between the variables and their spatial characteristics are then modelled through a mathematical BHM formulation, and the model is subsequently fitted via Markov Chain Monte Carlo (MCMC) simulations. This ultimately provides model-based estimates of various quantities of interest, such as the latent soil moisture field, inference of missing data points, spatial correlation characteristics, etc., a few examples of which are provided in Figure 6 over the Murrumbidgee River Catchment in New South Wales, Australia (the ground probe locations are shown as black dots in the plots). All of these estimated quantities also come with an assessment of their reliability/uncertainty (e.g. variance, credible intervals, etc.) which is of particular importance for the subsequent use or assessment of these products. More information on this work can be found in [5,6].

Figure 6: Datasets involved in the spatial BHM for soil moisture blending. Top row: AMSR-E data (left)
with imputed pixels (missing data) outlined in blue, and AWAP precipitation covariate (right).
Bottom row: soil moisture data from ground probes (left) with imputed values (missing data) shown as
white bars, and latent soil moisture estimates (blended product) from the fitted BHM (right).

Vegetation trends

In perennial and natural vegetation systems, monitoring changes in vegetation over time is of fundamental interest for identifying and quantifying impacts of management and natural processes. Subtle changes in vegetation cover can be identified by calculating the trends of a vegetation density index over time. As part of this project, operational methods were developed to apply such an index-trends approach to continental-scale monitoring of disturbances within forested regions of Australia. In essence, this vegetation trends product is a time-series summary providing a visual indication of within-forest vegetation changes (disturbance and recovery) over time at 25 m resolution. It is based on a national archive of calibrated Landsat TM/ETM+ data from 1989 to 2006 produced for Australia's National Carbon Accounting System (NCAS).

Figure 7: Symbolic representation of processing steps for the generation of forest cover trends.

Figure 7 shows a symbolic representation of the main concepts contained in this approach. The trends product relies on the identification of an appropriate Landsat-based vegetation cover index that is sensitive to changes in forest density. To produce the trends information, statistical summaries of the index response over time (such as slope and quadratic curvature) are then calculated and finally displayed as maps where the different colours indicate the approximate timing, direction (decline or increase), magnitude and spatial extent of the changes in vegetation cover. This information highlights subtle changes within forested areas and provides the capacity to identify processes affecting forests which are of primary interest to ecologists and land managers, at scales that are relevant for natural resource management and environmental reporting. The detail of this work is documented in [7].

Rainfall extremes in Australia

Understanding how extremes of temperature, precipitation and other climate variables will change in the future is a key element in planning for the impacts of potential climate change. Analysis of data from weather stations can be used to assess past and current trends in extremes, but projections from global and regional climate models (GCMs/RCMs) are required to project into the future under different greenhouse gas emissions scenarios. As part of CSIRO's Climate Adaptation Flagship, this statistical climatology project investigates the use of Bayesian hierarchical modelling to model the observed rainfall data and the GCM/RCM outputs (current climate and projections) in a spatially consistent manner. This approach will provide an efficient tool to analyse the characteristics of future extreme weather events, such as return levels and intensity-frequency-duration curves.

Figure 8: 100-year return levels for rainfall simulated via spatial Bayesian hierarchical modelling (posterior mean and 95%
credible interval, CI), based on a 56-year rainfall dataset recorded at 42 weather stations around Sydney, NSW.

Blending methodologies for multiple Earth Observation products

From February to July 2013, I led a team of scientists on a project focussing on blending methodologies for remote sensing data. This project was part of the CSIRO Transformational Capability Platform on Earth Observation Informatics (EOI TCP).

Important environmental variables are often available from multiple remote sensing (RS) platforms simultaneously; e.g. Landsat and ALOS-PALSAR may both be used to provide measures of vegetation cover, and the ASCAT and SMOS sensors both deliver soil moisture estimates. This project focuses on the assessment and further development of existing/new data blending methodologies for RS products. These techniques need to handle several challenges, including different temporal/spatial resolutions, heterogeneous scales of uncertainty, different acquisition periods, and exploitation of other important covariates (topography, vegetation, etc.).

Seabed mapping

For a short time, I was also involved in a project related to the development of spatial tools for seabed condition mapping using acoustic swath data. The aim was to develop operational procedures for analysing and classifying underwater sonar data, so as to improve the current state of scientific management/investigation of human impacts on marine environments and living resources in coastal areas. My involvement within this project was mainly related to the pre-processing of data from the EM300 multibeam echo sounder system. Among others, several procedures were developed to fit a depth surface to the (noisy) EM300 data, to fit slope angles to the depth data, and to correct the EM300 backscatter data with respect to the local slope angle (thus improving on the flat-bottom assumption typically used for calibration/normalisation of backscatter data).

Figure 9: Depth data from EM300 multibeam sensor, with overlaid backscatter intensity.


[1] E. Lehmann, Z.-S. Zhou, P. Caccetta, A. Mitchell, A. Milne, K. Lowell and S. McNeill, Combined analysis of optical and SAR remote sensing data for forest mapping and monitoring, International Symposium on Digital Earth, Perth, Australia, August 2011.
[2] E. Lehmann, P. Caccetta, Z.-S. Zhou, A. Mitchell, I. Tapley, A. Milne, A. Held, K. Lowell, and S. McNeill, Forest discrimination analysis of combined Landsat and ALOS-PALSAR data, International Symposium on Remote Sensing of Environment, Sydney, Australia, April 2011.
[3] E. Lehmann, Z.-S. Zhou, P. Caccetta, A. Milne, A. Mitchell, K. Lowell and A. Held, Forest mapping and monitoring in Tasmania using multi-temporal Landsat and ALOS-PALSAR data, IEEE International Geoscience and Remote Sensing Symposium, Munich, Germany, July 2012.
[4] E. Lehmann, P. Caccetta, Z.-S. Zhou, S. McNeill, X. Wu and A. Mitchell, Joint processing of Landsat and ALOS-PALSAR data for forest mapping and monitoring, IEEE Transactions on Geoscience and Remote Sensing, vol. 50, nr. 1, pp. 55-67, January 2012.
[5] G. Chiu and E. Lehmann, Bayesian hierarchical modelling: incorporating spatial information in water resources assessment and accounting, International Congress on Modelling and Simulation, pp. 3349-3355, Perth, Australia, December 2011.
[6] G. Chiu, E. Lehmann and J. Bowden, A spatial modelling approach for the blending and error characterization of remotely sensed soil moisture products, Journal of Environmental Statistics, vol. 4, nr. 9, April 2013.
[7] E. Lehmann, J. Wallace, P. Caccetta, S. Furby and K. Zdunic, Forest cover trends from time series Landsat data for the Australian continent, International Journal of Applied Earth Observation and Geoinformation, vol. 21, pp. 453-462, April 2013.