The world craves data. We want it now. And we want to know more.
Many of us can’t just go for a run anymore. Our fitness trackers record our time, distance, pace and heart rate. If we don’t have the statistics, it’s as though our run never happened.
Or think back a couple of years to when Australia was going through its second wave of COVID-19 and we hung out for the daily announcements of case numbers. We compared ourselves to other states, territories and to other countries.
“As public servants, it is important to have good information – good statistics – to inform the work we do,” says Lyndon Ang, a Sir Roland Wilson scholar from the Australian Bureau of Statistics (ABS).
“However, there is a tension between the cost of producing reliable statistics and the increasing demand from the public for timely and accurate data.”
More data is available to us now than ever before.
“From mobile phones to supermarket scanners, online search engines and satellites in space – information is constantly being collected about people, places and things.”
Through his PhD research, Lyndon aims to improve the way we harness externally sourced datasets alongside sample surveys to produce statistics that provide reliable conclusions.
“Many of these alternative data sources can be described as non-probability data – which means we often don’t know how the person, business or thing was selected in the dataset. This can be problematic if the data do not represent everyone in the population.
“Imagine you have a group of 20 people and someone has tossed a coin to decide who from the group will be selected to complete a survey. If you don’t know anything about the coin toss – how many sides the coin has, or whether the coin toss is fair for all 20 people – then we don’t know if the resulting survey data reliably represents the original 20 people.”
Lyndon says the key is to have additional supplementary or complementary information about the population that can be used to help address the issues in the non-probability data.
Through his PhD research, Lyndon aims to find new ways to fill the gaps reliably and efficiently.
“For official statistics, this potentially provides opportunities to enhance the way that we run surveys in terms of the questions we ask, who we ask to respond and how we design the surveys."
Lyndon hopes his research will help government agencies beyond the ABS to fill in the gaps in incomplete data sets.
“Addressing the issues with non-probability data will reduce the cost of data collection and decrease the burden on the population – meaning fewer or shorter surveys for people to complete.”
The Sir Roland Wilson Scholarship is a three-year, full pay scholarship for PhD research at ANU for high performing EL1 and EL2 APS employees.
Read more about the Sir Roland Wilson Foundation. Stay up to date by following us on Facebook, Twitter and LinkedIn.
Image: Kelly Chen Photography