Session 23

Assessing the Quality of Data Survey

This session will provide a series of original investigations on data quality in both national and international contexts. The starting premise is that all survey data contain a mixture of substantive and methodologically-induced variation. Most current work focuses primarily on random measurement error, which is usually treated as normally distributed. However, there are a large number of different kinds of systematic measurement errors, or more precisely, there are many different sources of methodologically-induced variation and all of them may have a strong influence on the “substantive” solutions. To the sources of methodologically-induced variation belong response sets and response styles, misunderstandings of questions, translation and coding errors, uneven standards between the research institutes involved in the data collection (especially in cross-national research), item- and unit non-response, as well as faked interviews. We will consider data as of high quality in case the methodologically-induced variation is low, i.e. the differences in responses can be interpreted based on theoretical assumptions in the given area of research. The aim of the session is to discuss different sources of methodologically-induced variation in survey research, how to detect them and the effects they have on the substantive findings. Keywords: Quality of data, task simplification, response styles, satisficing.




1.Combining Research Elicited Data and Processed Produced Data in the Sociology of Deviance

Khumo Motshwari  (Universität Augsburg, Germany)

Processed produced mass data has always been criticized for its limitations and inability to address certain kinds of research questions, and this is buttressed by arguments in (Baur, 2019), who highlighted that the data is usually not produced for research purposes but is usually a side product of social processes. Process produced data such as administrative data that are collected by government ministries and organizations constitute an important data source that can be useful in research but are often not sufficient alone to answer different kinds of research questions, especially when they are used in isolation. Whilst these limitations are evident and indisputable, it is equally clear that such data can be effectively harnessed to answer specific types of research questions, and this paper offers an example of the usefulness of process produced data in studying juvenile delinquency. This paper intends to use the Botswana Youth Risk Behavioral and Biological Surveillance Survey by the Ministry of Basic Education as a form of process produced data set, to analyze the factors associated with juvenile delinquency in Botswana. Whilst the report in itself provides a basis from which to launch the study, it cannot answer other uniquely qualitative research questions that would require thorough descriptions from the juveniles involved in these acts of deviance. This paper therefore uses this as a starting point to explore the possibilities and potentialities associated with combining process produced data with research elicited data, with the ultimate aim being to demonstrate how doing so offers more satisficing research results. This a methodological approach that has not yet been extensively applied in the context of Botswana, and other African countries, and therefore the paper will open discussions on the possibilities of conducting research in this manner, especially where process produced data sets are publicly accessible and available.


2.Challenges of Representativeness in Survey Research: An Evaluation of the ERiK Surveys 2020

Gedon Benjamin (German Youth Institute (DJI), Germany)

At what point can we claim that survey data and results are representative of the entire population? And what actions can researchers take to improve representativeness? Despite the fact that these questions are fundamental to almost any research endeavour, they are rarely explicitly addressed. This presentation aims to address this gap and showcases the challenges of generalisability faced by many researchers by discussing the approach taken by the national study “Indicator based monitoring of structural quality in the German early childhood education and care system (ERiK for short)”. We first describe the ERiK survey concept, which consists of five cross-sectional surveys in 2020, covering the multiple stakeholder perspectives of directors and pedagogical staff in day-care facilities, family day-care workers, youth welfare offices and day-care facility providers. We then evaluate the quality of the ERiK data collection with regard to their representativeness, especially for the 16 German federal states, focusing in particular on selectivity due to varying sampling and participation probabilities. We use several different measurements for this assessment, including comparisons between actual and ideal sample size, the share of respondents of the total population and the response rate. Additionally, we discuss other possible sources of error at different stages of the survey process (summarised in the concept of the Total Survey Error) and develop appropriate weighting factors.

We conclude that the ERiK surveys 2020 can be considered representative and therefore
can be used to make generalised statements about the quality of child day-care in Germany.
Finally, we review some limitations of the datasets.


3.Evidence on Non-Response and Coverage Bias in German Provider Surveys

Lisa Ulrich  (Deutsches Jugendinstitut, Germany)

In Germany, nearly all pre-school children attend a childcare center facilitated by a public or private provider. Since early childhood education and care (ECEC) has long-lasting educational benefits for children, debates about the quality of services and related research increase as well as the number of provider surveys being conducted rises. However, sampling frames for providers are not available and provider registries are not administered in Germany, so that the survey quality of these provider surveys and the consequences for respective survey estimates have not been scrutinized. In this article, we assess the risk of observing non-response and coverage bias in provider surveys on exemplary point estimates, namely, the share of children attending childcare centers. We accomplish this task comparing distributions based on registry data on childcare facilities as well as survey data on providers and on child care facilities from the ERiK Surveys 2020. We find that the combined extent of nonresponse and noncoverage biases the point estimate of up to 10 percent. Considering the detrimental consequences of biases that high for political and societal planning, the results show that the distributions based on unweighted provider survey data can only be used to a limited extent. Furthermore, the paper shows that statistical adjustments through weighting can minimize bias due to non-coverage and non-response for the exemplary outcome. The paper thereby introduces a procedure for future provider studies to evaluate their survey quality respectably the generalisability of their point estimates.


4.Pilot studies: A useful Methodological Principle in Quantitative Research

Joy Tauetsile  (University of Botswana, Botswana)

Commonly known as ‘feasibility’ studies pilot studies are designed to assess the feasibility of a large, expensive full-scale study and are an essential pre- requisite in social science research. The objective of this paper is to provide lessons learnt conducting a pilot study by providing the key aspects of the pilot study including a) Sample selection and data collection b) Outcomes of the pilot and application to main study c) Deciding on the measures and / or scales to use for main study based on pilot study results d) Fieldwork protocol and logistical problems experienced during the pilot e) The criteria for evaluating success of a pilot study and (f) reporting the results of a pilot investigation. In discussing these aspects this paper will make reference to two papers “Measuring Employee Engagement: Utretch Work Engagement Scale (UWES) or Intellectual, Social, and Affective Scale (ISA)” and “Employee Engagement in Non-Western contexts: The link between Social Resources, Ubuntu and Employee Engagement”. The former was the pilot inquiry for the latter. The paper will conclude by elaborating and discussing the challenges and the valuable lessons learnt during a pilot investigation which assisted in making appropriate methodological decisions for the main inquiry.

Key Words: Pilot Study, Employee Engagement, Botswana