Frequently Asked Questions
General FAQ
Users should first read the study guide to understand the study design and the sample. Also review the questionnaires to understand the study content and variables included.
The SG LEADS project was led by Principal Investigator Professor Wei-Jun Jean Yeung and a team of multidisciplinary Co-PIs (see People tab).
The Singapore Longitudinal EArly Development Study (SG LEADS) was funded by the Ministry of Education Social Science Research Thematic Grant (MOE 2016 – SSRTG – 044).
You may follow the citation below when citing the study:
Singapore Longitudinal Early Development Study (SG LEADS), Wei-Jun Jean Yeung (PI), Ding Xiaopan, Ryan Hong, Shirlena Huang, Lim Sun Sun, Leher Singh, Brenda Yeoh, funded by The Ministry of Education Social Science Research Thematic Grant (MOE2016 – SSRTG – 044) and conducted by the Centre for Family and Population Research, National University of Singapore, 2017-2022.
If information from the study guides and technical reports have been used, please cite them and their specific authors accordingly. The list of citations for them can be found on our documentation page.
Wave 1 of the Core National Survey was conducted from 2018-2019.
Wave 2 of the Core National Survey was conducted in 2021.
In Wave 1 (2018-2019), we interviewed a nationally representative sample of 5,006 Singaporean children under 7, and their primary caregivers from 3,477 households across the nation. Children from all socio-economic statuses and racial groups were properly represented in this survey.
In Wave 2 (2021), the study successfully re-interviewed 3,017 households and 4,351 children from the same sample.
In Wave 1, up to 2 eligible children were interviewed in each household. If there were more than 2 eligible children, 2 of them were randomly selected. See Study Guides for more details.
A probability sample of households with at least one child under the age of 7 in Wave 1 was selected from all planning areas in Singapore. SG LEADS also oversampled the low-income population (i.e. households living in 1-3 rooms). With sampling weights created by study staff, the sample properly represents gender, age and race of the children and households who lived in different types of households. More information on sampling design is provided in the Wave 1 Study Guide.
In Wave 1, we interviewed up to 2 eligible children and their primary caregiver from each household. If there are more than 2 eligible children, 2 of them are randomly selected. See Wave 1 Study Guide for more details.
In Wave 2, we re-interviewed the same target child(ren) and their primary caregiver at the time of interview.
The primary caregiver is the one who takes care of the target child primarily. The order of selecting the primary caregiver is mother, father, other adults.
We conducted face-to-face Computer-Assisted Personal Interviews (CAPI) with each primary caregiver at the child’s home, with one interviewer interviewing the child’s primary caregiver, and another interviewer performing assessments with the target child(ren).
Yes, you should for descriptive statistics. As mentioned earlier, we oversampled households in 1-3 rooms HDB units. Without apply the weights, the statistics will not accurately represent the nation.
For multivariate analysis, we also recommend using the sampling weight. However, as different disciplines have different practices, please check with your disciplines for advice. It is always a good practice to compare the weighted and unweighted results in the multivariate analysis.
The interview covers a wide range of topics, as we assessed motor, social-emotional, linguistic, cognitive, health and other indicators of socio-psychological well-being of primary caregiver and child(ren). We also collected information on factors that potentially shape child’s well-being, child development and family resilience, such as early childcare arrangements, preschool attendance, time and technology use, parenting behaviors and practices, neighborhood environment, financial and non-monetary investment in children, mother and father’s roles, and family stress.
Our survey also includes a pen-and-paper cognitive assessment conducted with each target child.
For more detailed information on questionnaire constructs, head to our Documentation page.
The assessments are administrated in English. Target children aged 3 years and above, and can either speak or understand English would participate the assessments.
Some cognitive assessments included Woodcock-Johnson Tests, Delay of Gratification, Digit Span, etc.
The Woodcock-Johnson Test of Achievement IV (WJ ACH IV) provides a normed set of tests for academic achievement. Four subsets of the test are administrated to SG LEADS children aged 3 and above in each wave: the Letter-Word Identification, the Passage Comprehension, the Applied Problems and the Calculation tests. These subsets can be used individually, or in the case of the four subscales, combined to create scores for Broad Reading and Broad Math. See either Study Guide for detailed description.
The WJ tests have standardized administrative and scoring protocols. The tests are designed to provide a normative score that shows the target child's reading and match abilities in comparison to national average for the child's age. Singapore (SG) normed scores are constructed based on the child's raw score on the test (essentially the number of correct items completed) and the child's age to the nearest month. Raw scores are charted on normative tables based on the child's age and what percentile the child falls into. More information on SG norming is provided in the Technical Reports.
For children aged 12 to 30 months: Communicative Development Inventory (Language Skill)
For children aged 3 years and below: Temperament, Prosocial behavior, Self-control
For children aged 3 years and above: Woodcock-Johnson Test of Achievement (Achievement), Behavior Problem Index, Digit Span (Working Memory), Delay of Gratification
For all children: Height and Weight
Dataset FAQ
The questionnaires includes 4 booklets: 1. Household Information Form (Screener), 2. PCG-Household Booklet (HB), 3. Child Booklet (CB), 4. Child Assessment (CA).
The primary caregiver of the child(ren) answers one set of Household Information Form (screener) and PCG-Household Booklet (HB). The primary caregiver responses to a set of Child Booklet (CB) for each participating child. If two children in the same household participated in the SG LEADS study, there will be two different sets of Child Booklet and Child Assessment, each corresponds to a child. (Refer to data setup documentation for more details.)
The Household Information Form (screener) and PCG-Household Booklet (HB) are organized at the household level, while the Child Booklet (CB) and Child Assessment (CA) are arranged at the child-level. The cross-sectional merged dataset that combines all four booklets mentioned above is arranged at the child-level. To select household-level data from a cross-sectional merged dataset, use “unique_W1” or “unique_W2”. In, both waves, unique=0 mean the child is the only child interviewed in the household. Unique=1 and 2 means the first or second of the two children interviewed in the household, respectively. Therefore, use “unique<=1” to select the household level data in the merged dataset. In addition, the interview order does not equal to the birth order of the child.
Since Wave 1, each household was assigned a unique Household ID (HHID_W1, for example 11123), and child(ren) of a household was/were assigned a unique Child ID (CHID_W1) which consist of HHID_W1 and child interview order (e.g., 11123child1). These two identifiers have been carried over in every subsequent wave.
From Wave 2 onwards, there are wave-specific Household ID and Child ID. If a two-children household was spilt up into 2 households each with 1 SG LEADS child, these two “spiltoff” households would share the same baseline Household ID (HHID_W1) and the children would carry their Wave 1 Child ID. However, these two households and their children would be assigned different wave-specific IDs.
Variables from different booklets have different prefix in the variable names (refer to Data Structure documentation). The variable from the household information form (or Screener) comes with a prefix “S_” . Variables of the Primary Caregiver – Household Booklet start with “HB_” . The prefix “CB_” is for variables in Child booklet, and “OB_” for interviewer observation variables at the end of the child booklet. The variables with prefix “CA_” refer to the Child Assessment.
Variables are generally named by its question number in the questionnaire. For example, Question A1 in household booklet (the primary caregiver’s (PCG’s) gender) is named as HB_A1 (refer to the questionnaire for questions and their numbers.)
The W1 variable names have a suffix of “_W1”, and W2 variables have a suffix of “_W2”. For example, HB_A1_W1 is the gender of W1 PCG, and HB_A1_W2 is the gender of W2 PCG.
Users can go through the questionnaire to look for questions of interest (e.g., Wave 1 household booklet question number B1: How long have you live in current neighborhood?). Use the prefix and question number and wave suffix to search for variables of interest (e.g., HB_B1_W1). Or search keywords in the variable list to locate the variables of interest. For example, searching for “neighbor” in the variable list, it will return a list of variables relating to the neighborhood (e.g., HB_B1_W1, HB_B1_W2).
SG LEADS staffs also clean and create some variables for ease of use (e.g., total family income). Users could either search for keywords in the variable list in the statistics software, or in the codebook for constructed variables. Codebooks provide the variable name, label and coding scheme for each variable.
Weights at the household-level and child-level have been created (refer to sampling weight technical report for details of weight construction). Users should use the weight that is consistent with the analysis level (e.g., household level weight for household level analysis). In Wave 2, longitudinal weights have been created.
For cross-sectional analysis, please use the weight created for that specific wave.
In longitudinal analysis with participants that enroll in all waves, use the longitudinal weights from the latest wave.
SG LEADS staffs have created a list of secondary/constructed variables that users could use for their analysis directly. These constructs include basic family demographic variables (biological parents’ information, the head of household’s information), and some constructed scales (e.g., the externalizing and internalizing behaviour problem index). Generally, the constructed variables are placed after the raw variables used to create these constructs. Refer to the construct variable codebook for details.
If constructed variables are available, we recommend using the constructed variables rather than the raw variables. When the SG LEADS staffs created these constructs, we took into consideration different aspects of information available in the dataset. For a few constructed scales, a technical report introducing the process of construction is available (e.g., income constructs, behavior problems index).