Structural equation model framework

  • Method

This technique is useful for describing unmeasured variables or concepts, such as pathogen transmission or cognitive ability, with available quantitative measures like handwashing practices and household toilet access, or memory performance and visual responses to stimuli, respectively. These measured predictor variables serve to enumerate the association between an unmeasured variable and a causal outcome of interest. Ki has leveraged its vast data resources and this methodology to confirm and explore causal pathways using measured predictors and assumed pathways to determine their influence on healthy growth and development.


SEM is not a single technique, but a general framework that integrates several multivariate techniques into a single model.[1]

SEM is defined as a “path analysis using latent variables.”[1]

Path analysis, or a structural model, is a visual representation of the model, including regression equations between measured variables (see specific notation and example in Figure 1) and a specified causal order.

Latent variables are theoretical constructs that are not directly measured, such as dietary intake.[2]

  • Multiple measured variables can be used to construct a latent variable.
    • A measured variable is comprised of a true score and error. Error includes systematic error (bias) or random error (equal likelihood of error occurring). By including multiple measured variables, a better measure of the true score and error estimate can be obtained.
  • Parameter values estimate the correlation between the latent variable and the selected measured variable (and its error). Parameter values close to one are highly correlated, and are, therefore, considered to be a good indicator of the latent variable.
  • For example, to describe childhood dietary intake as a predictor for growth impairment, a parent may be asked to complete a food diary to detail types of foods consumed, calories per meal, and supplemental vitamins consumed. The SEM can test the hypothesized predictor variables documented in the food diary in relation to the unmeasured childhood dietary intake as a latent variable.

SEM provides confirmatory (hypothesis testing) or explanatory (hypothesis generating) analyses.

Confirmatory factor analysis models are imposed on the data and aim to estimate parameters and assess fit of the model to the data.[2]

Statistical methods[3]

  • The raw data are not incorporated into the model, but rather the variance and covariance of observed data are used to construct a matrix.
  • The analysis of the variance/covariance matrix (Figure 2) assesses the model’s ability to summarize the observed matrix.
    • Variance (σ2) of each variable with itself (depicted along the diagonal of the matrix) and the covariance between two measured variables make up the cells in the matrix.
    • If the implied model is true, then the observed matrix should closely align with the implied model’s matrix values.

FIGURE 2. Example variance‑ covariance matrix

  • Maximum likelihood estimates the unknown model parameters by utilizing a function based on the probability of sample data.3 (see the “Distributions” section of the “Categorical” Sheet)
  • SEM incorporates traditional statistical models, such as regression, factor, and complex path models.[2]
  • Statistical software used for SEM utilizes complex algorithms to maximize the fit of the model to the data.
  • There is an implied structure for covariances between observed variables.[2]

Advantages of SEM

  • SEM depicts systems of relationships with numerous dependent variables within a causal system.
  • Error terms for each observed variable enable SEM to correct for measurement error.[3]
  • Latent variables represent wider coverage of complex causal relationships with unmeasured variables. This also reduces error by incorporating multiple quantitative constructs to describe the latent variable.
  • Maximum likelihood estimation is an efficient mathematical function that delivers unbiased estimates.[3]
  • MLE requires data to be normally distributed and continuous.

Disadvantages of SEM

  • SEM critiques are dependent on the statistical models incorporated into the framework.

FIGURE 1. Notation for a Path Diagram & an Example Path Diagram. The example diagram explores the association of the latent variable dietary intake on growth.

  • Causal interpretation of SEM is common despite utilization of non-experimental data. Correlational data cannot deduce causal conclusions.[2] Causality can only be assumed if experimental data are utilized.


  1. Sturgis P. Structural Equation Modeling: What is it and what can we use it for? (part 1 of 3) [Online]. 2016. Available from: Accessed 26 Nov 2017.
  2. Hox J, Bechger T. An Introduction to Structural Equation Modeling. Family Science Review.11:354-73.
  3. Sturgis P. Key ideas, terms & concepts in Structural Equation Modeling (part 2 of 3) [Online]. 2016. Available from: Accessed 26 Nov 2017.


Last Updated

October, 2020