A common reason for a poor fit is a mismatch between your model and the experimental data you have provided. The source of this mismatch may be either the model or the data. Check basic items such as:
- you have entered the correct data for each set of scenario inputs / experiments and have not mixed up your scenarios and data
- your data make sense in the context of the scenario inputs; e.g. the data do not indicate 1 kg of filtrate when there was only 10 g of mother liquor (or vice versa)
- your model contains the correct types of phases and rates to describe your system (e.g. liquids, solids, feed, mass transfer, etc).
Beyond these basic aspects, the quality of your fit can be assessed in many ways. Two of the most common and useful are:
Visual inspection: The default Data versus Model view in Fitting will help your eye to judge whether there is good correspondence between model and reality. Even better, running the model with the updated parameter values in Simulator gives a good indication of the quality of the fit (i.e. do the lines go near/ through the datapoints). This is an important first check of the fit, and shows any obvious problems (e.g. flat profiles of species etc., which can occur at a local or 'false' minimum).
Review the Fitting results tab and Fitting Report: Even if the fit looks good visually, it is advisable to check the numbers in the Fitting results display and the Fitting Report. In particular:
Check for large confidence intervals on parameters. This means a wide range of parameter values can fit the data reasonably well and there is no 'sharp minimum'. Large confidence intervals can imply:
- Too many parameters are being fitted (many combinations of parameter values give an equally good fit). If this is the case, some values in the correlation matrix will probably be near 1 (or near -1). The correlation matrix will indicate which pairs of parameters are correlated, and it may help to introduce different parameters to be fitted (e.g. try creating a new parameter that is the ratio of two correlated parameters).
- Problems with the data (e.g. several outliers are distorting the fit). The confidence intervals are based on the assumption that the residuals are normally distributed. If this is not the case, e.g. if several outlier data points are included), the confidence intervals will be inaccurate (too wide). Identify possible outlier points, de-select the outliers and re-fit.
- Incomplete reaction mechanism or process description. If the reaction scheme (or other rate processes you have included) does not adequately describe the data (e.g. trying to fit a first order reaction when the actual reaction is second order), it is possible that significant mechanisms or phenomena have been left out of the model or lumped together. This can often be seen by viewing the residuals (using the data analysis functions) in Simulator. If the residuals show a systematic trend (i.e. not randomly spread around the time axis) then it is likely that the parameter list (e.g. reaction mechanism, or other set of rate processes on the Process sheet) is incomplete.
Check the user-defined or assumed errors on the individual datapoints, as reported in the fitting report. Compare these to the measured values of responses and judge whether they are reasonable.
- If you have supplied the errors, are they too large or too small? Should they be relative or absolute? Each of these possibilities affects the calculation of chi-squared and the reported Goodness of Fit statistic.
- If you have used Weighted or Unweighted SSQ, the error values give a rough indication of the likely predictive accuracy of your model. Is this level of accuracy tight enough?