Nonlinear regression is a mathematical model that fits an equation to certain data using a generated line. As is the case with a linear regression that uses a straight-line equation (such as Ỵ= c + m x), nonlinear regression shows association using a curve, making it nonlinear in the parameter.
A simple nonlinear regression model is expressed as follows:
Y = f(X,β) + ϵ
X is a vector of P predictors
β is a vector of k parameters
F (-) is the known regression function
ϵ is the error term
Alternatively, the model can also be written as follows:
Yi = h [xi(1) , xi(2), … , xi(m) ; Ѳ1, Ѳ2, …, Ѳp] + Ei
Yi is the responsive variable
h is the function
x is the input
Ѳ is the parameter to be estimated
Since each parameter can be evaluated to determine whether it is nonlinear or linear, a given function Yi can include a mix of nonlinear and linear parameters. The function h in the model is considered, as it cannot be written as linear in the parameters. Instead, the function is deduced from theory.
The term “nonlinear” refers to the parameters in the model, as opposed to the independent variables. Unlimited possibilities exist for describing the deterministic part of the model. Such flexibility provides a good ground on which to make statistical inferences.
The goal of the model is to minimize the sum of the squares as least as possible using iterative numeric procedures. The best estimate for the model’s parameters is the principle of least squares, which measures how many observations deviate from the mean of the data set. It is also worth noting that the difference between linear and nonlinear regression models lies in calculating the least squares.
Nonlinear regression is a mathematical function that uses a generated line – typically a curve – to fit an equation to some data.
The sum of squares is used to determine the fitness of a regression model, which is computed by calculating the difference between the mean and every point of data.
Nonlinear regression models are used because of their ability to accommodate different mean functions.
How to Calculate the Sum of Squares
The sum of squares is calculated by first computing the difference between every point of data and the mean in a set of data. Afterward, each of the differences is squared before summing up all the squared figures. The sum of squares determines how a model best fits the data, and by convention, the smaller the sum of the squared values, the better the model fits the data set.
Estimating how well the curve fits involves determining the goodness of fit using the computed least squares. It is premised on the idea that the magnitude of the difference between the curve and the data sets determines how well the curve fits the data.
The similarity between nonlinear and linear regression is that both models seek to determine the robustness of predictability from a set of variables graphically. However, it is more challenging to develop a nonlinear model given that its function is iterative and that it is created through a series of trial-and-error. Several established methods, such as Levenberg-Marquardt and Gauss-Newton, are used to develop nonlinear models.
Typically, a linear regression model appears nonlinear at first glance. A curve estimation approach identifies the nature of the functional relationship at play in a data set. It means that either the linear or nonlinear regression model is applicable as the correct model, depending on the nature of the functional association.
While a linear regression model forms a straight line, it can also create curves depending on the form of its equation. Similarly, a nonlinear regression equation can be transformed to mimic a linear regression equation using algebra.
Applications of Nonlinear Regression
Overall, a nonlinear regression model is used to accommodate different mean functions, even though it is less flexible than a linear regression model. Some of its advantages include predictability, parsimony, and interpretability. Financial forecasting is one way that a nonlinear regression can be applied.
A scatterplot of changing financial prices over time shows an association between changes in prices and time. Because the relationship is nonlinear, a nonlinear regression model is the best model to use.
A logistic price change model can provide the estimates of the market prices that were not measured and a projection of the future changes in market prices. The majority of financial and macroeconomics time series show different features over time based on the state of the economy.
To illustrate, recessions versus expansions, bull and bear stock markets, or low versus high volatility are some of the dual regimes that require nonlinear models in economic time series data. Such nonlinear time series that take dual regimes, commonly referred to as state-dependent models, include models such as regime-switching, smooth, and threshold.
Accurate specification and description of the relationship between the dependent and independent variables guarantees accurate results from a nonlinear regression. Also, given that poor starting values may create a no-convergent model, good starting values are necessary. More often, nonlinear regression adopts a quantitative dependent or independent variable.
CFI offers the Business Intelligence & Data Analyst (BIDA)® certification program for those looking to take their careers to the next level. To keep learning and developing your knowledge base, please explore the additional relevant resources below: