Phase Space Reconstruction

Introduction

In other posts, we’ve conducted analyses and experiments on the assumption that nonlinear systems can in the slightest sense, be deterministic — i.e., have some sort of coherence in their distributional properties over time. The concept this post seeks to tackle is Phase Space Reconstruction. Such a method forms the basis for several non-linear analysis methods such as Recurrence Quantification Analysis (RQA) and Empirical Dynamic Modelling (EDM). The rest of this post will contain steps for phase space reconstruction, the underlying theories and assumptions that support it, and code outputs along with graphs to visualize these arguments. As with previous posts, these concepts and methods will be tested against a predator-prey dataset. You can view it in Figure 1.

Figure 1: Predator-prey full dataset

You can find the full dataset, code and figures on our GitHub page. For this post we will only focus on the graphs from the ‘sheep’ time-series, but all 3 of them show similar properties.

When analysing time-series, we seek to uncover the dynamics of a system but we can’t do that without making some assumptions about the distribution of data. To properly explain the concept of determinism, consider a system for which we know (or can determine) all of its’ future states, once the present state is known — also called a purely deterministic system (Kant & Schreiber, 2004). In soft-assembled dynamical systems, because they are not fully deterministic, simple rules and equations do not fully capture the interactions between a complex system’s component elements (Properties of Dynamical Systems). The reconstruction of a system’s states in the phase space is an effective method to uncover these ‘governing laws’.

Takens’ Theorem

The reconstruction of a phase space in Cognitive Science is partly based on a prominent theorem posed by Takens (1981); the time series of one signal of a system contains information about other signals in the whole system. If what Takens states is true (let’s assume that it is), the ‘sheep’ variable can be chosen to do a sufficient job at reconstructing the whole ecosystem in the phase space. We hypothesise that the ‘sheep’ variable will do the best job at reconstructing the phase space, because it’s directly influenced by the ‘wolves’ and ‘grass’ variables. The construction of the phase space ensures that under certain conditions, we display an (almost) accurate depiction of the qualitative and quantitative properties in the original time-series (Takens, 1981).

This is feasible, given that the method of delays is done correctly; i.e., we estimate or choose two parameters correctly:

  • The time delay τ (tau)
  • The embedding dimension d

The output of such a reconstruction is a matrix of d dimensions on different multiple lags τ, where d is the number of columns in the final embedding matrix. An improper choice of these two parameters can lead to misrepresentation of the properties and a false indication of a non-linear system when the system is actually linear (Garcia & Almeida, 2005). Figure 2 shows the plot of such matrix with dimension d at each axis (retrieved from Sayed et al., 2017).

Figure 2: Phase Space Reconstruction (Sayed et al., 2017)

To optimise the time delay τ and embedding dimension d, a number of statistical tools have been proven effective such as (Average) Mutual Information, autocorrelation, False Nearest Neighbours, (FNN), high-order correlations, average displacement (AD), etc. (Wallot & Mønster, 2018). This post uses a total of four formal methods to compute the optimal time delay and embedding dimension.

Optimising tau 𝝉

In order to find the right time embedding 𝝉 we need to find the first moment where trajectories are separated maximally. For this, two options exist:

  • The levelling-off of the Average Mutual Information (AMI) function
  • The first local zero-crossing of the Autocorrelation Function (ACF)

In the most basic explanation, the AMI quantifies the amount of information two random variables hold on each other (Farvardin, 2003) . In our case this is between the original time-series and the time-series shifted by different levels of tau. Therefore, the point at which the function levels off (first-crosses zero) gives a proper indication of which time delay holds the least information on the present time point (Fraser & Swinney, 1986). This method is popular due to its consideration of nonlinear correlations.

Figure 3: Average Mutual Information plot for ‘sheep’ time-series

Figure 3 gives the Average Mutual Information (AMI) for the ‘sheep’ time-series. It is shown in a plot with the distribution of mutual information for lags up to 50. In principle, the point on the x axis where the AMI value (y-axis) distribution starts to level off, at lag 14, is the optimal time delay τ (Kant & Schreiber, 2004). You can see that the timeLag function in R with the ‘ami’ technique shows you a green dotted line when this occurs.

As explained in Properties of Dynamical Systems, the ACF shows the correlation of the time-series with lagged versions of itself. Therefore, when there is 0 correlation, we would expect the time-series and the time-lag to be uncorrelated. In other words, it holds no memory of the time-series at that lag. Again, we can use the timeLag function, but this time we can change the technique to ‘acf’ (Kant & Schreiber, 2004).

Figure 4: Autocorrelation function for ‘sheep’ time-series

The ACF of the ‘sheep’ time-series is visible in Figure 4. The ACF gives the optimal time delay for a much higher lag, namely at lag 35. You can also see from the results on our GitHub page that there are minor differences between time-series.

Because of this much longer lag, we choose to work with the lag given by the AMI function, as the embedding dimension may give us more dimensions to work with and therefore many short lags. In that way we have more freedom when constructing the phase space, since in this post we will only look into visualising it into three dimensions.

Embedding dimensions

In order to determine the optimal embedding dimensions, we choose to use two methods.

  • False Nearest Neighbour method
  • Cao’s method

The logic behind the optimisation of the embedding dimension through the False Nearest Neighbour algorithm is a tricky one. We use the concept of distance on the phase plane to compute the optimal value of d (Kennel et al., 1992). Wallot & Mønster (2018) explain this perfectly: the distance between two data points is their difference in magnitude on a plane. They are neighbours in the actual space if their distance is less than some threshold. After embedding, if this distance has changed significantly, they are false neighbours. Similarly, if the distance did not change significantly, then they are true neighbours. The embedding is done multiple times for increasing dimensions of d. The lag at which the amount of false neighbours stops changing (greatly), is the optimal embedding dimension.

For running the code we set the maximum embedding dimension to 7, since beyond that no more neighbours are found. Furthermore, the Theiler window is set to 50. In simple words; the Theiler window ensures that there needs to be a minimum amount of time-steps between two neighbours. Otherwise, the algorithm may find too many false neighbours next to each other. The other parameters are as suggested by the algorithm default settings.

Figure 5: False Nearest Neighbours algorithm ‘sheep’ time-series

Figure 5 shows the optimal embedding dimension to be (approximately) 3 for the ‘sheep’ time-series. This method of False Nearest Neighbours by Kennel et al.(1992) may seem foolproof, but some researchers have found issues with it and as a result. Cao (1997) has found a method that is not dependent on subjective parameters (except for the time-delayed embedding) and is less dependent on the amount of data points available. He mentions more benefits, but these two are for our research purposes of upmost importance. Therefore, we will also use Cao’s method (Cao, 1997) to estimate the minimal embedding dimension.

In Cao’s method you will see two lines (E1(d) & E2(d), which are needed to find the optimal embedding dimension. E1(d) is solely dependent on the dimension d and the time lag τ. When this value starts to level off at 1, is where you find the optimal embedding dimension. E2(d) is a check, to make sure the signal is deterministic. For time-series with some determinism, there must be a dimension d where E2(d) is not equal to 1.

Figure 6: Cao’s method for ‘sheep’ time-series

As can be seen from the plot above, Cao’s method gives us a higher minimal dimension embedding, namely 8. The logic behind the FNN seems adequate enough for estimating the optimal embedding dimension. However, Cao (1997) states that the FNN is unstable with respect to misclassification of false neighbours. For that reason, and the reason that a higher embedding dimension gives more flexibility when it comes to reconstructing the phase space, we choose to move forward with Cao’s method.

Phase Space Reconstruction

Now that the parameters for phase space reconstruction have been estimated, it is possible to do the actual reconstruction and to give a small overview of what it looks like

Figure 7: ‘sheep’ time-series delayed reconstructed in eight dimensions

As expected, there are a total of eight columns because our optimal embedding dimension is 8 (Figure 6). Now the three original time-series will be plotted, so we have something to compare against.

Figure 8: Original signals plotted against each other in 3D

Consider the X, Y, and Z axes in Figure 8. There is a clear pattern at which the points are moving through the phase space. This indicates that in the phase space, even a soft-assembled dynamical predator-prey system that is supposedly chaotic by nature can exhibit behaviour following a certain pattern.

Firstly, notice how the three variables in interact with each other in actual space. We see that when the sheep are low the wolves are also low and grass are more. This corresponds to the dynamics of an actual ecosystem. It is also clear how the variables increase and decrease together. All of this would also indicate some, albeit negative, correlation between the sheep and the wolves, and grass. Since they all have a certain influence on each other, we would expect that we could retrieve this construct by our time-delayed embeddings.

Figure 9, 10 and 11 shows the reconstructed phase spaces of the three signals. From a total of dimensions in the embedded manifold, three could be chosen (Figure 7) to fit the 3D plot.

Figure 9: reconstructed phase space through ‘sheep’ time-series
Figure 10: reconstructed phase space through ‘wolves’ time-series
Figure 11: reconstructed phase space through ‘grass’ time-series

In Figure 9 the sheep signal was plotted and contrary to our hypothesis, the trajectories in the sheep phase space follow a less similar path as the actual space Figure 8 than the wolf variable (Figure 10). We do, however, see similar properties of the whole system in the three different signals. For example, in Figure 9 and 10 have a similar hollow shape as the actual space. Furthermore, Figure 9 and 10 do a good job at depicting the qualitative and quantitative properties of the system. Lastly, all three signals following the circular path. Because of these similar properties, we found some evidence that different variables in the system hold information of the system as a whole, which is the foundation for Takens’ theorem.

Conclusion

This post provided a brief, but informative analysis of the phase space reconstruction of the predator-prey data set. We talked about determinism and the lack thereof in dynamical systems, Takens’ theorem along with time dependence, mutual information, nearest neighbours and and improvement thereof (Cao’s method). All of this was eventually used to embed the original data in an M — dimensional manifold where all possible states could be (theoretically) determined, given a set of initial conditions.

The 3D plot of the wolf and sheep signal trajectories (Figure 9 & 10) do not look significantly different than the 3D plot of the original variables in Figure 8. There are several properties visible through the reconstructed signals in the phase space that are also present for the original signals in the actual phase. We accept this reconstruction as a success due the behaviour of the whole system being somewhat accurately depicted through two different signals.

This story was written by Malik Rigot, Ayrton Sambo & Niels Meulmeester, originally created as part of Travis J. Wiltshire’s Complex Systems Methods for Cognitive and Data Scientists course at Tilburg University.

References

Cao, L. (1997). Practical method for determining the minimum embedding dimension of a scalar time series. Physica D: Nonlinear Phenomena, 110(1–2), 43–50. https://doi.org/10.1016/s0167-2789(97)00118-8

Farvardin, N. (2003). Source coding, theory and applications. Encyclopedia of Physical Science and Technology, 377–395. https://doi.org/10.1016/b0-12-227410-5/00714-6

Fraser, A. M., & Swinney, H. L. (1986). Independent coordinates for strange attractors from mutual information. Physical Review A, 33(2), 1134–1140. https://doi.org/10.1103/physreva.33.1134

Garcia, S. P., & Almeida, J. S. (2005). Nearest neighbor embedding with different time delays. Physical Review E, 71(3). https://doi.org/10.1103/physreve.71.037204

Kantz, H., & Schreiber, T. (2004). Nonlinear time series analysis. Cambridge University Press.

Kennel, M. B., Brown, R., & Abarbanel, H. D. (1992). Determining embedding dimension for phase-space reconstruction using a geometrical construction. Physical Review A, 45(6), 3403–3411. https://doi.org/10.1103/physreva.45.3403

Sayed, K., Kamel, M., Alhaddad, M., Malibary, H. M., & Kadah, Y. M. (2017). Characterization of phase space trajectories for brain-computer interface. Biomedical Signal Processing and Control, 38, 55–66. https://doi.org/10.1016/j.bspc.2017.05.007

Takens, F. (1981). Detecting strange attractors in turbulence. Lecture Notes in Mathematics, 366–381. https://doi.org/10.1007/bfb0091924

Wallot, S., & Mønster, D. (2018). Calculation of average mutual information (AMI) and false-nearest neighbors (FNN) for the estimation of embedding parameters of multidimensional time series in Matlab. Frontiers in Psychology, 9. https://doi.org/10.3389/fpsyg.2018.01679

--

--