On the time series support vector machine using dynamic time warping kernel for brain activity classification

Запропоновано нову технологію аналізу даних, що використовується для класифікації нормальних і передуючих нападам електроенцефалограм. Технологія заснована на використанні ядра динамічного перетворення масштабу часу, об'єднаного з методом опорних векторів (SVM). Результати експериментів показал...

Повний опис

Збережено в:
Бібліографічні деталі
Опубліковано в: :Кибернетика и системный анализ
Дата:2008
Автори: Chaovalitwongse, W.A., Pardalos, P.M.
Формат: Стаття
Мова:Англійська
Опубліковано: Інститут кібернетики ім. В.М. Глушкова НАН України 2008
Теми:
Онлайн доступ:https://nasplib.isofts.kiev.ua/handle/123456789/71979
Теги: Додати тег
Немає тегів, Будьте першим, хто поставить тег для цього запису!
Назва журналу:Digital Library of Periodicals of National Academy of Sciences of Ukraine
Цитувати:On the time series support vector machine using dynamic time warping kernel for brain activity classification / W.A. Chaovalitwongse, P.M. Pardalos // Кибернетика и системный анализ. — 2008. — № 1. — С. 159-173. — Бібліогр.: 52 назв. — англ.

Репозитарії

Digital Library of Periodicals of National Academy of Sciences of Ukraine
_version_ 1859822115078799360
author Chaovalitwongse, W.A.
Pardalos, P.M.
author_facet Chaovalitwongse, W.A.
Pardalos, P.M.
citation_txt On the time series support vector machine using dynamic time warping kernel for brain activity classification / W.A. Chaovalitwongse, P.M. Pardalos // Кибернетика и системный анализ. — 2008. — № 1. — С. 159-173. — Бібліогр.: 52 назв. — англ.
collection DSpace DC
container_title Кибернетика и системный анализ
description Запропоновано нову технологію аналізу даних, що використовується для класифікації нормальних і передуючих нападам електроенцефалограм. Технологія заснована на використанні ядра динамічного перетворення масштабу часу, об'єднаного з методом опорних векторів (SVM). Результати експериментів показали, що запропонована технологія значно перевершує стандартну SVM і дозволяє покращити класифікацію активності мозку.
first_indexed 2025-12-07T15:26:31Z
format Article
fulltext UDC 612.821:51+519.7+519.8 W.A. CHAOVALITWONGSE, P.M. PARDALOS ON THE TIME SERIES SUPPORT VECTOR MACHINE USING DYNAMIC TIME WARPING KERNEL FOR BRAIN ACTIVITY CLASSIFICATION 1 Keywords: time series, classification, EEG, brain dynamics, optimization, dynamic time warping, epilepsy, support vector machines. 1. INTRODUCTION The human brain is among the most complex systems known to man. In neuroscience research, countless number of studies have attempted to comprehend the mechanism of brain functions through detailed analysis of neuronal excitability and synaptic transmission. Many theories of brain functions have been proposed over the last century. Only in the last few years has it become feasible to capture simultaneous responses from large enough numbers of neurons to empirically test those long-standing hypotheses about brain function. However, most neuro-scientific experiments have resulted in massive datasets, in a form of multi-dimensional time series data. These data contain both spatial and temporal properties of brain functions. Making sense of such massive data requires very efficient and sophisticated techniques that are capable of capturing both spatial and temporal properties simultaneously. Current research studies in data mining and classification are mostly focused on the data with only spatial or temporal properties. In addition, very few studies in quantitative neuroscience are not tailored to exploit both spatial and temporal properties of this relentless flood of information. In this study, epilepsy will be a case point. Epilepsy is the second most common brain disorder after stroke, yet the most devastating one. The most disabling aspect of epilepsy is the uncertainty of recurrent seizures, which can be characterized by a chronic medical condition produced by temporary changes in the electrical function of the brain. Most epilepsy studies employ electroencephalograms (EEGs) as a tool for capturing the electrical changes and evaluating physiological states (normal and abnormal) of the brain. Although EEGs offer excellent spatial and temporal resolution of brain activity, EEG data are so enormous, in a form of long-term multi-dimensional time series, that neuroscientists understand very little about the dynamical transitions to neurological dysfunctions of seizures. A necessary first step to advance epilepsy research is to develop a seizure prediction/warning system. Therefore, the main goal of this study to employ techniques in data mining and optimization to discover seizure-precursor patterns encrypted in the enormous EEGs. In order to validate the reliability of a seizure prediction/warning system, one has to test the hypothesis that the EEGs during the normal period differ from the EEGs during the seizure-precursor. This will, in turn, lead to a classical classification problem. However, the data used in this classification problem has spatio-temporal properties. We herein propose an efficient and effective spatio-temporal data mining/classification method for multi-dimensional time series classification of brain activity. The organization of the succeeding sections of this paper is as follows. The background of this work including classification techniques in the literature and previous studies in seizure prediction and classification will be discussed in Sec. 2. Subsequently in Sec. 3, basic concepts and standard classification procedure of support vector machines are discussed. The methods employed in this study including the quantification of the brain dynamics, the EEG data acquisition, the support vector machines with dynamic ISSN 0023-1274. Êèáåðíåòèêà è ñèñòåìíûé àíàëèç, 2008, ¹ 1 159 1Research was partially supported by Rutgers Research Council grant-202018, the NSF grants CCF-0546574, DBI-980821, EIA-9872509, CCF 0546574, and NIH grant R01-NS-39687-01A1. � W.A. Chaovalitwongse, P.M. Pardalos, 2008 time warping kernel are given in Sec. 4. The design of experiments and the details of the empirical study are described in Sec. 5. The results on the statistical evaluation and the performance characteristics of the proposed classification method are provided in Sec. 6. The concluding remarks and future works are then later discussed in Sec. 7. 2. BACKGROUND In this section, we give an overview of classification techniques in the literature. We will subsequently focus on optimization-based classification techniques like support vector machines. Later in this section, we give a brief background about epilepsy and the significance of this work to epilepsy research. 2.1. Classification Techniques. Generally, classification techniques are combinatorial in nature as they are involved with discrete decisions. Thus, classification problems can be naturally posted as discrete optimization problems [3, 6, 15, 18, 19, 22, 34]. There have been enormous number of optimization techniques for classification problems developed during the past few decades including classification tree, support vector machines (SVMs), linear discriminant analysis, logic regression, least squares, nearest neighbors, etc. Most optimization methods in classification have been applied to SVMs. A number of linear programming formulations for SVMs have been used to explore the properties of the structure of the optimization problem and solve large-scale problems [5, 35]. The SVM technique proposed in [35] was also demonstrated to be applicable to the generation of complex space partitions similar to those obtained by C4.5 [39] and CART [7]. Current SVM research mainly focuses on extending SVMs to multi-class problems [20, 31, 45]. Nevertheless, there have been very few studies in the literature that address the use of SVMs for time series classification. Moreover, most of the existing studies are only focused on pattern recognition and similarity search [12, 13, 17, 29, 27, 28] or the use of kernel functions for single time series transformation like speech recognition [47, 43, 50, 4, 52]. Multi-dimensional time series classification (MDTSC) has to deal with massive time series data with both spatial and temporal properties (e.g., EEGs). However, to our knowledge, almost all of SVM studies only address either spatial or temporal classification problem. None of current SVM studies attempts to simultaneously consider both problems. In this study, a novel SVM approach for MDTSC is developed based on the improved kernel of finding a separating plane of SVMs in time series sample as well as the idea of projecting the data from the temporal properties. This technique represents a bridge between the parametric techniques that require a priori knowledge of the distributions underlying the data and nonparametric techniques, which presuppose the functional form of the discriminant surfaces separating the different pattern classes. 2.2. Epilepsy Research. Epilepsy will be a case in point in this proposal. Epilepsy is the second most common brain disorder, currently afflicting at least 2 million Americans. The diagnosis and treatment of epilepsy is complicated by the disabling aspect that seizures occur spontaneously and unpredictably. A major epilepsy research lies in the study of how neuronal circuitries of the brain support these electrical changes. Most epilepsy studies use EEGs as a tool for capturing the electrical changes and evaluating physiological states (normal and abnormal) of the brain. Although EEGs have been widely used for the past few decades, neuro-scientists understand very little about the dynamical transitions to neurological dysfunctions of seizures [21]. In some types of epilepsy (e.g., focal or partial epilepsy), there is a localized structural change in neuronal circuitry within the cerebrum which produces organized quasi-rhythmic discharges, which spread from the region of origin (epileptogenic zone) to activate other areas of the cerebral hemisphere. The development of the epileptic state can be considered as changes in network circuitry of neurons in the brain that produce changes in voltage potential, which can be captured by EEG recordings. These changes are reflected by wriggling lines along the time axis in a typical EEG recording. A typical electrode montage for intracranial EEG recordings in our study is shown in Fig. 1. The 10-second EEG profiles 160 ISSN 0023-1274. Êèáåðíåòèêà è ñèñòåìíûé àíàëèç, 2008, ¹ 1 during the normal and pre-seizure periods of patient 1 are illustrated in Figs. 2a and 2b. The EEG onset of a typical epileptic seizure is illustrated in Fig. 2c. Fig. 2d shows the post-seizure state of a typical epileptic seizure, respectively. 2.3. Seizure Prediction and Brain Activity Classification. If seizures could be predicted, it would lead to the development of completely novel diagnostic and therapeutic advances in controlling epileptic seizures. This will tremendously improve the quality of life for those patients who currently suffer from epilepsy. There is growing evidence that human epileptic seizures are preceded by physiological changes that are reflected in the dynamical characteristics of the EEG signals. Our group reported pre-seizure convergence of STLmax values (calculated from intracranial or scalp electrode EEG recordings) which occurred tens of minutes prior to epileptic seizures [24]. Subsequently, Elger and Lehnertz [14, 32] reported reductions in the effective correlation dimension (D 2 eff , a measure of the complexity of the EEG signals) that were more prominent in pre-seizure EEG samples than at times more distant from a seizure. They estimated that a detectable change in dynamics could be observed at least 2 minutes before a seizure in most cases [14]. Martinerie and coworkers [36] also reported significant differences between dimension measures obtained in pre-seizure versus normal EEG samples. They found an abrupt decrease in dimension during the pre-seizure transition. This study also employed relatively brief (40 minutes) samples of pre-seizure and normal data. More recently, this group reported changes in brain dynamics obtained from scalp electrode recordings of the EEG. By comparing pre-seizure EEG samples to a reference sample selected from normal data, they found evidence for dynamical changes that anticipated temporal lobe seizures by periods of up to 15 minutes [40]. Recently, Litt and coworkers [33] reported sus- tained bursts of energy in some EEG channels visually selected by one of the investigators. ISSN 0023-1274. Êèáåðíåòèêà è ñèñòåìíûé àíàëèç, 2008, ¹ 1 161 Fig. 1. Inferior transverse views of the brain, illustrating approximate depth and subdural electrode placement for EEG recordings are depicted. Sub- dural electrode strips are placed over the left orbitofrontal (LOF), right orbitofrontal (ROF), left subtemporal (LST), and right subtemporal (RST) cortex. Depth electrodes are placed in the left temporal depth (LTD) and right temporal depth (RTD) to record hippocampal activity. Fig. 2. Twenty-second EEG recordings of (a) normal activity (b) pre-seizure activity (c) seizure onset activity (d) post-seizure activity from patient 1 obtained from 32 electrodes. Each horizontal trace represents the volt- age recorded from electrode sites listed in the left column (see Fig. 1 for anatomical location of electrodes). c d a b During the past decade, seizure predictability has been demonstrated through the above-mentioned studies including our previous studies in [9,10,25,38]. These studies were motivated by mathematical models used to analyze multidimensional complex systems (e.g., neuronal network in the brain) based on the chaos theory and optimization techniques. The results of those studies demonstrated that a seizure is essentially a reflecting transition of progressive changes of hidden dynamical patterns in EEG. Such transitions have been shown to be detectable through the quantitative analysis of the brain dynamics [9,10,38]. However, in order for one to validate the seizure predictability, one would have to demonstrate, qualitatively and quantitatively, that the normal EEGs differ from the pre-seizure (abnormal) EEGs. The discriminant ability to differentiate and classify a pre-seizure EEG signal is logically a prerequisite and a necessary first step of seizure prediction/warning development. Thus far, to our knowledge, none of current epilepsy studies in the literature is undertaken to test this hypothesis. Our group has attempted to apply data mining techniques using hidden dynamical characteristics to differentiate normal and pre-seizure EEGs [11]. 3. SUPPORT VECTOR MACHINES (SVMs) In this section, we discuss some basic concepts of SVMs. Then, we explain a general classification procedure of SVMs. Later, we address the use of kernel functions, the most widely used trick of SVMs. 3.1. Basic Concepts. SVMs is one of the classification techniques widely used in practice. The essence of support vector machines is to construct separating surfaces that will minimize the upper bound on the out-of-sample error. In the case of one linear surface (plane) separating the elements from two classes, this approach will choose the plane that maximizes the sum of the distances between the plane and the closest elements from each class (i.e., the “gap” between the elements from different classes). The mathematical definition of support vector machines can be described as follows. Let all the data elements be represented as n-dimensional vectors (or points in the n-dimensional space), then these elements can be separated geometrically by constructing the surfaces that serve as the “borders” between different groups of points. One of the common approaches is to use linear surfaces (planes) for this purpose, however, different types of nonlinear (e.g., quadratic) separating surfaces can be considered in certain applications. Note that, in practice, it is not possible to find a surface that would “perfectly” separate the points according to the value of some attribute. In other words, data points with different values of the given attribute may not necessarily lie at the different sides of the surface. However, in general, the number these errors should be small enough. The classification problem of support vector machines can be represented as the problem of finding geometrical parameters of the separating surface(s). As it will be described below, these parameters can be found by solving the optimization problem of minimizing the misclassification error for the elements in the training dataset (so-called “in-sample error”). After determining these parameters, every new data element will be automatically assigned to a certain class, according to its geometrical location in the elements space. The procedure of using the existing dataset for classifying new elements is often called “training the classifier” (and the corresponding dataset is referred to as the “training dataset”). It means that the parameters of separating surfaces are “tuned” (or, “trained”) to fit the attributes of the existing elements to minimize the number of errors in their classification. However, a crucial issue in this procedure is not to “overtrain” the model, so that it would have enough flexibility to classify new elements, which is the primal purpose of constructing the classifier. An example of hyperplanes separating the brain’s pre-seizure, normal, and post-seizure states is illustrated in Fig. 3. 3.2. SVMs Mathematical Formulation. The main idea of applying SVMs to the classification of EEG time series is to embed EEG data (both normal and pre-seizure) into higher dimensional space and try to find a hyperplane to separate the data. The problem can be formally defined as follows. Let all the EEG data samples be represented 162 ISSN 0023-1274. Êèáåðíåòèêà è ñèñòåìíûé àíàëèç, 2008, ¹ 1 as n-dimensional vectors (or points in the n-dimensional space). A very common SVM approach is to find a plane which would separate all the vectors (points) in the n-dimensional space defined in A from the vectors in B. If a plane is defined by the standard expression xT � �� , where � � �� ( ,... , )1 n T is an n-dimensional vector of real numbers, and � is a scalar, then this plane will separate all the elements from A and B. Thus, the discrimination rules can be formulated as an optimization problem to determine vectors � and � such that the separating hyperplane defines two open half spaces, { | , }x x xn T� �R � � and { | , }x x xn T� �R � � , which contain most data points in A and B respectively. However, in practice it is usually not possible to perfectly separate two sets of elements by a plane. For this reason, one should try to minimize the average measure of misclassifications. The violations of these constraints are modeled by introducing nonnegative variables u and �. The most common mathematical model for SVMs that minimizes the total average measure of misclassification errors is given by: min , , ,� � � � u i i m j j k m u k 1 1 1 1� � � �� (1) s.t. A u e e� �� � , (2) A e e� � � � , (3) u 0 0, � . (4) As one can see, this is a linear programming problem, and the decision variables here are the geometrical parameters of the separating plane � and �, as well as the variables representing misclassification error u and �. Although in many cases this type of problems may involve high dimensionality of data, they can be efficiently solved by available LP solvers, for instance Matlab, Xpress-MP, or CPLEX. 3.3. Time Series Kernel Functions. In this section, we will discuss the use of kernel functions in time series classification, one of the most widely used technique in SVM learning. Generally, kernel functions are used to extend the decision functions of SVMs to the nonlinear separation case. The main idea of kernel functions is to map the data from the input space X into a high dimensional feature space � by a function � X: � and solving the linear learning problem to find a separating hyperplane in �. The actual kernel function � does not need to be known, it suffices to have a kernel function k, which calculates the inner product in the feature space k x y x y( , ) ( ) ( )� �� � . ISSN 0023-1274. Êèáåðíåòèêà è ñèñòåìíûé àíàëèç, 2008, ¹ 1 163 Fig. 3. Example of hyperplanes separating different brain’s states. The kernel function can be viewed as a similarity (distance) measure in the input space [46]. The similarity between the samples x and y can be shown as the kernel function k x y( , ) as the following: d x y x y k x x k x y k y y2 2 2( , ) ( ( ) ( )) ( , ) ( , ) ( , )� � �� � . 3.3.1. Linear Kernel. The most simple kernel function is the linear kernel, k x y x y( , ) � � . The decision function takes the formula, f x wx b( ) � � . In time series prediction, the linear kernel can be interpreted as an statistical autoregressive model of the order k (AR[k]). This can be shown by x f x x w x bT T T k t T t t k � � � � �( ,... , )1 1 . The interpretation of this kernel function is that the time series are considered to be similar if they are generated by the same AR-model. 3.3.2. Radial Basis Function Kernel. Another commonly used kernel function is the radial basis function (RBF) kernel, k x y x y� �( , ) exp( )� 2 . The similarity of two samples in the RBF kernel can be interpreted as their euclidian distance. In time series prediction, the RBF kernel, in turn, has a parallel in the phase space representation. This can be explained as follows. Assume the time series is generated by a function f such that x f x xT T T k� ( ,... , )1 . If one takes the time series x1 ,... , x xk N,... , and plots it in the (k � 1)-dimensional phase space. It can be easily observed that the resulting plot is a part of the graph of f , so the function f can be estimated from the time series. Especially, assuming that the function is linear and the time series is generated by x f x xT T T k� � ( ,... , )1 �, where � is a Gaussian noise. Clearly, the time series model is AR[1]) and it can be shown that most of the time series data points lies in an ellipsoid defined by the mean of the time series and the variance of �. The interpretation of this kernel function is that the time series are considered to be similar in means of the euclidian distance in the phase space. 3.3.3. Fourier Kernel. Fourier transform is among the most common transformation in time series analysis. The Fourier kernel function is advantageous when the information or pattern of the time series does not lie in the individual values at each time point but in the frequency of some events. The inner product of the Fourier expansion of two time series can be directly calculated by the regularized kernel function, k x y q q x y q F ( , ) ( cos ( )) � � 1 2 1 2 2 2 , where 0 1� �q and X n� [ , ]0 2� [49]. 4. METHODS In this section, we describe the methods used in each step of multi-dimensional EEG classification starting from quantifying the brain dynamics from EEG signals to implementing the dynamic time warping kernel with SVMs to analyze the multidimensional time series of the brain dynamics. 4.1. Quantification of the Brain Dynamics. Quantification of the brain dynamics from EEGs in this study is suitable to the investigation of a nonstationary system such as the brain because it is capable of automatically identifying and appropriately weighing existing transients in the data. This technique is motivated by mathematical models from chaos theory used to characterize multi-dimensional complex systems and reduce the dimensionality of EEGs [1, 26, 37, 42, 48]. To quantify the brain dynamics, we divide EEG signals into sequential 10.24-second epochs (non-overlapping windows) to properly account for possible nonstationarities in the epileptic EEG. For each epoch of each channel of EEG signals, we estimate the measures of chaos to quantify the chaoticity of the attractor. These measures include Short-Term Maximum Lyapunov Exponent and Angular Frequency. A chaotic system like human brain is a system in which orbits that originate from similar initial conditions or nearby points in the phase space diverge 164 ISSN 0023-1274. Êèáåðíåòèêà è ñèñòåìíûé àíàëèç, 2008, ¹ 1 exponentially in expansion process. The rate of divergence is an important aspect of the dynamical system and is reflected in the value of Lyapunov exponents and Angular Frequency. In other words, the Lyapunov exponents and Angular Frequency measure the average uncertainty along the local eigenvectors and phase differences of an attractor in the phase space, respectively. Next, we will give a short overview of mathematical models used in the estimation of the Short-Term Maximum Lyapunov Exponent and Angular Frequency from EEG signals. 4.1.1. EEG Time Series Embedding. In the study of the brain dynamics, the initial step in analyzing the dynamical properties of EEG signals is to embed it in a higher dimensional space of dimension p, which enables us to capture the behavior in time of the p variables that are primarily responsible for the dynamics of the EEG. We can now construct p-dimensional vectors X t( ) , whose components consist of values of the recorded EEG signal x t( ) at p points in time separated by a time delay. Construction of the embedding phase space from a data segment x t( ) of duration T is made with the method of delays. The vectors X i in the phase space are constructed as: X x t x t x t pi i i i� � � ( ( ), ( )... ( ( )* ))� �1 , where � is the selected time lag between the components of each vector in the phase space, p is the selected dimension of the embedding phase space, and t T pi � [ , ( ) ]1 1 � . The vectors X i in the phase space are illustrated in Fig. 4. 4.1.2. Estimation of Short-Term Maximum Lyapunov Exponent (STLmax ). The method for estimation of STLmax for nonstationary data (e.g., EEG time series) is previously explained in [23, 51]. In this section, we will only give a short description and basic notation of our mathematical models used to estimate STLmax . First, let us define the following notation: — �t is the evolution time for �X i j, , that is, the time one allows �X i j, to evolve in the phase space. If the evolution time �t is given in second, then L is in bits per second. ISSN 0023-1274. Êèáåðíåòèêà è ñèñòåìíûé àíàëèç, 2008, ¹ 1 165 Fig. 4. Diagram illustrating an EEG epoch embedded in phase space for the quantification of brain dynamics: assume p � 4. The fiducial trajectory, the first three local Lyapunov exponents (L1, L2, L3), is shown. — t0 is the initial time point of the fiducial trajectory and coincides with the time point of the first data in the data segment of analysis. In the estimation of STLmax , for a complete scan of the attractor, t0 should move within [ , ]0 �t . — N a is the number of local STLmax ’s that will be estimated within a duration T data segment. Therefore, if Dt is the sampling period of the time domain data, T N D N t pt a� � � ( ) ( )1 1� �. — X t i( ) is the point of the fiducial trajectory t X t( ( ))0 with t t i� , X t x t( ) ( ( ),...0 0� x t p( ( )* ))0 1� � , and X t j( ) is a properly chosen vector adjacent to X t i( ) in the phase space. — �X i j, ( )0 � X t i( ) X t j( ) is the displacement vector at t i , that is, a perturbation of the fiducial orbit at t i , and �X ti j, ( )� � X t ti( )� � X t tj( )� � is the evolution of this perturbation after time �t. — t t i ti � � 0 1( )* � and t t j tj � � 0 1( )* � , where i N a�[ , ]1 and j N�[ , ]1 with j i� . STLmax is defined as the average of local Lyapunov exponents in the state space and can be calculated by the following equation: STL N ta i N a max log� � � 1 2 1� | ( )| | ( )| , , � � X t X i j i j � 0 . 4.1.3. Estimation of Angular Frequency (�). Similar to the estimation of STLmax , the estimation of the Angular Frequency, �, is motivated by the representation of a state as a vector in the state space. � is merely an average uncertainty along the phase differences of an attractor in the phase space. First, let us define the difference in phase between two evolved states X t i( ) and X t ti( )� � as ��i . Then, denoting with (��) the average of the local phase differences ��i between the vectors in the state space, we have �� � � � 1 1N a i N a ��i , where N a is the total number of phase differences estimated from the evolution of X t i( ) to X t ti( )� � in the state space, according to �� � � i i i i i X t X t t X t X t t � � � � � arccos ( ) ( ) ( ) ( ) . Then, the average angular frequency � is defined as � � � � �� t � . If �t is given in second, then � is given in rad/sec. Thus, while STLmax measures the local stability of the state of the system on average, � measures how fast a local state of the system changes on average (e.g., dividing � by 2�, the rate of the change of the state of the system is expressed in sec �1 Hz). 4.2. Dynamic Time Warping Kernel. Give two time series (or vector sequences) X and Y of equal length | | | |X Y n� � , pattern similarity is determined by aligning time series X with time series Y with the distortion of alignment D X Yalign ( , ) . Dynamic time warping (DTW) is used to compute the best possible alignment warp between two time series by selecting the one with the minimum distortion. In other words, The DTW distance is a distance measure (or similarity measure) between two time series by computing the best possible alignment or the minimum mapping (aligning) distance between two time series. In this study, all our EEG data samples are equal in length; however, the DTW can be extended to the case where the lengths of the two time series are not equal. DTW has been widely used in many contexts including data mining [30, 2], gesture recognition [16], robotics [44], speech processing [41, 47, 50], and medicine [8]. 166 ISSN 0023-1274. Êèáåðíåòèêà è ñèñòåìíûé àíàëèç, 2008, ¹ 1 The problem of calculating the DTW distance can be solved by a dynamic programming approach. The basic concept can be described as follows. First, construct an alignment in an n n� matrix so that each vector (data points) in X is matched with a corresponding vector in Y . Typically, the Euclidean distance is used as the local distance between two vectors, d x y x yi j i j( , ) ( )� 2 , where the (ith, jth) element of the matrix is the distance d x yi j( , ) between the ith point of time series X , and the jth point of time series Y . Then, we construct a warp path W w wK� 1 , ... , , where K is the length of the warp path and max(| | , | | )X Y � � � �K X Y| | | | . The kth element of the warp path represents the matching point of two time series, w i jk � ( , ) , where ( , )i j corresponds to index i from time series X , and index j from time series Y (see Fig. 5). The warp path must start at the beginning of each time series and finish at the end of both time series. In other words, the path starts from the beginning of each time series, w1 1 1� ( , ) , and finishes at the end of both time series, w n nK � ( , ) . The warp path can actually be calculated in reverse order starting at the end of both time series. There is also a constraint on the warp path that forces indices i and j to be monotonically increasing in the warp path; that is, w i jk � ( , ) and w i jk� � � �1 ( , ) where i i i� � � � 1 and j j j� � � � 1. Note that there can be an exponential number of warping paths that satisfy the above conditions. However, the optimal warp path is the one with a minimum warping (distortion) cost defined by D X Y K d w wki kj k K align ( , ) min ( , )� � � 1 1 . In a dynamic programming approach, the warp path must either be incremented by one unit (adjacent) or stay at the same i or j axes. Therefore, we only need to evaluate the recurrence of the cumulative distance found in the adjacent elements: D i j d x y D i j D i j D i j i j( , ) ( , ) min ( , ), ( , ), ( , ). � � � � 1 1 1 1 � � � 4.3. SVMs with DTW Kernel. The essence of the SVMs framework with DTW kernel in this application can be described as follows. Based on the concept of DTW, one employ a Euclidean distance measure to find the optimal path that minimizes the accumulated distance of the warping path. The SVMs with DTW uses inner product or kernel function to find the optimal path that maximize the accumulated similarity (or minimize the distance) as follows: k x y M m k x yDTW k k k L I J I J ( , ) max ( ) , ( ) ( )� � � 1 1 , (5) s.t. 1 1� � � � I k I k L( ) ( ) , (6) 1 1� � � � J k J k L( ) ( ) , (7) where L X Y� �| | | | , I k( ) and J k( ) are linear warping functions, m k( ) is a nonnegative (path) weighting coefficient, and M is a (path) normalizing factor [47]. The linear discriminant function of SVMs with DTW kernel for time series classification can be then expressed the same way as the original linear SVMs function except using the DTW kernel. It is important to note that, in our case, we have multiple time series; therefore, the similarity of the DTW kernel will be based ISSN 0023-1274. Êèáåðíåòèêà è ñèñòåìíûé àíàëèç, 2008, ¹ 1 167 Fig. 5. A warping matrix with the minimum-distance warp path of two time series X and Y . on a pair-wise manner. In other words, if we let N be the total number of electrodes, then we have to calculate a total of N N( ) 1 2 kernel functions or similarity indices. These similarity indices can, in turn, be considered as the attributes of the input data. Finally, note that the same algorithms used to solve a standard SVMs can be used to solve the SVMs with DTW kernel as well. 5. EMPIRICAL STUDY The underlying hypothesis in this empirical study is that the proposed SVMs with DTW kernel is capable of discriminating/classifying different physiological stages (normal and pre-seizure) of the brain. The features of input data are in a form of time series of the the brain dynamics measure (i.e., STLmax and �). In this section, we discuss in detail each step of our empirical study. 5.1. EEG Data Acquisition. The datasets in this study consisted of continuous long-term (3 to 9 days) multichannel intracranial EEG recordings from bilaterally, surgically implanted macroelectrodes in the hippocampus, temporal and frontal lobe cortexes of 3 epileptic patients with medically intractable temporal lobe epilepsy (outlined in Table 1). The recordings were obtained as part of a pre-surgical clinical evaluation, using a Nicolet BMSI 4000 recording system with amplifiers of an input range of 0.6 mV, sampling rate of 200 Hz and filters with a frequency range of a 0.5–70Hz. Each recording included a total of 28 to 32 intracranial electrodes (8 subdural and 6 hippocampal depth electrodes for each cerebral hemisphere, and a strip of 4 additional electrodes if deemed necessary by the neurologist). Note that we only use the EEG recording from 26 electrodes in this study as those electrodes are most commonly used. The recorded EEG signals were digitized and stored on magnetic media for subsequent off-line analysis. These EEG recordings were viewed by two independent electroencephalographers to determine the number and type of recorded seizures, seizure onset and end times, and seizure onset zones. 5.2. Data Sampling and Pre-Processing. In this study, the classification will be performed separately for each subject. Per individual, we use the Monte-Carlo sampling technique to randomly select EEG data from 2 groups (normal and pre-seizure states) from the continuous recordings. Each data sample contains a 5-minute epoch of EEG data from 26 electrodes. Note that in this analysis we only consider clinical seizures and un-clustered seizures. From the data set, we consider 22, 7, and 15 seizures in the EEG data from Patients 1, 2, and 3, respectively. Since the data set of each patient is very much different in length and the total number of seizures, for each patient we randomly select three epochs of pre-seizure EEG data per seizure. In other words, 66, 21, and 45 epochs of the EEG data are selected from the group of pre-seizure EEG data in Patients 1, 2 and 3, respectively. Per patient, 200 epochs of EEG data from the normal state are randomly and uniformly sampled. The criteria used in determining normal and pre-seizure states of EEG data is as follows. Normal EEG samples are selected from EEG recordings that is more than 8 hours apart from a seizure. Pre-seizure EEG samples are selected from EEG recordings during the 30-minute interval before. For instance, we analyze 22 seizures from Patient 1’s data; therefore, 266 EEG epochs (200 normal and 66 pre-seizure) are sampled. After EEG data are sampled, we first calculate measures of the brain dynamics (i.e., STLmax and �) from the EEG data using the methods described in Sec. 4. Each measure is calculated continuously for each non-overlapping 10.24-second segment of EEG data; therefore, each of EEG epoch contains 30 data points of the brain dynamical time series. 168 ISSN 0023-1274. Êèáåðíåòèêà è ñèñòåìíûé àíàëèç, 2008, ¹ 1 Patient ID Gender Age Seizure Types Duration of EEG, days Number of Seizure 1 F 41 CP 9.06 24 2 M 45 CP, SC 3.63 9 3 M 29 CP, SP 6.07 19 Total 18.76 59 CP — Complex Partial; SC — Subclinical T a b l e 1 . EEG dataset characteristics 5.3. Classification Procedure. The input data consist of 66, 21, and 45 of N m* -dimensional time series, where N is the number of electrodes and m is the length of each EEG epoch times two (2 dynamical measures). We then calculate a pair-wise kernel function for every electrode pair of multi-dimensional EEG time series. Therefore, the total number of feature vectors corresponding to Patients 1, 2, and 3 is N N( ) * � 1 2 2 650. The method used to calculate the kernel function is described in Sec. 4. Subsequently, we employ SVMs to classify these EEG data. We then use Matlab to solve the constructed SVMs model is to find a plane which would separate all the vectors of normal and pre-seizure EEGs. 5.4. Training and Testing. In this section, we describe the step of how the SVMs are trained and tested. There are many choices of how to divide the data into training and test sets. In order to reduce the bias of training and test data, we propose to implement a leave-one-out cross validation scheme, extensively used as a method to estimate the generalization error based on “resampling”. It is important to note that the classification techniques will be trained and tested individually for each patient. To train the SVMs, it is important to note that, in general, the training of support vectors machines is optimized when the number of pre-seizure and normal samples are comparable. Otherwise, the SVMs will be biased to classify most samples to the physiological state with larger size samples. To adequately implement the SVMs, we train the classifier with the same number of pre-seizure and normal samples by implementing Monte Carlo sampling simulation. First, we shuffle (random order) the pre-seizure and normal samples individually. Since the size of pre-seizure samples is much larger than the size of normal samples, the number of pre-seizure samples will be used to determine the size of the training and testing sets. Then, we divide the first of pre-seizure samples for the training and the other half for the testing. After that, we randomly select training data (with the same size) from normal samples. For individual patient, we run the simulation 100 times. 6. RESULTS To evaluate the performance characteristics of the proposed classification technique, we calculate the sensitivity and specificity of the proposed classification technique. These results will be discussed in this section. 6.1. Performance Evaluation of Classification Schemes. In general, to evaluate the classifier, we categorize the classification into two classes: positive (pre-seizure) and negative (normal). Then we consider four subsets of classification results: 1. True positives (TP) — denoting correct classifications of positive cases. 2. True negatives (TN) — denoting correct classifications of negative cases. 3. False positives (FP) — denoting incorrect classifications of negative cases into class positive. 4. False negatives (FN) — denoting incorrect classifications of positive cases into class negative. To better explain the concept of the evaluation of classifiers, let us consider in the case of the detection of pre-seizure EEG data (see Fig. 6). A classification result was considered to be true positive if we classify a pre-seizure EEG sample as a pre-seizure sample. A classification result was considered to be true negative if we classify a normal EEG sample as a normal sample. A classification result was considered to be false positive when we classify a normal EEG sample as a pre-seizure sample. A classification result was considered to be false negative when we classify a pre-seizure EEG sample as a normal sample. Sensitivity and specificity are widely used in the medical domain as classification performance measures Sensitivity measures the fraction of positive cases that are classified as positive. Specificity measures the fraction of negative cases classified as negative. The sensitivity and specificity are defined as follows: Sensitivity � � TP TP FN , Specificity � � TN TN FP . ISSN 0023-1274. Êèáåðíåòèêà è ñèñòåìíûé àíàëèç, 2008, ¹ 1 169 Fig. 6. The evaluation concept of classification results. Note that we use “pre-seizure” and “abnormal” inter- changeably. In fact, the sensitivity can be considered as a probability of accurately classifying EEG samples in the pre-seizure case. The specificity can be considered as a probability of accurately classifying EEG samples in the normal case. In general, a good classifier is the one with high sensitivity and specificity. 6.2. Performance Characteristics of the Proposed Classification Methods. After running 100 simulations, we then report the average classification performance in this section. Figure 7 illustrates the overall classification results of the proposed SVMs-DTW algorithm. Table 2 and Fig. 8 illustrate a performance comparison of the standard SVMs for EEG classification proposed in [11] versus the results from the proposed SVMs with DTW proposed in this paper tested on 3 patients. For all cases, the incorporation of the DTW kernel function, we can achieve substantially better classification results (about 8% better on average). In Patient 1, the proposed algorithm achieve about 92% sensitivity and over 93% specificity on average. This result demonstrates the improvement in classification performance of almost 8% on average. In Patient 2, the proposed algorithm achieve about 80% sensitivity and about 78% specificity on average. This result demonstrates the improvement in classification performance of almost 5% on average. In Patient 3, the proposed algorithm achieve about 84% sensitivity and over 85% specificity on average. This result demonstrates the improvement in classification performance of over 10% on average. Overall, the proposed SVMs-DTW can achieve the sensitivity of correctly classifying pre-seizure of 83.91%, and the specificity of correctly classifying normal EEGs of 85.52%, respectively. This reflects to almost 8% accurate classification improvement. In Fig. 8, we observe that the incorporation of the DTW kernel function can improve the classification performance of SVMs in every case. It is very interesting to note that the classification performance of our algorithm for each patient is consistent with the standard SVMs [11]. Specifically, the EEG data from Patient 1 tend to be more classifiable than those of Patients 2 and 3. We speculate that the number of seizures in the EEG data set could play a very important role in terms of 170 ISSN 0023-1274. Êèáåðíåòèêà è ñèñòåìíûé àíàëèç, 2008, ¹ 1 T a b l e 2 . Performance characteristics of the support vector machine classifier for individual patient Patient SVMs [11] SVMs-DTW Sensitivity, % Specificity, % Overall, % Sensitivity, % Specificity, % Overall, % 1 81.21 87.46 84.34 92.15 93.84 92.99 2 71.18 76.85 74.02 79.45 78.07 78.76 3 74.13 70.60 72.37 80.13 84.66 82.39 Average 75.51 78.30 76.91 83.91 85.52 84.72 Fig. 7. Average classification results among all 3 patients using leave-one-out cross validation to train and test the algorithm over 100 simulations. providing more training data for abnormal (pre-seizure) EEGs. Because our algorithm yields the worst classification results in Patient 2 among all 3 cases, it is very intuitive to claim that there is so much the classifier can learn from 7 seizure samples as apposed to 22 and 15 samples. Nonetheless, these results confirm our hypothesis that the brain’s states are classifiable based on the brain dynamics measures and data mining techniques applied to EEG signals. The framework of classifiers proposed in this study can be extended to development of an abnormal brain activity classifier or an online brain activity monitoring. 7. CONCLUDING REMARKS This study addresses the open question of the classifiability of the brain’s preseizure and normal EEGs. The results of this study is another proof concept of the application of the quantification of the brain dynamics and data mining techniques. This framework was proved successful in providing insights and characterizing different states of brain activities reflected from pathological dynamical interactions of brain network. In addition, these results also confirm our hypothesis that it is possible to differentiate and classify the brain’s pre-seizure and normal activities based on optimization, data mining, and dynamical system approaches in multichannel intracranial EEG recordings. Also, the incorporation of DTW kernel function with SVMs is very straightforward and not difficult to implement. The optimization problems in the framework of support vector machines can be solved in reasonable time. All of the programming was done in Matlab environment on a desktop computer Pentium IV 2.4 GHz with 1 GB of RAM. The proposed technique very fast and scalable. The running time for statistical cross-validation technique is less than 5 minutes on average. In the future, more cases (patients and seizures) will be studied to validate the observation across patients as well as the development of multi-class classifier based on support vector machines framework. In addition, the feature selection study will be possible in the future. This study will help us to select electrodes that show prominent changes, which might lead us to the solution to the epileptogenic zone localization problem. REFERENCES 1. B a b l o y a n t z A . , D e s t e x h e A . Low dimensional chaos in an instance of epilepsy // Proc. Nat. Acad. Sci. USA. — 1986. — 83. — P. 3513–3517. 2. B e r n d t D . , C l i f f o r d J . Using dynamic time warping to find patterns in time series // Proc. of the AAAI–94 Workshop on Knowledge Discovery in Databases (KDD-94). — 1994. 3. B e r t s i m a s D . , D a r n e l l C . R . , S o u c y R . Portfolio construction through mixed-integer pro- gramming at grantham, mayo, van otterloo and company // Interfaces. — 1999. — 29, N 1. — P. 49–66. 4. B o r g w a r d t M . K . , V i s h w a n a t h a n S . V . N . , K r i e g e l H - P . Class prediction from time series gene expression profiles using dynamical systems kernels // Pacific Symp. on Biocomput. — 2006. — P. 547–558. 5. B r a d l e y P . S . , F a y y a d U . , M a n g a s a r i a n O . L . Mathematical programming for data mining: Formulations and challenges // INFORMS J. Computing. — 1999. — 11. — P. 217–238. ISSN 0023-1274. Êèáåðíåòèêà è ñèñòåìíûé àíàëèç, 2008, ¹ 1 171 Fig. 8. Performance comparison of the standard SVMs for EEG classification proposed in [11] versus the the proposed SVMs-DTW algorithm among all 3 patients using leave-one-out cross validation to train and test the algorithm over 100 simulations. 6. B r a d l e y P . S . , M a n g a s a r i a n O . L . , S t r e e t W . N . Clustering via concave minimization. // M.C. Mozer, M.I. Jordan, and T. Petsche, editors. Adv. in Neural Inform. Proces. Systems. — Cambridge: MIT Press, 1997. — P. 368–374. 7. B r e i m a n L . , F r i e d m a n J . , O l s e n R . , S t o n e C . Classification and regression trees. — Belmont: Wadsworth Inc, 1993. 8. C a i a n i E . G . , P o r t a A . , B a s e l l i G . e t a l . Warped-average template technique to track on a cy- cle-by-cycle basis the cardiac filling phases on left ventricular volume // IEEE Comput. in Cardiology. — 1998. — 25, N 98. — CH36292. 9. C h a o v a l i t w o n g s e W . A . , P a r d a l o s P . M . , I a s e m i d i s L . D . e t a l . Applications of global optimization and dynamical systems to prediction of epileptic seizures // P.M. Pardalos, J.C. Sackellares, L.D. Iasemidis, P.R. Carney, editors. Quantitative Neuroscience. — Dordrecht: Kluwer, — 2003. — P. 1–36. 10. C h a o v a l i t w o n g s e W . A . , P a r d a l o s P . M . , P r o k o y e v O . A . Reduction of multi-quadratic 0–1 programming problems to linear mixed 0–1 programming problems // Oper. Res. Letters. — 2004. — 32, N 6. — P. 517–522. 11. C h a o v a l i t w o n g s e W . A . , P a r d a l o s P . M . , P r o k o y e v O . A . Electroencephalogram (EEG) time series classification: Applications in epilepsy // Ann. Oper. Res. — 2006. — 148, N 1. — P. 227–250. 12. D a s g u p t a D . , F o r r e s t S . Novelty detection in time series data using ideas from immunology // In- tern. Conf. on Intell. Systems. — 1999. 13. D i e z J . J . R . , G o n z a l e z C . A . Applying boosting to similarity literals for time series classification // Intern. Workshop on Multiple Classifier Systems. — 2000. — P. 210–219. 14. E l g e r C . E . , L e h n e r t z K . Seizure prediction by non-linear time series analysis of brain electrical activity // Eur. J. Neurosci. — 1998. — 10. — P. 786–789. 15. F u n g G . M . , M a n g a s a r i a n O . L . Proximal support vector machines // 7th ACM SIGKDD Intern. Conf. on Knowledge Discovery and Data Mining. — 2001. 16. G a v r i l a D . M . , D a v i s L . S . Towards 3-d model-based tracking and recognition of human movement: a multi-view approach // Proc. of the Intern. Workshop on Autom. Face- and Gesture-Recognition. — 1995. 17. G e u r t s P . Pattern extraction for time series classification // Principles of Data Mining and Knowledge Discovery, 5th Eur. Conf. — 2001. — P. 115–127. 18. G r o s s m a n R . L . , K a m a t h C . , K e g e l m e y e r P . e t a l . Data mining for scientific and engi- neering applications. — Dordrecht: Kluwer Acad. Publ., 2001. — 628 p. 19. H a n d D . J . , M a n n i l a H . , S m y t h P . Principle of data mining. — Concord: Bradford Books, 2001. — 584 p. 20. H s u C . - W . , L i n C . - J . A comparison of methods multi-class support vector machines // IEEE Trans. on Neural Networks. — 2002. — 13. — P. 415–425. 21. H u r n A . S . , L i n d s a y K . A . , M i c h i e C . A . Modelling the lifespan of human t-lymphocite sub- sets // Math. Biosciences. — 1997. — 143. — P. 91–102. 22. I a n n a t i l l i F . J . , R u b i n P . A . Feature selection for multiclass discrimination via mixed-integer linear programming // IEEE Trans. on Pattern Analysis and Machine Learning. — 2003. — 25. — P. 779–783. 23. I a s e m i d i s L . D . On the dynamics of the human brain in temporal lobe epilepsy: PhD thesis. — Univ. of Michigan, Ann Arbor. — 1991. 24. I a s e m i d i s L . D . , P a r d a l o s P . M . , S a c k e l l a r e s J . C . , S h i a u D . - S . Quadratic binary programming and dynamical system approach to determine the predictability of epileptic seizures // J. Comb. Optimiz. — 2001. — 5. — P. 9–26. 25. I a s e m i d i s L . D . , S h i a u D . - S . , C h a o v a l i t w o n g s e W . A . e t a l . Adaptive epileptic sei- zure prediction system // IEEE Trans. Biomed. Eng. — 2003. — 5, N 5. — P. 616–627. 26. I a s e m i d i s L . D . , Z a v e r i H . P . , S a c k e l l a r e s J . C . , W i l l i a m s W . J . Phase space analysis of eeg in temporal lobe epilepsy // IEEE Eng. in Medicine and Biology Soc., 10th Ann. Intern. Conf. — 1988. — P. 1201–1203. 27. K e o g h E . , C h a k r a b a r t i K . , P a z z a n i M . , M e h r o t r a S . Dimensionality reduction for fast similarity search in large time series databases // Knowledge and Inform. Systems. — 2000. — 3, N 3. — P. 263–286. 28. K e o g h E . , K a s e t t y S . On the need for time series data mining benchmarks: A survey and empirical demon- stration // 8th ACM SIGKDD Intern. Conf. on Knowledge Discovery and Data Mining. — 2002. — P. 102–111. 29. K e o g h E . , P a z z a n i M . An enhanced representation of time series which allows fast and accurate classification, clustering and relevance feedback // 4th Int’l Conf. on Knowledge Discovery and Data Mining. — 1998. — P. 239–241. 30. K e o g h E . , P a z z a n i M . Scaling up dynamic time warping for datamining applications // Proc. of the 6th ACM SIGKDD Intern. Conf. on Knowledge Discovery and Data Mining. — 2000. — P. 285–289. 31. K r e b e l U . Pairwise classification and support vector machines // Adv. in Kernel Methods — Support Vector Learning. — Cambridge: MIT Press, 1999. — P. 255–268. 32. L e h n e r t z K . , E l g e r C . E . Can epileptic seizures be predicted? Evidence from nonlinear time series analysis of brain electrical activity // Phys. Rev. Lett. — 1998. — 80. — P. 5019–5022. 33. L i t t B . , E s t e l l e r R . , E c h a u z J . e t a l . Epileptic seizures may begin hours in advance of clinical onset: A report of five patients // Neuron. — 2001. — 30. — P. 51–64. 34. M a n g a s a r i a n O . L . Linear and nonlinear separation of pattern by linear programming // Oper. Res. — 1965. — 31. — P. 445–453. 172 ISSN 0023-1274. Êèáåðíåòèêà è ñèñòåìíûé àíàëèç, 2008, ¹ 1 35. M a n g a s a r i a n O . L . , S t r e e t W . N . , W o l b e r g W . H . Breast cancer diagnosis and prognosis via linear programming // Ibid. — 1995. — 43, N 4. — P. 570–577. 36. M a r t i n e r i e J . , V a n A d a m C . , V a n Q u y e n M . L . Epileptic seizures can be anticipated by non-linear analysis // Nature Medicine. — 1998. — 4. — P. 1173–1176. 37. P a c k a r d N . H . , C r u t c h f i e l d J . P . , F a r m e r J . D . Geometry from time series // Phys. Rev. Lett. — 1980. — 45. — P. 712–716. 38. P a r d a l o s P . M . , C h a o v a l i t w o n g s e W . A . , I a s e m i d i s L . D . e t a l . Seizure warning algorithm based on spatiotemporal dynamics of intracranial EEG // Math. Program. — 2004. — 101, N 2. — P. 365–385. 39. Q u i n l a n J . R . C4.5: Programs for Machine Learning. — Orlando: Morgan Kaufmann, 1993. — 302 p. 40. V a n Q u y e n M . L . , M a r t i n e r i e J . , B a u l a c M . , V a r e l a F . Anticipating epileptic seizures in real time by non-linear analysis of similarity between eeg recordings // Neuro Rep. — 1999. — 10. — P. 2149–2155. 41. R a b i n e r L . , J u a n g B . Fundamentals of speech recognition. — Upper Saddle River: Prentice Hall, 1993. — 496 p. 42. R a p p P . E . , Z i m m e r m a n I . D . , A l b a n o A . M . Experimental studies of chaotic neural behav- ior: cellular activity and electroencephalographic signals // H.G. Othmer, editor. Nonlinear oscillations in biology and chemistry. — New York: Springer-Verlag, 1986. — P. 175–205. 43. R ��u p i n g S . SVM kernels for time series analysis // R. Klinkenberg, S. R��uping, A. Fick, N. Henze, C. Herzog, R. Molitor, O. Schr��oder, editors. LLWA 01 — Tagungsband der GI–Workshop–Woche Lernen –Lehren–Wissen–Adaptivit��at. — 2001. — P. 43–50. 44. S c h m i l l M . , O a t e s T . , C o h e n P . Learned models for continuous planning // Proc. of the Seventh Intern. Workshop on Artif. Intell. and Statist. — 1999. — P. 278–282. 45. S c h o l k o p f B . , B u r g e s C . , V a p n i k V . Extracting support data for a given task // Proc. First In- tern. Conf. on Knowledge Discovery and Data Mining. — Menlo Park: AAAI Press. — 1995. 46. S c h ��o l k o p f B . The kernel trick for distances: Techn. Rep. / Microsoft Research. — 2000. 47. S h i m o d a i r a H . , N o m a K . , N a k a M . , S a g a y a m a S . Support vector machine with dynamic time-alignment kernel for speech recognition // Proc. of Eurospeech. — 2001. — P. 1841–1844. 48. T a k e n s F . Detecting strange attractors in turbulence // D.A. Rand, L.S. Young, editors. Dynamical systems and turbulence; Lecture Notes in Mathematics. — Berlin: Springer-Verlag, 1981. 49. V a p n i k V . N . The nature of statistical learning. — Berlin: Springer, 1995. — 16 p. 50. W a n V . , C a r m i c h a e l J . Polynomial dynamic time warping kernel support vector machines for dysarthric speech recognition with sparse training data // Proc. of Interspeech. — 2005. — P. 3321–3324. 51. W o l f A . , S w i f t J . B . , S w i n n e y H . L . , V a s t a n o J . A . Determining Lyapunov exponents from a time series. Physica D. — 1985. — 16. — P. 285–317. 52. Y a n g K . , S h a h a b i C . A pca-based kernel for kernel pca on multivariate time series // Proc. of ICDM 2005 Workshop on Temporal Data Mining: Algorithms, Theory and Applications held in conjunction with The Fifth IEEE Intern. Conf. on Data Mining (ICDM’05). — 2005. Ïîñòóïèëà 24.11.2006
id nasplib_isofts_kiev_ua-123456789-71979
institution Digital Library of Periodicals of National Academy of Sciences of Ukraine
language English
last_indexed 2025-12-07T15:26:31Z
publishDate 2008
publisher Інститут кібернетики ім. В.М. Глушкова НАН України
record_format dspace
spelling Chaovalitwongse, W.A.
Pardalos, P.M.
2014-12-15T17:27:34Z
2014-12-15T17:27:34Z
2008
On the time series support vector machine using dynamic time warping kernel for brain activity classification / W.A. Chaovalitwongse, P.M. Pardalos // Кибернетика и системный анализ. — 2008. — № 1. — С. 159-173. — Бібліогр.: 52 назв. — англ.
https://nasplib.isofts.kiev.ua/handle/123456789/71979
612.821:51+519.7+519.8
Запропоновано нову технологію аналізу даних, що використовується для класифікації нормальних і передуючих нападам електроенцефалограм. Технологія заснована на використанні ядра динамічного перетворення масштабу часу, об'єднаного з методом опорних векторів (SVM). Результати експериментів показали, що запропонована технологія значно перевершує стандартну SVM і дозволяє покращити класифікацію активності мозку.
Research was partially supported by Rutgers Research Council grant-202018, the NSF grants CCF-0546574, DBI-980821, EIA-9872509, CCF 0546574, and NIH grant R01-NS-39687-01A1.
en
Інститут кібернетики ім. В.М. Глушкова НАН України
Кибернетика и системный анализ
Системный анализ
On the time series support vector machine using dynamic time warping kernel for brain activity classification
Article
published earlier
spellingShingle On the time series support vector machine using dynamic time warping kernel for brain activity classification
Chaovalitwongse, W.A.
Pardalos, P.M.
Системный анализ
title On the time series support vector machine using dynamic time warping kernel for brain activity classification
title_full On the time series support vector machine using dynamic time warping kernel for brain activity classification
title_fullStr On the time series support vector machine using dynamic time warping kernel for brain activity classification
title_full_unstemmed On the time series support vector machine using dynamic time warping kernel for brain activity classification
title_short On the time series support vector machine using dynamic time warping kernel for brain activity classification
title_sort on the time series support vector machine using dynamic time warping kernel for brain activity classification
topic Системный анализ
topic_facet Системный анализ
url https://nasplib.isofts.kiev.ua/handle/123456789/71979
work_keys_str_mv AT chaovalitwongsewa onthetimeseriessupportvectormachineusingdynamictimewarpingkernelforbrainactivityclassification
AT pardalospm onthetimeseriessupportvectormachineusingdynamictimewarpingkernelforbrainactivityclassification