Standardization of Featureless Variables for Machine Learning Models Using Natural Language Processing

Modarresi, Kourosh; Munir, Abdurrahman

doi:10.1007/978-3-319-93701-4_18

Kourosh Modarresi²⁰ &
Abdurrahman Munir²⁰

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 10861))

Included in the following conference series:

International Conference on Computational Science

2582 Accesses
2 Citations

Abstract

AI and machine learning are mathematical modeling methods for learning from data and producing intelligent models based on this learning. The data these models need to deal with, is normally a mixed of data type where both numerical (continuous) variables and categorical (non-numerical) data types. Most models in AI and machine learning accept only numerical data as their input and thus, standardization of mixed data into numerical data is a critical step when applying machine learning models. Having data in the standard shape and format that models require often a time consuming, nevertheless very significant step of the process.

You have full access to this open access chapter, Download conference paper PDF

Machine Learning and Natural Language Processing: Review of Models and Optimization Problems

Towards a Balanced Natural Language Processing: A Systematic Literature Review for the Contact Centre

Natural Language Processing, Moving from Rules to Data

Keywords

1 Introduction

1.1 Motivation

As an example, when we have a data set (below) combined of many variables where all are numerical ones except two variables of categorical type (gender and marital status) as following [50]:

Table 1. Original mixed variables

Full size table

When applying many machine learning models, the models need the data to be numerical data type. Thus, the categorical data should be converted into numerical type. The most efficient way of converting the categorical variable is the introduction of dummy variables (one hot encoding) for which a new (dummy) variable is created for each category (except the last category – since it’d be dependent on the rest of dummy variables, i.e., its value could be determined when all other dummy variables are known) of the categorical variable. These dummy variables are binary variables and could assume only two values, 1 and 0. The value 1 means the sample has the value of that variable and 0 means the opposite.

Here, for this example, we have two categorical variables:

1.
Gender: there are only two categories, so we need to create one dummy variable.
2.
Marital Status: there are three categories so we need to create two new dummy variables.

The result after the creation of dummy variables is shown in Table 2.

Table 2. The original variables after the introduction of dummy variables.

Full size table

After this transitional step, we could use any machine learning model for this data set as all its variables are numerical one.

In general, for any categorical variable of “m” categories (classes), we need to create “m − 1” dummy variables. The problem arises when any specific categorical variable has large (based on our work, that means larger than 8) number of categories. The reason is that, in these cases, the number of dummy variables need to be created becomes too large causing the data to become of high dimension. The high dimensionality of data leads to “curse of dimensionality” problem and thus all related issues related to “curse of dimensionality” such as the need of “exponential increase in the number of data rows” and “difficulties of distance computation” would appear. Obviously, one needs to avoid the situation since, in addition to these problems, curse of dimensionality also leads to misleading results from any machine learning models such as finding false patterns discovered based on noise or random chance. Besides all of that, higher dimension leads to higher “computational cost” and “slow model response and lower robustness”, all of which should be avoided. Therefore, in the process of transformation of categorical data into numerical data types, we must reduce the number of newly created numerical variables to reduce the dimension of data [50].

Two examples of the case of categorical variables of large categories or classes are “country of residence” and “URL related data such as the last site visited by the user”. For the first variable, there are more than 150 categories and for the second, there is potentially as many categories as the number of users which is a very large (in the order of millions) number. To address these types of problem, this work establishes a new approach of reducing the number of categories (when the number of categories in a categorical variable in larger than 10) to K categories for $ {\text{K}} \le 10 $. This way, we will create a limited number of dummy variables to replace the categorical variable in the data set.

For some types of categorical variables such as “country of residence”, we may find some attributes online and thus, using these attributes and applying clustering models and web scraping, we can create only a handful of dummy variable to replace the categorical variables of large categories [50].

But, there are other type of categorical variables, such as “URL” variable, where it is not possible to scrap features online and thus the above method [50] cannot be applied. This paper focuses on a method of dealing with this type of categorical data.

2 The Approach Used in This Work

2.1 The Difficulties in Dealing with Modern Data

Quite often, the models in machine learning are models that use only numeric data. Though, practically all data that are used in machine learning are mixed type, numerical and categorical data. When used for machine learning models that could use only numerical data, mixed data types are handled using three different approaches: first approach is trying to, instead, using models that could handle mixed data type, second approach is to ignore (drop) categorical variables. The last approach is converting categorical variables to numerical type by introducing dummy variables. The first approach introduces many limitations as there are only a limited number of models that could handle mixed data and those models are often not the best model fitting the data set. The second approach leads to ignoring much of the information in data set, i.e., the categorical data. The practical approach is the third one, i.e., conversion of categorical data into numerical data. As we explained above, this can be done correctly only when all categorical variables have only limited number of categories (10 or less). Else, it leads to high dimensional data that causes, among other problems, machine learning models to produce meaningless (biased) results. In other words, when the variable has many classes, this approach becomes infeasible because the number of variables will be too much for the numeric models to handle.

This work detects a much smaller number of “latent classes” that are the underpinning classes or categories for the original categories of each categorical variable. This way, the high dimensionality is avoided and thus, we can use these latent classes to perform the dummy variable generation described above to use any machine learning. The small number of latent categories are detected using k-means clustering.

The basic idea is that categorical variables that have many values (or unique values for each sample) provide little information for other samples. To maintain the useful information from these variables, the best method is to keep that useful (latent) information. This invention does it by finding the latent categories by clustering all categories into similar groups. Using k-means clustering of the categories of any categorical variable, we may two distinct cases. First, is when each category has given features or attributes. This is rarely seen in the data sets. The second case is when there are no such attributes about each of the categories and we need to create them.

In the cases, we have features for all categories or classes of any variable, we could use k-means clustering directly. Though, quite often, there is no attributes information about these classes in the data sets. This work uses NLP [2, 13, 18,19,20, 53, 57] models (Natural Language Processing) to address the case of categorical variables without any attributes or features. The objective is to find a small number of dummy variables replacing the categorical variable, that we want to convert to a numerical one.

We show our approach for the very important example of URL variable.

2.2 Application of Our Model by Using the Example of URL Data

Categorical variables having URL are important example of these types of categorical variables. They are frequently present in click data and often have very large possible values, sometime as much as the number of users.

To extract the latent categories from these URL variables, we try to cluster them into similar URL’s i.e. URLs with similar paths. We choose to extract a word and character using n-gram vector representations from the URL’s, then cluster these vector representations using K-means clustering.

URL clustering is a great example because of the difficulty of the task. The difficulty is not only as a result of the number of URLs but also because of the lack of information (attributes) about them that can be used for clustering. When there is no information available about the variables, we need to use NLP. It important that we use NLP to perform the clustering because we have no knowledge of the format of the URLs, i.e., we have no attributions for each URL and clustering cannot be done without attributes. In this case, we use NLP to build the needed attributes for the URLs. When URLs have the same domain, like www.google.com, then the clusters would all be under www.google.com. However, the URLs could also be under multiple domains in which case the clusters would be under multiple domains. A predetermined algorithm would not be able to dynamically handle this variability. This is another reason that, in the case of URLs as an example, we use NLP to cluster them based off syntactic similarity, specifically word bigrams i.e. groups of three words. Our categorical variable has 500 categories, all under the domain of www.adobe.com. A few of these categories are;

For the algorithm to work best, we first strip the URL’s of any characters that provide little information for clustering (since these words may introduce no new information). These words include punctuation and common words such as “http” and “www”. We, thus, perform pre-processing on this list which includes removing punctuation, queries (anything after the character “?”), and stop-words (http, com, www, html, etc.). After this step, we are left with the URLs as space separated words representing the path of URL (Fig. 2);

A sample of the result looks like (Fig. 3):

One of the most popular tools in NLP is the ones involving representation of words with a numerical vector representation in an n dimensional space. Using the context of a word, it can be mapped into an n-dimensional vector space. Learned representations such as word embedding is increasingly popular for modeling semantics in NLP. This is done by reducing semantic composition to simple vector operations. We’ve modified and extended traditional representation learning techniques [13, 18, 50] to support multiple word senses and uncertain representations.

In this work, we used a modification so that, instead of projecting individual words, we project whole URLs containing multiple words. We use these words and their contexts as features for the projection of the whole URL (Fig. 4).

Using the cleaned list, we extract vector representations of the URL’s using the tool “Sally”. Sally is a tool that maps a set of strings to a set of vectors. The features that we use for this mapping are bi-gram words and tri-gram characters. Thus, using word bigrams of the URLs as features, we project the URLs into vector space using “Sally”. Sally represents the URLs using a sparse matrix representation. This means that the URLs are projected into very long vectors with each dimension representing a word trigram that has been seen in the dataset. If a trigram has been observed in the URL its value in the vector is 1. Otherwise the value is 0. This results in a long vector with most values equal to 0 and a few values equal to 1. All the vectors together make a matrix that is a sparse matrix because of its many 0 values. Finally, we used K-means clustering on the embedding. Given that the URLs have been transformed into points in n-dimensional vector space, K-means clustering can find groups of points and partitions them as a cluster in the dataset. Given a number K which is the number of clusters for the algorithm to discover, K-means finds the best partitioning of the dataset such that the points in the clusters are mutually as similar as possible. In the context of URLs this means finding the groups of URLs that share the most word trigrams. Figure 5 shows that the best K values is 10.

2.3 Computing the Optimal Number of Clusters

To compute the optimal number of clusters, we use Silhouette method which is based on minimizing the dissimilarities inside a cluster and maximizing the dissimilarities among clusters [31, 50]:

The Silhouette model computes s(i) for each data point in the data set for each K:

$$ \varvec{s}(\varvec{i}) = \frac{{\varvec{b}(\varvec{i}) - \varvec{a}(\varvec{i})}}{{{\mathbf{max}}\left\{ {\varvec{a}(\varvec{i}),\varvec{b}(\varvec{i})} \right\}}} $$

Where $ a\left( i \right) $ is the mean distance of point i to all the other points in its cluster. Also, $ b\left( i \right) $ is the mean distance to all the points in its closest cluster, i.e., $ b\left( i \right) $ is the minimum mean distance of point i to all clusters that i is not a member of.

The optimal K is the K that maximizes the total score s(i) for all data set. The score values lie in the range of [−1, 1] with −1 to be the worst possible score and +1 to be the optimal score. Thus, the closest (average score of all points) score to +1 is the optimal one and the corresponding K is the optimal K. Our experiments show that the value of K has upper bound of 10. Here, we use not only the score but the maximum separation and compactness of the clusters, as measured by distance between clusters and uniformity of the width of clusters, to test and validate our model simultaneously when computing optimal K. Figure 6 depicts Silhouette model for different K [50].

Using the results from silhouette model, we use k-means clustering to cluster the URL data. Some of the clusters are shown in Fig. 7.

As the figure above shows, our method has grouped together URLs with similar paths and separated URLs with dissimilar paths.

3 The Results and Conclusion

This project provides a method of converting categorical variables to numerical variables so machine learning models could use data. For this conversion to be plausible for categorical variables with many classes, we propose that clustering can be used to decrease the number of classes in the variable to a small number for dummy variable generation. Though, some variables may have accessible features which makes it possible to cluster them, but many variables lack the information or features that would be needed for clustering models. This work deal effectively with these types of categorical variables and assumes no extra features and information may be available, neither explicitly nor implicitly – by web scraping, for such variables. For the model to work, we used NLP to create a vector representation of the variables. Then, we use the vector representation to cluster the variables, i.e., clustering the categories of the variables.

This work provides a new and only practical method of dealing with the standardization of categorical variables when the variables have large number of categories or classes and have no explicitly or implicitly available features. Our model avoids the deletion of the categorical variables and thus loss of information that causes machine learning models to produce meaningless results. This work also leads to the avoidance of creating high dimensional data where “curse of dimensionality” leads to high computational cost, need of exponentially larger data sets, distorted values for distance metrics and biased models.

References

Ahn, D., Jijkoun, V., Mishne, G., Müller, K., de Rijke, M., Schlobach, S.: Using Wikipedia at the TREC QA track. In: Proceedings of TREC (2004)
Google Scholar
Auer, S., Bizer, C., Kobilarov, G., Lehmann, J., Cyganiak, R., Ives, Z.: DBpedia: a nucleus for a web of open data. In: Aberer, K., et al. (eds.) ASWC/ISWC -2007. LNCS, vol. 4825, pp. 722–735. Springer, Heidelberg (2007). https://doi.org/10.1007/978-3-540-76298-0_52
Chapter Google Scholar
Backstrom, L., Leskovec, J.: Supervised random walks: predicting and recommending links in social networks. In: ACM International Conference on Web Search and Data Mining, WSDM (2011)
Google Scholar
Bahdanau, D., Cho, K., Bengio, Y.: Neural machine translation by jointly learning to align and translate. In: International Conference on Learning Representations, ICLR (2015)
Google Scholar
Baudiš, P.: YodaQA: a modular question answering system pipeline. In: POSTER 2015-19th International Student Conference on Electrical Engineering, pp. 1156–1165 (2015)
Google Scholar
Baudiš, P., Šedivý, J.: Modeling of the question answering task in the YodaQA system. In: Mothe, J., Savoy, J., Kamps, J., Pinel-Sauvagnat, K., Jones, G.J.F., SanJuan, E., Cappellato, L., Ferro, N. (eds.) CLEF 2015. LNCS, vol. 9283, pp. 222–228. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-24027-5_20
Chapter Google Scholar
Becker, S., Bobin, J., Candès, E.J.: NESTA: a fast and accurate first-order method for sparse recovery. SIAM J. Imag. Sci. 4(1), 1–39 (2009)
Article MathSciNet Google Scholar
Bjorck, A.: Numerical Methods for Least Squares Problems. SIAM, Philadelphia (1996)
Book Google Scholar
Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent Dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022 (2003)
MATH Google Scholar
Bollacker, K., Evans, C., Paritosh, P., Sturge, T., Taylor, J.: Freebase: a collaboratively created graph database for structuring human knowledge. In: Proceedings of the 2008 ACM SIGMOD International Conference on Management of Data, pp. 1247–1250. ACM (2008)
Google Scholar
Brill, E., Dumais, S., Banko, M.: An analysis of the AskMSR question-answering system. In: Empirical Methods in Natural Language Processing, EMNLP, pp. 257–264 (2002)
Google Scholar
Boyd, S., Vandenberghe, L.: Convex Optimization. Cambridge University Press, Cambridge (2004)
Book Google Scholar
Buscaldi, D., Rosso, P.: Mining knowledge from Wikipedia for the question answering task. In: International Conference on Language Resources and Evaluation, LREC, pp. 727–730 (2006)
Google Scholar
Candès, E.J., Recht, B.: Exact matrix completion via convex optimization. Found. Comput. Math. 9, 717–772 (2008)
Article MathSciNet Google Scholar
Candès, E.J.: Compressive sampling. In: Proceedings of the International Congress of Mathematicians, Madrid, Spain (2006)
Google Scholar
Candès, E.J., Tao, T.: Near-optimal signal recovery from random projections: universal encoding strategies. IEEE Trans. Inf. Theory 52, 5406–5425 (2004)
Article MathSciNet Google Scholar
Caruana, R.: Multitask learning. In: Thrun, S., Pratt, L. (eds.) Learning to Learn, pp. 95–133. Springer, Boston (1998). https://doi.org/10.1007/978-1-4615-5529-2_5
Chapter Google Scholar
Chen, D., Bolton, J., Manning, C.D.: A thorough examination of the CNN/Daily Mail reading comprehension task. In: Association for Computational Linguistics, ACL (2016)
Google Scholar
Chen, D., Fisch, A., Weston, J., Bordes, A.: Reading Wikipedia to answer open-domain questions. arXiv:1704.00051 (2017)
Collobert, R., Weston, J.: A unified architecture for natural language processing: deep neural networks with multitask learning. In: International Conference on Machine Learning, ICML (2008)
Google Scholar
d’Aspremont, A., El Ghaoui, L., Jordan, M.I., Lanckriet, G.R.G.: A direct formulation for sparse PCA using semidefinite programming. SIAM Rev. 49(3), 434–448 (2007)
Article MathSciNet Google Scholar
Efron, B., Hastie, T., Johnstone, I., Tibshirani, R.: Least angle regression. Ann. Stat. 32, 407–499 (2004)
Article MathSciNet Google Scholar
Eldén, L.: Algorithms for the regularization of ill-conditioned least squares problems. BIT 17, 134–145 (1977)
Article MathSciNet Google Scholar
Eldén, L.: A note on the computation of the generalized cross-validation function for ill-conditioned least squares problems. BIT 24, 467–472 (1984)
Article MathSciNet Google Scholar
Engl, H.W., Groetsch, C.W. (eds.): Inverse and Ill-Posed Problems. Academic Press, London (1987)
MATH Google Scholar
Fader, A., Zettlemoyer, L., Etzioni, O.: Open question answering over curated and extracted knowledge bases. In: ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1156–1165 (2014)
Google Scholar
Fazel, M., Hindi, H., Boyd, S.: A rank minimization heuristic with application to minimum order system approximation. In: Proceedings American Control Conference, vol. 6, pp. 4734–4739 (2001)
Google Scholar
Golub, G.H., Van Loan, C.F.: Matrix Computations, 4th edn. Computer Assisted Mechanics and Engineering Sciences, Johns Hopkins University Press, Baltimore (2013)
Google Scholar
Golub, G.H., Van Loan, C.F.: An analysis of the total least squares problem. SIAM J. Numer. Anal. 17, 883–893 (1980)
Article MathSciNet Google Scholar
Golub, G.H., Heath, M., Wahba, G.: Generalized cross-validation as a method for choosing a good ridge parameter. Technometrics 21, 215–223 (1979)
Article MathSciNet Google Scholar
Hastie, T., Tibshirani, R., Friedman, J.: The Elements of Statistical Learning; Data Mining, Inference and Prediction. Springer, New York (2001). https://doi.org/10.1007/978-0-387-21606-5
Book MATH Google Scholar
Hastie, T.J., Tibshirani, R.: Handwritten digit recognition via deformable prototypes. Technical report. AT&T Bell Laboratories (1994)
Google Scholar
Hein, T., Hofmann, B.: On the nature of ill-posedness of an inverse problem in option pricing. Inverse Probl. 19, 1319–1338 (2003)
Article MathSciNet Google Scholar
Hewlett, D., Lacoste, A., Jones, L., Polosukhin, I., Fandrianto, A., Han, J., Kelcey, M., Berthelot, D.: WikiReading: a novel large-scale language understanding task over wikipedia. In: Association for Computational Linguistics, ACL, pp. 1535–1545 (2016)
Google Scholar
Hill, F., Bordes, A., Chopra, S., Weston, J.: The Goldilocks principle: reading children’s books with explicit memory representations. In: International Conference on Learning Representations, ICLR (2016)
Google Scholar
Hua, T.A., Gunst, R.F.: Generalized ridge regression: a note on negative ridge parameters. Commun. Stat. Theory Methods 12, 37–45 (1983)
Article MathSciNet Google Scholar
Jolliffe, I.T., Trendafilov, N.T., Uddin, M.: A modified principal component technique based on the LASSO. J. Comput. Graph. Stat. 12, 531–547 (2003)
Article MathSciNet Google Scholar
Kirsch, A.: An Introduction to the Mathematical theory of Inverse Problems. Springer, New York (1996). https://doi.org/10.1007/978-1-4419-8474-6
Book MATH Google Scholar
Mardia, K., Kent, J., Bibby, J.: Multivariate Analysis. Academic Press, New York (1979)
MATH Google Scholar
Manning, C.D., Surdeanu, M., Bauer, J., Finkel, J., Bethard, S.J., McClosky, D.: The stanford CoreNLP natural language processing toolkit. In: Association for Computational Linguistics, ACL, pp. 55–60 (2014)
Google Scholar
Marquardt, D.W.: Generalized inverses, ridge regression, biased linear estimation, and nonlinear estimation. Technometrics 12, 591–612 (1970)
Article Google Scholar
Mazumder, R., Hastie, T., Tibshirani, R.: Spectral regularization algorithms for learning large incomplete matrices. JMLR 2010(11), 2287–2322 (2010)
MathSciNet MATH Google Scholar
McCabe, G.: Principal variables. Technometrics 26, 137–144 (1984)
Article MathSciNet Google Scholar
Miller, A.H., Fisch, A., Dodge, J., Karimi, A.-H., Bordes, A., Weston, J.: Key-value memory networks for directly reading documents. In: Empirical Methods in Natural Language Processing, EMNLP, pp. 1400–1409 (2016)
Google Scholar
Mintz, M., Bills, S., Snow, R., Jurafsky, D.: Distant supervision for relation extraction without labeled data. In: Association for Computational Linguistics and International Joint Conference on Natural Language Processing, ACL/IJCNLP, pp. 1003–1011 (2009)
Google Scholar
Modarresi, K., Golub, G.H.: An adaptive solution of linear inverse problems. In: Proceedings of Inverse Problems Design and Optimization Symposium, IPDO 2007, Miami Beach, Florida, 16–18 April, pp. 333–340 (2007)
Google Scholar
Modarresi, K.: A local regularization method using multiple regularization levels, Stanford, CA, April 2007
Google Scholar
Modarresi, K.: Algorithmic approach for learning a comprehensive view of online users. Proc. Comput. Sci. 80(C), 2181–2189 (2016)
Article Google Scholar
Modarresi, K.: Computation of recommender system using localized regularization. Proc. Comput. Sci. 51(C), 2407–2416 (2015)
Article Google Scholar
Modarresi, K., Munir, A.: Generalized variable conversion using K-means clustering and web scraping. In: ICCS 2018 (2018, Accepted)
Google Scholar
Rajpurkar, P., Zhang, J., Lopyrev, K., Liang, P.: SQuAD: 100,000 + questions for machine comprehension of text. In: Empirical Methods in Natural Language Processing, EMNLP (2016)
Google Scholar
Ryu, P.-M., Jang, M.-G., Kim, H.-K.: Open domain question answering using Wikipedia-based knowledge model. Inf. Process. Manag. 50(5), 683–692 (2014)
Article Google Scholar
Seo, M., Kembhavi, A., Farhadi, A., Hajishirzi, H.: Bidirectional attention flow for machine comprehension. arXiv preprint arXiv:1611.01603 (2016)
Tarantola, A.: Inverse Problem Theory. Elsevir, Amsterdam (1987)
MATH Google Scholar
Tibshirani, R.: Regression shrinkage and selection via the LASSO. J. Roy. Stat. Soc. Ser. B 58(1), 267–288 (1996)
MathSciNet MATH Google Scholar
Tikhonov, A.N., Goncharsky, A.V. (eds.): Ill-Posed Problems in the Natural Sciences. MIR, Moscow (1987)
Google Scholar
Wang, Z., Mi, H., Hamza, W., Florian, R.: Multi-perspective context matching for machine comprehension. arXiv preprint arXiv:1612.04211 (2016)
Witten, R., Candès, E.J.: Randomized algorithms for low-rank matrix factorizations: sharp performance bounds. Algorithmica 72, 264–281 (2013)
Article MathSciNet Google Scholar
Zhou, Z., Wright, J., Li, X., Candès, E.J., Ma, Y.: Stable principal component pursuit. In: Proceedings of International Symposium on Information Theory, June 2010
Google Scholar
Zou, H., Hastie, T., Tibshirani, R.: Sparse principal component analysis. J. Comput. Graph. Stat. 15(2), 265–286 (2006)
Article MathSciNet Google Scholar

Download references

Author information

Authors and Affiliations

Adobe Inc., San Jose, CA, USA
Kourosh Modarresi & Abdurrahman Munir

Authors

Kourosh Modarresi
View author publications
You can also search for this author in PubMed Google Scholar
Abdurrahman Munir
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Kourosh Modarresi .

Editor information

Editors and Affiliations

Chinese Academy of Sciences, Beijing, China
Yong Shi
National Supercomputing Center in Wuxi, Wuxi, China
Haohuan Fu
Chinese Academy of Sciences, Beijing, China
Yingjie Tian
University of Amsterdam, Amsterdam, The Netherlands
Valeria V. Krzhizhanovskaya
University of Amsterdam, Amsterdam, The Netherlands
Michael Harold Lees
University of Tennessee, Knoxville, Tennessee, USA
Jack Dongarra
University of Amsterdam, Amsterdam, The Netherlands
Peter M. A. Sloot

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Modarresi, K., Munir, A. (2018). Standardization of Featureless Variables for Machine Learning Models Using Natural Language Processing. In: Shi, Y., et al. Computational Science – ICCS 2018. ICCS 2018. Lecture Notes in Computer Science(), vol 10861. Springer, Cham. https://doi.org/10.1007/978-3-319-93701-4_18

Download citation

DOI: https://doi.org/10.1007/978-3-319-93701-4_18
Published: 12 June 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-93700-7
Online ISBN: 978-3-319-93701-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us