CASE: A Platform for Crowdsourcing Based API Search

Liang, Tingting; Chen, Liang; Xie, Zhining; Yang, Wei; Wu, Jian

doi:10.1007/978-3-662-48616-0_34

Tingting Liang¹⁷,
Liang Chen¹⁷,
Zhining Xie¹⁷,
Wei Yang¹⁷ &
…
Jian Wu¹⁷

Part of the book series: Lecture Notes in Computer Science ((LNPSE,volume 9435))

Included in the following conference series:

International Conference on Service-Oriented Computing

1552 Accesses

Abstract

With the rapid growth of Web APIs on the Internet, searching appropriate APIs is becoming a challenging problem. General API search systems (e.g., ProgrammableWeb) implement API search through simple keywords matching leading to unsatisfactory search results. In this paper, we presents a crowdsourcing based API search engine CASE. Specifically, the API search engine leverages social information, Twitter List, a tool used by individual users to organize accounts that interest them on semantics. Based on the lists information, Latent Semantic Indexing (LSI) model is employed to compute the semantic similarity between the APIs and queries. Furthermore, the popularity of APIs inferred from the lists number is integrated with the semantic similarity to generate the final search result.

You have full access to this open access chapter, Download conference paper PDF

Exploiting Social Tagging in Web API Search

Crowdsourcing for Web Service Discovery

What did you see? A study to measure personalization in Google’s search engine

Article Open access 16 December 2019

Tobias D. Krafft, Michael Gamer & Katharina A. Zweig

Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

1 Introduction and Motivation

With the development of Web and mobile applications, a form of service called application programming interface (API) becomes prevalent. An API is a gateway to other people’s software and can be employed to invoke a third-party software component over the Internet. Compared to the traditional web services, APIs are in a more popular style and can make solution more accessible and more useful, which lead to a rapid growth of Web APIs. According to the analysis offered by ProgrammableWeb (PW)^{Footnote 1}, a platform dedicated to manage APIs and mashups, the number of its following APIs increased fast in recent years and has reached 10850 by October 2014. Thus, how to discover the appropriate APIs becomes a hot issue. Since general API search systems consider the keywords match rather than the real semantic or topic information of APIs, the performance of API search is limited. Taking PW as an example, if a consumer takes “travel” as the query in search system, the top result is “Webcams.travel” which is a directory of touristic webcams and classified in Video category. Absolutely, “Webcams.travel” is not the objective API. The reason is that PW search system only seeks APIs whose names or descriptions contain the query.

To alleviate the limitation of general API search systems, introducing crowdsourcing information into the search method is a popular and novel strategy. And it has been proved effective in many other fields, such as information retrieval, social network, etc. [2, 4]. In this paper, we introduce extra crowdsourcing information, named as Twitter lists [3], to improve the performance of API search. Twitter lists, which are used to help users organize the Twitter accounts they are interested in, include list names and descriptions that contain a wealth of semantic information. Figure 1 illustrates the contents of lists by showing several lists including LinkedIn. Most of the list names are related to ‘Social Media’ and ‘Career’ along with the corresponding list descriptions, which distinctly reflect the topic of LinkedIn. After being matched with APIs in PW, we crawled 3877 APIs’ Twitter accounts and the lists involving them. One API could be involved in many lists and the most reaches 84511. 30.8 % of total APIs are organized by more than 100 lists, and 70.1 % are included by at least 10 lists. Thus the list information of most APIs is full of richness.

In this paper, we build a demonstration of API search engine called CASE. The critical idea of CASE is to infer an API’s topic by analyzing semantics of lists including the API Twitter account. Based on the lists information, Latent Semantic Indexing (LSI) [1] is employed to compute the semantic similarity between APIs and queries. Additionally, the popularity of APIs inferred from the lists number is integrated with the semantic similarity to generate the final search result. Therefore, CASE could provide semantics matched and popularity satisfied results for a given query.

2 Algorithm Framework

Figure 2 illustrates the whole framework of algorithm applied in the API search engine. The list data for API search is crawled from Twitter after being matched with APIs crawled from PW. Generally, the process of the algorithm could be divided into two parts. The first part is utilizing the name and description information of lists to compute the semantic similarity, including three steps: pre-processing, LSI and similarity calculation. After the data pre-processing like tokenization, stop words removal, stemming, etc., the features extracted from lists and the given query could be mapped into a conceptual space through LSI. The returned relevant APIs are ranked according to cosine similarity, thus the semantics based rank score of an API and a given query is defined as:

$$\begin{aligned} score_{s}(q,API_{i}) =sim(q,d_{i}) =\frac{\sum _{j}w_{q,j}w_{i,j}}{\sqrt{\sum _{j}{w^{2}}_{q,j}}\sqrt{\sum _{j}{w^{2}}_{i,j}}}, \end{aligned}$$

(1)

where $\mathbf w $ denotes term vector of an API or query in latent semantic space. Another part is about popularity calculation for each API based on its list number, which reflects how popular an API is among Twitter users to a certain extent. The popularity based rank score of an API is define as:

$$\begin{aligned} score_{p}(API_{i})=\frac{\log _{10} N-\log _{10} min}{\log _{10} max-\log _{10} min} \end{aligned}$$

(2)

where N represents the number of lists including the ith API, min and max respectively denote the minimal and maximal numbers of APIs’ lists.

The final rank score is decided by the integration of semantic similarity and API’s popularity, since the two measures synthetically satisfy the demands of users. Thus, combining the last two equations, we define the final rank score as:

$$\begin{aligned} score(q,API_{i})=\lambda score_{s}(q,API_{i}) +(1-\lambda ) score_{p}(API_{i}),\lambda \in [0,1] \end{aligned}$$

(3)

where $\lambda $ is a weight to balance the importance of semantic similarity and popularity.

3 User Interface

Our crowdsourcing based API search engine is online available, users can use it to search APIs by visiting http://zjumsi.com/projects/case/index.php. The overall pages in the search platform are designed based on HTML5. The first page offers the crowdsourcing based API search function of the platform, and the next pages show some information and analysis about APIs, Twitter lists, and the demo video^{Footnote 2}.

Figure 3(a) depicts the search interface and 52 sample queries from 12 categories are offered here. Users can click any sample query or input a query manually for searching the appropriate APIs. Figure 3(b) shows the search result page while user chooses a sample query holiday. It can be easily found that each search result entity mainly includes four parts: (1) API name; (2) API description; (3) the category API belongs to; (4) the lists number including API which shows API’s popularity. Moreover, the first icon beside the API name links to the API page in ProgrammableWeb, which offers more detailed information about the API. When click the second icon, it will show the page of Twitter lists including the API, helping users to better understand the crowdsourcing knowledge about API.

4 Conclusion and Outlook

This paper demonstrates a crowdsourcing based API search platform named CASE. CASE applies LSI model to calculate semantic similarity based on Twitter lists information, and integrates it with API popularity to infer the final search result. Therefore, the search platform offers semantics matched and popularity satisfied results for a user query.

To enrich the function of the API search platform, it is considered to assign tags for each API which can be used to find APIs with similar function, and recommend APIs for a specific mashup in our future work.

Notes

1.
ProgrammableWeb: http://www.programmableweb.com.
2.
The demo video also can be found at https://youtu.be/D6xTzFkAXjQ.

References

Deerwester, S.C., Dumais, S.T., Landauer, T.K., Furnas, G.W., Harshman, R.A.: Indexing by latent semantic analysis. JAsIs 41(6), 391–407 (1990)
Article Google Scholar
Ghosh, S., Sharma, N., Benevenuto, F., Ganguly, N., Gummadi, K.: Cognos: crowdsourcing search for topic experts in microblogs. In: Proceedings of the 35th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 575–590. ACM (2012)
Google Scholar
Kallen, N.: Twitter blog: Soon to launch: Lists (2009)
Google Scholar
Lamere, P.: Social tagging and music information retrieval. J. New Music Res. 37(2), 101–114 (2008)
Article Google Scholar

Download references

Acknowledgment

This research was partially supported by the National Technology Support Program under grant of 2011BAH16B04, the National Natural Science Foundation of China under grant of 61173176, Science and Technology Program of Zhejiang Province under grant of 2013C01073, National High-Tech Research and Development Plan of China under Grant No. 2013AA01A604.

Author information

Authors and Affiliations

Zhejiang University, Hangzhou, China
Tingting Liang, Liang Chen, Zhining Xie, Wei Yang & Jian Wu

Authors

Tingting Liang
View author publications
You can also search for this author in PubMed Google Scholar
Liang Chen
View author publications
You can also search for this author in PubMed Google Scholar
Zhining Xie
View author publications
You can also search for this author in PubMed Google Scholar
Wei Yang
View author publications
You can also search for this author in PubMed Google Scholar
Jian Wu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Tingting Liang .

Editor information

Editors and Affiliations

Queensland University of Technology, Brisbane, Queensland, Australia
Alistair Barros
Université Paris Dauphine, Paris, France
Daniela Grigori
M.S. Ramaiah University, Bangalore, India
Nanjangud C. Narendra
University of Wollongong, Wollongong, New South Wales, Australia
Hoa Khanh Dam

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Liang, T., Chen, L., Xie, Z., Yang, W., Wu, J. (2015). CASE: A Platform for Crowdsourcing Based API Search. In: Barros, A., Grigori, D., Narendra, N., Dam, H. (eds) Service-Oriented Computing. ICSOC 2015. Lecture Notes in Computer Science(), vol 9435. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-48616-0_34

Download citation

DOI: https://doi.org/10.1007/978-3-662-48616-0_34
Published: 25 November 2015
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-662-48615-3
Online ISBN: 978-3-662-48616-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

CASE: A Platform for Crowdsourcing Based API Search

Abstract

Similar content being viewed by others