Abstract
We discuss the significance of designing views on data in a computational system assisting scientists in the process of discovery. A view on data is considered as a particular way to interpret the data. In the scientific literature, devising a new view capturing the essence of data is a key to discovery. A system HypothesisCreator, which we have been developing to assist scientists in the process of discovery, supports users’ designing views on data and have the function of searching for good views on the data. In this paper we report a series of computational experiments on scientific data with HypothesisCreator and analyses of the produced hypotheses, some of which select several views good for explaining given data, searched and selected from over ten millions of designed views. Through these experiments we have convinced that view is one of the important factors in discovery process, and that discovery systems should have an ability of designing and selecting views on data in a systematic way so that experts on the data can employ their knowledge and thoughts efficiently for their purposes.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
T. Alber, W. Gilbert, D. Ponzi, and G. Petsko. The role of mobility in the substrate binding and catalytic machinery of enzymes. In Ciba Found. Symp., volume 93, pages 4–24, 1983.
A. Bloomer, J. Champness, G. Bricogne, R. Staden, and A. Klug. Protein disk of tobacco mosaic virus at 2.8 a resolution showing the interactions within and between subunits. Nature, 276, 1978.
A. Brazma, I. Jonassen, J. Vilo, and E. Ukkonen. Pattern discovery in biosequences. In Proc. 4th International Colloquium on Grammatical Inference (ICGI-98), Lecture Notes in Artificial Intelligence, pages 257–270, 1998.
P. Cheeseman, D. Freeman, J. Kelly, M. Self, J. Stutz, and W. Taylor. Autoclass: A bayesian classification system. In Proc. 5th International Conference on Machine Learning, pages 54–64, 1988.
R. Cho, M. Campbell, E. Winzeler, L. Steinmetz, A. Conway, L. Wodicka, T. Wolfsberg, A. Gabrielian, D. L. D. Lockhart, and R. Davis. A genome-wide transcriptional analysis of the mitotic cell cycle. Molecular Cell, 2:65–73, 1998.
G. Daughdrill, M. Chadsey, J. Karlinsey, K. Hughes, and F. Dahlquist. The C-terminal half of the anti-sigma factor, FlgM, becomes structured when bound to its target, sigma 28. Nat. Struct. Biol., 4(4):285–291, 1997.
J. DeRisi, V. Iyer, and P. Brown. Exploring the metabolic and genetic control of gene expression on a genomic scale. Science, 278:680–686, 1997.
D. Duggan, M. Bittner, Y. Chen, P. Meltzer, and J. Trent. Expression profiling using cDNA microarrays. Nature Genetics, 21:10–14, 1999.
M. Eisen, P. Spellman, P. Brown, and D. Botstein. Cluster analysis and display of genome-wide expression patterns. Proc. Natl. Acad. Sci. USA, 95:14863–14868, 1998.
Y. Kitamura, T. Nozaki, S. Tatsumi, and A. Tanigami. Supporting genome information processing by MetaCommander. In Genome Informatics 1997, pages 238–239. Universal Academy Press, Inc, 1997.
R. Kriwacki, L. Hengst, L. Tennant, S. Reed, and P. Wright. Structural studies of p21Waf1/Cip1/Sdi1 in the free and Cdk2 bound state: conformational disorder mediates binding diversity. Proc. Natl. Acad. Sci. USA, 93(21), 1996.
P. Langley. The computer-aided discovery of scientific knowledge. In The First International Conference on Discovery Science, volume 1532 of Lecture Notes in Artificial Intelligence, pages 25–39. Springer-Verlag, 1998.
Y. Lee, B. G. Buchanan, and J. M. Aronis. Knowledge-based learning in exploratory science: Learning rules to predict rodent carcinogenicity. Machine Learning, 30:217–240, 1998.
N. Lowndes, A. Johnson, and L.H. Johnston. Coordination of expression of DNA synthesis genes in budding yeast by cell-cycle regulated trans factor. Nature, 350:247–250, 1991.
O. Maruyama, T. Uchida, T. Shoudai, and S. Miyano. Toward genomic hypothesis creator: View designer for discovery. In The first international conference on Discovery Science, volume 1532 of Lecture Notes in Artificial Intelligence, pages 105–116. Springer-Verlag, 1998.
J. Quinlan. Induction of decision trees. Machine Learning, 1:81–106, 1986.
P. Romero, Z. Obradovic, C. Kissinger, J. Villafranca, and A. Dunker. Identifying disordered regions in proteins from amino acid sequences. In Proc. I.E.E.E. International Conference on Neural Networks, volume 1, pages 90–95, 1997.
S. Shimozono and S. Miyano. Complexity of finding alphabet indexing. IEICE TRANS. INF. & SYS., E78-D:13–18, 1995.
S. Shimozono, A. Shinohara, T. Shinohara, S. Miyano, S. Kuhara, and S. Arikawa. Knowledge acquisition from amino acid sequences by machine learning system BONSAI. Trans. Information Processing Society of Japan, 35:2009–2018, 1994.
R. Verma, A. Patapoutian, C. Gordon, and J. Campbell. Identification and purification of a factor that binds to the Mlu I cell cycle box of yeast DNA replication genes. Proc. Natl. Acad. Sci. USA, 88:7155–7159, 1991.
S. Wu and U. Manber. Fast text searching with errors. Technical report, Department of computer science, the university of Arizona, 1991.
ftp://ftp.ncbi.nlm.nih.gov/genbank/genomes/S_cerevisiae.
http://cmgm.stanford.edu/pbrown.
http://pc21.phy.sci.yamaguchi-u.ac.jp/ichimura/perl/pdbid.html.
http://quest7.proteome.com/databases/YPD.
http://transfac.gbf-braunschweig.de/TRANSFAC.
http://www.a_ymax.com.
http://www.ddbj.nig.ac.jp/ft/full_index.html.
http://www.expasy.ch/prosite.
http://www.pdb.bnl.gov/pdb-bin/pdbmain.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 1999 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Maruyama, O., Uchida, T., Sim, K.L., Miyano, S. (1999). Designing Views in HypothesisCreator: System for Assisting in Discovery. In: Arikawa, S., Furukawa, K. (eds) Discovery Science. DS 1999. Lecture Notes in Computer Science(), vol 1721. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-46846-3_11
Download citation
DOI: https://doi.org/10.1007/3-540-46846-3_11
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-66713-1
Online ISBN: 978-3-540-46846-2
eBook Packages: Springer Book Archive