Wednesday, April 28, 2010

Project proposal submitted

I submitted the proposal last week . The title was "Genre classification of books using Non-negative Matrix Factorization".

I will be getting all the text data(ebooks) from gutenberg project ( http://www.gutenberg.org/). I plan to use 100 ebooks and classify them.Romance, horror, fantasy, politics are the main genre of books i am looking for, i wanted to have thriller in my list but the keyword occurrences in horror and thriller are very much similar which wouldn`t have given clear results in the end . Feeding the data and manipulating (converting it into Vector space models) it so that it can be used for affect recognition will be an issue. I plan to take Mac Kims help on this, as he has already done research in this area.

Thursday, April 15, 2010

Building up on my thesis proposal

I have read the initial literature given by Rafa. I found the paper "Opinion Mining and Sentiment Analysis" by Bo Pang and Lillian Lee very comprehensible for a beginner like me in this area of knowledge. The paper on " Towards a Frame work of Literature review Process in Support of Informative Systems research" by Yair Levy and Timothy J. Ellis gave me a solid understanding of why a good literature review is needed and thus further encouraged me to look for quality literature to develop my background knowledge in this area of interest. " Sentimental Analysis in Student Experiences of learning " by Sunghwan Mac Kim and Rafael A. Calvo introduced me to the various techniques applied to judging sentiments in text and how emotion is modelled .So far i have found the following papers which will assist me for my thesis:-

Shaikh, M. A., Prendinger, H., and Ishizuka, M. 2008. SENTIMENT ASSESSMENT OF TEXT BY ANALYZING LINGUISTIC FEATURES AND CONTEXTUAL VALENCE ASSIGNMENT. Appl. Artif. Intell. 22, 6 (Jul. 2008), 558-601. DOI= http://dx.doi.org/10.1080/08839510802226801

Wu, C., Chuang, Z., and Lin, Y. 2006. Emotion recognition from text using semantic labels and separable mixture models.ACM Transactions on Asian Language Information Processing (TALIP) 5, 2 (Jun. 2006), 165-183. DOI= http://doi.acm.org/10.1145/1165255.1165259

Gill, A. J., French, R. M., Gergle, D., and Oberlander, J. 2008. The language of emotion in short blog texts. In Proceedings of the 2008 ACM Conference on Computer Supported Cooperative Work (San Diego, CA, USA, November 08 - 12, 2008). CSCW '08. ACM, New York, NY, 299-302. DOI= http://doi.acm.org/10.1145/1460563.1460612

apparava, C. and Mihalcea, R. 2008. Learning to identify emotions in text. In Proceedings of the 2008 ACM Symposium on Applied Computing (Fortaleza, Ceara, Brazil, March 16 - 20, 2008). SAC '08. ACM, New York, NY, 1556-1560. DOI= http://doi.acm.org/10.1145/1363686.1364052