Monday, August 9, 2010

Log file

Have created repositories of the ebooks.


I am using the utility called luke to look inside the repository. Now i will be implementing NMF algorithm to get the reduced vector space models.

Tuesday, June 1, 2010

Creating Repositories for the ebooks

Rafa had provided me with TML ( text mining library) code which proved to be very useful for feeding the ebooks onto the computer. This utility helps to make repositories of the documents processed while indexing them at the same time. Once the repositories of the ebooks are created then i can implement the NMF( non-negative matrix factorization) algorithms to classify the ebooks.

Wednesday, April 28, 2010

Project proposal submitted

I submitted the proposal last week . The title was "Genre classification of books using Non-negative Matrix Factorization".

I will be getting all the text data(ebooks) from gutenberg project ( http://www.gutenberg.org/). I plan to use 100 ebooks and classify them.Romance, horror, fantasy, politics are the main genre of books i am looking for, i wanted to have thriller in my list but the keyword occurrences in horror and thriller are very much similar which wouldn`t have given clear results in the end . Feeding the data and manipulating (converting it into Vector space models) it so that it can be used for affect recognition will be an issue. I plan to take Mac Kims help on this, as he has already done research in this area.

Thursday, April 15, 2010

Building up on my thesis proposal

I have read the initial literature given by Rafa. I found the paper "Opinion Mining and Sentiment Analysis" by Bo Pang and Lillian Lee very comprehensible for a beginner like me in this area of knowledge. The paper on " Towards a Frame work of Literature review Process in Support of Informative Systems research" by Yair Levy and Timothy J. Ellis gave me a solid understanding of why a good literature review is needed and thus further encouraged me to look for quality literature to develop my background knowledge in this area of interest. " Sentimental Analysis in Student Experiences of learning " by Sunghwan Mac Kim and Rafael A. Calvo introduced me to the various techniques applied to judging sentiments in text and how emotion is modelled .So far i have found the following papers which will assist me for my thesis:-

Shaikh, M. A., Prendinger, H., and Ishizuka, M. 2008. SENTIMENT ASSESSMENT OF TEXT BY ANALYZING LINGUISTIC FEATURES AND CONTEXTUAL VALENCE ASSIGNMENT. Appl. Artif. Intell. 22, 6 (Jul. 2008), 558-601. DOI= http://dx.doi.org/10.1080/08839510802226801

Wu, C., Chuang, Z., and Lin, Y. 2006. Emotion recognition from text using semantic labels and separable mixture models.ACM Transactions on Asian Language Information Processing (TALIP) 5, 2 (Jun. 2006), 165-183. DOI= http://doi.acm.org/10.1145/1165255.1165259

Gill, A. J., French, R. M., Gergle, D., and Oberlander, J. 2008. The language of emotion in short blog texts. In Proceedings of the 2008 ACM Conference on Computer Supported Cooperative Work (San Diego, CA, USA, November 08 - 12, 2008). CSCW '08. ACM, New York, NY, 299-302. DOI= http://doi.acm.org/10.1145/1460563.1460612

apparava, C. and Mihalcea, R. 2008. Learning to identify emotions in text. In Proceedings of the 2008 ACM Symposium on Applied Computing (Fortaleza, Ceara, Brazil, March 16 - 20, 2008). SAC '08. ACM, New York, NY, 1556-1560. DOI= http://doi.acm.org/10.1145/1363686.1364052