We solve several remaining questions from last week:
1. The problem of how to transfer data stored in xls file into arff file is solved completely.
The method of transfer xls to arff is to store the data in xls as csv file, in which every message are stored and divided with comma. And then a heading was added into the csv file. In the heading, the line of data and their type are identified.
Date and time in the original data should be merged to certain form so that they can be recognized in Weka.
2. We found that only few tweets contain the location data. In the existing circumstances, we can not write advanced algorithm and programs to collect and filter the messages. So all the suggested topics involved with location is not doable any more from now on.
3. Some other suggested topic are raised this week. Movie classification is more practicable then original topics. The original topic is very creative. But due to the time and our capability, it is not practical.
This is another problem that we have to solve the next week.
1. We have to learn how to use Weka. Weka is a very powerful tool in data mining. But the operation and algorithm is very complicated. Certain filter and classifier have to be used to ensure the accuracy of the result.
2. We have to learn to combine some online services, Weka result and human analysis to give the right answer .
No comments:
Post a Comment