search - Best way to pre-process text messages using Hadoop -


i using hadoop process text messages(sms). not sure of best way pre-process these data can efficient search. example, after preprocessing data if searches 'ny' able display messages containing word 'ny'. advisable write pre-processed data xml file , not database.

note: have around 200k text messages in .csv file.

the way import preprocessed data hdfs first import data (csv file in case) database , create table view fine-tunes needs. import data hdfs using sqoop. more information on sqoop can found here

http://www.cloudera.com/blog/2009/06/introducing-sqoop/

for doing sqoop import database take @

http://archive.cloudera.com/cdh/3/sqoop/sqoopuserguide.html#_connecting_to_a_database_server


Comments

Popular posts from this blog

c# - SharpSVN - How to get the previous revision? -

c++ - Is it possible to compile a VST on linux? -

url - Querystring manipulation of email Address in PHP -