PHP Optimization performance -
i have following code, but. it's slow
<?php class ngram { const sample_directory = "samples/"; const generated_directory = "languages/"; const source_extension = ".txt"; const generated_extension = ".lng"; const n_gram_min_length = "1"; const n_gram_max_length = "6"; public function __construct() { mb_internal_encoding( 'utf-8' ); $this->generatengram(); } private function getfilepath() { $files = array(); $excludes = array('.', '..'); $path = rtrim(self::sample_directory, directory_separator . '/'); $files = scandir($path); $files = array_diff($files, $excludes); foreach ($files $file) { if (is_dir($path . directory_separator . $file)) fetchdir($path . directory_separator . $file, $callback); else if (!preg_match('/^.*\\' . self::source_extension . '$/', $file)) continue; else $filespath[] = $path . directory_separator . $file; } unset($file); return $filespath; } protected function removeunicharcategories($string){ //replace punctuation(' " # % & ! . : , ? ¿) become space " " //example : 'you&me', become 'you me'. $string = preg_replace( "/\p{po}/u", " ", $string ); //-------------------------------------------------- $string = preg_replace( "/[^\p{ll}|\p{lm}|\p{lo}|\p{lt}|\p{lu}|\p{zs}]/u", "", $string ); $string = trim($string); $string = mb_strtolower($string,'utf-8'); return $string; } private function generatengram() { $files = $this->getfilepath(); foreach($files $file) { $file_content = file_get_contents($file, file_text); $file_content = $this->removeunicharcategories($file_content); $words = explode(" ", $file_content); $tokens = array(); foreach ($words $word) { $word = "_" . $word . "_"; $length = mb_strlen($word, 'utf-8'); ($i = self::n_gram_min_length, $min = min(self::n_gram_max_length, $length); $i <= $min; $i++) { ($j = 0, $li = $length - $i; $j <= $li; $j++) { $token = mb_substr($word, $j, $i, 'utf-8'); if (trim($token, "_")) { $tokens[] = $token; } } } } unset($word); $tokens = array_count_values($tokens); arsort($tokens); $ngrams = array_slice(array_keys($tokens), 0); file_put_contents(self::generated_directory . str_replace(self::source_extension, self::generated_extension, basename($file)), implode(php_eol, $ngrams)); } unset($file); } } $ii = new ngram(); ?>
how make fast ? thanks
quickly searching 'how profile php' google led stackoverflow question: simplest way profile php script provides brief answer question.
not mention may find useful information here: http://www.php.net/apd http://www.xdebug.org/docs/profiler
Comments
Post a Comment