January 2009 Archives

(Web-service) Japanese Analyzer update

| No Comments | No TrackBacks
I added a new light-weight pop-up dictionary to the Japanese Analyzer .
Now in the new version, two links are displayed for each of the words. 'Tangorin' opens in a new window as same as before, while 'EDICT' opens in a small pop-up.
EDICT may be less accurate and less imformational than Tangorin, but I think it's more convenient for the learners who want to go through a sentence quickly.

Japanese Analyzer

(Processing) tree, tripod

| No Comments | No TrackBacks
I posted some videos to vimeo for the first time. They are pieces for studying and nothing to show off though.

This one is basically a fractal consists of a shape like tripod. I added a flickering effect that reminds me of old films.



tree, tripod from kynd on Vimeo.



(Web-service) smlfeeds - find similar feeds

| No Comments | No TrackBacks
What you write is what you are interested in. So how about a service which automatically searches others' blog posts that might have the same topic as yours.

My idea is to:
  1. Pick up keywords from the feed of the blog by using the term extraction service of Yahoo.
  2. Then, search the keywords at Technorati and return a list of pages hit in new rss feed.
Result? Not bad. Some articles are indeed of my taste though some are totally irrelevant and I feel there's plenty of room for improvement.

Give it a try if you like. Enter the URL of the feed of your blog, del.icio.us bookmark or anything(RSS 1.0/2.0 and atom 1. are supported) and subscribe to the rss generated based on your feed which will be updated everyday.

smlfeeds, looks up blog posts that might be similar to yours.

(Web-service) A note about keyword extraction

| No Comments | No TrackBacks
I've been interested in analyzing and processing text ever since I tested MECAPI, a Japanese text analyzer.

The term extraction service that Yahoo! provides is a handy tool to pick up keywords or words that seem to characterize a given text(I've uploaded a sample to test this api: http://www.kynd.info/library/termextraction/).

According to Tatsuwo-no change log, the formula below can be used to display an index that shows how a word(i) is characteristic to the text(j). To calculate the index, a great amount of sample documents such as, for example, all the documents registered in Yahoo!'s database and the number of hits in search, is needed.


  1. tfi,j is number of occurrences of i in j
  2. dfi is number of documents containing i
  3. N is total number of documents
In short, this means a word is 'characteristic' if it appears in the text for many times, but is rare in the total sample. Tatsuo-no change log also provides a sample code to test this formula using search results in Yahoo!.

(misc)RSS and ATOM specs

| No Comments | No TrackBacks
RSS 2.0 at Harvard Law

RSS Specifications
"This site is a comprehensive rss reference detailing you need to know about RSS."
http://www.atomenabled.org/
Atom is a simple way to read and write information on the web, allowing you to easily keep track of more sites in less time, and to seamlessly share your words and ideas by publishing to the web.
/*note*/

$proxy_opts = array('http' => array('proxy' => 'tcp://proxy.name.com:8080',),);
$proxy_context=stream_context_create($proxy_opts);
    
$file = file_get_contents($rss,false,$proxy_context);


PHP:stream_context_create-Manual

(PHP) header() function

| No Comments | No TrackBacks
/*note*/

xml/html/css/javascript/plain
header("Content-Type: text/xml");
jpg/gif/png/
header("Content-Type: image/jpg");

Errors
header("HTTP/1.0 301 Moved Parmanently");
header("HTTP/1.0 401 Unauthorized");
header("HTTP/1.0 403 Forbidden");
header("HTTP/1.0 404 Not Found");
header("HTTP/1.0 500 Internal Server Error");

Redirect
header("Location: http://fkob.net");

No cash
header( 'Expires: Mon, 26 Jul 1997 05:00:00 GMT' );
header( 'Last-Modified: ' . gmdate( 'D, d M Y H:i:s' ) . ' GMT' );
header( 'Cache-Control: no-store, no-cache, must-revalidate' );
header( 'Cache-Control: post-check=0, pre-check=0', false );
header( 'Pragma: no-cache' );
I was testing the Japanese Analyzer I introduced in the last post and found it tends to classify unknown words as interjections. On the original documentation of MeCab, the analyzer engine that MECAPI is build on, they say that 'MeCab guesses the part-of-speech when the word is not registered in the dictionry'. So if the sentence to be analyzed contains unknown word or typo, the api may return inaccurate information. I think it's better to simply say that the word is unkown when it's not in the dictionary, but it seems there's no way to change the setting. It's a bit of a shame.

Japanese Analyzer - kynd.info

(Web-service) Japanese Analizer powered by MECAPI

| No Comments | No TrackBacks
MECAPI is a convenient web api that analyzes Japanese sentences, breaks down them into words and returns the result with additional information like the part-of-speech of each of the words.
This is so useful when building a application like search or recommendation engine because Japanese sentence is not separated with spaces like in European languages.
For example, 'anatanonoie' means my house or literally house(ie) of(no) you(anata), but there's no way other than to know each of the words to tell where they are separated.

As a brief test, I implemented a Japanese sentence analyzer in a straight-forward way.
Type a sentence to analyze in the text box and submit, then the pronunciation, part-of-speech, inflection, and baseform the words used are displayed in a table. Clicking the baseform opens online dictionary(tangorin.com) in a new window.

Japanese Analyzer - kynd.info

Thumbnail image for japanese2.png

(AS3) Tumblrview updated

| No Comments | No TrackBacks
I've updated Tumblrview, a project to display the feeds from Tumblr in the flash-driven interface.

  • The size of the thumbnails can be changed using the slider at the bottom left, so that a user can choose from browsing tens of posts at once or looking at them one by one.
  • User id can be specified using query string, for example like http://www.kynd.info/tumblrview/?id=kyndnote. This means a user can bookmark or send a link to a page that automatically connects to his/her tumblrlog.

Tumblrview - kynd.info


(Processing) Syncronizing video image with sound

| No Comments | No TrackBacks
One of my complaints about Flash is that it still can't capture wave forms of sound from sources like the microphone or line-in.

With processing, it's like taking candy from a baby.
Here is a sample to get audio input from your PC and use the sample data to a create visual effect on a video image.
I didn't uploaded the completed applet because it didin't seem to work on a web page(neither does the example on the Minim library's online documentation).
If run on a PC with web-cam and microphone, this example will show an image from the camera distorted according to the waveforms as you talk or sing.



import ddf.minim.*;
import processing.video.*;

Minim minim;
AudioInput in;
Capture cam;
 
void setup() {
  size(320, 240);
  cam = new Capture(this, width, height);
  minim = new Minim(this);
  minim.debugOn();
  in = minim.getLineIn(Minim.MONO, 512);
}
 
void draw() {
  drawImage();
}

void drawImage() {
  float lev;
  PImage img = createImage(width,height,RGB);
  int ty;
  if (cam.available() == true) {
    cam.read();
    cam.loadPixels();
  
     for (int i = 0; i < width; i ++) {
       lev = in.left.get(int(float(i) / width * 511)) * 60;
       for (int j = 0; j <height; j ++) {
    
       ty = int(j + lev);
       if (ty < 0) { ty += height; }
       if (ty >= height) { ty -= height; }
       img.pixels[i + j * width] = cam.pixels[i + ty * width];
     }
   }
   image(img, 0, 0);
  }
}

 
void stop() {
  in.close();
  minim.stop();
  super.stop();

(Processing) Playing MP3 with Minim

| No Comments | No TrackBacks
Playing MP3 using Minim. Clicking on the stage plays/stops the sound.