The Beta Things: scala

Showing posts with label scala. Show all posts

2008/12/10

Merge Sort in Scala

Merge Sort is a kind of fast sorting algorithm with O(nlog(n)) order.

Merge sort itself is a recursive algorithm and can be expressed neatly even in an Imperative programming like Java. However pattern matching and high order functions in Scala can make this even interesting.


def mergeSort(list:List[Int]):List[Int] = list match {
 case x::Nil=> List(x)
 case _ =>
   val (a, b) = list.splitAt(list.length /2)
   merge(mergeSort(a), mergeSort(b))     
}

def merge(aList:List[Int], bList:List[Int]):List[Int]= bList match {       
 case Nil => aList
 case _ =>
   aList match {
   case Nil => bList
   case x::xs =>
     if (x < bList.head)
    x::merge(xs, bList)
     else
    bList.head::merge(aList, bList.tail)               
   }   
}

Insertion Sort in Scala

Insertion sort is a so much simple and straightforward sorting algorithm. The pseudocode looks like following:


for j ← 2 to length[A]
    do key ← A[j]
       i ← j - 1
       while i > 0 and A[i] > key
           do A[i + 1] ← A[i]
              i ← i - 1
       A[i + 1] ← key

But do it in Scala using FP is a quite interesting experience.

Firstly, the traditional way in Scala:


def insertSort(list:Array[Int]) = {     
 var j = 1
 while(j<list.length){   
   val key = list(j)
   var i = j-1
   while(i>=0 && list(i)>key){
     list(i+1) = list(i)
     i=i-1
   }
   list(i+1)=key
   println("In the middle " + list.mkString(","))
   j=j+1
 }
 list
}

I use while rather than for, because for means more than looping.

The pure functional way requires no side-effect, which also generally means immutable and no loop.

So the first function is insert(list:List[Int], t:Int]:List[Int], which inserts a value t into a sorted list list and returns the result.


def insert(list:List[Int], t:Int):List[Int]= list match {
   case Nil => List[Int](t)
   case x::r if x > t=> t::x::r
   case x::r => x::insert(r,t)      
 }

Pattern match makes this like a breeze.

Then the sorting function:


def insertSortFp(list:List[Int])={
    (list :\ List[Int]())((a,b)=>insert(b,a))
  }

I use foldRight here just because of my laziness.

2008/11/12

GWT RPC from non-servlet-container - finally

Finally, I get to this part after the first and second rounds.

GwtRpc

A Scala module(object) GwtRpc, is created for this RPC purpose, which defines three traits of interest:

GwtRpcSpp
RpcService
SppHolder

GwtRpcSpp is a simple trait extending GWT's SerializationPolicyProvider, which declares an abstract method getPolicyFileInputStream:


def getPolicyFileInputStream(moduleBaseUrl:String,
strongName:String):Option[InputStream]

moduleBaseUrl and stringName are two parameters carried by client request of GWT RPC. This method is supposed to be implemented to allow users to hook up their own logic about getting serialize policy file generated by GWT RPC.

RpcService is another simple trait which is supposed to be mixed into entities who is going to be the end-implementations of GWT RPC services.

trait RpcService { this:SppHolder with RemoteService =>

 def call(final in: => String)(handler: PartialFunction[CallResult, Unit])={
   ...
 }
 ...
}

Method call is the main entry of a RpcService implementation, which takes two arguments.

in - a Call-by-name string: the UTF-8 payload of an RPC request;
handler - a partial function which is provided by user to match the result of a call and write result back to response

SppHolder is a container of SerializationPolicyProvider. RpcService declares it as a part of its self type annotation which sort of declares a dependency to SppHolder, only the dependency needs to be fulfilled by its implementation.

Actually, this GwtRpc module is container-independent. It can be used in any container which provides IO stream as abstraction of HTTP request.

The Restlet implementation

Cause I (in this series) take more interesting in serving GWT RPC via Restlet, I would give a simple implementation for Restlet environment.

Generally, Restlet allows you to serve a HTTP request via a Restlet or a Resource. As a result, two traits implementations for each of them are provided respectively.

For Restlet:

trait RpcRestlet extends Restlet with RpcService with SppHolder{this:RemoteService=>
 override def handle(req:Request, resp:Response){
   req.getMethod match{
     case Method.POST=>
       call(fromStream(req.getEntity.getStream))(writeRpcResult(resp))
     case _ => // do nothing
   }
 }
}

For Resource:

trait RpcResource extends Resource with RpcService with SppHolder{this:RemoteService=>
 override def acceptRepresentation(rep: Representation){
   call(fromStream(rep.getStream))(writeRpcResult(getResponse)) 
 }
}

fromStream is a method to read payload from an InputStream. This payload is simple a UTF-8 string and will be decoded to an instance of RpcRequest by GWT's RPC.decodeRequest(String).

writeRpcResult is a functional variable, which takes a Response as argument and returns a PartialFunction:

val writeRpcResult: Response=>PartialFunction[CallStatus, Unit] = {r=>{
  case CallSuc(msg)=>
    val rep = new StringRepresentation(msg, MediaType.APPLICATION_JSON)
    rep.setCharacterSet(CharacterSet.UTF_8)
    rep.setDownloadable(true)
    rep.setDownloadName("attachment")
    r.setEntity(rep)
    r.setStatus(Status.SUCCESS_OK)
  case CallFail(msg)=>
    r.setEntity(new StringRepresentation(msg, MediaType.TEXT_PLAIN))
    r.setStatus(Status.SERVER_ERROR_INTERNAL)
}
}

To create an implementation of GWT RemoteService, just do almost same thing as in GWT but change the parent class to either one of above. For example:

abstract class PingServiceImpl extends RpcRestlet with PingService with SppHolder{
def ping(msg:String)={
  "Pong-> " + msg + "@" + new Date
}
}

PingServiceImpl is still abstract, because its dependency to SerializationPolicyProvider has not been fulfilled. To make thing simple, I reuse the App created in last installment as the base and make it a SerializationPolicyProvider for the whole Application scope.

The implementation looks like following code, using Directory gwtDir is a lazy way here.


class App extends Application with GwtApp with GwtRpcSpp{self=>
def getPolicyFileInputStream(moduleBaseUrl:String,
                            strongName:String):Option[InputStream]={
  val ref = new Reference(moduleBaseUrl)
  val response = gwtDir.get(ref.getLastSegment+"/"+getPolicyFileName(strongName))
  if(Status.SUCCESS_OK == response.getStatus)
    Some(response.getEntity.getStream)
  else
    None
}
}

Finally attach ping service to the router of this application and specify the SppHolder dependency:


override def createRoot={
//...
r.attach("/ping", new PingServiceImpl{
 def provider = self
})
//...
}

There is no need to change code of GWT client, just let ping service call endpoint http://localhost:8080/ping will done.

For complete source code of this sample, see http://edgebox.googlecode.com/svn/sandbox/gwt-restlet-demo/.

2008/10/31

GWT RPC from non-servlet-container (2)

In the second post, I would like to show-off the project structure of a simple GWT application served by pure Restlet (which means not running in a servlet container) Application as a sample and basis of further discussion.

The GWT Client

In order to deploy a GWT application in non-servlet container like Restlet, I can not package the GWT client in a war. I choose to package everything (class files, java sources, gwt compiled js and other static resources) into a single jar. It might be also friend to OSGi environment.

I use gwt-maven plugin as the build tool, which is quite handy in our case. For detail usage of gwt-maven plugin, please check the example. I made some modification in order to create this "special" jar.


<plugin>
 <groupId>com.totsp.gwt</groupId>
 <artifactId>maven-googlewebtoolkit2-plugin</artifactId>
 <version>2.0-beta24</version>
 <configuration>
   ...
   <output>${project.build.directory}/classes/_gwt</output>
 </configuration>
</plugin>

I changed the output directory of GWT resources, so that they can easily get packaged by maven jar plugin.


<plugin>
 <groupId>org.apache.maven.plugins</groupId>
 <artifactId>maven-jar-plugin</artifactId>
 <executions>
   <execution>
     <phase>package</phase>
     <goals>
       <goal>jar</goal>
     </goals>
     <configuration>
       <classifier>gwt</classifier>
       <excludes>
         <exclude>_gwt/.gwt-tmp/**</exclude>
       </excludes>
     </configuration>
   </execution>
 </executions>
 <configuration>
   <excludes>
     <exclude>_gwt/**</exclude>
   </excludes>
 </configuration>
</plugin>

Two jars are created, one with classifier "gwt" includes all static resources.


<build>
<resources>
<resource>
<directory>./src/main/java</directory>
</resource>
<resource>
<directory>./src/main/resources</directory>
</resource>
</resources>
...
</build>

This is a small trick to package java sources into GWT jar.

The Restlet Application

(Sorry for using Scala as the example language, but hopefully you will see how Scala can ease the development)
At current moment, there is no standard way to deploy a web application using Restlet. Basically in Restlet, you expose dynamic part as a Resource, and serve static things using Directory (see example on Restlet site).

I create a GwtApp trait for serving GWT client an Application scope.


trait GwtApp extends Application{
 val gwtDir = new Directory(null, "clap://class/_gwt/")

 abstract override def createRoot():Restlet={
   gwtDir.setContext(getContext)
   super.createRoot match{
     case root:Router=>
       root.attach("/_gwt", gwtDir);root
     case r@_=>
       val root = new Router
       root.attachDefault(r)
       root.attach("/_gwt",gwtDir)
       root
   }
 }
}

The small "abstract override" trick creates a kind of AOP around aspect over createRoot method, see this. So I can just mixin this trait into my application without any change.


class App extends Application with GwtApp{

 override def createRoot():Restlet={
   val r = // ... get Root as a Router
   r.attach("/test", classOf[TestResource])
   //attach the main view
   r.attach("/main", new Restlet(getContext){
     override def handle(req:Request, resp:Response){
       val r = new InputRepresentation(
       getClass.getResourceAsStream("/host.html"),
         MediaType.TEXT_HTML)
       resp.setEntity(r)
     }                  
   })
   r
 }
}

Two endpoints are exposed by this application.

/main - return the host page of above GWT client, the src location of GWT bootstrap js file is /_gwt/gr.demo.App/gr.demo.App.nocache.js
/test - a very simple restful service accepting text/html request and serving back a text/plain representation, see below;


class TestResource(ctx:Context, req:Request, resp:Response)
   extends Resource(ctx,req,resp){

 getVariants.add(new Variant(MediaType.TEXT_HTML))
 override def represent(v:Variant)={
   new StringRepresentation("Helo@"+new Date, MediaType.TEXT_PLAIN);
 }
}

After you package all of above into a Component;


val com = new Component
Array(HTTP, FILE, CLAP).foreach(x=>com.getClients.add(x))
com.getServers.add(HTTP, 8080)
com.getDefaultHost.attachDefault(new App)

You get a client served by pure Restlet application.

So next post of this series will be the prototype of serving GWT RPC using Restlet, hopeful it will be cool.

2008/10/30

Scala 2.7.2-RC5

今天下午才升级到了Scala 2.7.2-RC4，没想到Scala team以迅雷不及掩耳之势，今天发布了2.7.2的RC5。根据Scala-lang主页上的说明，感觉这应该是2.7.2正式release之前的最后一个RC了。

We are taking particular care in polishing Scala 2.7.2, also considering that 2.7.2 will be the reference release for the "Programming in Scala" book, currently in print. This RC5 should really be our last release candidate for this release cycle, and should be shortly followed by the final release.

不过到现在，scala-tool的maven repository中，还没有rc4和rc5的版本。

2008/10/22

Scan package

很多时候，需要动态的检查某个package下是否包含特定的类，比如继承了某个接口，或是附加上了某个annotation。

这是个典型的听上去容易，做起来麻烦的任务。

如果package的classpath属于file://一类，可以直接使用Class.getResourceAsStream("your/package")打开一个text stream, 其中的每一行为package包含的每一个元素。


import scala.io.{Source}

for(
line <- Source.fromInputStream(aClass.getResourceAsStream("your/package")).getLines
name = line.trim
if match(name)
}yield{
Class.forName(name)
}

但是如果package位于某一个jar文件中则需要遍历jar file中对应的每一个jar entry。

庆幸的是，始终可以在open source社区中找到"先人"的贡献。

如果你使用Spring (v 2+)，一定会对Spring中新增的annotation wire印象深刻，其中包括一个自动在某个package下寻找符合条件的bean的功能。

这个package scanning的功能由org.springframework.context.annotation.ClassPathBeanDefinitionScanner这个类来完成。默认的设定是找出指定package下带有@Component的类。不过由于Spring良好的结构，很容易进行扩展。


val reg = new SimpleBeanDefinitionRegistry
val scanner = new ClassPathBeanDefinitionScanner(reg, false)
scanner.setIncludeAnnotationConfig(false)
scanner.addIncludeFilter(new AssignableTypeFilter(classOf[Object]))
scanner.scan("your.pkg")

Scanner依赖一个TypeFilter来选择感兴趣的类。在上面的代码中禁止了预先设置的filter，并且用一个AssignableTypeFilter来代替，因此选择所有的类。
一旦scan结束，找到的类定义(BeanDefinition)会被放入BeanDefinitionRegistry这样一个结构中。上面的代码中，使用了SimpleBeanDefinitionRegistry这样一个简单的实现。

2008/10/09

Scala implcit parameters on constructor

Scala's implicit conversion is a so much cool feature which allows you extends exiting libraries without changing source codes. It works like a wrapper but transparent to users.

Says I have a string representing a URL, by implicit conversion


implicit def toMyRichString(url:String) = new MyRichString(str){
   def getText={
    //...
   }
 }

to make a String richer (even richer than RichString). I can use


"http://www.scala-lang.org".getText

to retrieve the html text from Scala site.

The possibility is unlimited and the feature is almost free. It does have performance lost, but comparing to other dynamic language (such as Groovy), this is way much better.

Implicit parameter is even cooler but subtle. It kinds of allows you to provide a default value to your function parameter so you can invoke a function without specifying that parameter. The requirement is you must provide such a value (of same type) in the context of that invoking.

Implicit parameter can also be used on constructor. This gives me such an idea that this could be used in some scenario of dependency injection or auto-wiring. The first case I think out is about log. So here is an example:


class Log(val name:String){
 def log(msg:String){
   println("["+name+"] " + msg)
 }
}
implicit def anyRefToLogger(t:AnyRef) = new Log(t.getClass.getName)

class Test(val arg:String)(implicit log: AnyRef=>Log){
 log.log("Hello " + arg)
}

new Test("Implicit Cool")

Which may seems trivial but quite interesting to me.

2008/10/04

Lazy Fibonacci stream in Scala

I tried to create a lazy infinite Fibonacci stream using Scala, here are my two solutions.

The first one is a naive one using inner function:


def fib:Stream[Long]={
  def fibInner(f0:Long, f1:Long):Stream[Long]={
    Stream.cons(f0, fibInner(f1, f0+f1))
  }
  fibInner(1,2)
}

The second one is ported from Haskell:


def fib2:Stream[Long]={
  Stream.cons(1, Stream.cons(2, fib2.zip(fib2.tail).map(x=>x._1+x._2)))
}

Although both of them works but the second one is way slower than the first one.

2008/09/27

Scala syntax primer

Good Scala syntax reference on jim-mcbeath. Another worthy-reading is by A. Sundararajan and part 2.

2008/09/21

Scala smooths ORM and more

Two Scala features', object as module and implicit convert, can greatly ease your daily hibernate coding.

I do love the ORM module (GORM) provided Grails, which allow you to be focus on plain domain classes and these domain classes get enhanced later on by the container(Grails). So your plain domain classes (or POGOs) automatically have functionality of presist, such as save, delete, and queries.

Grails implements this by a lot of complicated Groovy MOP tricks(1,2). Basically, once such a protocol is defined, it is carried by domain classes along whole life cycle. Even your domain object is not currently working under persist layer, it still has save and delete method. This might not be a big deal in most cases. However, dynamic methods cost great much, it rings the bell about performance issue.

Implementing such feature in current Java programming work is also not good. Usually, such kind of framework-supported-enhancement involves lots of bytecode enhancement or instance proxying related reflection. Framework like ActiveObjects enhances your domain classes using the latter approach(correct me if I am wrong). As a result, you are not be able to create your domain instance using new keyword (your domain classes are enhanced hence changed). You have to create domain instance using a factory method provided by framework. It is faster, but break your POJOs in this or that ways. Your thin domain class will have this or that new injected methods or properties which might break even serialization.

So, how Scala implicit conversion do about this? My answer is implicit conversion and object as modules. Say I have a domain class Book which is implemented as following Scala code (you can do this in Java also).


import org.edgebox.orm.Entity
import scala.reflect.{BeanProperty=>BP}
class Book extends Entity{
@BP
var id=0L
@BP
var name:String = null
@BP
var price = 0
// other properties are omitted for simplicity
}

Later on when you use this domain class in your persist layer, you can simply do following:


import org.edgebox.orm.ORM
import org.edgebox.orm.Query._

class APersistService(val orm:ORM){
import orm._
def aPersistMethod = {
  val Book = classOf[Book]
  // doing query
  Book.all.filter(ilike("name", "scala")).order(asc("price")).fetch
  // doing get by id
  val book = Book.get(1)
  // update book ...
  book.save
  //...
}
}

The query syntax is stolen from Google AppEngine (or Django).

Quite simple, the trick is the import orm._ clause. You can treat instance orm as a module, the import clause introduces all elements of that orm instance into this PersistService. The orm instances has two interesting implicit convert functions:


type EntityType = T forSome{ type T <: Entity}   
implicit def entityWrapper(e:EntityType) = new RichEntity   
implicit def entityTypeWrapper(et:Class[_<:Entity])= new RichEntityType

So, when you tries to call book.save although your domain class knows nothing about persistence, the persist module (orm instance) and implicit conversion can nicely handle this for you, automatically and transparently by Scala compiler.

Benefits?

You keep your domain class clean, no one is gonna inject anything into your domain class or intercept any calls during your operations;
Very minor performance overhead, you gotta no performance lose when you are working with your domain class and very minor performance overhead when you are trying to using specific layer-related method;
Object as module also bring flexibility; instance of orm can be configured using various possibility and also Spring-enabled;
The most interesting point is (which is also my most favorite part), this approach introduces a more clean and flexible design onto multi-layered system. You can focus your domain class but when working on different layer you just need to import different function modules which automatically enhance your domain classes with layer-specific function. For example, our domain class might be able to do serialization by calling book.toJson or book.toRdf under layer handling Restful representation. And using which kind of serialization is totally relied on a simple import clause.

You can find above sample code under http://svn.assembla.com/svn/edgebox/edgebox/branch/edgebox/, under package org.edgebox.orm. At current moment, this toolkit is tightly based on Spring and Hibernate (as Grails does).

Updates

The code of RichEntityType and RichEntity are:


  class RichEntityType[T](val self:Class[T], val dao:Dao[T]) extends Proxy{
    def get(id:Serializable)={
      dao.get(id)
    }
  }
 
  class RichEntity[T<:Entity](val self:T, val dao:Dao[T]) extends Proxy{
    def save()={
      dao.save(self)
      self
    }
  }

2008/09/19

Scala Array and repeated parameters

I met an interesting Scala problem these days, which is about repeated parameter of a function. In Java you can specify varargs on a method,


public void a(String... args){
 ...
}

Inside that function, you get args as an Array of String.

In Scala, we have similar feature but called repeated parameter.


def a(args:String*)

The Scalabook (section 8.8) says you get that args as an instance of Array[String]. But Scala reference(pdf) has different explanation.

The type of such a repeated parameter inside the method is then
the sequence type scala.Seq[T ].

The Scala reference is right. Actually, the implementation gives you an instance of scala.runtime.BoxedObjectArray. You can treat it as an instance of Array[String] but you can not pass it as an instance of Array[String].

Say we have another method


def b(is:Array[String])

If you want to pass args instance you get in method a to method b, you get compile error:

error: type mismatch
found: String*
required: Array[String]

So here is my solution to this.

If you want to use args as an Array[String] with method a or pass it to Java, you need to unboxed it first.


args.asInstanceOf[BoxedObjectArray].unbox(classOf[String]).
    asInstanceOf[Array[String]]

If you want to pass args to another method, like b mentioned before, in Scala as paramter of type Array[String], you only need to get an array from that Seq


args.toArray[String]

Although you get another instance of BoxedAnyObject, but when passing parameter to method b, scala compiler can auto-unbox for you.

For interesting implementation detail, see David's excellent writing about Scala array.

2008/09/16

Scala, Object as Module

Package provides no way to abstract?

Package is not configurable
Package is not inheritable

Self type: (scalabook) self type is the assumed type for this.

If a formal parameter is given, it can be used as an alias for the reference this throughout the body of the template. If the formal parameter comes with a type T , this deﬁnition affects the self type S of the underlying class or object as follows: Let C be the type of the class or trait or object deﬁning the template. If a type T is given for the formal self parameter, S is the greatest lower bound of T and C. If no type T is given, S is just C. Inside the template, the type of this is assumed to be S.

举个简单的例子，对于一个trait A而言，无法预期会被哪一个class或是object继承(mixin)，因此在这个trait中，this默认为类型A，在这个trait中只能访问这个trait或是这个trait的父类型的元素。然而


trait A{
 self: B=>
}

引入了如下变化:

只有类型B或B的子类型class或object可以继承A(mixin)，否则编译错误；
trait A内部还可以使用类型B所定义的元素；

The Beta Things