Monday, February 27, 2012

Scala Source Code and Density

I love Scala. It enables me to write concise / dense and clear code at the same time.

Take for example the piece of code below.
Had I written it in Java, it would be much longer, while at the same time less dense, and with much less clarity of what it is doing.

val site = config.sites.find(_.name == siteName).getOrElse(
    throw new IllegalArgumentException("Cannot find site " + siteName))
val group = config.groups.find(_.name == groupName).getOrElse(
    throw new IllegalArgumentException("Cannot find group " + groupName))
group.repositories.foreach { repo =>
  val repoDir = new File(group.path, repo.name)
  println("Backing up to repository %s at %s starting..." format (repo.name, repoDir))
  if (repoDir.mkdir())
    println("Created directory %s" format repoDir)
  repo.facets.foreach { facet =>
    println("Exporting facet %s..." format facet.name)
    val source = site.databases.find(_.name == facet.source).getOrElse(
        throw new IllegalArgumentException(
          "Facet %s references non-existing database %s" format (facet.name, facet.source)))
    println("Source database is %s:%s at %s" format (source.kind, source.name, source.url))
    (source.kind, facet) match {
      case ("neo4j", Facet(name, sourceName, "node", typeName, "json")) =>
        val outFileName = new File(name + ".json")
        println("Exporting %s nodes to %s" format (typeName, outFileName))
        val dbDir = new File(new URI(source.url))
        println("Neo4j database directory: %s" format dbDir)
        val graphDb = new EmbeddedReadOnlyGraphDatabase(dbDir.getPath)
        val (rowMeta, rows) = try {
          val exporter = new Neo4jNodeExporter(graphDb, typeName)
          println("Fetching meta...")
          val rowMeta = exporter.fetchMeta()
          println("Columns are: " + rowMeta.columns.mkString(", "))
          val rows = exporter.export(rowMeta)
          (rowMeta, rows)
        } finally {
          graphDb.shutdown()
        }
        
        println("Reading rows...")
        val mapper = new ObjectMapper
        mapper.getSerializationConfig.set(Feature.INDENT_OUTPUT, true)
        val jsonRows = rows.map { row =>
          val obj = mapper.createObjectNode()
          for ((value, i) <- row.values.view.zipWithIndex) {
            if (value != null)
              obj.put(rowMeta.columns(i), value)
          }
          obj
        }
        val jsonArray = mapper.createArrayNode().addAll(jsonRows.toList)
        val jsonData = mapper.createObjectNode()
        jsonData.put("data", jsonArray)
        val outFile = new File(repoDir, outFileName.toString)
        
        print("Writing %s..." format outFile)
        mapper.writeValue(outFile, jsonData)
        println(" [OK]")
        
      case x: Any => println("Skipping unrecognized facet: " + x)
    }
  }
  println("Backing up to repository %s finished." format repo.name)
}

Scala is also fantastic at storing ad-hoc object graph / trees, check a look a this :

  sites = List(
    Site(name = "dev", databases = List(
      Database(name = "graph", kind = "neo4j", url = "file:///together/project/SatukanCinta/satukancinta-neo4j-db_dev_1.6/"))),
    Site(name = "test",
      databases = List(Database(name = "graph", kind = "neo4j", url = "file:///together/project/SatukanCinta/dumptest/graph")))),
  groups = List(Group(name = "main", path = "/together/project/SatukanCinta/dump_main",
    repositories = List(
      Repository(name = "like", kind = "*", facets = List(
        Facet(name = "user", source = "graph", primitive = "node", typeName = "com.satukancinta.domain.User", format = "json"),
        Facet(name = "topic", source = "graph", primitive = "node", typeName = "com.satukancinta.domain.Interest", format = "json"),
        Facet(name = "like", source = "graph", primitive = "relationship", typeName = "LIKE", format = "graphml"))))))) {

Beat that, Java!

Still don't believe me? How about processing a bunch of collections, and sprinkle built-in parallel capability :

    val indexHits = graphDb.getAllNodes.par.filter(_.getProperty("__type__") == typeName)
    log.info("Index for {} returned {} nodes", typeName, indexHits.size)
    val columnNames = indexHits.par.flatMap( node =>
      node.getPropertyKeys.filter( _ != "__type__" ) ).toSet
    val sortedColumns = columnNames.toList.sorted
    log.info("Columns for {}: {}", typeName, sortedColumns.mkString(", "))
    RowMeta(columns = sortedColumns)

I really can't imagine doing that (including the concurrency) in Java. Phew.

Verbose code == easy to read ? Not always. This one is much easier on the eyes.

Tip: To learn more about Scala programming, I recommend Programming in Scala: A Comprehensive Step-by-Step Guide, 2nd Edition.

8 comments:

  1. nice post!
    although you could add some highlighting\formatting to these snippets, and make functions less lengthy, otherwise it still hurts my eyes, much like java :(

    ReplyDelete
  2. This comment has been removed by the author.

    ReplyDelete
  3. The type system is complex and if you dive very deep into it, you might just spend a lot of time satisfying the compiler for little gain instead of writing useful code. I love to occasionally play with something like shapeless, but avoid doing it in serious code.

    ReplyDelete
  4. this is such a nice and useful information for us...i appreciate urs word.......professional seo services

    ReplyDelete
  5. Thank you for the info. It sounds pretty user friendly. I guess I’ll pick one up for fun. thank u.

    ASC Coding

    ReplyDelete
  6. Thank you for the info. It sounds pretty user friendly. I guess I’ll pick one up for fun. thank u









    Web Development New Jersey

    ReplyDelete
  7. > It enables me to write concise / dense and clear code at the same time.
    > ...
    > Verbose code == easy to read ? Not always. This one is much easier on the eyes.

    Looking at the chunks of dense code above, I'm speechless on the irony XD

    ReplyDelete