Chapter 3. Poorsmatic

JBoss.orgCommunity Documentation

Prev

Chapter 3. Poorsmatic

3.1. Get the source

3.2. Poorsmatic application overview

3.2.1. Web
3.2.2. Database
3.2.3. Configuration

3.3. Launch the application

3.4. Basic TorqueBox features

3.4.1. Deployment descriptor
3.4.2. Service
3.4.3. Messaging
3.4.4. Transactions

3.5. Wrapping up

This chapter shows you how to build a simple web application based on Sinatra. Althought the application itself is very small, it uses many TorqueBox features.

We will create an application called Poorsmatic, a "poor man's Prismatic". This is a truly awful content discovery service that merely returns URL's from Twitter that contain at least one occurrence of the search term used to find the tweets containing the URL's in the first place.

If you don't get it - don't worry - everything will be clear later.

3.1. Get the source

We made the application source available for you in the TorqueBox git repository in examples/poorsmatic/ directory. To get started you need to clone the repository.

$git clone git://github.com/torquebox/torquebox.git
Cloning into 'torquebox'...
remote: Counting objects: 77466, done.
remote: Compressing objects: 100% (26065/26065), done.
remote: Total 77466 (delta 37601), reused 76739 (delta 36971)
Receiving objects: 100% (77466/77466), 47.06 MiB | 2.48 MiB/s, done.
Resolving deltas: 100% (37601/37601), done

File location

When not said otherwise every file location used in this guide will be relative to the examples/poorsmatic/ directory located under the checkout directory.

3.2. Poorsmatic application overview

The main goal of the application is to grab tweets from Twitter stream. We'll use a TwitterService implemented as TorqueBox service. The service itself will be instructed in which keywords we're interested via terms topic. Received tweets will be scanned for URL's and those URL's will be put into url's queue for processing. The URLScraper will be responsible for visiting all URL's from the urls queue and counting the words found in the <body/> tag on the page. Results will be put into database. There will be also a web interface for viewing the URL's and editing the terms.

3.2.1. Web

We use use Sinatra web framework for this application. This is a very simple DSL framework for writing web applications fast, which is a perfect choice in our case.

If you're new to Sinatra, the project homepage is a great resource to start. It has a lot of examples and pretty good documentation.

3.2.2. Database

We choose PostgreSQL mainly because it has support for transactions which will be used in our application.

3.2.3. Configuration

3.2.3.1. Database preparation

First please create a new user and database for the application.

$su postgres -c psql
psql (9.2.2)
Type "help" for help.

postgres=#CREATE USER poorsmatic WITH PASSWORD 'poorsmatic';
CREATE ROLE
postgres=#CREATE DATABASE poorsmatic;
CREATE DATABASE
postgres=#GRANT ALL PRIVILEGES ON DATABASE poorsmatic to poorsmatic;
GRANT
postgres=#\q

3.2.3.2. Database authentication

To connect to the database we need to use the md5 authentication method instead of ident. To change it please open /var/lib/pgsql/data/pg_hba.conf (location in Fedora operating system, may be different if you use some other OS) file and make sure you use md5 method.

3.2.3.3. Enable transactions in database

To be able to use TorqueBox transactions in /var/lib/pgsql/data/postgresql.conf please set max_prepared_transactions to value greater than 0. In our case 10 should be sufficient.

3.2.3.4. ORM

The object-relational mapper of our choice is DataMapper. To see the configuration details please look at torquebox_init.rb. Models are defined in the models/ directory.

Although in this quick overview of the app we will not discuss the database access and ORM you can always see the full code in the repository.

3.3. Launch the application

It's time to launch the application for the first time. First we need to start TorqueBox by executing torquebox run command. The output should be similar to this:

$torquebox run
...
14:34:42,734 INFO  [org.torquebox.jobs.as] Initializing TorqueBox Jobs Subsystem
14:34:42,735 INFO  [org.torquebox.services.as] Initializing TorqueBox Services Subsystem
14:34:42,738 INFO  [org.torquebox.core.as] Initializing TorqueBox Core Subsystem
14:34:42,740 INFO  [org.torquebox.security.as] Initializing TorqueBox Auth Subsystem
14:34:42,740 INFO  [org.torquebox.web.as] Initializing TorqueBox Web Subsystem
14:34:42,733 INFO  [org.torquebox.cdi.as] Initializing TorqueBox CDI Subsystem
14:34:42,792 INFO  [org.torquebox.stomp.as] Initializing TorqueBox STOMP Subsystem
14:34:42,791 INFO  [org.projectodd.polyglot.stomp.as] Initializing Polyglot STOMP Subsystem
14:34:42,807 INFO  [org.projectodd.polyglot.hasingleton.as] Initializing HA-Singleton Subsystem
14:34:42,808 INFO  [org.projectodd.polyglot.cache.as] Initializing Polyglot Cache Subsystem
14:34:42,854 INFO  [org.torquebox.core.as] Welcome to TorqueBox AS - http://torquebox.org/
14:34:42,855 INFO  [org.torquebox.core.as]   version........... 2.2.0
14:34:42,856 INFO  [org.torquebox.core.as]   build............. 74
14:34:42,857 INFO  [org.torquebox.core.as]   revision.......... 530d7d30a5ba5ca953eba21b2aa6df1bf4022649
...
14:35:02,072 INFO  [org.jboss.as] (Controller Boot Thread) JBAS015961: Http management interface listening on http://127.0.0.1:9990/management
14:35:02,073 INFO  [org.jboss.as] (Controller Boot Thread) JBAS015951: Admin console listening on http://127.0.0.1:9990
14:35:02,073 INFO  [org.jboss.as] (Controller Boot Thread) JBAS015874: JBoss AS 7.1.x.incremental.129 "Arges" started in 21875ms - Started 281 of 400 services (118 services are passive or on-demand)

Now we're ready to deploy the application. To do so go to the examples/poorsmatic/ directory and execute the torquebox deploy command:

$torquebox deploy
Deployed: poorsmatic-knob.yml
    into: /home/goldmann/work/torquebox-2.2.0/jboss/standalone/deployments

You can now reach your application on http://localhost:8080/poorsmatic/. If you see "Hello from Poorsmatic!" message it means that everything worked perfectly!

Congratulations! You have now a running web application. In the next steps we will discuss the TorqueBox features we used to build it. features.

3.4. Basic TorqueBox features

In this section we'll discuss the basic TorqueBox features which made it possible to build the application.

3.4.1. Deployment descriptor

Deployment descriptor is a file used to inform TorqueBox what kind of application it is or what features are used in it. There are two types of deployment descriptors: internal and external. Additionally you have the choice of using pure Ruby DSL or specify the options in YAML format. In our case we'll use Ruby DSL syntax. Please refer to TorqueBox manual if you want to read more about deployment descriptors.

TorqueBox does pretty good job at guessing what kind of application is the one we are trying to deploy. If it's a Rack based application - it'll be registered by default at the root context (/). We can change this (and many other things) by using a deployment descriptor.

Example 3.1. torquebox.rb

TorqueBox.configure do
  web do
    ...
    context "/poorsmatic"
    ...
  end
end

As you can see we choose the /poorsmatic context which made it possible to reach the application at poorsmatic context.

This simple setting lets you deploy many applications on one TorqueBox server in different context roots.

3.4.2. Service

In our application we use Twitter to receive tweets from the stream filtered by some keywords. The best way to run something constantly is to implement it as a TorqueBox service. Below you can find a simple skeleton.

Example 3.2. TorqueBox service skeleton

class AService
  def initialize(credentials = {})
  end

  def start
  end

  def stop
  end
end

The code is pretty self-explaining. The only thing you need to keep in mind is that the start method is executed when you deploy the service. Similarly the stop method is executed when you undeploy the service. As simple as that.

In our application we'll use the twitter4j4r Twitter client. Please don't ask about the name...

Example 3.3. twitter_service.rb

require 'twitter4j4r'

class TwitterService
  ...
  def initialize(credentials = {})

    @terms = []

    @client = Twitter4j4r::Client.new(
        :consumer_key     => credentials['consumer_key'],
        :consumer_secret  => credentials['consumer_secret'],
        :access_token     => credentials['access_token'],
        :access_secret    => credentials['access_secret']
    )

    @client.on_exception do |exception|
      puts "An error occured while reading the stream: #{exception.message}"
    end
  end

  def start
    @client.track(*@terms) do |status, client|
    end
  end

  def stop
    @client.stop
  end
end

Now we need to inform TorqueBox that we want to deploy a service implemented in TwitterService class. We need to do it in the mentioned before torquebox.rb deployment descriptor. You may also ask how do we inject the credentials required to connect to Twitter? Yes, you're right, deployment descriptor.

Example 3.4. torquebox.rb

TorqueBox.configure do
  ...
  service TwitterService do
    name 'twitter-service'
    config do
      consumer_key 'Consumer key'
      consumer_secret 'Consumer secret'
      access_token 'Access token'
      access_secret 'Access token secret'
    end
  end
end

Before you deploy the application make sure you use correct credentials. You can generate them on the Twitter apps page. Just create new application and you're ready to rock.

You may wonder how do we update the keyword list we want to watch on Twitter? We'll use messaging features of TorqueBox.

3.4.3. Messaging

Messaging allows us to create loosely coupled applications. Using queues and topics aswell as message producers and consumers is very easy, so let's start right away! If you're new to the messaging terms, don't fear - take a look at the messaging section of TorqueBox manual.

3.4.3.1. Queue and topic deployment

We need to create one queue and a topic. We'll use the topic to send/receive keywords we want to watch on Twitter and the queue will be used to send urls from tweets containing one or more specified keywords. Creating them is very simple, we just need to add the queue and topic constructs to the deployment descriptor.

Example 3.5. torquebox.rb

TorqueBox.configure do
  ...
  queue '/queues/urls'
  topic '/topics/terms'
end

That's everything required to deploy a queue and topic with your application. Both will be started when you deploy the application and stopped when you undeploy it. Handy feature, isn't?

3.4.3.2. Message consumers

It's pretty easy to consume a message from a queue or topic. You can use message processors.

Example 3.6. term_consumer.rb

class TermConsumer < TorqueBox::Messaging::MessageProcessor
  ...
  def on_message(message)
    # do stuff here
  end
end

A new message processor will be created for each message that arrives to the queue. The message itself will be injected to the on_message method. You can do whatever you want with it afterwards. In our case we want to update the keywords in the twitter service. The easiest thing to do is to simply get access to the service and execute an update method on it. Let's do it!

Example 3.7. term_consumer.rb

class TermConsumer < TorqueBox::Messaging::MessageProcessor
  include TorqueBox::Injectors

  def initialize
    @twitter_service = fetch('service:twitter-service')
  end

  def on_message(terms)
    @twitter_service.update(terms)
  end
end

Each time a TermConsumer message processor is created TorqueBox will use its power to inject the TwitterService service into the consumer. Afterwards the terms will be injected into the on_message method and the update method will be executed on the service itself. The only thing that left now is to show the update method in TwitterService class.

Example 3.8. twitter_service.rb

class TwitterService
  def initialize(credentials = {})
    @terms = []
    ...
  end

  def update(terms)
    @terms = terms

    stop
    start
  end
  ...
end

Execution of this method will update the terms list. Additionally the Twitter client will be restarted to watch new keywords.

Last thing left to do is to wire the message consumer with the queue or topic. We use the deployment descriptor. Here is how to do it:

Example 3.9. torquebox.rb

TorqueBox.configure do
  ...
  queue '/queues/urls' do
    processor UrlScrapper do
      concurrency 4
    end
  end

  topic '/topics/terms' do
    processor TermConsumer
  end
  ...
end

You can ask yourself what concurrency parameter means. This tells TorqueBox how many message processors of the selected type should be connected to the selected queue. This is very handy if you expect to have many messages arriving and needs more workers to process them.

You can find another message processor used to retrieve and parse web pages in url_scrapper.rb file. Since URLScrapper doesn't do anything fancy besides counting the words in <body> and saving the result in a database we'll not go into discussing it this time, sorry.

3.4.3.3. Producing messages

You know now how to receive messages from topics (or queues), but the remaining question is how to put new messages into the queue? We'll look at it in this section.

The TwitterService besides retrieving tweets for particular keywords is also supposed to pull out every link from the received tweets. These links will be put into a queue for later processing.

We will use the twitter-text gem to retrieve the link.

Example 3.10. twitter_service.rb

require 'twitter-text'

class TwitterService
  include TorqueBox::Injectors
  ...
  def start
    @client.track(*@terms) do |status, client|
      urls = extract_urls(status.text)

      unless urls.empty?
        queue = fetch('/queues/urls')

        urls.each do |url|
          queue.publish(url)
        end
      end
    end
  end
  ...
end

For each received tweet we try to find an URL in it. If we find at least one, we send this URL as a string to the queue. To do this we inject first the queue and later we execute the publish method on it. Easy.

3.4.4. Transactions

The last feature of TorqueBox used in the presented application are distributed transactions. Distributed transaction ensure the atomicity of the execution. They can span across the whole application. For example a transaction could be started in the web layer and end in database. This is exactly how we use it in Poorsmatic!

Example 3.11. poorsmatic.rb

require 'torquebox'
require 'sinatra'
require 'haml'
require 'models/term'

class Poorsmatic < Sinatra::Base
  include TorqueBox::Injectors
  ...
  helpers do
    def terms_changed
      terms = []

      Term.all.each {|t| terms << t.term}

      topic = fetch('/topics/terms')

      topic.publish(terms)
    end
  end
  ...
  post '/terms' do

    term = Term.new(:term => params[:term])

    TorqueBox.transaction do
      if term.save
        terms_changed
      else
        session[:errors] = []
        term.errors.each {|e| session[:errors] << e.first }
      end
    end

    redirect to('/terms')
  end

  delete '/term/:id' do
    TorqueBox.transaction do
      Term.get(params[:id]).destroy
      terms_changed
    end

    redirect to('/terms')
  end
  ...
end

Look at the post '/terms' block. This code is executed when we try to save a new term using the web page. First - we create a Term object, then we start a transaction and try to save the object to the database. If the operation was successfull we send a list of the terms (see terms_changed method) as array to the /topic/terms topic.

The nice thing about transaction is that they're atomic. This means that if an error occurs in the transaction block, the transaction is rolled back and the state is restored. For example if an error occurs in our case during sending a message to the topic, the whole transaction is rolled back and the database state will be restored too!

You can find another usage of transaction in the delete '/term:id' block. It'll make sure that we'll notify the terms queue only after successful removal of the term from database.

Since distributed transactions is an advanced topic you can read about them in the TorqueBox manual. You'll find there more information about transactions itself or configuration options.

3.5. Wrapping up

Congratulations! You know now many of TorqueBox features. There are still many to explore. To learn more about them just open the TorqueBox manual and use them.

Poorsmatic application was just a simple example. You can always go back to the source code and see again how it was done.