Tony Gaskell: 2011

Monday, February 7, 2011

API Design for iHale

Many people in the technology field will be familiar with the term application programming interface (API). They are the essential building blocks of every program that have been created by previous programmers to save you the hassle of reimplementing commonly used classes. APIs provide all programmers with standardized methods and classes to act as an interface between different programs.

Some qualities of good API design:

Modular.
Reusable.
Easy to use.
Difficult to misuse.
Powerful to do what it needs to do.
Easy to expand.

iHale

iHale top-level architecture

It's always exciting to get a chance design a system from the ground up, which is exactly the opportunity that has been offered to our software engineering class.

But where do we begin?

The Scope
For this particular system there are few design requirements that we must fulfill. Our system has to interact with sensors for all of the house devices. We decided to model the system by having a main system which manages different environments that contain different devices. In our current Restlet API design, we tried to create resource objects to be sent across HTTP to issue commands to "actuators" (i.e. - turn lights on and off) and to be able to accept data from "sensors".

Generalizing the problem
Because we don't know exactly what type of hardware we are working with, we created a highly abstract API for our initial design. We felt that actuators and sensors are pretty similar, so we have them both extend a parent class called Device. The idea behind this design choice is to allow for these different resources to have their own unique ID numbers that is generated using a static variable that increments itself every time a device is created. This provides a simple way to reference a specific device that will not be confused with another.

Though we don't know exactly what sensors the device will be talking to, we know that we will be using REST to send data between the iHale system and the devices. Therefore we have to have a way to send GET and PUT request methods at the least. How we handle the requests at the moment, we're unsure. The main thing is that we let the users with the parameters they need to provide and what they can expect in return.

"Through Fire and Flames"
Our group met a few times over a week and spent a few good hours drawing up different designs, shooting down the bad ideas and keeping the good ones. But even after all the time that was spent on our design, in its current state I feel like we're going to have to end up redoing it. Why?

We were so focused on creating everything from the ground up, generalizing and abstracting that I think we got a little carried away. What we forgot to do is to look at the existing APIs for the systems we are using. Namely, Restlet's API which already provides the Resource class. Instead of creating a class that already sort of exists! Apart from saving us the time of coding an entirely new class, it would allow for even more compatibility with Restlet and its pre-existing methods!

Thoughts
Even though I think our first API design was perfect, it was a great experience in software design. It pushed us to step back and view every part of the system, and even future expansions of the project. While it is good to be able to think abstractly about the system, you want to keep yourself grounded and work within your scopes (in our case the services we are using). Fortunately for us the API is in the early stages of development, so its good for us to catch our mistakes now instead of later.

You can check out Team Maka's latest API distribution here.

Saturday, January 29, 2011

Berkeley DB and Persistency

Berkley DB is the first non-relational database I've had the pleasure of working with. It is a It has options to scale up to hundreds of terabytes. Though it is written in C, it provides API bindings for PHP, Java, Perl and Ruby just to name a few.

Embedded Database
Berkley DB is an embedded database, but what does that mean? It means that it must be integrated with an application that needs to access stored data.
One such weakness of embedded databases is the fact that it only supports one user (the application). But is that entirely bad? Not necessarily. Using another service, Restlet, we can integrate the two together to provide multiple user access to centralized data storage!

Kata 1: Timestamp as secondary index
The first distribution provided the user with three options: GET, PUT, DELETE. The task that was given was to implement a timestamp that could be used as a secondary index for each data entry.

get-timestamp: returns the Contact with the given timestamp, if such a Contact exists.
get-range: returns the set of Contacts with timestamps between two specified values.

Implementing the new options was actually much easier than I thought it would be. The BerkleyDB API provides very helpful examples that can make implementation an easy processes.

Completed
Time spent: 2 hours

Kata 2: Wicket, REST, Berkley DB: A match made in heaven

Wicket Interface

Although I can agree with the name of this kata, there were a few nuances that plagued me throughout the duration of this kata. It is important to know how the three interfaces interact with each other before diving into this system. I already mentioned Berkley DB a little, so I want to talk about the other two components to this application.

Wicket
This is your user front end. Using markup language to create forms, textfields and buttons, along with Wicket tags, it allows you to create a structured program over stateless HTTP.

It's a very tough gap that wicket tries to accomplish, and I think it does a pretty good job at it. The one thing I haven't gotten down is the implementation of sessions. Though I feel I came pretty close to understanding the principle, I could not get the sessions to work properly. Wicket allows for developers to create Session objects. Sessions have various applications depending on what you need. In this application we implement sessions in order to create concurrency. If you have many users trying to PUT and GET contacts from a centralized database, you want to make sure that people get what they asked for and not what someone else queried.

Restlet
This is the essential median between Wicket and BerkleyDB. BerkleyDB is designed to only allow for one person to update/query the database at a time. We add a Restlet layer on top of BerkleyDB in order to regulate all requests that are coming into the server. The good thing is that you won't have to worry about dirty reads, when a user requests data that is being modified by another user, because restlet will sequence the read/writes to the database. Another great thing about Restlet is that it can allow for multiple people to access the database (even though it is designed for one). In future implementations you could have many users/devices sending requests to the server as long as you can implement a Restlet layer on top of those devices to interact with the Restlet layer that handles the server's requests.

Not Completed

Time spent: 10 hours+

Download my kata distribution here.

Sunday, January 23, 2011

Java Virtual Machine Heap

I came across an issue regarding memory allocation, and decided it would be in my best interests to how the Java Virtual Machine (JVM) manages it's memory space.

JVM is comprised of three memory segments: heap memory, non-heap memory, and the code for itself.

Heap Memory

A JVM heap is allocated at the start-up of every program. By default the initial heap size is 2MB and the max heap size is set to 64MB. But, in the event that your program needs more memory you can initialize it to have more using:

%java -Xmsn

Specifies the initial size of the memory allocation pool. This value must be a multiple of 1024 greater than 1 MB. Append the letter k or K to indicate kilobytes, the letter m or M to indicate megabytes, the letter g or G to indicate gigabytes, or the letter t or T to indicate terabytes. The default value is 2MB. Examples:

-Xms6291456
-Xms6144k
-Xms6m

Likewise, we can set a maximum heap size using:

%java -Xmsn

Heap Dump Snapshots

There are a couple of commands you can use to obtain a view of your program's heap. The first one being Java's tool, HPROF, which enables CPU, heap, or monitor profiling. The second command is the Java Heap Analysis Tool (JHAT).

%java -Xrunhprof:format=b, file=snapshot.hprof Classname
%jhat snapshot.hprof

Without formatting the snapshot, HPROF will return the heap dump in a text file. But because sifting through possibly 40k+ lines of text, we can format the dump so that it easer to browse, as well as make use of Object Query Language (OQL).

After running these two commands, you can view the heap file by opening up your browser and going to http://localhost:7000.

JConsole

Overview tab

This is a tool that was included in JDK 5.0 and later distributions. It provides the user with a GUI for monitoring specific data such as memory and thread usage.

Some useful tabs that are provided: Overview, Memory, Threads, Classes, and the VM Summary. You can adjust the time frames for even more specific analysis.

Tuesday, January 18, 2011

Restlet Katas

As a follow up to the Wicket Katas, our class was presented another one for Restlet.
If you would like to follow along with this kata, the base system is located here.

Kata 1: Time Resources

Add three new resources called "hour", "minute" and "second" that return the current hour, minute and second.

Write unit tests for each of these resources and make sure they pass, and make sure you can run them both from within Eclipse and using Ant.
Make sure the DateClient system accepts these parameters.
Ensure that "ant -f verify.build.xml" passes with no errors.

We are provided with two jar files, one for the client and one for the server. By running both in 2 sessions windows, you can see how the client and server interact with each other. By default the client sends a GET request to the server to retrieve a certain resource defined by the parameters: <host> <resource>.

When the request is sent from the client side, you will see in the server window how the server accepts the request, as well as the response (whether the resource was returned, or not).

This demonstrates the basic request-response relationship between client and server in the REST architecture.

Kata 2: Logging

By default, Restlet logs data about each request to the command line using the Java Logging API. Read a tutorial (and/or google) to get familiar with the API, then do the following:

define a ~/.dateservice/logging.properties file that enables the user to specify (among other things) that the logging information should go to a file, not the command line.
tell the restlet-dateservice application to read and use the logging properties file in ~/.dateservice/logging.properties.

Using Java's FileHandler class we can designate a properties file that can be stored on the server side to create and append logs to an external file. This is a very useful kata because it shows that can change the server's properties on the fly.

This blog post was also very helpful when trying to set up the REST server and the properties file.

Kata 3: Authentication

The current restlet-dateservice system performs no authentication. Check the user guide (and/or google) for guidance, then implement HTTP Basic authentication for all the resources in the restlet-dateservice system. To simplify things, all resources should be available as long as the requesting system uses HTTP Basic authentication and passes the username "guest" and password "pw". (You can hardcode that password into both the client and server sides of the system if you like.)

This kata makes use of Restlet's ChallengeAuthenticator class, allowing the user to define a Map of entries (username/password) to allow access to the server. When an inbound request is detected, the browser will prompt the user authentication when trying to contact the server.

Kata 4: Wicket

For this Kata, create a new client for the DateService server that is a web application using Wicket rather than a command line application. The web application should bring up a single page with a single form that requests what aspect of the date (year, month, day, hour, minute second) the user desires. When the user submits the form, the web application uses Restlet to obtain the appropriate information from the DateService server, then presents the results to the user in the page.

The web application should be placed in its own package. Note that you will have to modify the build.xml file to include Ivy-based retrieval of the Wicket libraries.

Modify the jar.build.xml to build a third jar file called restlet-dateservice-webapp.jar containing just the client classes and libraries required to run the web application.

Make sure to provide a JUnit test case for your web application, and that 'ant -f verify.build.xml' passes without errors.

In this final kata, we have to create a wicket application to act as the user of the Restlet server. The application does the same thing as the first kata, and sends a GET request to the server and waits for a response. When it receives a response from the server, we can assign it to a label.

Using a session, we can initiate a new page session with the updated label object which will be the server's response depending on which resource was requested.

Thoughts

I think these katas were good to introduce us to the REST architecture. There's a bit of research that is needed and it can be quite a large time investment, but I think understanding how the architecture works is mandatory for dynamic web programming. At the time of this post I have not been able to complete the final kata because of this error when I try to send a GET request from the client-side:

INFO: A recoverable error was detected (1001), attempting again in 2000 ms.

GET http://localhost:8112/dateserver/year: Response 1001: Cannot allocate memory

Google shows that a 1001 response indicates a connection issue and it does not seem that my requests are being received by the server.