Saturday, January 29, 2011

Berkeley DB and Persistency



Berkley DB is the first non-relational database I've had the pleasure of working with.  It is a It has options to scale up to hundreds of terabytes.  Though it is written in C, it provides API bindings for PHP, Java, Perl and Ruby just to name a few.


Embedded Database
Berkley DB is an embedded database, but what does that mean?  It means that it must be integrated with an application that needs to access stored data.
One such weakness of embedded databases is the fact that it only supports one user (the application).  But is that entirely bad?  Not necessarily.  Using another service, Restlet, we can integrate the two together to provide multiple user access to centralized data storage!


Kata 1: Timestamp as secondary index 
The first distribution provided the user with three options: GET, PUT, DELETE.  The task that was given was to implement a timestamp that could be used as a secondary index for each data entry.

get-timestamp: returns the Contact with the given timestamp, if such a Contact exists.
get-range: returns the set of Contacts with timestamps between two specified values. 

Implementing the new options was actually much easier than I thought it would be.  The BerkleyDB API provides very helpful examples that can make implementation an easy processes.

Completed
Time spent: 2 hours


Kata 2: Wicket, REST, Berkley DB: A match made in heaven
Wicket Interface
Although I can agree with the name of this kata, there were a few nuances that plagued me throughout the duration of this kata.  It is important to know how the three interfaces interact with each other before diving into this system.  I already mentioned Berkley DB a little, so I want to talk about the other two components to this application.

Wicket
This is your user front end.  Using markup language to create forms, textfields and buttons, along with Wicket tags, it allows you to create a structured program over stateless HTTP.

It's a very tough gap that wicket tries to accomplish, and I think it does a pretty good job at it.  The one thing I haven't gotten down is the implementation of sessions.  Though I feel I came pretty close to understanding the principle, I could not get the sessions to work properly.  Wicket allows for developers to create Session objects.  Sessions have various applications depending on what you need.  In this application we implement sessions in order to create concurrency.  If you have many users trying to PUT and GET contacts from a centralized database, you want to make sure that people get what they asked for and not what someone else queried.

Restlet
This is the essential median between Wicket and BerkleyDB.  BerkleyDB is designed to only allow for one person to update/query the database at a time.  We add a Restlet layer on top of BerkleyDB in order to regulate all requests that are coming into the server.  The good thing is that you won't have to worry about dirty reads, when a user requests data that is being modified by another user, because restlet will sequence the read/writes to the database.  Another great thing about Restlet is that it can allow for multiple people to access the database (even though it is designed for one).  In future implementations you could have many users/devices sending requests to the server as long as you can implement a Restlet layer on top of those devices to interact with the Restlet layer that handles the server's requests.

Not Completed
Time spent: 10 hours+


Download my kata distribution here.

Sunday, January 23, 2011

Java Virtual Machine Heap

I came across an issue regarding memory allocation, and decided it would be in my best interests to how the Java Virtual Machine (JVM) manages it's memory space.
JVM is comprised of three memory segments: heap memory, non-heap memory, and the code for itself.
Heap Memory

A JVM heap is allocated at the start-up of every program.  By default the initial heap size is 2MB and the max heap size is set to 64MB.  But, in the event that your program needs more memory you can initialize it to have more using:

%java -Xmsn

Specifies the initial size of the memory allocation pool.   This  value  must  be  a  multiple  of 1024 greater than 1 MB.  Append the letter  k  or  K  to indicate  kilobytes,  the letter m or M to indicate megabytes, the letter g or G to indicate gigabytes, or  the  letter  t or T to indicate terabytes. The default value is 2MB. Examples:
  • -Xms6291456
  • -Xms6144k
  • -Xms6m
Likewise, we can set a maximum heap size using:

%java -Xmsn

Heap Dump Snapshots

There are a couple of commands you can use to obtain a view of your program's heap. The first one being Java's tool, HPROF, which enables CPU, heap, or monitor profiling.  The second command is the Java Heap Analysis Tool (JHAT).

%java -Xrunhprof:format=b, file=snapshot.hprof Classname
%jhat snapshot.hprof

Without formatting the snapshot, HPROF will return the heap dump in a text file.  But because sifting through possibly 40k+ lines of text, we can format the dump so that it easer to browse, as well as make use of Object Query Language (OQL).

After running these two commands, you can view the heap file by opening up your browser and going to http://localhost:7000.


JConsole
Overview tab
This is a tool that was included in JDK 5.0 and later distributions.  It provides the user with a GUI for monitoring specific data such as memory and thread usage.

Some useful tabs that are provided: Overview, Memory, Threads, Classes, and the VM Summary.  You can adjust the time frames for even more specific analysis.

Tuesday, January 18, 2011

Restlet Katas

As a follow up to the Wicket Katas, our class was presented another one for Restlet.
If you would like to follow along with this kata, the base system is located here.

Kata 1: Time Resources

Add three new resources called "hour", "minute" and "second" that return the current hour, minute and second.
  • Write unit tests for each of these resources and make sure they pass, and make sure you can run them both from within Eclipse and using Ant.  
  • Make sure the DateClient system accepts these parameters. 
  • Ensure that "ant -f verify.build.xml" passes with no errors.  

We are provided with two jar files, one for the client and one for the server.  By running both in 2 sessions windows, you can see how the client and server interact with each other.  By default the client sends a GET request to the server to retrieve a certain resource defined by the parameters: <host> <resource>.

When the request is sent from the client side, you will see in the server window how the server accepts the request, as well as the response (whether the resource was returned, or not).

This demonstrates the basic request-response relationship between client and server in the REST architecture.


Kata 2: Logging

By default, Restlet logs data about each request to the command line using the Java Logging API.  Read a tutorial (and/or google) to get familiar with the API, then do the following:
  • define a ~/.dateservice/logging.properties file that enables the user to specify (among other things) that the logging information should go to a file, not the command line.
  • tell the restlet-dateservice application to read and use the logging properties file in  ~/.dateservice/logging.properties.

Using Java's FileHandler class we can designate a properties file that can be stored on the server side to create and append logs to an external file.  This is a very useful kata because it shows that can change the server's properties on the fly.

This blog post was also very helpful when trying to set up the REST server and the properties file.

Kata 3: Authentication

The current restlet-dateservice system performs no authentication.  Check the user guide (and/or google) for guidance, then implement HTTP Basic authentication for all the resources in the restlet-dateservice system.  To simplify things, all resources should be available as long as the requesting system uses HTTP Basic authentication and passes the username "guest" and password "pw". (You can hardcode that password into both the client and server sides of the system if you like.)

This kata makes use of Restlet's ChallengeAuthenticator class, allowing the user to define a Map of entries (username/password) to allow access to the server.  When an inbound request is detected, the browser will prompt the user authentication when trying to contact the server.

Kata 4: Wicket

For this Kata, create a new client for the DateService server that is a web application using Wicket rather than a command line application.  The web application should bring up a single page with a single form that requests what aspect of the date (year, month, day, hour, minute second) the user desires. When the user submits the form, the web application uses Restlet to obtain the appropriate information from the DateService server, then presents the results to the user in the page.  

The web application should be placed in its own package.  Note that you will have to modify the build.xml file to include Ivy-based retrieval of the Wicket libraries. 

Modify the jar.build.xml to build a third jar file called restlet-dateservice-webapp.jar containing just the client classes and libraries required to run the web application.  

Make sure to provide a JUnit test case for your web application, and that 'ant -f verify.build.xml' passes without errors. 

In this final kata, we have to create a wicket application to act as the user of the Restlet server.  The application does the same thing as the first kata, and sends a GET request to the server and waits for a response.  When it receives a response from the server, we can assign it to a label.

Using a session, we can initiate a new page session with the updated label object which will be the server's response depending on which resource was requested.

Thoughts 
I think these katas were good to introduce us to the REST architecture.  There's a bit of research that is needed and it can be quite a large time investment, but I think understanding how the architecture works is mandatory for dynamic web programming.  At the time of this post I have not been able to complete the final kata because of this error when I try to send a GET request from the client-side:

INFO: A recoverable error was detected (1001), attempting again in 2000 ms.
GET http://localhost:8112/dateserver/year: Response 1001: Cannot allocate memory

Google shows that a 1001 response indicates a connection issue and it does not seem that my requests are being received by the server.