Efficiently Producing and Consuming CSV in Java using HTTP-RPC

In a previous article, I discussed how the open-source HTTP-RPC framework can be used to efficiently transform JDBC query results into JSON. However, while JSON is an extremely common and well-supported data format, it may in some cases be preferable to return a CSV document instead. Because field keys are specified only at the beginning of a CSV document rather than being duplicated for every record, CSV generally requires less bandwidth than JSON. Additionally, consumers can begin processing CSV as soon as the first record arrives, rather than waiting for the entire document to download.

The previous article demonstrated how HTTP-RPC's ResultSetAdapter and JSONEncoder classes could be used to easily generate a JSON response from a SQL query on this table, taken from the MySQL sample database:

CREATE TABLE pet (
    name VARCHAR(20),
    owner VARCHAR(20),
    species VARCHAR(20), 
    sex CHAR(1), 
    birth DATE, 
    death DATE
);

The following service method returns the same results encoded as CSV. The only difference in this version is the use of HTTP-RPC's CSVEncoder class instead of JSONEncoder:

@RequestMethod("GET")
public void getPets(String owner) throws SQLException, IOException {
    try (Connection connection = DriverManager.getConnection(DB_URL)) {
        Parameters parameters = Parameters.parse("SELECT name, species, sex, birth FROM pet WHERE owner = :owner");

        parameters.put("owner", owner);

        try (PreparedStatement statement = connection.prepareStatement(parameters.getSQL())) {
            parameters.apply(statement);

            try (ResultSet resultSet = statement.executeQuery()) {
                CSVEncoder csvEncoder = new CSVEncoder();
                
                csvEncoder.writeValue(new ResultSetAdapter(resultSet), getResponse().getOutputStream());
            }
        }
    } finally {
        getResponse().flushBuffer();
    }
}

The response might look something like this, where each record in the document represents a row from the result set (dates are represented using epoch time):

"name","species","sex","birth"
"Claws","cat","m",763880400000
"Chirpy","bird","f",905486400000
"Whistler","bird",,881643600000

Consuming a CSV Response

On the other end, HTTP-RPC's CSVDecoder class can be used to efficiently process the contents of a CSV document. Rather than reading the entire payload into memory and returning the data as a random-access list of map values, CSVDecoder returns a cursor over the records in the document. This allows a client to read records essentially as they are being produced, reducing memory consumption and improving throughput.

For example, the following code could be used to consume the response generated by the web service:

WebServiceProxy webServiceProxy = new WebServiceProxy("GET", new URL("http://localhost:8080/httprpc-test/pets"));

webServiceProxy.getArguments().put("owner", "Gwen");
webServiceProxy.getArguments().put("format", "csv");

webServiceProxy.invoke((inputStream, contentType) -> {
    CSVDecoder csvDecoder = new CSVDecoder();

    CSVDecoder.Cursor pets = csvDecoder.readValues(inputStream);

    for (Map<String, String> pet : pets) {
        System.out.println(String.format("%s is a %s", pet.get("name"), pet.get("species")));
    }

    return null;
});

This code produces the following output:

Claws is a cat
Chirpy is a bird
Whistler is a bird

Additionally, the adapt() method of the CSVDecoder.Cursor class can be used to facilitate typed iteration of CSV data. This method produces an Iterable sequence of values of a given type representing the rows in the document. The returned adapter uses dynamic proxy invocation to map properties declared by the interface to map values in the cursor. A single proxy instance is used for all rows to minimize heap allocation.

For example, the following interface might be used to model the pet records shown above:

public interface Pet {
    public String getName();
    public String getOwner();
    public String getSpecies();
    public String getSex();
    public Date getBirth();
}

This code uses adapt() to create an iterable sequence of Pet values from the CSV response:

CSVDecoder csvDecoder = new CSVDecoder();

CSVDecoder.Cursor pets = csvDecoder.readValues(inputStream);

for (Pet pet : pets.adapt(Pet.class)) {
    System.out.println(String.format("%s is a %s", pet.getName(), pet.getSpecies()));
}

The output is identical to the previous example; however, in this case, the fields of each record are retrieved in a type-safe manner (a feature whose value becomes much more obvious when handling CSV documents containing numeric or boolean values, for example).

More Information

This article provided an example of how HTTP-RPC's CSVEncoder and CSVDecoder classes can be used to efficiently stream and process JDBC query results as CSV. For more information, see the project README.

HTTP-RPC 5.5 Released

HTTP-RPC is an open-source framework for implementing REST services in Java. It is extremely lightweight and requires only a Java runtime environment and a servlet container. The entire framework is distributed as a single JAR file that is about 52KB in size, making it an ideal choice for applications such as microservices where a minimal footprint is desired.

HTTP-RPC version 5.5 is now available for download and via Maven Central. The primary enhancement in this release is the new WebServiceProxy class, which allows an HTTP-RPC web service to act as a consumer of other REST services.

Service proxies are initialized via a constructor that takes the following arguments:

  • method – the HTTP method to execute
  • url – an instance of java.net.URL representing the target of the operation

Request headers and arguments are specified via the proxy's getHeaders() and getArguments() methods, respectively. Like HTML forms, arguments are submitted either via the query string or in the request body. As with HTML, POST requests may be submitted as either "application/x-www-form-urlencoded" or "multipart/form-data" (specified via the proxy's setEncoding() method). Custom request encodings are also supported.

Service operations are invoked via one of the following methods:

public <T> T invoke() throws IOException { ... }
public <T> T invoke(ResponseHandler<T> responseHandler) throws IOException { ... }

The first version automatically deserializes a successful response using HTTP-RPC's JSONDecoder class. The second version allows a caller to perform custom deserialization of the server response.

For example, the following code snippets demonstrate how WebServiceProxy might be used to access the operations of a simple math service:

// GET /math/sum?a=2&b=4
WebServiceProxy webServiceProxy = new WebServiceProxy("GET", new URL("http://localhost:8080/httprpc-test/math/sum"));

webServiceProxy.getArguments().put("a", 4);
webServiceProxy.getArguments().put("b", 2);

Number result = webServiceProxy.invoke();

System.out.println(result); // 6.0
// GET /math/sum?values=1&values=2&values=3
WebServiceProxy webServiceProxy = new WebServiceProxy("GET", new URL("http://localhost:8080/httprpc-test/math/sum"));

webServiceProxy.getArguments().put("values", Arrays.asList(1, 2, 3));

Number result = webServiceProxy.invoke();

System.out.println(result); // 6.0

Typed Web Service Access

Additionally, the adapt() methods of the WebServiceProxy class can be used to facilitate type-safe access to web services:

public static <T> T adapt(URL baseURL, Class<T> type) { ... }
public static <T> T adapt(URL baseURL, Class<T> type, Map<String, ?> headers) { ... }

Both versions take a base URL and an interface type as arguments and return an instance of the given type that can be used to invoke service operations. The second version also accepts a map of header values that will be submitted with every service request.

The RequestMethod annotation is used to associate an HTTP verb with an interface method. The optional ResourcePath annotation can be used to associate the method with a specific path relative to the base URL. If unspecified, the method is associated with the base URL itself.

For example, the following interface might be used to model the operations of the example math service:

public interface MathService {
    @RequestMethod("GET")
    @ResourcePath("sum")
    public Number getSum(double a, double b) throws IOException;

    @RequestMethod("GET")
    @ResourcePath("sum")
    public Number getSum(List<Double> values) throws IOException;
}

This code snippet uses the adapt() method to create an instance of MathService, then invokes the getSum() methods on the returned instance. The results are identical to the previous example:

MathService mathService = WebServiceProxy.adapt(new URL("http://localhost:8080/httprpc-test/math/"), MathService.class);

// GET /math/sum?a=2&b=4
mathService.getSum(4, 2); // 6.0

// GET /math/sum?values=1&values=2&values=3
mathService.getSum(Arrays.asList(1.0, 2.0, 3.0)); // 6.0

Summary

This article introduced HTTP-RPC's new WebServiceProxy class, which allows an HTTP-RPC service to act as a consumer of other REST services. For more information, see the project README.

HTTP-RPC 5.3 Released

HTTP-RPC 5.3 is now available for download. This release adds support for typed result set iteration.

As previously discussed, HTTP-RPC's ResultSetAdapter class can be used to optimize transformation of JDBC result set data to JSON. The new adapt() method the 5.3 release adds to ResultSetAdapter can be used to facilitate type-safe iteration of query results. This method takes a ResultSet and an interface type as arguments, and returns an Iterable of the given type representing the rows in the result set.

For example, this interface might be used to model the results of a query on the "pet" table from the MySQL sample database:

public interface Pet {
    public String getName();
    public String getOwner();
    public String getSpecies();
    public String getSex();
    public Date getBirth();
}

The following service method uses adapt() to create an iterable sequence of Pet values. It wraps the adapter's iterator in a stream, and then uses the stream to calculate the average age of all pets in the database. The getBirth() method declared by the Pet interface is used to retrieve each pet's age in epoch time. The average value is converted to years at the end of the method:

@RequestMethod("GET")
@ResourcePath("/average-age")
public double getAverageAge() throws SQLException {
    Date now = new Date();

    double averageAge;
    try (Connection connection = DriverManager.getConnection(DB_URL);
        Statement statement = connection.createStatement();
        ResultSet resultSet = statement.executeQuery("SELECT birth FROM pet")) {
        Iterable<Pet> pets = ResultSetAdapter.adapt(resultSet, Pet.class);

        Stream<Pet> stream = StreamSupport.stream(pets.spliterator(), false);

        averageAge = stream.mapToLong(pet -> now.getTime() - pet.getBirth().getTime()).average().getAsDouble();
    }

    return averageAge / (365.0 * 24.0 * 60.0 * 60.0 * 1000.0);
}

For more information, see the project README.

Efficiently Transforming JDBC Query Results to JSON

A lot of enterprise data is stored in relational databases and accessed via SQL queries. Many web services are little more than HTTP-based wrappers around such queries.

Unfortunately, transforming query results to JSON so it can be consumed by a client application often involves numerous inefficient steps, such as binding each row to a data object and loading the entire data set into memory before serializing it back to the caller. This type of approach has a negative impact on performance and scalabilty. Each row requires multiple heap allocations and constructor invocations, increasing latency and CPU load. Worse, the caller does not receive a response until the entire data set has been processed.

Further, since each response is loaded entirely into memory, high-volume applications require a large amount of RAM, and can only scale through the addition of more physical hardware. Eventually, the garbage collector has to run, slowing down the entire system.

A much more efficient approach is to stream response data. Instead of copying the query results into an in-memory data structure before sending the response, the web service can write a row of data to the output stream each time a row is read from the result set. This allows a client to begin receiving the data as soon as it is available, significantly reducing latency. Also, because no intermediate data structures are created, CPU and memory load is reduced, allowing each server to handle a higher number of concurrent requests. Finally, because fewer heap allocations are required, the garbage collector needs to run much less frequently, resulting in fewer system pauses.

Introducing HTTP-RPC

HTTP-RPC is an open-source framework for implementing REST services in Java. It is extremely lightweight and requires only a Java runtime environment and a servlet container. The entire framework is distributed as a single JAR file that is less than 30KB in size, making it an ideal choice for applications such as microservices where a minimal footprint is desired.

DispatcherServlet

HTTP-RPC’s DispatcherServlet type provides an abstract base class for REST-based web services. Service operations are defined by adding public methods to a concrete service implementation.

Methods are invoked by submitting an HTTP request for a path associated with a servlet instance. Arguments are provided either via the query string or in the request body, like an HTML form. DispatcherServlet converts the request parameters to the expected argument types, invokes the method, and writes the return value to the output stream as JSON.

The RequestMethod annotation is used to associate a service method with an HTTP verb such as GET or POST. The optional ResourcePath annotation can be used to associate the method with a specific path relative to the servlet. For example, the following class might be used to implement a web service that performs a simple addition operation:

@WebServlet(urlPatterns={"/math/*"})
public class MathServlet extends DispatcherServlet {
    @RequestMethod("GET")
    @ResourcePath("/sum")
    public double getSum(double a, double b) {
        return a + b;
    }
}

The following request would cause the method to be invoked, and the service would return the value 6 in response:

GET /math/sum?a=2&b=4

JSONEncoder

The JSONEncoder class, which is used internally by DispatcherServlet to serialize response data, converts return values to their JSON equivalents as follows:

  • CharSequence: string
  • Number: number
  • Boolean: true/false
  • Iterable: array
  • java.util.Map: object

Note that collection types are not required to support random access; iterability is sufficient. This is an important feature, as it allows service implementations to stream result data rather than buffering it in memory before it is written.

ResultSetAdapter

HTTP-RPC’s ResultSetAdapter class implements the Iterable interface and makes each row in a JDBC result set appear as an instance of Map, allowing query results to be efficiently serialized as an array of JSON objects.

For example, consider a web service that returns the result of a SQL query on this table, taken from the MySQL sample database:

CREATE TABLE pet (
    name VARCHAR(20),
    owner VARCHAR(20),
    species VARCHAR(20), 
    sex CHAR(1), 
    birth DATE, 
    death DATE
);

A method to retrieve a list of all pets belonging to a given owner might be implemented as shown below. Note that the example uses HTTP-RPC’s Parameters class to simplify query execution using named parameters rather than positional values. Also note that the method uses JSONEncoder to explicitly write the results to the output stream rather than simply returning the adapter instance, to ensure that the underlying result set is closed and system resources are not leaked:

@RequestMethod("GET")
public void getPets(String owner) throws SQLException, IOException {
    try (Connection connection = DriverManager.getConnection(DB_URL)) {
        Parameters parameters = Parameters.parse("SELECT name, species, sex, birth FROM pet WHERE owner = :owner");

        parameters.put("owner", owner);

        try (PreparedStatement statement = connection.prepareStatement(parameters.getSQL())) {
            parameters.apply(statement);

            try (ResultSet resultSet = statement.executeQuery()) {
                JSONEncoder jsonEncoder = new JSONEncoder();

                jsonEncoder.writeValue(new ResultSetAdapter(resultSet), getResponse().getOutputStream());
            }
        }
    } finally {
        getResponse().flushBuffer();
    }
}

A response produced by the method might look something like this, where each object in the array represents a row from the result set:

[
  {
    "name": "Claws",
    "species": "cat",
    "sex": "m",
    "birth": 763880400000
  },
  {
    "name": "Chirpy",
    "species": "bird",
    "sex": "f",
    "birth": 905486400000
  },
  {
    "name": "Whistler",
    "species": "bird",
    "sex": null,
    "birth": 881643600000
  }
]

With just a few lines of code, query results can be quickly and efficiently returned to the caller, with no intermediate buffering required.

More Information

This article introduced the HTTP-RPC framework and provided an example of how the ResultSetAdapter class can be used to efficiently transform JDBC query results into JSON. For more information, see the project README.

Named Parameters in JDBC Queries

Prepared statements are a common way to execute parameterized queries in JDBC. For example, the following SQL might be used to retrieve a list of all users whose first or last name matches a particular character sequence:

SELECT * FROM user WHERE first_name LIKE ? or last_name LIKE ?

Parameter values are supplied at runtime via indexed setter methods defined by the PreparedStatement class:

statement.setString(1, pattern);
statement.setString(2, pattern);

This works fine for simple queries, but it becomes increasingly difficult to manage as the number of parameters grows. It is also redundant – although this query only requires a single argument, two parameter values must be supplied.

The Java Persistence API (JPA) provides a more convenient alternative using named parameters. For example, the above query might be written as follows in JPQL:

SELECT u FROM User u WHERE u.firstName LIKE :pattern or u.lastName LIKE :pattern

This is more readable and less verbose, as the caller only needs to provide the value of the “pattern” parameter once. It is also more resilient to changes, as the arguments are not dependent on ordinal position. Unfortunately, it requires a JPA-compliant object-relational mapping (ORM) framework such as Hibernate, a dependency that may not be satisfiable in all situations.

The org.httprpc.sql.Parameters class provided by the HTTP-RPC framework brings named parameter support to JDBC. The parse() method of this class is used to create a Parameters instance from a JPA-like SQL query; for example:

SELECT * FROM user WHERE first_name LIKE :pattern or last_name LIKE :pattern

It takes a string or reader containing the query text as an argument:

Parameters parameters = Parameters.parse(sqlReader);

The getSQL() method of the Parameters class returns the processed query in standard JDBC syntax. This value can be used in a call to Connection#prepareStatement():

PreparedStatement statement = connection.prepareStatement(parameters.getSQL());

Parameter values are specified via the put() method:

parameters.put("pattern", pattern);

The values are applied to the statement via the apply() method:

parameters.apply(statement);

Once applied, the query can be executed:

ResultSet resultSet = statement.executeQuery();

Note that Parameters is not limited to queries; it can also be used for updates.

A complete example using the Parameters class can be found here. It is a simple REST service that allows a caller to search a database of pets by owner name.

See the project README for more information.

JTemplate: Data-Driven Presentation Templates for Java

Templates are documents that describe an output format such as HTML, XML, or CSV. They allow the ultimate representation of a data structure to be specified independently of the data itself, promoting a clear separation of responsibility.

The CTemplate system defines a set of "markers" that are replaced with values supplied by the data structure (which CTemplate calls a "data dictionary") when a template is processed. In JTemplate, the data dictionary is provided by an instance of java.util.Map whose entries represent the values supplied by the dictionary.

For example, the contents of the following map might represent the result of some simple statistical calculations:

{
    "count": 3, 
    "sum": 9.0,
    "average": 3.0
}

A template for transforming this data into HTML is shown below:

<html>
<head>
    <title>Statistics</title>
</head>
<body>
    <p>Count: {{count}}</p>
    <p>Sum: {{sum}}</p>
    <p>Average: {{average}}</p> 
</body>
</html>

At execution time, the "count", "sum", and "average" markers are replaced by their corresponding values from the data dictionary, producing the following markup:

<html>
<head>
    <title>Statistics</title>
</head>
<body>
    <p>Count: 3</p>
    <p>Sum: 9.0</p>
    <p>Average: 3.0</p> 
</body>
</html>

TemplateEncoder Class

JTemplate provides the TemplateEncoder class for merging a template document with a data dictionary. Templates are applied using one of the following TemplateEncoder methods:

public void writeValue(Object value, OutputStream outputStream) { ... }
public void writeValue(Object value, OutputStream outputStream, Locale locale) { ... }
public void writeValue(Object value, Writer writer) { ... }
public void writeValue(Object value, Writer writer, Locale locale) { ... }

The first argument represents the value to write (i.e. the data dictionary), and the second the output destination. The optional third argument represents the locale for which the template will be applied. If unspecified, the default locale is used.

For example, the following code snippet applies a template named map.txt to the contents of a data dictionary whose values are specified by a hash map:

HashMap<String, Object> map = new HashMap<>();

map.put("a", "hello");
map.put("b", 123");
map.put("c", true);

TemplateEncoder encoder = new TemplateEncoder(getClass().getResource("map.txt"), "text/plain");

String result;
try (StringWriter writer = new StringWriter()) {
    encoder.writeValue(map, writer);

    result = writer.toString();
}

System.out.println(result);

If map.txt is defined as follows:

a = {{a}}, b = {{b}}, c = {{c}}

this code would produce the following output:

a = hello, b = 123, c = true

Additional Information

This article introduced the JTemplate framework and provided a brief overview of its key features.

The latest JTemplate release can be downloaded here. For more information, see the project README.

Regular Expression to Validate a Comma-Separated List of Email Addresses

Recently, I needed to create a regular expression to validate the format of a comma-separated list of email addresses. Just thought I’d share the result in case it is of use to anyone. The pattern I came up with for matching a single address is as follows:

[\w-]+(\.[\w-]+)*@[\w-]+(\.[\w-]+)+

Note that I didn’t attempt to handle every possible combination of valid characters – only what I’d consider the most common ones.

The pattern is simply duplicated for the comma-separated list. Here’s an example of applying the pattern in Java:

Pattern addressListPattern = Pattern.compile(String.format("%1$s(,\\s*%1$s)*",
    "[\\w-]+(\\.[\\w-]+)*@[\\w-]+(\\.[\\w-]+)+"),
    Pattern.UNICODE_CHARACTER_CLASS);

System.out.println(addressListPattern.matcher("xyz").matches()); // false
System.out.println(addressListPattern.matcher("foo@bar.com").matches()); // true
System.out.println(addressListPattern.matcher("foo@bar.com, xyz").matches()); // false
System.out.println(addressListPattern.matcher("foo@b-ar.com, f.oo@b-ar.co.uk").matches()); // true
System.out.println(addressListPattern.matcher("føo@b-ar.com, f-oo@b-ar.co.uk").matches()); // true

Note that Pattern.UNICODE_CHARACTER_CLASS is not necessary (or supported) in Android, which supports Unicode character matching by default.

Hope it helps!