Validating XML files on Google App Engine

Validating xml file against a schemaon Google App Engine does not differ form the standard validation way – you just have to decide how to load the schema file. GAE is using jetty as web container, so loading a schema file or any file under WEB-INF dir is as follows:

private static final String SCHEMA_RELATIVE_PATH = “/WEB-INF/schema/mathml2/mathml2.xsd”;

public void doPost(HttpServletRequest request, HttpServletResponse response) throws IOException {
ServletFileUpload fileUpload = new ServletFileUpload();
// set the file size limit
fileUpload.setSizeMax(MAX_SIZE_LIMIT);

response.setContentType(CONTENT_TYPE);

PrintWriter out = response.getWriter();

String fileName = null;
MathMLErrorHandler errorHandler = new MathMLErrorHandler();
try {
FileItemIterator iterator = fileUpload.getItemIterator(request);
while (iterator.hasNext()) {
FileItemStream item = iterator.next();
InputStream mathMLContent = item.openStream();
if (item.isFormField()) {
out.println(“Got a form field: ” + item.getFieldName());
} else {
fileName = item.getName();
// validate MathML content
validateMathMLContent(mathMLContent, errorHandler);
out.println(“—————————————“);
out.println(fileName + ” is valid”);
out.println(“—————————————“);
}
}
} catch (FileUploadException e) {
e.printStackTrace(out);
} catch (SAXException e) {
if (fileName != null) {
out.println(“—————————————“);
out.println(fileName + ” is NOT valid with respect to mathml2.xsd schema”);
out.println(“—————————————“);
}

out.println(errorHandler.getExceptionStackTrace());

errorHandler.getSAXParseException().printStackTrace(out);
} catch (IllegalArgumentException e) {
e.printStackTrace(out);
} catch (ArithmeticException e) {
e.printStackTrace(out);
}
}

private void validateMathMLContent(InputStream mathMLContent, MathMLErrorHandler errorHandler) throws SAXException, IOException {
SchemaFactory schemaFactory = SchemaFactory.newInstance(XMLConstants.W3C_XML_SCHEMA_NS_URI);

ServletContext servletContext = getServletContext();

Source schemaSource = new StreamSource(new File(servletContext.getRealPath(SCHEMA_RELATIVE_PATH)));

Schema schema = schemaFactory.newSchema(schemaSource);

MathMLUtilities.validateMathMLFile(mathMLContent, schema, errorHandler);
}


The schema file is loaded through ServletContext – a standard way which is guaranteed to work on all platforms and containers.

public void doPost(HttpServletRequest request, HttpServletResponse response) throws IOException {
ServletFileUpload fileUpload = new ServletFileUpload();
// set the file size limit
fileUpload.setSizeMax(MAX_SIZE_LIMIT);

response.setContentType(CONTENT_TYPE);

PrintWriter out = response.getWriter();

String fileName = null;
MathMLErrorHandler errorHandler = new MathMLErrorHandler();
try {
FileItemIterator iterator = fileUpload.getItemIterator(request);
while (iterator.hasNext()) {
FileItemStream item = iterator.next();
InputStream mathMLContent = item.openStream();
if (item.isFormField()) {
out.println(“Got a form field: ” + item.getFieldName());
} else {
fileName = item.getName();
// validate MathML content
validateMathMLContent(mathMLContent, errorHandler);
out.println(“—————————————“);
out.println(fileName + ” is valid”);
out.println(“—————————————“);
}
}
} catch (FileUploadException e) {
e.printStackTrace(out);
} catch (SAXException e) {
if (fileName != null) {
out.println(“—————————————“);
out.println(fileName + ” is NOT valid with respect to mathml2.xsd schema”);
out.println(“—————————————“);
}

out.println(errorHandler.getExceptionStackTrace());

errorHandler.getSAXParseException().printStackTrace(out);
} catch (IllegalArgumentException e) {
e.printStackTrace(out);
} catch (ArithmeticException e) {
e.printStackTrace(out);
}
}

private void validateMathMLContent(InputStream mathMLContent, MathMLErrorHandler errorHandler) throws SAXException, IOException {
SchemaFactory schemaFactory = SchemaFactory.newInstance(XMLConstants.W3C_XML_SCHEMA_NS_URI);

ServletContext servletContext = getServletContext();

Source schemaSource = new StreamSource(new File(servletContext.getRealPath(SCHEMA_RELATIVE_PATH)));

Schema schema = schemaFactory.newSchema(schemaSource);

MathMLUtilities.validateMathMLFile(mathMLContent, schema, errorHandler);
}

Posted in Uncategorized | Leave a comment

File Upload on Google App Engine

Google App Engine does not allow you to create files or write to files on its GSF (Google File System). If you want to deploy an application, which takes as input some files, processes them and returns some output, a normal question arises – how to upload a file on Google App Engine. Most of the blogs and forums that I have  visited, recommend the usage of Apache Commons FileUpload. You can download the binaries from here. When you download the binary distribution and copy it itunder WEB-INF/lib directory, you have to download one more jar file in order to be able to upload successfully a file to GAE – commons-io-1.4.jar from Apache Commons IO project, because commons-fileupload-1.2.1.jar has a dependency to commons-io-1.4.jar. That is a little bit tricky, because when you copy the commons-fileupload-1.2.1.jar, normally you do not expect to add any other library. But when you try to upload a file, GAE simple tells you that there is some Internal Server Error occurred and you begin to think for possible reasons for this failure. The reason for this internal error is because of the missing Commons IO library.

1. HTML Submit form used as a client sending the file

<!–DOCTYPE HTML PUBLIC “-//W3C//DTD HTML 4.01 Transitional//EN”>
<html>
<head>
<meta http-equiv=”Content-Type” content=”text/html; charset=UTF-8″>
<title>MathML on the cloud</title>
</head>
<body>
<div align=”center”>

<table border=”1″>
<tbody>
<tr>
<th>Calculator Form</th>
<th>Validation Form</th>
</tr>

<tr align=”center”>
<td>
<form name=”filesForm” action=”/mathmlproject_gae” method=”post”
enctype=”multipart/form-data”>Content Markup File <input type=”file” name=”Content Markup File”
>
<br />
<br />
<input type=”submit” name=”Submit” value=”Calculate”>
</form>

</td>

</tr>
</tbody>
</table>
</div>
</body>
</html>

This is a simple upload form, composed of three forms, each one uploading a file and sending it to the dedicated servlet. The action attribute for each form is indicating the servlet which will be used for processing the HTTP request. Java Servlets use the HTTP protocol as transport for uploading files.

2. The servlet, waiting for requests

// some import statements

import org.apache.commons.fileupload.FileItemIterator;
import org.apache.commons.fileupload.FileItemStream;
import org.apache.commons.fileupload.FileUploadException;
import org.apache.commons.fileupload.servlet.ServletFileUpload;


import org.xml.sax.SAXException;

@SuppressWarnings(“serial”)
public class MathMLServlet extends HttpServlet {
private static final long MAX_SIZE_LIMIT = 50000;

private static final String CONTENT_TYPE = “text/plain”;

public void doPost(HttpServletRequest request, HttpServletResponse response) throws IOException {
ServletFileUpload fileUpload = new ServletFileUpload();
// set the file size limit
fileUpload.setSizeMax(MAX_SIZE_LIMIT);

response.setContentType(CONTENT_TYPE);

PrintWriter out = response.getWriter();
try {
FileItemIterator iterator = fileUpload.getItemIterator(request);
while (iterator.hasNext()) {
FileItemStream item = iterator.next();
InputStream mathMLContent = item.openStream();
if (item.isFormField()) {
out.println(“Got a form field: ” + item.getFieldName());
} else {
String fileName = item.getName();
String fieldName = item.getFieldName();
String contentType = item.getContentType();

// some statements

The MathMLServlet gets the file from the request by using the Apache Commons FileUpload library.

3. The deployment descriptor web.xml file for MathMLServlet

<?xml version=”1.0″ encoding=”utf-8″?>
xmlns:xsi=”http://www.w3.org/2001/XMLSchema-instance&#8221;
xmlns=”http://java.sun.com/xml/ns/javaee&#8221; xmlns:web=”http://java.sun.com/xml/ns/javaee/web-app_2_5.xsd&#8221;
xsi:schemaLocation=”http://java.sun.com/xml/ns/javaee
http://java.sun.com/xml/ns/javaee/web-app_2_5.xsd&#8221;
version=”2.5″>

<!– MathMLServlet –>
<servlet>
<servlet-name>MathMLGAEServlet</servlet-name>
<servlet-class>mathml.servlet.MathMLServlet</servlet-class>
</servlet>
<servlet-mapping>
<servlet-name>MathMLGAEServlet</servlet-name>
<url-pattern>/mathmlproject_gae</url-pattern>
</servlet-mapping>

<!– Welcome page –>
<welcome-file-list>
<welcome-file>index.html</welcome-file>
</welcome-file-list>

</web-app>


Posted in Uncategorized | Leave a comment

“Inout” parameters in GWT RPC

There is no such thing as “Inout” or “passed by reference” parametes in GWT RPC.

Whenever we come to passing parameters we are always thinking in terms of “in”, “out”, “inout” or as sometimes refereed as “passed by value” and “passed by reference” parameters.

GWT RPC allows you to pass method parameters between the client and the server when invoking methods on the service interface.

The interface might be:

public interface GreetingService extends RemoteService {
String greetServer(List<String> strings) throws IllegalArgumentException;
}

with the following implementation on the server side:

public String greetServer(String input, List<String> strings) throws IllegalArgumentException {
strings.add(“input”);
return “Hello”;
}

Invoking the service on the client side is shown below. A new list is created on the client side and passed to “greetServer” method. The initial expectation might be that when the service method is executed on the server side and the onSuccess() method is called on the client side the “list” would contain one string of value “input”.

final List<String> list = new ArrayList<String>();
greetingService.greetServer(textToServer, list,
new AsyncCallback<String>() {
public void onFailure(Throwable caught) {
}
public void onSuccess(String result) {
!! list.size() == 0
}
});

But as I mentioned there is no such thing as “inout” parameters in GWT RPC. So the list is passed to the server. It is modified, but the modified object is not returned to the client. If you would like to return more then one value you would have to implement a complex object containing all the results and declare this object type as a return type. An example could be found in this google thread.

What struggles me here was that this is not easily seen from the GWT documentation. If you step back for a moment you could probably find a number of reasons why there are no “inout” parameters in GWT RPC, but I thing there should be an explicit note in the documentation.

Posted in Uncategorized | Tagged , | Leave a comment

Plugin (Bundle) “org.datanucleus.store.appengine” is already registered

In a moment of trying to just make a Web Application Project project “build” in Eclipse I have probably made a number of Copy/Pastes which I am not particularly proud of. After there were no compilation errors I ran the project and got the following error:

Caused by: org.datanucleus.exceptions.NucleusException: Plugin (Bundle) “org.datanucleus.store.appengine” is already registered. Ensure you dont have multiple JAR versions of the same plugin in the classpath. The URL “file:/home/kiko/tx1/war/WEB-INF/lib/datanucleus-appengine-1.0.7.final.jar” is already registered, and you are trying to register an identical plugin located at URL “file:/home/kiko/tx1/war/WEB-INF/lib/datanucleus-appengine-1.0.6.final.jar.”
at org.datanucleus.plugin.NonManagedPluginRegistry.registerBundle(NonManagedPluginRegistry.java:434)
at org.datanucleus.plugin.NonManagedPluginRegistry.registerBundle(NonManagedPluginRegistry.java:340)
at org.datanucleus.plugin.NonManagedPluginRegistry.registerExtensions(NonManagedPluginRegistry.java:222)
at org.datanucleus.plugin.NonManagedPluginRegistry.registerExtensionPoints(NonManagedPluginRegistry.java:153)
at org.datanucleus.plugin.PluginManager.registerExtensionPoints(PluginManager.java:82)
at org.datanucleus.OMFContext.<init>(OMFContext.java:160)
at org.datanucleus.OMFContext.<init>(OMFContext.java:141)
at org.datanucleus.ObjectManagerFactoryImpl.initialiseOMFContext(ObjectManagerFactoryImpl.java:144)
at org.datanucleus.jdo.JDOPersistenceManagerFactory.initialiseProperties(JDOPersistenceManagerFactory.java:316)
at org.datanucleus.jdo.JDOPersistenceManagerFactory.<init>(JDOPersistenceManagerFactory.java:260)
at org.datanucleus.store.appengine.jdo.DatastoreJDOPersistenceManagerFactory.<init>(DatastoreJDOPersistenceManagerFactory.java:71)
at org.datanucleus.store.appengine.jdo.DatastoreJDOPersistenceManagerFactory.getPersistenceManagerFactory(DatastoreJDOPersistenceManagerFactory.java:126)
… 50 more

As the exception message suggests I probably have two versions of the same jar in my lib folder. And this is correct. Both datanucleus-appengine-1.0.6.final.jar and datanucleus-appengine-1.0.7.final.jar are in the lib folder. I shouldn’t have Copy/Pasted that much. I still don’t know which one should I use, but I will understand that in a minute.

What impresses me here is the clarity of message. Problems in the class path have always struggled java developers and in many case a different version of a jar in the class path would have lead to “unexpected” errors. Great job DataNucleus.

Posted in Uncategorized | Tagged | Leave a comment

Unresolved type in GWT

I was continuing my work with GWT and Google App Engine when an error was shown in the Development Mode View:

11:09:15.233 [DEBUG] [tx1] Validating newly compiled units
11:09:15.244 [ERROR] [tx1] Errors in file:/home/kiko/tx1/src/com/tx1/shared/entities/Predicate.java’
11:09:15.260 [ERROR] [tx1] Line 3: The import java.net cannot be resolved
11:09:15.430 [ERROR] [tx1] Line 24: URI cannot be resolved to a type

The error message seems quite straightforward. I have a class com.tx1.shared.entities.Predicate, which contains a field of type java.net.URI.

public class Predicate {
private URI uri;
}

While working with Eclipse I am starting the application in a Development Mode. This means that the GWT java code is not compiled to JavaScript, but is instead emulated as java code. This gives great power to the developer since you could debug GWT applications as a plain java application.

But because of this GWT does not allow certain classes (as java.net.URI for example) to be used in the client package. A list of all the classes that GWT could emulate could be found in the section JRE Emulation Reference.

Further information could be found in this google thread.

Posted in Uncategorized | Tagged | 1 Comment

Programming on Google App Engine

Hi folks,

This is our first blog dedicated to cloud computing. It is intended to be used by all interested on the cloud, to share ideas, questions and best practices on how to program on the cloud.

Currently, everyone seems to have different definition of what cloud computing really means. That is why we decided to create this blog, hoping that it will guide you step by step through the whole process of using it as a platform of your cutting edge applications.

In this blog we will talk about Google and its web application hosting service – App Engine. We will cover its basic concepts and how to write applications for it.

Introducing Google App Engine

Gives you an overview of Google App Engine and its components, tools, and major features.

Creating an application

Describes how to create an application, setting up a development environment, setting up accounts and domain names, and deploying the application on Google App Engine. It also demonstrates how to use of the App Engine features – Google Accounts, the data store, and memcache – to implement a pattern to many web applications: storing and retrieving user preferences.

Handling Web requests

Contains details about App Engine’s architecture, the various features of the frontend, app servers, and static file servers, and details about the App Server runtime environment for Python and Java. The frontend routes requests to the App Engine servers and the static file servers, and manages secure connections and Google Accounts authentication and authorization.

Datastore Entities

Gives a brief introduction of the App Engine datastore, a strongly consistent scalable data object storage system with support for local transactions. It also introduces data entities, keys and properties, and Python and Java APIs for creating, updating, and deleting entities.

Datastore Queries

Describes the datastore queries and indexes, and the Python and Java APIs for queries. The App Engine datastore’s query engine uses prebuilt indexes for all queries. This paragraph describes the features of the query engine in detail, and how each feature uses indexes. It also discusses how to define and manage indexes for your application queries.

Datastore Transactions

Talks about datastore transactions and how to keep your data consistent. The App Engine datastore uses local transactions in a scalable environment. This paragraph attempts to provide a complete explanation of how the datastore updates data, and how to design your data and your app to best take advantage of these features.

Data Modeling with Python

Introduces data modeling with Python, how to use Python data modeling API to enforce invariants in your data schema. The datastore itself is a schemaless, a fundamental aspect of its scalability.

The Java Persistence API

Introduces the Java Persistence API (JPA), how its concepts translate to the datastore, how to use it to model data schemas, and how using it makes your application easier to port to other environments. JPA is a J2EE standard interface. App Engine also supports another standard interface known as Java Data Objects (JDO), though JDO will not be covered.

The Memory Cache

Gives an overview of Google’s memory cache service – memcache, and its Python and Java APIs. Aggressive caching is essential for high-performance web applications.

Fetching URL and Web Resources

Talks about how to fetch resources on the Internet via HTTP by using the URL Fetch service.

Sending and Receiving Mail and Instant Messages

Talks about how to send and receive mail and instant messages. It covers receiving mail and XMPP chat messages relayed by App Engine using request handlers. It also discusses creating and processing messages using tools in the API.

Bulk Data Operations and Remote Access

Gives you and introduction of how to perform large maintainance operations on your live application using scripts running on your computer. Tools included in the SDK make it easy to back up, restore, load, and retrieve data in your datastore. You can also write your own tools using the remote access API for data transformations and other jobs. You can also run an interactive Python shell that uses the remote API to manipulate a live Python or Java app.

Task Queues and Scheduled Tasks

Introduces queues and scheduled tasks. Task queues perform tasks in parallel by running your code on multiple application servers. Tasks can also be executed on a regular schedule with no user interaction.

The Django Web Application Framework

Gives an overview of the Django Web Application Framework with the Python runtime environment. It discusses how to set up a Django project, using the Django App Engine Helper, and taking advantage of features of Django via the Helper such as using the App Engine data modeling interface with forms and text fixtures.

Deploying and Managing Applications

Talks about how to deploy, upload and run your applications on App Engine, to update and test an application using app versions, and how to manage and inspect the running application.

Posted in Uncategorized | Leave a comment